August Virtual Lab Hudi+Presto_S3

HANDS-ON VIRTUAL LAB FOR DATA ENGINEERS aND ARCHITECTS:

Building an Open Data Lakehouse on AWS with Presto and Apache Hudi

flask

Thursday, March 30 at 8am PT | 11am ET | 4pm GMT | 9:30pm IST

Presented by: ahana trademarked site logo    onehouse logo

Come prepared to have your video on while following along with the instructor – this is an interactive session and we encourage participation from everyone attending!

You may be familiar with the Data Lakehouse, an emerging architecture that brings the flexibility, scale and cost management benefits of the data lake together with the data management capabilities of the data warehouse. In this workshop, we’ll get hands-on building an Open Data Lakehouse – an approach that brings open technologies and formats to your lakehouse. 

For the purpose of this workshop, we’ll use Presto for the open source SQL query engine, Apache Hudi for ACID transactions, and AWS S3 for the data lake. You’ll get hands-on with Presto and Hudi. We’ll show you how to deploy each, connect them, set up your Hudi tables for ACID transactions, and finally run queries on your S3 data.

By the end, you should be well-versed in Presto and Hudi and have the building blocks created for your own Open Data Lakehouse.

Course Outline:

  • Introduction to the Open Data Lakehouse, including what is Presto (query engine) and what is Apache Hudi (transaction layer
  • Deploying Presto in AWS with Ahana Cloud
  • Querying S3 with Presto
  • Integrating Hudi with Presto
  • Inserting data into Hudi and querying your Hudi table with Presto
  • Future roadmap – what additional Hudi support is coming to Presto like ACID compliance and table versioning

By the end of this lab, you’ll know how to run queries with Presto and Hudi to optimize your AWS S3 data lake.

Instructors

Nadine Farah

Head of Developer Relations, Onehouse

Nadine Farah

Rohan Pednekar

Product Manager, Ahana

rohan