Feb 2022_virtual lab


Architecting your Data Lake for Fast Queries


In this 90-minute, hands-on lab you will learn best practices to optimize your AWS S3 based data lake, learning how to query and analyze your data lake for better performance.

Tuesday, February 22 | 10am PT

Presented by: ahana trademarked site logo

In this 90 minute hands-on, virtual lab we’ll go through best practices on how to optimize your AWS S3-based data lake for better query throughput and performance. 

Come prepared to have your video on while following along with the instructor – this is an interactive session and we encourage participation from everyone attending!

This event has ended.

Sign up for our newsletter to stay up to date with the latest news from Ahana.

What you’ll learn:

  • A quick overview on the open data lake analytics stack
  • Using Presto as the query engine on your data lake
  • Identify when a query can benefit from:
    • A different partition column
    • Horizontal cluster scaling
    • Changing data catalog properties
  • Increasing query performance by optimizing storage
    • Columnar storage format
    • Understanding when/how to partition data
  • RaptorX and Caching

Course Outline:

  • Introduction to Ahana Cloud for Presto
  • Work on optimization
    • Partitioning
    • Compression
    • Formats
    • Hive connector tunables – caching, session properties
  • How to do horizontal cluster scaling

By the end of this lab, you’ll know how to run fast and efficient queries with Presto to optimize your AWS S3 data lake.