The Open Data Lakehouse
Data warehouse workloads at better price performance and more flexibility
Combining the best of each: The Open Data Lakehouse
The Open Data Lakehouse brings the reliability and performance of the data warehouse together with the flexibility and better price performance of the data lake, enabling SQL and ML/AI use cases on your data.
Presto is the SQL query engine for the Open Data Lakehouse, enabling warehouse workloads on your data lake for better price performance.
India’s leading instant delivery service moves to Open Data Lakehouse
See how BlinkIt powers 200K orders/day and deliveries in under 10min
Benefits of a Data Lakehouse
Better Price Performance
More control of your compute costs with Presto
As businesses need more analytics on more data, compute costs can skyrocket in your data warehouse. With Presto for the Open Data Lakehouse, you get more control over your compute costs for better price performance.
Flexible and Efficient
Support your BI/Dashboarding and Data Science/AI/ML workloads
Do more with your unstructured and semi-structured data. The Open Data Lakehouse opens up more use cases by enabling you to query all types of data and run your AI/ML frameworks on big data in a flexible and efficient way.
No vendor lock-in, no proprietary data formats
Store data in Open Formats (Parquet, Apache ORC, and more) so you can use any compute engine. Leverage open source technologies (Presto, TensorFlow, and more) to avoid lock-in.
Open Data Lakehouse Components
Presto is the open source SQL query engine for the Data Lakehouse. It enables ad hoc analytics on your data to power your dashboarding and reporting needs. Query data where it lives and no need to migrate to proprietary data formats.
It can be challenging to keep data updated in a data warehouse and typically requires constant ETL from sources to destination, resulting in additional time, cost, and duplication of data. Transaction management with technologies like Apache Hudi, Iceberg, or Delta Lake enables ingesting incremental data, managing data capture for inserts and deletions, and ACID transactions.
Security & Governance
Bring the security and governance of data warehouses to the data lakehouse with technologies like AWS Lake Formation or Apache Ranger – you select which works best for your needs. Define access control policies down to the row level, enabling you to handle sensitive data.
The catalog describes all of the data that’s stored in your system to make it usable, so you can analyze it to create dashboards and reports. Having a catalog like AWS Glue or Hive Metastore or an open source option like Amundsen in your Open Data Lakehouse is critical.
Making it easy: SaaS for Presto
Get a powerful SQL query engine as SaaS for your data lakehouse. Ahana is a managed service for Presto that’s simple to use and cost-effective.
Getting Started with your Open Data Lakehouse in AWS
Ready to start building your Open Data Lakehouse in AWS? Here are some resources to get you started.
Schedule a Demo
We’ll show you how to migrate data warehouse workloads or build an open data lakehouse from scratch.
Customer use case
See why Blinkit moved from the data warehouse to the Open Data Lakehouse
See how to build an Open Data Lakehouse stack in our on-demand webinar.
Free Presto trial – Get Started with the Open Data Lakehouse in AWS
No credit card required