Presto on AWS
What is Presto on AWS?
To tackle this question, what is Presto on AWS, let’s first define Presto. PrestoDB is an open-source distributed SQL query engine for running interactive analytic queries against all types of data sources. Presto was originally developed by Facebook and later donated to the Linux Foundation’s Presto Foundation. It was designed and written from the ground up for interactive analytics and approaches the speed of commercial data warehouses while scaling to the size of organizations like Facebook.
Presto enables self-service ad-hoc analytics on large amounts of data. With Presto, you can query data where it lives, including Hive, Amazon S3, Hadoop, Cassandra, relational databases, NoSQL databases, or even proprietary data stores. A single Presto query can combine data from multiple sources, allowing for analytics across your entire organization.
AWS and PrestoDB is a powerful combination. If you want to run Presto in AWS, it’s easy to spin up a managed Presto cluster either through the Amazon Management Console, the AWS CLI, or the Amazon EMR API. It’s not too difficult to run AWS Presto CLI EMR.
You can also give Ahana Cloud a try. Ahana is a managed service for Presto that takes care of the devops for you and provides everything you need to build your SQL Data Lakehouse using Presto.
Running Presto on AWS gives you the flexibility, scalability, performance, and cost-effective features of the cloud while allowing you to take advantage of Presto’s distributed query engine.
How does Presto work with AWS?
The quickest answer is that PrestoDB is the compute engine on top of your storage of your SQL Data Lakehouse. In this case, the storage is AWS S3.
There are some AWS services that work with Presto in AWS, like Amazon EMR and Amazon Athena. Amazon EMR and Amazon Athena are the best Amazon services to deploy Presto in the cloud. They are managed services that do the integration, testing, setup, configuration, and cluster tuning for you. Amazon Athena Presto and EMR are widely used, but both come with some challenges, such as price performance and cost.
There are some differences when it comes to EMR Presto vs Athena. AWS EMR enables you to provision as many compute instances as you want, and within minutes. Amazon Athena lets you deploy Presto using the AWS Serverless platform, with no servers, virtual machines, or clusters to setup, manage, or tune.
Many Amazon Athena users run into issues, however, when it comes to scale and concurrent queries. Amazon Athena vs Presto is a common query and many users look at using a service like Athena or Presto. Learn more about those challenges and why they’re moving to Ahana Cloud, SaaS for Presto on AWS.
To get started with PrestoDB for your SQL Data Lakehouse on AWS quickly, check out the services from Ahana Cloud. Ahana has two versions of their solution: a Full-Edition and a Free-Forever Community Edition. Each option has components of the SQL Lakehouse included, as well as support from Ahana. Explore Ahana’s managed service for PrestoDB.
Presto was originally designed to run interactive queries against data warehouses, but now it has evolved into a unified SQL engine on top of open data lake analytics for both interactive and batch workloads.
Both AWS Athena and Ahana Cloud are based on the popular open-source Presto project. The biggest difference between the two is that Athena is a serverless architecture while Ahana Cloud is a managed service for Presto servers.
In this blog, we discuss AWS Athena vs Presto and some of the reasons why you might choose to deploy PrestoDB on your own instead of using the AWS Athena service, like AWS pricing.