In this brief post, we’ll discuss the 5 main reasons why data platform engineers decide to move their data analytics workloads from Amazon Athena to Ahana Cloud for Presto.
While AWS Athena’s serverless architecture means users don’t need to scale, provision, or manage any servers, there are trade-offs with a serverless approach around performance, pricing, and several technical limitations.
What are AWS Athena and Ahana Cloud for Presto?
Presto is an open source distributed SQL query engine designed for petabyte-scale interactive analytics against a wide range of data sources, from your data lake to traditional relational databases.
Ahana Cloud for Presto provides a fully managed Presto cloud service in AWS, with a wide range of native Presto connectors support, IO caching, optimized configurations for your workload.
AWS Athena is a serverless interactive query service built on Presto that developers use to query AWS S3-based data lakes and other data sources.
While there are some benefits to AWS Athena, let’s talk about why the data engineers we talk to migrate to Ahana Cloud.
1. Need for Concurrency & Partitions
AWS Athena maximum concurrency is limited to 20-25 queries depending on the region, and users must request increased quotas. Some users even observe a max concurrency nearer to 3. Athena users can only run up to 5 queries simultaneously for each account, and Athena restricts each account to 100 databases. Athena’s partition limit is 20K partitions per table when using the Hive catalog. These limitations pose challenges if you have a complex query in-front of queries that are more latency-sensitive workloads like serving up results to a user-facing dashboard.
Ahana Cloud on the other hand runs any amount of queries when you need them. You have full transparency into what’s going on under the hood. You get unlimited concurrency because you can simply scale the number of distributed workers.
2. Need for Performance predictability
When using AWS Athena you don’t control the number of underlying servers that AWS allocates to Athena to run your queries. As the Athena service is shared, the performance characteristics can change frequently and substantially. One minute there may be 50 servers, the next only 10 servers.
With Ahana Cloud, because you have full control of your deployment, performance is always consistent and many times, faster than Athena.
3. Need for more Data source connectors
AWS Athena doesn’t use native Presto connectors, so you’ll need to use the limited options AWS provides or build your own with the AWS Lambda service.
In Ahana Cloud, you can define and manage data sources in the SaaS console, you can also attach or detach them from any cluster with the click of a button. Connect your existing Amazon database services like RDS / MySQL, RDS / PostgreSQL, Elasticsearch and Amazon Redshift.
4. Need for control over the underlying engine
AWS Athena’s serverless nature may make it easy to use, but it also means users have no control over adding more sessions, resources, debugging, etc.
In Ahana Cloud however, you control the number of Presto nodes in your deployment, and you choose the node instance-types for optimum price/performance. That’s easy with the full visibility provided via dashboards on performance and query management.
5. Need for Price predictability
AWS Athena billing is per query, based on volume of data scanned, making it inefficient and expensive at scale. Because costs are hard to control and predict, it leads to “bill shock” for some users. If one query scans one terabyte, that’s $5 for a few seconds.
Ahana is cloud-native and runs on Amazon Elastic Kubernetes (EKS), helping you to reduce operational costs with its automated cluster management, increased resilience, speed, and ease of use. Plus, Ahana is pay-as-you-go pricing – only pay for what you use. Using the same example, $5 lets you run a 6 node cluster of r5.xlarge instances for an hour, or hundreds of queries instead of just one.
Summary
AWS Athena Serverless architecture makes it really easy to get started with, however, the service has many different limitations that can cause problems, and many data engineering teams have spent hours trying to diagnose them. Due to these limitations, AWS Athena can run slowly and increase operational costs.
Ahana Cloud for Presto is the first fully integrated, cloud-native managed service for Presto that simplifies the ability of cloud and data platform teams of all sizes to provide self-service, SQL analytics for their data analysts and scientists. And all this without the limits of AWS Athena.
Ahana Cloud is available in AWS. You can sign up and start using our service today for free.