Starburst vs. Athena: Evaluating different Presto vendors

logo presto lg

The High Level Overview – Athena vs. Starburst

Starburst and Amazon Athena are both query engines used to query data from object stores such as Amazon S3. Athena is a serverless service based on open-source Presto technology, while Starburst is the corporate entity behind a fork of Presto called Trino. An alternative to these offerings is Ahana Cloud, a managed service for Presto.

All of these tools will cover similar ground in terms of use cases and workloads. Understanding the specific limitations and advantages of each tool will help you decide which one is right for you.

What is Starburst?
Starburst Enterprise is a data platform that leverages Trino, a fork of the original Presto project, as its query engine. It enables users to query, analyze, and process data from multiple sources. Starburst Galaxy is the cloud-based distribution of Starburst Enterprise.
What is Amazon Athena?
Amazon Athena is a serverless, interactive query service that makes it easy to analyze data stored in Amazon S3 using standard SQL.
What is Ahana Cloud?
Ahana Cloud is a managed service for Presto on AWS that gives you more control over your deployment. Typically users see up to 5x better price performance as compared to Athena.

Try Ahana for Superior Price-Performance

Run SQL workloads directly on Amazon S3 with a platform that combines the best parts of open-source Presto and managed infrastructure. Start for free or get a demo.

Performance

We are defining performance as the ability to maintain fast query response times, and whether doing so requires a lot of manual optimization.

According to the vendor:

Below is a summary of the claims made in each vendor’s promotional materials related to their products’ performance. 

Starburst
Starburst’s website mentions that the product provides enhanced performance by using Cached Views and pushdown capabilities. These features allow for faster read performance on Parquet files, the ability to generate optimal query plans, improved query performance and decreased network traffic.
Athena
The AWS website mentions that Athena is optimized for fast performance with Amazon S3 and automatically executes queries in parallel for quick results, even on large datasets. 
Ahana Cloud
Ahana has multi-level data caching with RaptorX which includes one-click caching built-in to every Presto cluster. This can give you up to 30X query performance improvements.

According to user reviews:

Below is a summary of the claims made on user reviews in websites such as G2, Reddit, and Stack Overflow, related to each tool’s performance. Users generally regard Starburst and Athena as having good performance, but note that Starburst may require more customization and technical expertise, and Athena may need more optimization and sometimes has concurrency issues.

Starburst
Reviewers who were happy with Starburst’s performance mentioned that it provides quick and efficient access to data, is able to handle large volumes of data and concurrent queries, and has good pluggability, portability, and parallelism. Some reviewers noted that tuning can be cumbersome, and that storing metadata in the Hive metastore creates overheads which can slow down performance. Others mentioned the cost associated with customization, the need for technical expertise to deploy Starburst Enterprise, and occasional performance issues when dealing with large datasets.
Athena
Many reviewers see Athena as fast and reliable, and capable of handling large volumes of data. Negative aspects mentioned include Athena not supporting stored procedures, the possibility of performance issues if too many partitions are used, concurrency issues, inability to scale the service, and the need to optimize queries and data.

Scale

We are defining scale as how effectively a data tool can handle larger volumes of data and whether it is a good fit for more advanced use cases.

According to the vendor:

Below is a summary of the claims made in each vendor’s promotional materials related to their products’ scale. 

Starburst
The Starburst website claims that Starburst offers fast access to data stored on multiple sources, such as AWS S3, Microsoft Azure Data Lake Storage (ADLS), Google Cloud Storage (GCS), and more. It also provides unified access to Hive, Delta Lake, and Iceberg. It has features such as high availability, auto scaling with graceful scaledown, and monitoring dashboards
Athena
The AWS website claims that Athena automatically executes queries in parallel, so results are fast, even with large datasets and complex queries. Athena is also highly available and executes queries using compute resources across multiple facilities, automatically routing queries appropriately if a particular facility is unreachable. Additionally, Athena integrates out-of-the-box with AWS Glue, which allows users to create a unified metadata repository across various services, crawl data sources to discover data and populate their Data Catalog with new and modified table and partition definitions, and maintain schema versioning.
Ahana Cloud
Ahana has an autoscaling feature that helps you manage your Presto clusters by automatically adjusting the number of worker nodes in the Ahana-managed Presto cluster. You can read the docs for more information.

According to user reviews:

Below is a summary of the claims made on user reviews in websites such as G2, Reddit, and Stack Overflow, related to each tool’s scale. Users see both tools are capable of operating at scale, but both have limitations in this respect as well.

Starburst
Multiple reviews note that Starburst Data is capable of handling larger volumes of data, can join disparate data sources, and is highly configurable and scalablePotential issues with scalability noted in the reviews include the need for manual tuning, reliance on technical resources on Starburst’s side, and the need to restart a catalog after adding a new one. Issues with log files and security configurations are also mentioned.
Athena
Some reviews suggest that Athena is well-suited for larger volumes of data and more advanced use cases, with features such as data transfer speed and integration with Glue being mentioned positively.However, other reviews suggest that Athena may not be able to handle larger volumes of data effectively due to issues such as lack of feature parity with Presto, lack of standard relational table type, and difficulty in debugging queries.

Usability, Ease of Use and Configuration

We define usability as whether a software tool is simple to install and operate, and how much effort users need to invest a lot of effort in order to accomplish their tasks. We assume that data tools that use familiar languages and syntaxes such as SQL are easier to use than tools that require specialized knowledge.

According to the vendor:

Below is a summary of the claims made in each vendor’s promotional materials related to their products’ ease of use. 

Starburst
The Starburst website claims that Starburst is easy to use and can be connected to multiple data sources in just a few clicks. It provides features such as Worksheets, a workbench to run ad hoc queries and explore configured data sources, and Starburst Admin, a collection of Ansible playbooks for installing and managing Starburst Enterprise platform (SEP) or Trino clusters.
Athena
The AWS website claims that Athena requires no infrastructure or administration setup. Athena is built on Presto, so users can run queries against large datasets in Amazon S3 using ANSI SQL.
Ahana Cloud
Ahana gives you Presto simplified – no installation, no AWS AMIs or CFTs, and no configuration needed. You can be running in 30 minutes, you get a built-in catalog and one-click integration to your data sources, and it’s all cloud native running on AWS EKS.

According to user reviews:

Overall, users have found Starburst and Athena to be relatively easy to use, but have also mentioned some drawbacks related to complex customization, lack of features, and difficulty debugging.

Starburst
Several reviewers mention that Starburst is easy to deploy, configure, and scale, and that the customer support is helpful.However, some reviews also mention negatives such as the need for complex customization to achieve optimal settings, difficulty in configuring certificates with Apache Ranger, and unclear error messages when trying to integrate with a Hive database.
Athena
Reviewers are happy with the ease of deploying Athena in their AWS account, and mention that setting up tables, views and writing queries is simple.However, some reviews also mention drawbacks such as the lack of support for stored procedures, and the lack of feature parity between Athena and Presto. Another issue that comes up is that debugging queries can be difficult due to unclear error messages.

Cost

  • Athena charges a flat price of $5 per terabyte of data scanned. Costs can be reduced by compressing and partitioning data.
  • Starburst’s pricing is more complex as it is based on credits and cluster size. The examples given on the company’s pricing page hint at a minimum of a few thousands of $s spend per month
  • Ahana Cloud is pay-as-you-go through your AWS bill based on the compute you use. There’s a pricing calculator you can use to get an idea.

While the specifics of your cloud bill will eventually depend on the way you use these tools and the amount of data you process in them, Athena and Ahana Cloud have a simpler cost structure and offer a more streamlined on-demand model.

Need a better alternative to Athena and Starburst?

Get a demo of Ahana to learn how we deliver superior price/performance, control and usability for Presto.

Sources