Starburst vs Snowflake

The High Level Overview

Starburst and Snowflake are both in the data analytics space but are significantly different in terms of their architecture and use cases. Starburst is the corporate entity behind a fork of Presto called Trino, a SQL query engine whereas Snowflake is a cloud data warehouse that stores data in a proprietary format, although it utilizes cloud storage to provide elasticity. An alternative to Starburst is Ahana Cloud, a managed service for Presto.

Snowflake would more often be considered as an alternative to Redshift or other cloud data warehouse technologies – typically used for situations where workloads are predictable, or where organizations are willing to pay a premium to provide very fast query performance. Storing large volumes of semi-structured data in data warehouses will typically be expensive, and in these cases many organizations would consider a serverless alternative such as Ahana or Amazon Athena.

What is Starburst?

Starburst Enterprise is a data platform that leverages Trino, a fork of the original Presto project, as its query engine. It enables users to query, analyze, and process data from multiple sources. Starburst Galaxy is the cloud-based distribution of Starburst Enterprise.
What is Snowflake?

Snowflake is a cloud-based data warehouse that provides a SQL interface for querying, loading, and analyzing data. It also provides tools for data sharing, security, and governance.
What is Ahana Cloud?

Ahana Cloud is a managed service for Presto on AWS that gives you more control over your deployment. It enables users to query, analyze, and process data from multiple sources.

Try Ahana for Superior Price-Performance

Run SQL workloads directly on Amazon S3 with a platform that combines the best parts of open-source Presto and managed infrastructure. Start for free or get a demo.

Performance

We are defining performance as the ability to maintain fast query response times, and whether doing so requires a lot of manual optimization.

According to the vendor:

Below is a summary of the claims made in each vendor’s promotional materials related to their products’ performance. 

Starburst

Starburst’s website mentions that the product provides enhanced performance by using Cached Views and pushdown capabilities. These features allow for faster read performance on Parquet files, the ability to generate optimal query plans, improved query performance and decreased network traffic.
Snowflake

The Snowflake website claims that Snowflake’s multi-cluster resource isolation ensures reliable, fast performance for both ad-hoc and batch workloads; and that this performance is ensured even when working at larger scale.
Ahana Cloud

Ahana has multi-level data lake caching that can give customers up to 30X query performance improvements. Ahana is also known for its better price-performance as compared to Athena especially.

According to user reviews:

Below is a summary of the claims made on user reviews in websites such as G2, Reddit, and Stack Overflow, related to each tool’s performance. Users generally have positive opinions about Starburst’s performance, but find it difficult to customize and integrate with external databases; Snowflake’s performance is seen as an advantage, but users note that it is expensive for some use cases.

Starburst

– Several reviewers mention that Starburst is easy to deploy, configure, and scale.

– However, some reviews also mention negatives such as the need for complex customization to achieve optimal settings, difficulty in configuring certificates with Apache Ranger, and unclear error messages when trying to integrate with a Hive database.
Snowflake

– Many reviewers have generally positive opinions about Snowflake’s performance – although it’s clear from the reviews that this performance comes at a high cost. They mention positive aspects such as its ability to handle multiple users at once, instantaneous cluster scalability, fast query performance, and automatic compute scaling

– Negative aspects mentioned include credit limits, expensive pricing for real-time use cases or large queries, cost of compute, time required to learn Snowflake’s scaling, and missing developer features.
Ahana Cloud

Ahana is similar to Athena in that you get fast and reliable data analytics at scale. Unlike Athena, you get more control over your Presto deployment – no issues with concurrency or deterministic performance.

Scale

We are defining scale as how effectively a data tool can handle larger volumes of data and whether it is a good fit for more advanced use cases.

According to the vendor:

Below is a summary of the claims made in each vendor’s promotional materials related to their products’ scale. 

Starburst

The Starburst website claims that Starburst offers fast access to data stored on multiple sources, such as AWS S3, Microsoft Azure Data Lake Storage (ADLS), Google Cloud Storage (GCS), and more. It also provides unified access to Hive, Delta Lake, and Iceberg. It has features such as high availability, auto scaling with graceful scaledown, and monitoring dashboards.
Snowflake

The Snowflake website claims that Snowflake can instantly and cost-efficiently scale to handle virtually any number of concurrent users and workloads, without impacting performance; an that Snowflake is built for high availability and high reliability, and designed to support effortless data management, security, governance, availability, and data resiliency.
Ahana Cloud

Ahana has autoscaling built-in which automatically adjusts the number of worker nodes in an Ahana-managed Presto cluster. This allows for efficient performance and also helps to avoid excess costs.

According to user reviews:

Below is a summary of the claims made on user reviews in websites such as G2, Reddit, and Stack Overflow, related to each tool’s scale. Overall, users generally think that both Starburst and Snowflake are capable of handling larger volumes of data, but may have other potential issues with scalability.

Starburst

– Multiple reviews note that Starburst Data is capable of handling larger volumes of data, can join disparate data sources, and is highly configurable and scalable

– Potential issues with scalability noted in the reviews include the need for manual tuning, reliance on technical resources on Starburst’s side, and the need to restart a catalog after adding a new one. Issues with log files and security configurations are also mentioned.
Snowflake

– Reviewers note that Snowflake is capable of handling larger volumes of data. They also mention that it has features such as cluster scalability, flexible pricing models, and integrations with third-party tools that can help with scaling. 

– However, some reviewers also mention potential limitations such as the lack of full functionality for unstructured data, the difficulty of pricing out the product, and the lack of command line tools for integration.

Usability, Ease of Use and Configuration

We define usability as whether a software tool is simple to install and operate, and how much effort users need to invest a lot of effort in order to accomplish their tasks. 

According to the vendor:

Below is a summary of the claims made in each vendor’s promotional materials related to their products’ ease of use. 

Starburst

The Starburst website claims that Starburst is easy to use and can be connected to multiple data sources in just a few clicks. It provides features such as Worksheets, a workbench to run ad hoc queries and explore configured data sources, and Starburst Admin, a collection of Ansible playbooks for installing and managing Starburst Enterprise platform (SEP) or Trino clusters.
Snowflake

The Snowflake website claims that Snowflake is a fully managed service, which can help users automate infrastructure-related tasks; and that Snowflake provides robust SQL support and the Snowpark developer framework for Python, Java, and Scala, allowing customers to work with data in multiple ways.
Ahana Cloud

Ahana is a managed service which means you get more control over your deployment than you would with Athena, but it also takes care of the configuration parameters under the hood.

According to user reviews:

Below is a summary of the claims made on user reviews in websites such as G2, Reddit, and Stack Overflow, related to each tool’s usability. Overall, users generally find both Starburst and Snowflake to be easy to use, although they have noted some areas of improvement for each.

Starburst

– Several reviewers mention that Starburst is easy to deploy, configure, and scale, and that the customer support is helpful.

– However, some reviews also mention negatives such as the need for complex customization to achieve optimal settings, difficulty in configuring certificates with Apache Ranger, and unclear error messages when trying to integrate with a Hive database.
Snowflake

– Reviewers have mostly positive opinions about Snowflake’s ease of use and configuration. Several mention that Snowflake is easy to deploy, configure, and use, with many online training options available and no infrastructure maintenance required. 

– On the negative side, some reviews mention that there are too many tiers with their own credit limits, making it economically non-viable, and that the GUI for SQL Worksheets (Classic as well as Snowsight) could be improved. Additionally, some reviews mention that troubleshooting error messages and missing documentation can be challenging, and that they would like to see better POSIX support.

Cost

  • Starburst’s pricing is based on credits and cluster size. The examples given on the company’s pricing page hint at a minimum of a few thousands of $s spend per month
  • Snowflake is priced based on two consumption-based metrics: usage of compute and of data storage, with different tiers available. Storage costs begin at a flat rate of $23 USD per compressed TB of data stored, while compute costs are $0.00056 per second for each credit consumed on Snowflake Standard Edition, and $0.0011 per second for each credit consumed on Business Critical Edition. 
  • Ahana Cloud is pay-as-you-go pricing based on your consumption. There’s a pricing calculator if you want to see what your deployment model would cost.

As we can see, Snowflake follows data warehouse pricing models, where users pay both for storage and compute. A recurring theme in many of the reviews is that costs are hard to control, especially for real-time or big data use cases. Starburst’s pricing can be difficult to predict based on the information available online, but the company is clearly leaning towards an enterprise pricing model that looks at annual commitment rather than pay-as-you-go.

Need a better alternative?

Get a demo of Ahana to learn how we deliver superior price/performance, control and usability for Presto in the cloud.

Sources