Ahana Cloud for Presto Versus Amazon EMR

In this brief post, we’ll discuss some of the benefits of Ahana Cloud over Amazon Elastic MapReduce (EMR). While EMR offers optionality in the number of big data compute frameworks, that flexibility comes with operational and configuration burden. When it comes to low-latency interactive querying on big data that just works, Ahana Cloud for Presto offers much lower operational burden and Presto-specific optimizations.

Presto is an open source distributed SQL query engine designed for petabyte-scale interactive analytics against a wide range of data sources, from your data lake to traditional relational databases. In fact, you can run federated queries across your data sources. Developed at Facebook, Presto is supported by the Presto Foundation, an independent nonprofit organization under the auspices of the Linux Foundation. Presto is used by leading technology companies, such as Facebook, Twitter, Uber, and Netflix.

Amazon EMR is a big data platform hosted in AWS. EMR allows you to provision a cluster with one or more big data technologies, such as Hadoop, Apache Spark, Apache Hive, and Presto. Ahana Cloud for Presto is the easiest cloud-native managed service for Presto, empowering data teams of all sizes. As a focused Presto solution, here are a few of Ahana Cloud’s benefits over Amazon EMR:

Less configuration. Born of the Hadoop era, Presto has several configuration parameters in several files to configure and tune to get right. With EMR, you have to configure these yourself. With Ahana Cloud, we tune more than 200 parameters out of the box, so when you spin up a cluster, you get excellent query performance from the get go. Out of the box, Ahana Cloud provides an Apache Superset sandbox for administrators to validate connecting to, querying and visualizing your data.

Easy-to-modify configuration. Ahana Cloud offers the ability to not only spin up and terminate clusters, but also stop and restart them—-allowing you to change the number of Presto workers and add or remove data sources. With EMR, any manual changes to the number of Presto workers and data sources require a new cluster or manually restarting the services yourself. Further, adding and removing data sources is done through a convenient user interface instead modifying low-level configuration files.

ahana data sources

Optimizations. As a Presto managed service, Ahana Cloud will continually provide optimizations relevant to Presto. For example, Ahana recently released data lake I/O caching. Based on the RubiX open source project and enabled with a single click, the caching eliminates redundant reads from your data lake if the same data is read over and over. This caching results in up to 5x query performance improvement and up to 85% latency reductions for concurrent workloads. Finally, idle clusters processing no queries can automatically scale down to a single Presto worker to preserve costs while allowing for a quick warm up.

Screen Shot 2021 03 18 at 4.45.32 PM

If you are experienced at tuning Presto and want full control of the infrastructure management, Amazon EMR may be the choice for you. If simplicity and accelerated go-to-market without needing to manage a complex infrastructure are what you seek, then Ahana Cloud for Presto is the way to go. Sign up for our free trial today.