Turbocharge Business Intelligence Using Redash and Presto SQL query engine

Ahana Cloud for Presto is a SaaS Managed Service which speeds up the performance of leading business intelligence, data visualization, and SQL tools like Redash.

Interactive, ad hoc queries and faster data visualizations for Redash

What is Redash? Redash is an open source business intelligence (BI) system that uses SQL for data analysis and visualization. It enables the collaborative creation of visualizations and dashboards from different data sources using SQL. The data sources are not restricted to SQL conformant databases but include NoSQL databases like MongoDB and object storage systems like AWS S3. The visualizations, dashboards, reusable snippets, and other Redash artifacts are then shared either internally or publicly with other users to provide insight and support data-driven decision-making. This allows organizations and individuals to strategize and make decisions based on the latest data available to them. Some enterprise companies using Redash include Mozilla, SoundCloud, Atlassian, and Cloudflare.

Advantages of Using Redash

There are many advantages of using Redash for data analytics and visualization. Being open source, it can be installed for free and modified as per each specific use case with no restrictions. It can be installed and hosted free manually or accessed via premium cloud offerings, preventing vendor lock-in. The use of SQL for carrying out data processing makes it accessible to many users who already know SQL. Redash’s web-based interface and collaborative nature make it easy to create, share and use BI artifacts across diverse teams and departments.

image from redash.io

Redash Internals

Redash consists of a javascript frontend and python backend. The frontend is based on React while older versions are based on AngularJS. The frontend handles user interaction and has the SQL editor, dashboards, and visualizations. The backend consists of a Flask web server that provides a REST API and a PostgreSQL database that handles data caching. Queries are handled using Celery workers responsible for actual data processing and connecting to the various data sources. Redash is packaged as a single-page app (SPA) web-based system with high scalability and availability.

One way Redash is used is via direct connections to individual supported data stores. The other way is to connect Redash to a distributed query engine like Presto, to enable higher performance, higher concurrent workloads and instant, seamless access to multiple data sources. With Redash + Presto, a distributed SQL query engine, it can become an even more powerful tool.

What is Presto?

Presto is a massively parallel processing (MPP) query engine meant to interact with different data sources and process data extremely fast. This is achieved by storing data and intermediate results in-memory rather than writing them to disk as Hadoop does. Using Redash and Presto enables the separation of the BI and querying systems. Different people thus can use, optimize, and manage them independently.

How Redash works with Presto

+

Combining Redash and Presto allows the development of free and distributed BI systems as both of them are open source. Typically, this is achieved by having a frontend Redash cluster that communicates with a backend Presto cluster. The backend handles query processing and data management while the frontend provides the user interface. 

Ahana Cloud customer Cartona uses Redash with Presto and Ahana Cloud to power its dashboards. Learn more about their use case in their presentation.

PrestoCon Day, 2020 Talk by eCommerce company Cartona

Having Presto as the query engine provides better performance and access to more data sources via the Presto connectors.

The architecture consists of Redash connected to a presto cluster with one or more connected data sources. Presto handles the data access and in-memory processing of queries. Redash handles the visualization of reports and dashboards. One configures Redash to connect to the Presto cluster by setting the name value to Presto and providing the other properties like host, port, and catalog as appropriate. This allows the presto cluster to be scaled by adding or removing processing nodes to meet the requirements of the Redash users. Users are then able to run queries against the data sources easily as both Presto and Redash use an SQL interface. Integrating them offers other advantages such as data federation, fast query processing, and being able to have different clusters that can be optimized to best meet the needs of business analysts and data scientists.

Ahana Cloud is the cloud-native SaaS managed service for Presto, see how you can turbocharge Redash in 30 minutes!

Get Started with Presto & Redash