Presto Training & Learning Center

The Ahana™ Learning Center covers beginner to advanced level Presto topics, questions, and answers to help you learn Presto.

Topics

Ahana Cofounder Will Present Session At Next Gen Big Data Platforms Meetup hosted by LinkedIn About Open Data Lake Analytics

Nov 2, 20213 min read

Ahana Cofounder and Chief Product Officer Dipti Borkar will present a session at Next Gen Big Data Platforms Meetup hosted by LinkedIn about open data lake analytics.

Presto 105: Running Presto with AWS Glue as catalog on your Laptop

Oct 21, 202110 min read

Introduction This is the 5th tutorial in our Getting Started with Presto series. To recap, here are the first 4 tutorials: Presto 101: Installing & Configuring Presto locally Presto 102: … Continue reading Presto 105: Running Presto with AWS Glue as catalog on your Laptop

0 to Presto with AWS Oct 2021

Oct 20, 20212 min read

Data lakes are widely used and have become extremely affordable, especially with the advent of technologies like AWS S3. During this webinar, Gary Stafford, Solutions Architect at AWS, and Dipti Borkar, Cofounder & CPO at Ahana, will share how to build an open data lake stack with Presto and AWS S3.

How to Build an Open Data Lake Analytics Stack

Oct 12, 20212 min read

In this webinar, we’ll discuss how to build an Open Data Lake Analytics stack using 4 key components: open source technologies, open formats, open interfaces & open cloud.

Presto 104: Running Presto with Hive Metastore on your Laptop

Oct 8, 202110 min read

Introduction This is the 4th tutorial in our Getting Started with Presto series. To recap, here are the first 3 tutorials: Presto 101: Installing & Configuring Presto locally Presto 102: … Continue reading Presto 104: Running Presto with Hive Metastore on your Laptop

Unlocking the Value of Your Data Lake

Oct 6, 20212 min read

During this webinar Ahana co-founder and Chief Product Officer Dipti Borkar will discuss how to unlock the value of your data lake with the emerging Open Data Lake analytics architecture.

Connecting to Presto with Superset

Oct 5, 20218 min read

This blog post will provide you with an understanding of how to connect Superset to Presto. TL;DR Superset refers to a connection to a distinct data source as a database. … Continue reading Connecting to Presto with Superset

Presto Tutorial 103: PrestoDB cluster on GCP

Sep 21, 202126 min read

Introduction This tutorial is Part III of our Getting started with PrestoDB series. As a reminder, Prestodb is an open source distributed SQL query engine. In tutorial 102 we covered … Continue reading Presto Tutorial 103: PrestoDB cluster on GCP

How to use mathematical functions and operators and aggregate functions for Presto?

Sep 21, 20216 min read

Presto offers several classes of mathematical functions that operate on single values and mathematical operators that allow for operations on values across columns. In addition, aggregate functions can operator on … Continue reading How to use mathematical functions and operators and aggregate functions for Presto?

Ahana Cofounder Will Co-lead Session At OSPOCon 2021 About Presto SQL Query Engine

Sep 21, 20213 min read

San Mateo, Calif. – September 21, 2021 — Ahana, the SaaS for Presto company, today announced that its Cofounder and Chief Product Officer Dipti Borkar will co-lead a session with … Continue reading Ahana Cofounder Will Co-lead Session At OSPOCon 2021 About Presto SQL Query Engine

Announcing the workload profile feature in Ahana Cloud

Sep 20, 20213 min read

Ahana Cloud for Presto is the first fully integrated, cloud native managed service that simplifies the ability of cloud and data platform teams. With the managed Presto service, we provide … Continue reading Announcing the workload profile feature in Ahana Cloud

Presto on AWS: Exploring different Presto services

Sep 16, 20212 min read

In this webinar we will discuss why companies are using Ahana Cloud for their Presto deployments and give an overview of Ahana including the Ahana SaaS console, how easy it is to add data sources like AWS S3 and integrate catalogs like Hive, and features like Data Lake Caching for 5x performance and autoscaling.

Ahana Joins AWS ISV Accelerate Program to Expand Access to Its Presto Managed Service for Fast SQL on Amazon S3 Data Lakes

Sep 14, 20214 min read

Ahana also selected into the invite-only AWS Global Startup Program San Mateo, Calif. – September 14, 2021 — Ahana, the Presto company, today announced it has been accepted into the … Continue reading Ahana Joins AWS ISV Accelerate Program to Expand Access to Its Presto Managed Service for Fast SQL on Amazon S3 Data Lakes

Ahana 101: An introduction to Ahana Cloud for Presto on AWS, SaaS for Presto on AWS

Sep 9, 20212 min read

In this webinar we will discuss why companies are using Ahana Cloud for their Presto deployments and give an overview of Ahana including the Ahana SaaS console, how easy it is to add data sources like AWS S3 and integrate catalogs like Hive, and features like Data Lake Caching for 5x performance and autoscaling.

Presto 101: An introduction to open source Presto

Sep 2, 20212 min read

In this session, Dipti will introduce the Presto technology and share why it’s becoming so popular – in fact, companies like Facebook, Uber, Twitter, Alibaba, and much more use Presto for interactive ad hoc queries, reporting & dashboarding data lake analytics, and much more. We’ll also show a quick demo on getting Presto running in AWS.

What is a Presto lag example?

Aug 31, 20212 min read

The Presto lag function a window function that returns the value of an offset before the current row in a window. One common use case for the lag function is … Continue reading What is a Presto lag example?

SQL on the Data Lake, Using open source Presto to unlock the value of your data lake

Aug 26, 20212 min read

In this webinar, Dipti will discuss why open source Presto has quickly become the de-facto query engine for the data lake. Presto enables ad hoc data discovery where you can use SQL to run queries whenever you want, wherever your data resides. With Presto, you can unlock the value of your data lake.

How do I get the date_diff from previous rows?

Aug 24, 20211 min read

To find the difference in time between consecutive dates in a result set, Presto offers window functions. Take the example table below which contains sample data of users who watched … Continue reading How do I get the date_diff from previous rows?

Data Warehouse or Data Lake, which one do I use?

Aug 19, 202143 min read

In this webinar, you’ll hear from industry analyst John Santaferraro and Ahana cofounder and CPO Dipti Borkar who will discuss the data landscape and how many companies are thinking about their data warehouse/data lake strategy.

Tutorial: How to run SQL queries with Presto on Amazon Redshift

Aug 10, 20218 min read

Presto has evolved into a unified SQL engine on top of cloud data lakes for both interactive queries as well as batch workloads with multiple data sources. This tutorial is … Continue reading Tutorial: How to run SQL queries with Presto on Amazon Redshift

How do I use the approx_percentile function in Presto?

Aug 9, 20216 min read

The Presto approx_percentile is one of the approximate aggregate functions, and it returns an approximate percentile for a set of values (e.g. column). In this short article, we will explain … Continue reading How do I use the approx_percentile function in Presto?

Announcing the Ahana $20M Series A – Furthering our Vision of Open Data Lake Analytics with Presto

Aug 3, 20216 min read

I’m very excited to announce that Ahana, the SaaS for Presto company, has raised a jumbo $20M Series A round from lead investor Third Point Ventures. Our SaaS managed service … Continue reading Announcing the Ahana $20M Series A – Furthering our Vision of Open Data Lake Analytics with Presto

Presto Company Ahana Raises $20M Series A Led By Third Point Ventures To Redefine Open Data Lake Analytics

Aug 3, 20214 min read

Funding comes on heels of major momentum in customer and community adoption for Presto San Mateo, Calif. – August 3, 2021 — Ahana, the SaaS for Presto company, today announced … Continue reading Presto Company Ahana Raises $20M Series A Led By Third Point Ventures To Redefine Open Data Lake Analytics

Autoscale your Presto cluster in Ahana Cloud

Jul 29, 20214 min read

We’re excited to announce that autoscaling is now available on Ahana Cloud. In this initial release, the autoscaling feature will monitor the worker nodes’ average CPU Utilization of your presto … Continue reading Autoscale your Presto cluster in Ahana Cloud

Community Roundtable: Open Data Lakes with Presto, Apache Hudi & AWS Glue and S3

Jul 28, 202152 min read

Join us for this roundtable discussion where experts from each layer in this stack – Presto, AWS, and Apache Hudi – will discuss why we’re seeing a pronounced adoption to this next generation of cloud data lake analytics and how these technologies enable open, flexible, and highly performant analytics in the cloud.

Tutorial: How to run SQL queries with Presto on Google BigQuery

Jul 20, 20217 min read

Presto has evolved into a unified SQL engine on top of cloud data lakes for both interactive queries as well as batch workloads with multiple data sources. This tutorial is … Continue reading Tutorial: How to run SQL queries with Presto on Google BigQuery

Snowflake may not be the silver bullet you wanted for your long term data strategy… here’s why

Jul 20, 20217 min read

Since COVID, every business has pivoted and moved everything online, accelerating digital transformation with data and AI. Self-service, accelerated analytics has become more and more critical for businesses and Snowflake … Continue reading Snowflake may not be the silver bullet you wanted for your long term data strategy… here’s why

Can I write back or update data in my Hadoop / Apache Hive cluster through Presto?

Jul 13, 20212 min read

Using Presto with a Hadoop cluster for SQL analytics is pretty common especially in on premise deployments.  With Presto, you can read and query data from the Hadoop datanodes but … Continue reading Can I write back or update data in my Hadoop / Apache Hive cluster through Presto?

How do I convert Unix Epoch time to a date or something more human readable with SQL?

Jul 13, 20211 min read

Many times the Unix Epoch Time gets stored in the database. But this is not very human readable and conversion is required for reports and dashboards.  Example of Unix Epoch … Continue reading How do I convert Unix Epoch time to a date or something more human readable with SQL?

How do I transfer data from a Hadoop / Hive cluster to a Presto cluster?

Jul 13, 20212 min read

Hadoop is a system that manages both compute and data together. Hadoop cluster nodes have the HDFS file system and may also have different types of engines like Apache Hive, … Continue reading How do I transfer data from a Hadoop / Hive cluster to a Presto cluster?

Hands-on Presto Tutorial: How to run Presto on Kubernetes

Jul 12, 202110 min read

What is Presto? Presto is a distributed query engine designed from the ground up for data lake analytics and interactive query workloads. Presto supports connectivity to a wide variety of … Continue reading Hands-on Presto Tutorial: How to run Presto on Kubernetes

Presto 102 Tutorial: Install PrestoDB on a Laptop or PC

Jul 9, 20214 min read

Summary Prestodb is an open source distributed parallel query SQL engine. In tutorial 101 we walk through manual installation and configuration on a bare metal server or on a VM. It … Continue reading Presto 102 Tutorial: Install PrestoDB on a Laptop or PC

Enabling spill to disk for optimal price per performance

Jul 7, 20214 min read

Presto was born out of the need for low-latency interactive queries on large scale data, and hence, continually optimized for that use case. In such scenarios, the best practice is … Continue reading Enabling spill to disk for optimal price per performance

Presto substring operations: How do I get the X characters from a string of a known length?

Jul 7, 20212 min read

Presto provides an overloaded substring function to extract characters from a string. We will use the string “Presto String Operations” to demonstrate the use of this function. Extract last 7 … Continue reading Presto substring operations: How do I get the X characters from a string of a known length?

Presto 101 Tutorial: Installing & Configuring Presto

Jun 30, 20218 min read

Installing & Configuring Presto locally Presto Installation Presto can be installed manually or using docker images on: Single Node: Both co-ordinator and workers run on the same machine.  or even … Continue reading Presto 101 Tutorial: Installing & Configuring Presto

Spark SQL | What is Spark SQL & Spark SQL Guide | Ahana

Jun 30, 20212 min read

What is Spark SQL? Spark is a general purpose computation engine for large-scale data processing. At Spark’s inception, the primary abstraction was a resilient distributed dataset (RDD), an immutable distributed … Continue reading Spark SQL | What is Spark SQL & Spark SQL Guide | Ahana

Hive vs Presto vs Spark

Jun 30, 20213 min read

What is Apache Hive? Apache Hive is a data warehouse software project built on top of Apache Hadoop for providing data query and analysis. Hive provides an SQL-like interface called … Continue reading Hive vs Presto vs Spark

Query Data Lake With Presto | Presto Google Cloud | Ahana

Jun 24, 20213 min read

How do I query a data lake with Presto? A data lake is a centralized repository that allows you to store all your structured and unstructured data at any scale. … Continue reading Query Data Lake With Presto | Presto Google Cloud | Ahana

Presto EMR S3 Timeout Error | Presto Query Timeout | Ahana

Jun 24, 20212 min read

Why am I getting a Presto EMR S3 timeout error? If you’re using AWS EMR Presto, you can use the S3 select pushdown feature to push down compute operations (i.e. … Continue reading Presto EMR S3 Timeout Error | Presto Query Timeout | Ahana

Ahana Demonstrates Major Momentum in Customer and Community Adoption for Presto 1H 2021

Jun 24, 20216 min read

The Presto company also shows significant product momentum with numerous accolades and industry recognition  San Mateo, Calif. – June 24, 2021 — Ahana, the Presto company, today announced major momentum … Continue reading Ahana Demonstrates Major Momentum in Customer and Community Adoption for Presto 1H 2021

Why I’m betting on PrestoDB, and why you should too!

Jun 13, 20217 min read

By Dipti Borkar, Ahana Cofounder, Chief Product Officer & Chief Evangelist I’ve been in open source software companies and communities for over 10 years now, and in the database industry … Continue reading Why I’m betting on PrestoDB, and why you should too!

Do I need to move my data to query it with Presto?

Jun 3, 20211 min read

No, Presto queries your data in-place so you don’t need to move it. If you’re using AWS S3 for your data lake, for example, you wouldn’t need to ingest it … Continue reading Do I need to move my data to query it with Presto?

5 main reasons Data Engineers move from AWS Athena to Ahana Cloud

Jun 1, 20214 min read

In this brief post, we’ll discuss the 5 main reasons why data platform engineers decide to move their data analytics workloads from Amazon Athena to Ahana Cloud for Presto. While … Continue reading 5 main reasons Data Engineers move from AWS Athena to Ahana Cloud

Ahana Cloud for Presto Versus Amazon EMR

May 27, 20213 min read

In this brief post, we’ll discuss some of the benefits of Ahana Cloud over Amazon Elastic MapReduce (EMR). While EMR offers optionality in the number of big data compute frameworks, … Continue reading Ahana Cloud for Presto Versus Amazon EMR

Streaming Data Processing Using Apache Kafka and Presto

May 12, 20213 min read

Kafka Quick Start Kafka is a distributed data streaming framework meant to enable the creation of highly scalable distributed systems. Developed at LinkedIn in 2008 and open-sourced in 2011, it … Continue reading Streaming Data Processing Using Apache Kafka and Presto

Business Intelligence And Data Analysis With Druid and Presto

May 12, 20214 min read

Apache Druid Helicopter View Apache Druid is a distributed, columnar database aimed at developing analytical solutions. It offers a real-time analytics database able to ingest and query massive amounts of … Continue reading Business Intelligence And Data Analysis With Druid and Presto

Flexible And Low Latency OLAP Using Apache Pinot and Presto for real time analytics

May 12, 20213 min read

Apache Pinot Overview Apache Pinot is a distributed, low latency online analytical processing (OLAP) platform used for carrying out fast big data analytics. Developed at LinkedIn in 2014, the highly … Continue reading Flexible And Low Latency OLAP Using Apache Pinot and Presto for real time analytics

Turbocharge your Analytics with MongoDB And Presto

May 12, 20214 min read

High-Level View Of MongoDB MongoDB is a NoSQL distributed document database meant to handle diverse data management requirements. Its design goals include creating an object-oriented, highly available, scalable, efficient, and … Continue reading Turbocharge your Analytics with MongoDB And Presto

CRN® Recognizes Ahana on Its 2021 Big Data 100 List As One of The Coolest Business Analytics Companies

May 5, 20214 min read

Ahana also named to CRN’s 10 Hot Big Data Companies You Should Watch in 2021 list San Mateo, Calif. – May 5, 2021 — Ahana, the self-service analytics company for … Continue reading CRN® Recognizes Ahana on Its 2021 Big Data 100 List As One of The Coolest Business Analytics Companies

Presto Sync Partition Metastore & Metadata | Presto Sync | Ahana

May 4, 20212 min read

How do I sync my partition and metastore in Presto? Sync partition metadata is used to sync the metastore with information on the file system/s3 for the external table. Depending … Continue reading Presto Sync Partition Metastore & Metadata | Presto Sync | Ahana

Ahana Cofounder Will Present Session At Next Gen Big Data Platforms Meetup hosted by LinkedIn About Open Data Lake Analytics

Nov 2, 20213 min read

Ahana Cofounder and Chief Product Officer Dipti Borkar will present a session at Next Gen Big Data Platforms Meetup hosted by LinkedIn about open data lake analytics.

Presto 105: Running Presto with AWS Glue as catalog on your Laptop

Oct 21, 202110 min read

Introduction This is the 5th tutorial in our Getting Started with Presto series. To recap, here are the first 4 tutorials: Presto 101: Installing & Configuring Presto locally Presto 102: … Continue reading Presto 105: Running Presto with AWS Glue as catalog on your Laptop

0 to Presto with AWS Oct 2021

Oct 20, 20212 min read

Data lakes are widely used and have become extremely affordable, especially with the advent of technologies like AWS S3. During this webinar, Gary Stafford, Solutions Architect at AWS, and Dipti Borkar, Cofounder & CPO at Ahana, will share how to build an open data lake stack with Presto and AWS S3.

How to Build an Open Data Lake Analytics Stack

Oct 12, 20212 min read

In this webinar, we’ll discuss how to build an Open Data Lake Analytics stack using 4 key components: open source technologies, open formats, open interfaces & open cloud.

Presto 104: Running Presto with Hive Metastore on your Laptop

Oct 8, 202110 min read

Introduction This is the 4th tutorial in our Getting Started with Presto series. To recap, here are the first 3 tutorials: Presto 101: Installing & Configuring Presto locally Presto 102: … Continue reading Presto 104: Running Presto with Hive Metastore on your Laptop

Unlocking the Value of Your Data Lake

Oct 6, 20212 min read

During this webinar Ahana co-founder and Chief Product Officer Dipti Borkar will discuss how to unlock the value of your data lake with the emerging Open Data Lake analytics architecture.

Connecting to Presto with Superset

Oct 5, 20218 min read

This blog post will provide you with an understanding of how to connect Superset to Presto. TL;DR Superset refers to a connection to a distinct data source as a database. … Continue reading Connecting to Presto with Superset

Presto Tutorial 103: PrestoDB cluster on GCP

Sep 21, 202126 min read

Introduction This tutorial is Part III of our Getting started with PrestoDB series. As a reminder, Prestodb is an open source distributed SQL query engine. In tutorial 102 we covered … Continue reading Presto Tutorial 103: PrestoDB cluster on GCP

How to use mathematical functions and operators and aggregate functions for Presto?

Sep 21, 20216 min read

Presto offers several classes of mathematical functions that operate on single values and mathematical operators that allow for operations on values across columns. In addition, aggregate functions can operator on … Continue reading How to use mathematical functions and operators and aggregate functions for Presto?

Ahana Cofounder Will Co-lead Session At OSPOCon 2021 About Presto SQL Query Engine

Sep 21, 20213 min read

San Mateo, Calif. – September 21, 2021 — Ahana, the SaaS for Presto company, today announced that its Cofounder and Chief Product Officer Dipti Borkar will co-lead a session with … Continue reading Ahana Cofounder Will Co-lead Session At OSPOCon 2021 About Presto SQL Query Engine

Announcing the workload profile feature in Ahana Cloud

Sep 20, 20213 min read

Ahana Cloud for Presto is the first fully integrated, cloud native managed service that simplifies the ability of cloud and data platform teams. With the managed Presto service, we provide … Continue reading Announcing the workload profile feature in Ahana Cloud

Presto on AWS: Exploring different Presto services

Sep 16, 20212 min read

In this webinar we will discuss why companies are using Ahana Cloud for their Presto deployments and give an overview of Ahana including the Ahana SaaS console, how easy it is to add data sources like AWS S3 and integrate catalogs like Hive, and features like Data Lake Caching for 5x performance and autoscaling.

Ahana Joins AWS ISV Accelerate Program to Expand Access to Its Presto Managed Service for Fast SQL on Amazon S3 Data Lakes

Sep 14, 20214 min read

Ahana also selected into the invite-only AWS Global Startup Program San Mateo, Calif. – September 14, 2021 — Ahana, the Presto company, today announced it has been accepted into the … Continue reading Ahana Joins AWS ISV Accelerate Program to Expand Access to Its Presto Managed Service for Fast SQL on Amazon S3 Data Lakes

Ahana 101: An introduction to Ahana Cloud for Presto on AWS, SaaS for Presto on AWS

Sep 9, 20212 min read

In this webinar we will discuss why companies are using Ahana Cloud for their Presto deployments and give an overview of Ahana including the Ahana SaaS console, how easy it is to add data sources like AWS S3 and integrate catalogs like Hive, and features like Data Lake Caching for 5x performance and autoscaling.

Presto 101: An introduction to open source Presto

Sep 2, 20212 min read

In this session, Dipti will introduce the Presto technology and share why it’s becoming so popular – in fact, companies like Facebook, Uber, Twitter, Alibaba, and much more use Presto for interactive ad hoc queries, reporting & dashboarding data lake analytics, and much more. We’ll also show a quick demo on getting Presto running in AWS.

What is a Presto lag example?

Aug 31, 20212 min read

The Presto lag function a window function that returns the value of an offset before the current row in a window. One common use case for the lag function is … Continue reading What is a Presto lag example?

SQL on the Data Lake, Using open source Presto to unlock the value of your data lake

Aug 26, 20212 min read

In this webinar, Dipti will discuss why open source Presto has quickly become the de-facto query engine for the data lake. Presto enables ad hoc data discovery where you can use SQL to run queries whenever you want, wherever your data resides. With Presto, you can unlock the value of your data lake.

How do I get the date_diff from previous rows?

Aug 24, 20211 min read

To find the difference in time between consecutive dates in a result set, Presto offers window functions. Take the example table below which contains sample data of users who watched … Continue reading How do I get the date_diff from previous rows?

Data Warehouse or Data Lake, which one do I use?

Aug 19, 202143 min read

In this webinar, you’ll hear from industry analyst John Santaferraro and Ahana cofounder and CPO Dipti Borkar who will discuss the data landscape and how many companies are thinking about their data warehouse/data lake strategy.

Tutorial: How to run SQL queries with Presto on Amazon Redshift

Aug 10, 20218 min read

Presto has evolved into a unified SQL engine on top of cloud data lakes for both interactive queries as well as batch workloads with multiple data sources. This tutorial is … Continue reading Tutorial: How to run SQL queries with Presto on Amazon Redshift

How do I use the approx_percentile function in Presto?

Aug 9, 20216 min read

The Presto approx_percentile is one of the approximate aggregate functions, and it returns an approximate percentile for a set of values (e.g. column). In this short article, we will explain … Continue reading How do I use the approx_percentile function in Presto?

Announcing the Ahana $20M Series A – Furthering our Vision of Open Data Lake Analytics with Presto

Aug 3, 20216 min read

I’m very excited to announce that Ahana, the SaaS for Presto company, has raised a jumbo $20M Series A round from lead investor Third Point Ventures. Our SaaS managed service … Continue reading Announcing the Ahana $20M Series A – Furthering our Vision of Open Data Lake Analytics with Presto

Presto Company Ahana Raises $20M Series A Led By Third Point Ventures To Redefine Open Data Lake Analytics

Aug 3, 20214 min read

Funding comes on heels of major momentum in customer and community adoption for Presto San Mateo, Calif. – August 3, 2021 — Ahana, the SaaS for Presto company, today announced … Continue reading Presto Company Ahana Raises $20M Series A Led By Third Point Ventures To Redefine Open Data Lake Analytics

Autoscale your Presto cluster in Ahana Cloud

Jul 29, 20214 min read

We’re excited to announce that autoscaling is now available on Ahana Cloud. In this initial release, the autoscaling feature will monitor the worker nodes’ average CPU Utilization of your presto … Continue reading Autoscale your Presto cluster in Ahana Cloud

Community Roundtable: Open Data Lakes with Presto, Apache Hudi & AWS Glue and S3

Jul 28, 202152 min read

Join us for this roundtable discussion where experts from each layer in this stack – Presto, AWS, and Apache Hudi – will discuss why we’re seeing a pronounced adoption to this next generation of cloud data lake analytics and how these technologies enable open, flexible, and highly performant analytics in the cloud.

Tutorial: How to run SQL queries with Presto on Google BigQuery

Jul 20, 20217 min read

Presto has evolved into a unified SQL engine on top of cloud data lakes for both interactive queries as well as batch workloads with multiple data sources. This tutorial is … Continue reading Tutorial: How to run SQL queries with Presto on Google BigQuery

Snowflake may not be the silver bullet you wanted for your long term data strategy… here’s why

Jul 20, 20217 min read

Since COVID, every business has pivoted and moved everything online, accelerating digital transformation with data and AI. Self-service, accelerated analytics has become more and more critical for businesses and Snowflake … Continue reading Snowflake may not be the silver bullet you wanted for your long term data strategy… here’s why

Can I write back or update data in my Hadoop / Apache Hive cluster through Presto?

Jul 13, 20212 min read

Using Presto with a Hadoop cluster for SQL analytics is pretty common especially in on premise deployments.  With Presto, you can read and query data from the Hadoop datanodes but … Continue reading Can I write back or update data in my Hadoop / Apache Hive cluster through Presto?

How do I convert Unix Epoch time to a date or something more human readable with SQL?

Jul 13, 20211 min read

Many times the Unix Epoch Time gets stored in the database. But this is not very human readable and conversion is required for reports and dashboards.  Example of Unix Epoch … Continue reading How do I convert Unix Epoch time to a date or something more human readable with SQL?

How do I transfer data from a Hadoop / Hive cluster to a Presto cluster?

Jul 13, 20212 min read

Hadoop is a system that manages both compute and data together. Hadoop cluster nodes have the HDFS file system and may also have different types of engines like Apache Hive, … Continue reading How do I transfer data from a Hadoop / Hive cluster to a Presto cluster?

Hands-on Presto Tutorial: How to run Presto on Kubernetes

Jul 12, 202110 min read

What is Presto? Presto is a distributed query engine designed from the ground up for data lake analytics and interactive query workloads. Presto supports connectivity to a wide variety of … Continue reading Hands-on Presto Tutorial: How to run Presto on Kubernetes

Presto 102 Tutorial: Install PrestoDB on a Laptop or PC

Jul 9, 20214 min read

Summary Prestodb is an open source distributed parallel query SQL engine. In tutorial 101 we walk through manual installation and configuration on a bare metal server or on a VM. It … Continue reading Presto 102 Tutorial: Install PrestoDB on a Laptop or PC

Enabling spill to disk for optimal price per performance

Jul 7, 20214 min read

Presto was born out of the need for low-latency interactive queries on large scale data, and hence, continually optimized for that use case. In such scenarios, the best practice is … Continue reading Enabling spill to disk for optimal price per performance

Presto substring operations: How do I get the X characters from a string of a known length?

Jul 7, 20212 min read

Presto provides an overloaded substring function to extract characters from a string. We will use the string “Presto String Operations” to demonstrate the use of this function. Extract last 7 … Continue reading Presto substring operations: How do I get the X characters from a string of a known length?

Presto 101 Tutorial: Installing & Configuring Presto

Jun 30, 20218 min read

Installing & Configuring Presto locally Presto Installation Presto can be installed manually or using docker images on: Single Node: Both co-ordinator and workers run on the same machine.  or even … Continue reading Presto 101 Tutorial: Installing & Configuring Presto

Spark SQL | What is Spark SQL & Spark SQL Guide | Ahana

Jun 30, 20212 min read

What is Spark SQL? Spark is a general purpose computation engine for large-scale data processing. At Spark’s inception, the primary abstraction was a resilient distributed dataset (RDD), an immutable distributed … Continue reading Spark SQL | What is Spark SQL & Spark SQL Guide | Ahana

Hive vs Presto vs Spark

Jun 30, 20213 min read

What is Apache Hive? Apache Hive is a data warehouse software project built on top of Apache Hadoop for providing data query and analysis. Hive provides an SQL-like interface called … Continue reading Hive vs Presto vs Spark

Query Data Lake With Presto | Presto Google Cloud | Ahana

Jun 24, 20213 min read

How do I query a data lake with Presto? A data lake is a centralized repository that allows you to store all your structured and unstructured data at any scale. … Continue reading Query Data Lake With Presto | Presto Google Cloud | Ahana

Presto EMR S3 Timeout Error | Presto Query Timeout | Ahana

Jun 24, 20212 min read

Why am I getting a Presto EMR S3 timeout error? If you’re using AWS EMR Presto, you can use the S3 select pushdown feature to push down compute operations (i.e. … Continue reading Presto EMR S3 Timeout Error | Presto Query Timeout | Ahana

Ahana Demonstrates Major Momentum in Customer and Community Adoption for Presto 1H 2021

Jun 24, 20216 min read

The Presto company also shows significant product momentum with numerous accolades and industry recognition  San Mateo, Calif. – June 24, 2021 — Ahana, the Presto company, today announced major momentum … Continue reading Ahana Demonstrates Major Momentum in Customer and Community Adoption for Presto 1H 2021

Why I’m betting on PrestoDB, and why you should too!

Jun 13, 20217 min read

By Dipti Borkar, Ahana Cofounder, Chief Product Officer & Chief Evangelist I’ve been in open source software companies and communities for over 10 years now, and in the database industry … Continue reading Why I’m betting on PrestoDB, and why you should too!

Do I need to move my data to query it with Presto?

Jun 3, 20211 min read

No, Presto queries your data in-place so you don’t need to move it. If you’re using AWS S3 for your data lake, for example, you wouldn’t need to ingest it … Continue reading Do I need to move my data to query it with Presto?

5 main reasons Data Engineers move from AWS Athena to Ahana Cloud

Jun 1, 20214 min read

In this brief post, we’ll discuss the 5 main reasons why data platform engineers decide to move their data analytics workloads from Amazon Athena to Ahana Cloud for Presto. While … Continue reading 5 main reasons Data Engineers move from AWS Athena to Ahana Cloud

Ahana Cloud for Presto Versus Amazon EMR

May 27, 20213 min read

In this brief post, we’ll discuss some of the benefits of Ahana Cloud over Amazon Elastic MapReduce (EMR). While EMR offers optionality in the number of big data compute frameworks, … Continue reading Ahana Cloud for Presto Versus Amazon EMR

Streaming Data Processing Using Apache Kafka and Presto

May 12, 20213 min read

Kafka Quick Start Kafka is a distributed data streaming framework meant to enable the creation of highly scalable distributed systems. Developed at LinkedIn in 2008 and open-sourced in 2011, it … Continue reading Streaming Data Processing Using Apache Kafka and Presto

Business Intelligence And Data Analysis With Druid and Presto

May 12, 20214 min read

Apache Druid Helicopter View Apache Druid is a distributed, columnar database aimed at developing analytical solutions. It offers a real-time analytics database able to ingest and query massive amounts of … Continue reading Business Intelligence And Data Analysis With Druid and Presto

Flexible And Low Latency OLAP Using Apache Pinot and Presto for real time analytics

May 12, 20213 min read

Apache Pinot Overview Apache Pinot is a distributed, low latency online analytical processing (OLAP) platform used for carrying out fast big data analytics. Developed at LinkedIn in 2014, the highly … Continue reading Flexible And Low Latency OLAP Using Apache Pinot and Presto for real time analytics

Turbocharge your Analytics with MongoDB And Presto

May 12, 20214 min read

High-Level View Of MongoDB MongoDB is a NoSQL distributed document database meant to handle diverse data management requirements. Its design goals include creating an object-oriented, highly available, scalable, efficient, and … Continue reading Turbocharge your Analytics with MongoDB And Presto

CRN® Recognizes Ahana on Its 2021 Big Data 100 List As One of The Coolest Business Analytics Companies

May 5, 20214 min read

Ahana also named to CRN’s 10 Hot Big Data Companies You Should Watch in 2021 list San Mateo, Calif. – May 5, 2021 — Ahana, the self-service analytics company for … Continue reading CRN® Recognizes Ahana on Its 2021 Big Data 100 List As One of The Coolest Business Analytics Companies

Presto Sync Partition Metastore & Metadata | Presto Sync | Ahana

May 4, 20212 min read

How do I sync my partition and metastore in Presto? Sync partition metadata is used to sync the metastore with information on the file system/s3 for the external table. Depending … Continue reading Presto Sync Partition Metastore & Metadata | Presto Sync | Ahana

Ahana Cofounder Will Present Session At Next Gen Big Data Platforms Meetup hosted by LinkedIn About Open Data Lake Analytics

Nov 2, 20213 min read

Ahana Cofounder and Chief Product Officer Dipti Borkar will present a session at Next Gen Big Data Platforms Meetup hosted by LinkedIn about open data lake analytics.

Presto 105: Running Presto with AWS Glue as catalog on your Laptop

Oct 21, 202110 min read

Introduction This is the 5th tutorial in our Getting Started with Presto series. To recap, here are the first 4 tutorials: Presto 101: Installing & Configuring Presto locally Presto 102: … Continue reading Presto 105: Running Presto with AWS Glue as catalog on your Laptop

0 to Presto with AWS Oct 2021

Oct 20, 20212 min read

Data lakes are widely used and have become extremely affordable, especially with the advent of technologies like AWS S3. During this webinar, Gary Stafford, Solutions Architect at AWS, and Dipti Borkar, Cofounder & CPO at Ahana, will share how to build an open data lake stack with Presto and AWS S3.

How to Build an Open Data Lake Analytics Stack

Oct 12, 20212 min read

In this webinar, we’ll discuss how to build an Open Data Lake Analytics stack using 4 key components: open source technologies, open formats, open interfaces & open cloud.

Presto 104: Running Presto with Hive Metastore on your Laptop

Oct 8, 202110 min read

Introduction This is the 4th tutorial in our Getting Started with Presto series. To recap, here are the first 3 tutorials: Presto 101: Installing & Configuring Presto locally Presto 102: … Continue reading Presto 104: Running Presto with Hive Metastore on your Laptop

Unlocking the Value of Your Data Lake

Oct 6, 20212 min read

During this webinar Ahana co-founder and Chief Product Officer Dipti Borkar will discuss how to unlock the value of your data lake with the emerging Open Data Lake analytics architecture.

Connecting to Presto with Superset

Oct 5, 20218 min read

This blog post will provide you with an understanding of how to connect Superset to Presto. TL;DR Superset refers to a connection to a distinct data source as a database. … Continue reading Connecting to Presto with Superset

Presto Tutorial 103: PrestoDB cluster on GCP

Sep 21, 202126 min read

Introduction This tutorial is Part III of our Getting started with PrestoDB series. As a reminder, Prestodb is an open source distributed SQL query engine. In tutorial 102 we covered … Continue reading Presto Tutorial 103: PrestoDB cluster on GCP

How to use mathematical functions and operators and aggregate functions for Presto?

Sep 21, 20216 min read

Presto offers several classes of mathematical functions that operate on single values and mathematical operators that allow for operations on values across columns. In addition, aggregate functions can operator on … Continue reading How to use mathematical functions and operators and aggregate functions for Presto?

Ahana Cofounder Will Co-lead Session At OSPOCon 2021 About Presto SQL Query Engine

Sep 21, 20213 min read

San Mateo, Calif. – September 21, 2021 — Ahana, the SaaS for Presto company, today announced that its Cofounder and Chief Product Officer Dipti Borkar will co-lead a session with … Continue reading Ahana Cofounder Will Co-lead Session At OSPOCon 2021 About Presto SQL Query Engine

Announcing the workload profile feature in Ahana Cloud

Sep 20, 20213 min read

Ahana Cloud for Presto is the first fully integrated, cloud native managed service that simplifies the ability of cloud and data platform teams. With the managed Presto service, we provide … Continue reading Announcing the workload profile feature in Ahana Cloud

Presto on AWS: Exploring different Presto services

Sep 16, 20212 min read

In this webinar we will discuss why companies are using Ahana Cloud for their Presto deployments and give an overview of Ahana including the Ahana SaaS console, how easy it is to add data sources like AWS S3 and integrate catalogs like Hive, and features like Data Lake Caching for 5x performance and autoscaling.

Ahana Joins AWS ISV Accelerate Program to Expand Access to Its Presto Managed Service for Fast SQL on Amazon S3 Data Lakes

Sep 14, 20214 min read

Ahana also selected into the invite-only AWS Global Startup Program San Mateo, Calif. – September 14, 2021 — Ahana, the Presto company, today announced it has been accepted into the … Continue reading Ahana Joins AWS ISV Accelerate Program to Expand Access to Its Presto Managed Service for Fast SQL on Amazon S3 Data Lakes

Ahana 101: An introduction to Ahana Cloud for Presto on AWS, SaaS for Presto on AWS

Sep 9, 20212 min read

In this webinar we will discuss why companies are using Ahana Cloud for their Presto deployments and give an overview of Ahana including the Ahana SaaS console, how easy it is to add data sources like AWS S3 and integrate catalogs like Hive, and features like Data Lake Caching for 5x performance and autoscaling.

Presto 101: An introduction to open source Presto

Sep 2, 20212 min read

In this session, Dipti will introduce the Presto technology and share why it’s becoming so popular – in fact, companies like Facebook, Uber, Twitter, Alibaba, and much more use Presto for interactive ad hoc queries, reporting & dashboarding data lake analytics, and much more. We’ll also show a quick demo on getting Presto running in AWS.

What is a Presto lag example?

Aug 31, 20212 min read

The Presto lag function a window function that returns the value of an offset before the current row in a window. One common use case for the lag function is … Continue reading What is a Presto lag example?

SQL on the Data Lake, Using open source Presto to unlock the value of your data lake

Aug 26, 20212 min read

In this webinar, Dipti will discuss why open source Presto has quickly become the de-facto query engine for the data lake. Presto enables ad hoc data discovery where you can use SQL to run queries whenever you want, wherever your data resides. With Presto, you can unlock the value of your data lake.

How do I get the date_diff from previous rows?

Aug 24, 20211 min read

To find the difference in time between consecutive dates in a result set, Presto offers window functions. Take the example table below which contains sample data of users who watched … Continue reading How do I get the date_diff from previous rows?

Data Warehouse or Data Lake, which one do I use?

Aug 19, 202143 min read

In this webinar, you’ll hear from industry analyst John Santaferraro and Ahana cofounder and CPO Dipti Borkar who will discuss the data landscape and how many companies are thinking about their data warehouse/data lake strategy.

Tutorial: How to run SQL queries with Presto on Amazon Redshift

Aug 10, 20218 min read

Presto has evolved into a unified SQL engine on top of cloud data lakes for both interactive queries as well as batch workloads with multiple data sources. This tutorial is … Continue reading Tutorial: How to run SQL queries with Presto on Amazon Redshift

How do I use the approx_percentile function in Presto?

Aug 9, 20216 min read

The Presto approx_percentile is one of the approximate aggregate functions, and it returns an approximate percentile for a set of values (e.g. column). In this short article, we will explain … Continue reading How do I use the approx_percentile function in Presto?

Announcing the Ahana $20M Series A – Furthering our Vision of Open Data Lake Analytics with Presto

Aug 3, 20216 min read

I’m very excited to announce that Ahana, the SaaS for Presto company, has raised a jumbo $20M Series A round from lead investor Third Point Ventures. Our SaaS managed service … Continue reading Announcing the Ahana $20M Series A – Furthering our Vision of Open Data Lake Analytics with Presto

Presto Company Ahana Raises $20M Series A Led By Third Point Ventures To Redefine Open Data Lake Analytics

Aug 3, 20214 min read

Funding comes on heels of major momentum in customer and community adoption for Presto San Mateo, Calif. – August 3, 2021 — Ahana, the SaaS for Presto company, today announced … Continue reading Presto Company Ahana Raises $20M Series A Led By Third Point Ventures To Redefine Open Data Lake Analytics

Autoscale your Presto cluster in Ahana Cloud

Jul 29, 20214 min read

We’re excited to announce that autoscaling is now available on Ahana Cloud. In this initial release, the autoscaling feature will monitor the worker nodes’ average CPU Utilization of your presto … Continue reading Autoscale your Presto cluster in Ahana Cloud

Community Roundtable: Open Data Lakes with Presto, Apache Hudi & AWS Glue and S3

Jul 28, 202152 min read

Join us for this roundtable discussion where experts from each layer in this stack – Presto, AWS, and Apache Hudi – will discuss why we’re seeing a pronounced adoption to this next generation of cloud data lake analytics and how these technologies enable open, flexible, and highly performant analytics in the cloud.

Tutorial: How to run SQL queries with Presto on Google BigQuery

Jul 20, 20217 min read

Presto has evolved into a unified SQL engine on top of cloud data lakes for both interactive queries as well as batch workloads with multiple data sources. This tutorial is … Continue reading Tutorial: How to run SQL queries with Presto on Google BigQuery

Snowflake may not be the silver bullet you wanted for your long term data strategy… here’s why

Jul 20, 20217 min read

Since COVID, every business has pivoted and moved everything online, accelerating digital transformation with data and AI. Self-service, accelerated analytics has become more and more critical for businesses and Snowflake … Continue reading Snowflake may not be the silver bullet you wanted for your long term data strategy… here’s why

Can I write back or update data in my Hadoop / Apache Hive cluster through Presto?

Jul 13, 20212 min read

Using Presto with a Hadoop cluster for SQL analytics is pretty common especially in on premise deployments.  With Presto, you can read and query data from the Hadoop datanodes but … Continue reading Can I write back or update data in my Hadoop / Apache Hive cluster through Presto?

How do I convert Unix Epoch time to a date or something more human readable with SQL?

Jul 13, 20211 min read

Many times the Unix Epoch Time gets stored in the database. But this is not very human readable and conversion is required for reports and dashboards.  Example of Unix Epoch … Continue reading How do I convert Unix Epoch time to a date or something more human readable with SQL?

How do I transfer data from a Hadoop / Hive cluster to a Presto cluster?

Jul 13, 20212 min read

Hadoop is a system that manages both compute and data together. Hadoop cluster nodes have the HDFS file system and may also have different types of engines like Apache Hive, … Continue reading How do I transfer data from a Hadoop / Hive cluster to a Presto cluster?

Hands-on Presto Tutorial: How to run Presto on Kubernetes

Jul 12, 202110 min read

What is Presto? Presto is a distributed query engine designed from the ground up for data lake analytics and interactive query workloads. Presto supports connectivity to a wide variety of … Continue reading Hands-on Presto Tutorial: How to run Presto on Kubernetes

Presto 102 Tutorial: Install PrestoDB on a Laptop or PC

Jul 9, 20214 min read

Summary Prestodb is an open source distributed parallel query SQL engine. In tutorial 101 we walk through manual installation and configuration on a bare metal server or on a VM. It … Continue reading Presto 102 Tutorial: Install PrestoDB on a Laptop or PC

Enabling spill to disk for optimal price per performance

Jul 7, 20214 min read

Presto was born out of the need for low-latency interactive queries on large scale data, and hence, continually optimized for that use case. In such scenarios, the best practice is … Continue reading Enabling spill to disk for optimal price per performance

Presto substring operations: How do I get the X characters from a string of a known length?

Jul 7, 20212 min read

Presto provides an overloaded substring function to extract characters from a string. We will use the string “Presto String Operations” to demonstrate the use of this function. Extract last 7 … Continue reading Presto substring operations: How do I get the X characters from a string of a known length?

Presto 101 Tutorial: Installing & Configuring Presto

Jun 30, 20218 min read

Installing & Configuring Presto locally Presto Installation Presto can be installed manually or using docker images on: Single Node: Both co-ordinator and workers run on the same machine.  or even … Continue reading Presto 101 Tutorial: Installing & Configuring Presto

Spark SQL | What is Spark SQL & Spark SQL Guide | Ahana

Jun 30, 20212 min read

What is Spark SQL? Spark is a general purpose computation engine for large-scale data processing. At Spark’s inception, the primary abstraction was a resilient distributed dataset (RDD), an immutable distributed … Continue reading Spark SQL | What is Spark SQL & Spark SQL Guide | Ahana

Hive vs Presto vs Spark

Jun 30, 20213 min read

What is Apache Hive? Apache Hive is a data warehouse software project built on top of Apache Hadoop for providing data query and analysis. Hive provides an SQL-like interface called … Continue reading Hive vs Presto vs Spark

Query Data Lake With Presto | Presto Google Cloud | Ahana

Jun 24, 20213 min read

How do I query a data lake with Presto? A data lake is a centralized repository that allows you to store all your structured and unstructured data at any scale. … Continue reading Query Data Lake With Presto | Presto Google Cloud | Ahana

Presto EMR S3 Timeout Error | Presto Query Timeout | Ahana

Jun 24, 20212 min read

Why am I getting a Presto EMR S3 timeout error? If you’re using AWS EMR Presto, you can use the S3 select pushdown feature to push down compute operations (i.e. … Continue reading Presto EMR S3 Timeout Error | Presto Query Timeout | Ahana

Ahana Demonstrates Major Momentum in Customer and Community Adoption for Presto 1H 2021

Jun 24, 20216 min read

The Presto company also shows significant product momentum with numerous accolades and industry recognition  San Mateo, Calif. – June 24, 2021 — Ahana, the Presto company, today announced major momentum … Continue reading Ahana Demonstrates Major Momentum in Customer and Community Adoption for Presto 1H 2021

Why I’m betting on PrestoDB, and why you should too!

Jun 13, 20217 min read

By Dipti Borkar, Ahana Cofounder, Chief Product Officer & Chief Evangelist I’ve been in open source software companies and communities for over 10 years now, and in the database industry … Continue reading Why I’m betting on PrestoDB, and why you should too!

Do I need to move my data to query it with Presto?

Jun 3, 20211 min read

No, Presto queries your data in-place so you don’t need to move it. If you’re using AWS S3 for your data lake, for example, you wouldn’t need to ingest it … Continue reading Do I need to move my data to query it with Presto?

5 main reasons Data Engineers move from AWS Athena to Ahana Cloud

Jun 1, 20214 min read

In this brief post, we’ll discuss the 5 main reasons why data platform engineers decide to move their data analytics workloads from Amazon Athena to Ahana Cloud for Presto. While … Continue reading 5 main reasons Data Engineers move from AWS Athena to Ahana Cloud

Ahana Cloud for Presto Versus Amazon EMR

May 27, 20213 min read

In this brief post, we’ll discuss some of the benefits of Ahana Cloud over Amazon Elastic MapReduce (EMR). While EMR offers optionality in the number of big data compute frameworks, … Continue reading Ahana Cloud for Presto Versus Amazon EMR

Streaming Data Processing Using Apache Kafka and Presto

May 12, 20213 min read

Kafka Quick Start Kafka is a distributed data streaming framework meant to enable the creation of highly scalable distributed systems. Developed at LinkedIn in 2008 and open-sourced in 2011, it … Continue reading Streaming Data Processing Using Apache Kafka and Presto

Business Intelligence And Data Analysis With Druid and Presto

May 12, 20214 min read

Apache Druid Helicopter View Apache Druid is a distributed, columnar database aimed at developing analytical solutions. It offers a real-time analytics database able to ingest and query massive amounts of … Continue reading Business Intelligence And Data Analysis With Druid and Presto

Flexible And Low Latency OLAP Using Apache Pinot and Presto for real time analytics

May 12, 20213 min read

Apache Pinot Overview Apache Pinot is a distributed, low latency online analytical processing (OLAP) platform used for carrying out fast big data analytics. Developed at LinkedIn in 2014, the highly … Continue reading Flexible And Low Latency OLAP Using Apache Pinot and Presto for real time analytics

Turbocharge your Analytics with MongoDB And Presto

May 12, 20214 min read

High-Level View Of MongoDB MongoDB is a NoSQL distributed document database meant to handle diverse data management requirements. Its design goals include creating an object-oriented, highly available, scalable, efficient, and … Continue reading Turbocharge your Analytics with MongoDB And Presto

CRN® Recognizes Ahana on Its 2021 Big Data 100 List As One of The Coolest Business Analytics Companies

May 5, 20214 min read

Ahana also named to CRN’s 10 Hot Big Data Companies You Should Watch in 2021 list San Mateo, Calif. – May 5, 2021 — Ahana, the self-service analytics company for … Continue reading CRN® Recognizes Ahana on Its 2021 Big Data 100 List As One of The Coolest Business Analytics Companies

Presto Sync Partition Metastore & Metadata | Presto Sync | Ahana

May 4, 20212 min read

How do I sync my partition and metastore in Presto? Sync partition metadata is used to sync the metastore with information on the file system/s3 for the external table. Depending … Continue reading Presto Sync Partition Metastore & Metadata | Presto Sync | Ahana