Advanced SQL Queries with Presto
Advanced SQL features and functions are used by analysts when, for example, complex calculations are needed, or when many tables (perhaps from multiple sources) need to be joined, when dealing with nested or repeated data, dealing with time-series data or complex data types like maps, arrays, structs and JSON, or perhaps a combination of all these things.
Presto’s ANSI SQL engine supports numerous advanced functions which can be split into the following categories – links to the PrestoDB documentation are provided for convenience:
- Functions for JSON and other complex data types – https://prestodb.io/docs/current/functions/json.html
- Advanced aggregation techniques – https://prestodb.io/docs/current/functions/aggregate.html
- Window functions – https://prestodb.io/docs/current/functions/window.html
- Array and map functions – https://prestodb.io/docs/current/functions/array.html and https://prestodb.io/docs/current/functions/map.html
- Lambda expressions to assist with nested data – https://prestodb.io/blog/2020/03/02/presto-lambda
- Geospatial – https://prestodb.io/docs/current/functions/geospatial.html
- Sampling – https://prestodb.io/docs/current/sql/select.html “Table Sampling” section.
Running advanced SQL queries can benefit greatly from Presto’s distributed, in-memory processing architecture and cost-based optimizer.