presto logs

Where to Find Presto Logs

If you’re wondering “where do I find Presto logs”, we’ll help explain it. Presto needs a data directory for storing logs, etc. and it is recommended this is created in a data directory outside of the installation directory (which allows it to be easily preserved when upgrading Presto). Presto’s node properties file etc/node.properties sets the location of the data directory. Set the node.data-dir property to point to the location (filesystem path) of the intended data directory – Presto will store logs (Presto logs EMR) and other data here. 

presto logs

For example, my environment is set to log to /var/lib/presto/data/var/log and in here we find http request and other logs: 

$ ls -al /var/lib/presto/data/var/log
-rw-r--r-- 1 root root 5202277 Sep 25 21:53 http-request.log
-rw-r--r-- 1 root root  238037 Sep 12 01:23 http-request.log-2020-09-11.0.log.gz
-rw-r--r-- 1 root root  636367 Sep 13 00:00 http-request.log-2020-09-12.0.log.gz
-rw-r--r-- 1 root root  583609 Sep 14 00:00 http-request.log-2020-09-13.0.log.gz
-rw-r--r-- 1 root root  297803 Sep 16 12:53 http-request.log-2020-09-14.0.log.gz
-rw-r--r-- 1 root root  225971 Sep 17 00:00 http-request.log-2020-09-16.0.log.gz
-rw-r--r-- 1 root root  221244 Sep 25 10:43 http-request.log-2020-09-17.0.log.gz
...

Note that logging levels are controlled by the etc/log.properties file. The level can be set to one of four levels: DEBUG, INFO, WARN and ERROR. 

Query Logs: Running/completed/failed queries are displayed in the Presto Console UI. You can also access query logs using the Presto Event Listener in combination with custom functions to listen to events happening inside the Presto engine and react to them. Event listeners are invoked for following events in Presto query workflow 1) Query creation, 2) Query completion, and 3) Split completion. There’s a worked example here http://dharmeshkakadia.github.io/presto-event-listener/ and the doc page is here https://prestodb.io/docs/current/develop/event-listener.html. This  method enables you to collect all the queries submitted to Presto for later analysis. 

Ahana & Presto Logs

If you are using Ahana Cloud then it exposes the query log in a catalog that can be easily queried, e.g. using presto-cli for convenience.  The catalog is called ahana_querylog and it uses the aforementioned Event Listener mechanism:

$ presto --server https://devtest.james.staging.ahn-dev.app 
presto:demo> show catalogs;
    Catalog     
----------------
 ahana_hive     
 ahana_querylog 
 glue           
 glue2          
 jmx            
 mysql-db       
 system         
 tpcds          
 tpch           
(9 rows)

presto:public> use ahana_querylog.public;
presto:public> show tables;

       Table       
-------------------
 querylog          
 schema_migrations 
(2 rows)

presto:public> select * from querylog;
      type      | cluster_name |           ts            | seq |    user     |                               
----------------+--------------+-------------------------+-----+-------------+-----------------
 queryCreated   | devtest      | 2020-09-22 13:03:32.000 |   1 | jamesmesney | show catalogs                 
 queryCreated   | devtest      | 2020-09-22 13:03:54.000 |   3 | jamesmesney | use ahana_querylog            
 queryCreated   | devtest      | 2020-09-22 13:04:15.000 |   4 | jamesmesney | use ahana_hive                
 queryCreated   | devtest      | 2020-09-22 13:06:28.000 |   5 | jamesmesney | SHOW FUNCTIONS                
 queryCreated   | devtest      | 2020-09-22 13:15:13.000 |   8 | jamesmesney | show catalogs                 
 queryCreated   | devtest      | 2020-09-22 13:15:19.000 |  10 | jamesmesney | use ahana_hive                
 queryCompleted | devtest      | 2020-09-22 13:15:19.000 |  11 | jamesmesney | use ahana_hive                
 queryCreated   | devtest      | 2020-09-22 13:15:20.000 |  13 | jamesmesney | SELECT table_name 
 queryCreated   | devtest      | 2020-09-22 13:15:24.000 |  15 | jamesmesney | show tables                   
 queryCreated   | devtest      | 2020-09-22 13:15:40.000 |  16 | jamesmesney | SHOW FUNCTIONS                
 queryCreated   | devtest      | 2020-09-22 13:15:44.000 |  21 | jamesmesney | show tables                   
 queryCompleted | devtest      | 2020-09-22 13:15:55.000 |  25 | jamesmesney | use ahana_querylog            
 queryCreated   | devtest      | 2020-09-22 13:15:20.000 |  12 | jamesmesney | SHOW FUNCTIONS                
 splitCompleted | devtest      | 2020-09-22 13:15:44.000 |  22 | NULL        | NULL                          
 queryCreated   | devtest      | 2020-09-22 13:15:55.000 |  24 | jamesmesney | use ahana_querylog     

So, now you know where and how to find Presto logs. Are you looking for more information on all-things Presto? If so, check out some of our past presentations.

What is an Open Data Lake in the Cloud?

The Open Data Lake in the cloud is the solution to the massive data problem. Many companies are adopting that architecture because of better price-performance, scale, and non-proprietary architecture.

Data Warehouse Concepts for Beginners

A data warehouse is a relational database that is designed for query and analysis rather than for transaction processing. Check out this article for more information about data warehouses.