How to Presto Data Share
Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes. Presto was designed and written from the ground up for interactive analytics and approaches the speed of commercial data warehouses while scaling to the size of organizations like Facebook.
Presto enables data sharing in two ways.
1) With its broad list of data connectors, Presto provides a simple way to exploit data virtualization with federated SQL queries accessing data across multiple data sources. Data in RDBMSs, NoSQL databases, legacy data warehouses, files in object storage can all be accessed, and combined in a single SQL query if required. Presto allows querying data where it lives, like in Hive, Cassandra, relational databases or even proprietary data stores. A single Presto query can combine data from multiple sources, allowing for analytics across your entire organization by a variety of users and applications. A complete list of Presto’s connectors can be found at https://prestodb.io/docs/current/connector.html
2) sharing access to data by connecting your Presto cluster to other systems, tools and applications is achieved using the Presto API or using SQL via client libraries, ODBC or JDBC interfaces:
API
Presto’s HTTP API is the communication protocol between server and client. It’s used to send query statements for execution on the server and to receive results back to the client. See https://github.com/prestodb/presto/wiki/HTTP-Protocol for details and usage notes.
As an example., you can make a simple REST call to Presto to get a JSON dump of recently run queries using the syntax:
http://<prestoServerHost>:<port>/v1/query
The default port is 8080.
You can optionally specify a query ID – in this example the query ID is 20200926_204458_00000_68x9u:
http://myHost:8080/v1/query/20200926_204458_00000_68x9u
JDBC
Presto can be accessed using SQL from Java using the JDBC driver. Download link is in the documentation: https://prestodb.io/docs/current/installation/jdbc.html. The following JDBC URL connection string formats are supported:
jdbc:presto://host:port
jdbc:presto://host:port/catalog
jdbc:presto://host:port/catalog/schema
Here’s example Java code to establish a connection:
String sql = "SELECT * FROM sys.node";
String url = "jdbc:presto://localhost:8080/catalog/schema";
try (Connection connection =
DriverManager.getConnection(url, "test", null)) {
try (Statement statement = connection.createStatement()) {
try (ResultSet rs = statement.executeQuery(sql)) {
while (rs.next()) {
System.out.println(rs.getString("node_id"));
}
}
}
}
ODBC
Several free and paid-for options exist:
- Free: Prestogres is a gateway server that allows clients to use PostgreSQL protocol to run queries on Presto: https://github.com/treasure-data/prestogres/blob/master/README.md
- Paid: https://www.simba.com/drivers/presto-odbc-jdbc/
- Paid: https://www.cdata.com/drivers/presto/odbc/
Client libraries
Presto libraries for C, Go, Java, node.js, PHP, Python, R, Ruby are available at https://prestodb.io/resources.html#libraries
When it comes to Presto data usage and share, we hope the above information is useful.