How Much Memory Should I Give A Presto Worker Node?
Presto is an in-memory query engine and so naturally memory configuration and management is important.
Presto’s JVM memory config nearly always needs to be configured – you shouldn’t be running Presto with its default setting of 16GB of memory per worker/coordinator.
The Xmx flag specifies the maximum memory allocation pool for a Java virtual machine. Change the
-Xmx16G in the
jvm.config file to a number based on your cluster’s capacity, and number of nodes. See https://prestodb.io/presto-admin/docs/current/installation/presto-configuration.html on how to do this.
Rule of Thumb
It is recommended you set aside 15-20% of total physical memory for the OS. So for example if you are using EC2 “r5.xlarge” instances which have 32GB of memory, 32GB-20% = 25.6GB so you would use
-Xmx25G in the
jvm.config file for coordinator and worker (or
-Xmx27G if you want to go with 15% for the OS).
This is assuming there are no other services running on the server/instance, so maximum memory can be given to Presto.
Like with JVM above, there are two memory related settings that you should check before starting Presto. For most workloads Presto’s other memory settings will work perfectly well when left at their defaults. There are configurable parameters that control memory allocation that could be useful for specific workloads however. The practical guidelines below will 1) help you decide if you need to change your Presto memory configuration, and 2) which parameters to change.
You may want to change Presto’s memory configuration to optimise ETL workloads versus analytical workloads, or for high query concurrency versus single-query scenarios. There’s a great in-depth blog on Presto’s memory management written by one of Presto’s contributors at https://prestodb.io/blog/2019/08/19/memory-tracking which will guide you in making more detailed tweaks.
Configuration Files & Parameters
When first deploying Presto there are two memory settings that need checking. Locate the config.properties files for both the coordinator and worker. There are two important parameters here:
query.max-memory. Again see https://prestodb.io/presto-admin/docs/current/installation/presto-configuration.html for rules-of-thumb and how to configure these parameters based on the available memory.
The above guidelines and links should help you when considering how much memory should you give a worker node. To avoid all memory configuration work we recommend using Ahana Cloud for Presto – a fully managed service for Presto, that needs zero configuration.