Setting-up Presto Resource Groups

Before you start allowing users on your Presto cluster it’s best practice to configure resource groups.  A resource group is a query admission control and workload management mechanism that manages resource allocation. Once configured, Presto’s resource group manager places limits on CPU and/or memory resource usage, can enforce queueing policies on queries, or divide resources up among groups and sub-groups. Each query belongs to a single resource group, and consumes resources from that group (and its “ancestor” groups).  Presto tracks memory and cpu usage of queries and these statistics can be recorded in the jmx catalog. 

Set-up is fairly straight-forward but there are some subtleties to be aware of like for example once a query starts execution, the resource group manager has no control over the query.

There follows a worked example which sets up four resource groups with controls based on memory usage: 

  1. Group1 one can use up to 40% of the (available) cluster memory 
  2. Group2 can use up to 30% of memory
  3. An administration group which has high priority for anyone logging in as “admin”
  4. An adhoc group for all other queries. 

Step 1: You’ll need to add a  etc/resource-groups.properties file with the following contents to enable resource groups:

resource-groups.configuration-manager=file
resource-groups.config-file=etc/resource-groups.json

Step 2: Create your etc/resource-groups.json file with your group1 (40% mem) and group2 (30% mem). Some recommendations:

  • Generally, allow a max global mem usage of 80% for Presto user activity. The remaining 20% is the recommended overhead for Presto to use as workspace.
  • Configure your maxQueued parameter (required) to the maximum number of queued queries you want the cluster to tolerate. Once this limit is reached new queries are rejected with an error message.
  • Configure hardConcurrencyLimit – this is the maximum query concurrency across the cluster. 
  • Configure your softMemoryLimit (your maximum amount of distributed memory each group may use before new queries are queued) for each group. You can specify it as a percentage (i.e. 40%) of the cluster’s memory. 
  • We’ll leave the cpuLimit parameters alone, which are optional anyway. These can be looked at later depending on your workload and performance requirements. 
  • Consider setting  “jmxExport”: true to store the statistics allowing you to monitor the resource behaviour. 
  • I recommend a group for admin use.  I’ve called this simply “admin” in the example json file below.  
  • I recommend an adhoc group for all other queries – a “catch-all”. I added this to the example json file below.

Step 3: You also need to set-up Selectors which let the cluster know who or what you are so you can be assigned a resource group. This can be based on the id of the user running the query, or a “source” string literal that you provide on the CLI with the –source  option. There’s a way to pass this via JDBC calls too.  I have added a “group1” and a “group2” selector to the example configuration below, plus “admin”, and “adhoc” for everything else.  

Step 4: Test it! Examine Presto’s log when you start the cluster to make sure your json config is valid. If all is well run a query with the Presto CLI and specify the resource group like this:

$ presto --server localhost:8090 --source group1
presto> show catalogs;

Check that the group is being picked-up correctly in the Presto UI – here you can see the right group is displayed (“group1”) for my test query:

There is a great example and more detailed description of all the params in the docs at https://prestodb.io/docs/current/admin/resource-groups.html

Once you get the basic resource groups set up you can tune it. You can consider using the optional schedulingPolicy which controls how queued queries are selected to run next. Also if your two resource groups have differing importance you can set their schedulingWeight (default is 1) to control how their queued queries are selected for execution – higher weight = higher priority e.g. users’ adhoc/interactive queries might be set to 10, but batch /  etl job type queries may be left at 1.  You can also have Presto auto-recognise DDL and treat such queries differently with their own group.

Here’s a sample etc/resource-groups.json file with the four groups defined:

{
  "rootGroups": [
    {
      "name": "global"
      "softMemoryLimit": "80%",
      "hardConcurrencyLimit": 100,
      "maxQueued": 1000,
      "jmxExport": true,
      "subGroups": [
        {
          "name": "group1",
          "softMemoryLimit": "40%",
          "hardConcurrencyLimit": 5,
          "maxQueued": 100,
          "schedulingWeight": 1
        },
        {
          "name": "group2",
          "softMemoryLimit": "30%",
          "hardConcurrencyLimit": 5,
          "maxQueued": 100,
          "schedulingWeight": 1
        },
        {
        "name": "adhoc",
        "softMemoryLimit": "10%",
        "hardConcurrencyLimit": 50,
        "maxQueued": 1,
        "schedulingWeight": 10,
        "subGroups": [
          {
            "name": "other",
            "softMemoryLimit": "10%",
            "hardConcurrencyLimit": 2,
            "maxQueued": 1,
            "schedulingWeight": 10,
            "schedulingPolicy": "weighted_fair",
            "subGroups": [
              {
                "name": "${USER}",
                "softMemoryLimit": "10%",
                "hardConcurrencyLimit": 1,
                "maxQueued": 100
              }
            ]
          }
        ]
        }
      ]
    },
    {
      "name": "admin",
      "softMemoryLimit": "100%",
      "hardConcurrencyLimit": 50,
      "maxQueued": 100,
      "schedulingPolicy": "query_priority",
      "jmxExport": true
    }
  ],
  "selectors": [
      {
      "user": "admin",
      "group": "admin"
    },
  {
      "source": ".*group1.*",
      "group": "global.group1"
    },
    {
      "source": ".*group2.*",
      "group": "global.group2"
    },
    {
      "group": "global.adhoc.other.${USER}"
    }
  ]
}

I hope that helps you get you up and running with Presto resource groups.