How do I sync my partition and metastore in Presto?

Sync partition metadata is used to sync the metastore with information on the file system/s3 for the external table. Depending upon the number of partitions the sync can take time.

Here is a quick reference from the presto docs: https://prestodb.io/docs/current/connector/hive.html?highlight=sync_partition_metadata

Procedures#

  • system.create_empty_partition(schema_name, table_name, partition_columns, partition_values)
    Create an empty partition in the specified table.
  • system.sync_partition_metadata(schema_name, table_name, mode, case_sensitive)
    Check and update partitions list in metastore. There are three modes available:
    • ADD : add any partitions that exist on the file system but not in the metastore.
    • DROP: drop any partitions that exist in the metastore but not on the file system.
    • FULL: perform both ADD and DROP.

The case_sensitive argument is optional. The default value is true for compatibility with Hive’s MSCK REPAIR TABLE behavior, which expects the partition column names in file system paths to use lowercase (e.g. col_x=SomeValue). Partitions on the file system not conforming to this convention are ignored, unless the argument is set to false.

If you want to get started with Presto easily, check out Ahana Cloud. It’s SaaS for Presto and takes away all the complexities of tuning, management and more. It’s free to try out for 14 days, then it’s pay-as-you-go through the AWS marketplace.