What is the difference between a managed table and external tables?
The main difference between a managed and external table is that when you drop an external table, the underlying data files stay intact. This is because the user is expected to manage the data files and directories. With a managed table, the underlying directories and data get wiped out when the table is dropped.
External Table: Table created using WITH has ‘external_location’
Managed Table: Table created in schema which has WITH used has ‘location’
You cannot “insert into” an external table (By default, the setting
hive.non-managed-table-writes-enabled=false prevents you from doing so).
The expectation that the data in the external table is managed externally. e.g. Spark, Hadoop, Python Scripts, or another external ETL process.
Below are the major differences between Internal vs External tables in Apache Hive.
|INTERNAL OR MANAGED TABLE||EXTERNAL TABLE|
|By default, Hive creates an Internal or Managed Table.||Use EXTERNAL option/clause to create an external table|
|Hive owns the metadata, table data by managing the lifecycle of the table||Hive manages the table metadata but not the underlying file.|
|Dropping an Internal table drops metadata from Hive Metastore and files from HDFS||Dropping an external table drops just metadata from Metastore without touching the actual file on HDFS/S3|
|Metadata on Inserts, creation of new partitions, etc. are updated automatically during inserts through the metastore||You need to explicitly run sync_partitions to sync changes on S3 with the metastore|
In short, use managed tables when the metastore should manage the lifecycle of the table, or when generating temporary tables. Use external tables when files are already present or in remote locations, and the files should remain even if the table is dropped.
If you want to get started with Presto easily, check out Ahana Cloud. It’s SaaS for Presto and takes away all the complexities of tuning, management and more. It’s free to try out for 14 days, then it’s pay-as-you-go through the AWS marketplace.