Terminology for Datalake tables

Knowledge regarding the terminology in the use of Datalake tables is useful.

Containers (buckets) and partitions play important roles in data organization of object storage. The following components are key elements of object storage:
Container
A container is a logical abstraction that is used to provide the location for the data. There is no folder concept in object storage; only containers (buckets) and keys.
Data file (Iceberg table)
A physical file containing a table’s data written in file formats such as Parquet and ORC.
Datalake Table
A Datalake table is simply an accumulation of Open Data Format (ODF) files in a remote storage directory. The file formats can be ORC, Parquet or AVRO. TEXTFILE and JSONFILE are also supported for Hive Datalake tables. The content of the table is just the combination of all of the data files. The Datalake table requirement is that the data files have the same file format.
Externally Managed Table
Externally managed tables are tables imported from or exported to an external metastore. While these tables are visible to and can be operated on in Db2, they are cataloged in an external metastore that is not owned by Db2. The external metastore where a Datalake table is cataloged is defined by a table property. The iceberg.catalog property is used for Iceberg tables and bigsql.external.catalog for non-Iceberg tables. The value for these properties is the external-metastore-name specified on the SYSHADOOP.REGISTER_EXT_METASTORE procedure used to register the external metastore. See Table Properties for more information on these table properties. See Restrictions and limitations for list of restrictions on these tables.
External Metastore
A metastore belonging to a separate or remote SQL engine, which we can register and import its table metadata to the Db2 local metastore.
File path

A file path is the complete path to the file where you want to store data. The S3 file system implementation allows zero-length files to be treated like directories, and file names that contain a forward slash (/) are treated like nested directories. The file path includes the container name, an optional file path, and a file name.

In object storage, the file path is used when a table is created. All files in the same file path contribute to the table data. All files in the file path must use the same file format (PARQUET, ORC or TEXTFILE) and cannot include a mix of file formats. You can add more data by adding another file to the file path.

Manifest file (Iceberg table)
Contains a link to a data file along with metadata about the table data stored in the file.
Manifest list file (Iceberg table)
Contains links to the manifest files that belong to a snapshot. Each manifest list file stores metadata about manifests, including partition stats and data file counts.
Metadata file (Iceberg table)
Contains a snapshot of a table’s state for a given point in time. It contains metadata about the table such as the schema (column and partition definitions), table properties, and a list of snapshots for a table.
Metastore
Stores information about the structure of tables, their columns, data types, partitioning, and other details necessary to manage and access data.
Partition
A partition is a set of files whose rows share a common column value. Partitioning divides the data into multiple file paths that are treated like directories and can greatly improve the performance of queries.
SerDe

The term SerDe is short for Serializer/Deserializer. Hive uses the SerDe interface (and file formats) for I/O, and this interface is used by the Big SQL I/O engine. The interface handles both serialization and deserialization while also interpreting the results of serialization as individual fields for processing. There are SerDe's included for handling ORC, Parquet, and TEXTFILE.

For an introduction to SerDe, see Hive SerDe.

Snapshot (Iceberg table)
A snapshot represents the state of a table at some time and is used to access the complete set of data files in the table. Each snapshot points to a manifest list file.