Skip to main content

Apache Iceberg

Apache Iceberg is an open-source table format for huge analytic datasets, providing a better infrastructure to handle data in data lakes. It offers atomic commits, concurrent writes, and schema evolution, which significantly enhance data lake reliability and performance. Iceberg's table layout simplifies data file management and enables fine-grained partitioning.

Private Connection​

To set up a private connection for Apache Iceberg, please refer to Amazon S3. Iceberg stores its raw data and metadata on S3, and you just need to configure the corresponding S3 paths.

In addition to Amazon S3 access policy setup, you will have to configure an additional policy for the Iceberg backend.

At the time of writing, Apache Iceberg offer AWS Glue, DynamoDB, RDS JDBC, Nessie and other similar service-based catalogs. Please see the Apache Iceberg Integrations documentation.