Manage Apache Iceberg Catalog

After you create an Apache Iceberg catalog, you can discover and use Iceberg tables stored in Amazon S3 directly from Flink SQL in Ververica Cloud. This topic shows how to create, verify, use, and delete Iceberg catalogs.

Supported Version: VERA Engine 4.3

Background Information

Iceberg uses a catalog backend to resolve table metadata (namespaces/databases and tables). Ververica Cloud supports Iceberg catalogs backed by:

AWS Glue Data Catalog: Recommended when you want a managed metastore.
Hadoop catalog (catalog-type = 'hadoop'): A directory or warehouse-based catalog without an external metastore.

Prerequisites

Before you create an Iceberg catalog, ensure you have the following:

An S3 bucket and a dedicated prefix to use as the Iceberg warehouse.
Network access or a private connection and IAM permissions to access S3. For more information, see Amazon S3.
If you use AWS Glue as the catalog backend, you need IAM permissions to access AWS Glue and network access to Glue endpoints.

Create an Iceberg Catalog

You can create catalogs in the SQL Editor using the CREATE CATALOG statement.

Create a Glue Catalog

To create a Glue catalog, execute the following statement:

CREATE CATALOG iceberg_glue_catalog WITH (
  'type'         = 'iceberg',
  'catalog-impl' = 'org.apache.iceberg.aws.glue.GlueCatalog',
  'io-impl'      = 'org.apache.iceberg.aws.s3.S3FileIO',
  'warehouse'    = 's3://<your-bucket>/<your-prefix>/iceberg-glue-catalog/'
);

Create a Hadoop Catalog

Use a Hadoop catalog when you want a warehouse-based catalog without an external metastore. For S3-backed warehouses in Ververica Cloud, use s3a:// and set the S3A filesystem adapter.

To create a Hadoop catalog, execute the following statement:

CREATE CATALOG iceberg_hadoop_catalog WITH (
  'type'               = 'iceberg',
  'catalog-type'       = 'hadoop',
  'warehouse'          = 's3a://<your-bucket>/<your-prefix>/iceberg-hadoop-catalog/',
  -- FileIO (choose one)
  'io-impl'            = 'org.apache.iceberg.hadoop.HadoopFileIO', -- recommended with s3a://
  -- 'io-impl'         = 'org.apache.iceberg.aws.s3.S3FileIO',
  -- Required for s3a:// warehouses on Ververica Cloud
  'hadoop.fs.s3a.impl' = 'com.ververica.connectors.iceberg.fs.HadoopFileSystemAdapter'
);

Verify and Browse Catalogs

After you create a catalog, you can verify and browse it using the following commands:

SHOW CATALOGS;
USE CATALOG iceberg_glue_catalog;   -- or iceberg_hadoop_catalog
SHOW DATABASES;
SHOW TABLES;

Create and Use Databases and Tables

Iceberg catalogs support CREATE DATABASE and CREATE TABLE statements. Use fully qualified identifiers in the format <catalog>.<database>.<table>.

Create a Database

To create a database, execute the following statement:

CREATE DATABASE IF NOT EXISTS iceberg_glue_catalog.test_db;

Create a Table

To create a table, execute the following statement:

CREATE TABLE IF NOT EXISTS iceberg_glue_catalog.test_db.test_table (
  name STRING,
  age  INT
);

Write to a Table

The following example uses the DataGen connector to write data to an Iceberg table:

CREATE TEMPORARY TABLE datagen (
  name STRING,
  age  INT
) WITH (
  'connector'        = 'datagen',
  'rows-per-second'  = '10',
  'number-of-rows'   = '100'
);

INSERT INTO iceberg_glue_catalog.test_db.test_table
SELECT name, age FROM datagen;

Read From a Table

To read data from an Iceberg table, execute the following statement:

SELECT * FROM iceberg_glue_catalog.test_db.test_table;

End-to-End Examples

Example 1: Datagen to Iceberg (Glue catalog)

CREATE CATALOG iceberg_glue_catalog WITH (
  'type'         = 'iceberg',
  'catalog-impl' = 'org.apache.iceberg.aws.glue.GlueCatalog',
  'io-impl'      = 'org.apache.iceberg.aws.s3.S3FileIO',
  'warehouse'    = 's3://<your-bucket>/<your-prefix>/iceberg-glue-catalog/'
);

CREATE DATABASE IF NOT EXISTS iceberg_glue_catalog.test_db;

CREATE TABLE IF NOT EXISTS iceberg_glue_catalog.test_db.test_table (
  name STRING,
  age  INT
);

CREATE TEMPORARY TABLE datagen (
  name STRING,
  age  INT
) WITH (
  'connector'        = 'datagen',
  'rows-per-second'  = '10',
  'number-of-rows'   = '100'
);

INSERT INTO iceberg_glue_catalog.test_db.test_table
SELECT name, age FROM datagen;

Example 2: Datagen to Iceberg (Hadoop catalog)

CREATE CATALOG iceberg_hadoop_catalog WITH (
  'type'               = 'iceberg',
  'catalog-type'       = 'hadoop',
  'warehouse'          = 's3a://<your-bucket>/<your-prefix>/iceberg-hadoop-catalog/',
  'io-impl'            = 'org.apache.iceberg.hadoop.HadoopFileIO',
  'hadoop.fs.s3a.impl' = 'com.ververica.connectors.iceberg.fs.HadoopFileSystemAdapter'
);

CREATE DATABASE IF NOT EXISTS iceberg_hadoop_catalog.test_db;

CREATE TABLE IF NOT EXISTS iceberg_hadoop_catalog.test_db.test_table (
  name STRING,
  age  INT
);

CREATE TEMPORARY TABLE datagen (
  name STRING,
  age  INT
) WITH (
  'connector'        = 'datagen',
  'rows-per-second'  = '10',
  'number-of-rows'   = '100'
);

INSERT INTO iceberg_hadoop_catalog.test_db.test_table
SELECT name, age FROM datagen;

Example 3: Iceberg to Iceberg Copy (Hadoop catalog)

-- Job 1: Create and populate source_table
CREATE CATALOG iceberg_hadoop_catalog WITH (
  'type'               = 'iceberg',
  'catalog-type'       = 'hadoop',
  'warehouse'          = 's3a://<your-bucket>/<your-prefix>/iceberg-hadoop-catalog/',
  'io-impl'            = 'org.apache.iceberg.hadoop.HadoopFileIO',
  'hadoop.fs.s3a.impl' = 'com.ververica.connectors.iceberg.fs.HadoopFileSystemAdapter'
);

CREATE DATABASE IF NOT EXISTS iceberg_hadoop_catalog.test_db;

CREATE TABLE IF NOT EXISTS iceberg_hadoop_catalog.test_db.source_table (
  name STRING,
  age  INT
);

CREATE TEMPORARY TABLE datagen (
  name STRING,
  age  INT
) WITH (
  'connector'        = 'datagen',
  'rows-per-second'  = '10',
  'number-of-rows'   = '100'
);

INSERT INTO iceberg_hadoop_catalog.test_db.source_table
SELECT name, age FROM datagen;

-- Job 2: Read from source_table and write to dest_table
CREATE TABLE IF NOT EXISTS iceberg_hadoop_catalog.test_db.dest_table (
  name STRING,
  age  INT
);

INSERT INTO iceberg_hadoop_catalog.test_db.dest_table
SELECT name, age FROM iceberg_hadoop_catalog.test_db.source_table;

Delete a Catalog

To delete a catalog, execute the DROP CATALOG statement:

DROP CATALOG iceberg_glue_catalog;
-- or
DROP CATALOG iceberg_hadoop_catalog;

Background Information​

Prerequisites​

Create an Iceberg Catalog​

Create a Glue Catalog​

Create a Hadoop Catalog​

Verify and Browse Catalogs​

Create and Use Databases and Tables​

Create a Database​

Create a Table​

Write to a Table​

Read From a Table​

End-to-End Examples​

Example 1: Datagen to Iceberg (Glue catalog)​

Example 2: Datagen to Iceberg (Hadoop catalog)​

Example 3: Iceberg to Iceberg Copy (Hadoop catalog)​

Delete a Catalog​