Manage Apache Iceberg Catalog
After you create an Apache Iceberg catalog, you can discover and use Iceberg tables stored in Amazon S3 directly from Flink SQL in Ververica Cloud. This topic shows how to create, verify, use, and delete Iceberg catalogs.
Supported Version: VERA Engine 4.3
Background Information
Iceberg uses a catalog backend to resolve table metadata (namespaces/databases and tables). Ververica Cloud supports Iceberg catalogs backed by:
- AWS Glue Data Catalog: Recommended when you want a managed metastore.
- Hadoop catalog (catalog-type = 'hadoop'): A directory or warehouse-based catalog without an external metastore.
Prerequisites
Before you create an Iceberg catalog, ensure you have the following:
- An S3 bucket and a dedicated prefix to use as the Iceberg warehouse.
- Network access or a private connection and IAM permissions to access S3. For more information, see Amazon S3.
- If you use AWS Glue as the catalog backend, you need IAM permissions to access AWS Glue and network access to Glue endpoints.
Create an Iceberg Catalog
You can create catalogs in the SQL Editor using the CREATE CATALOG statement.
Create a Glue Catalog
To create a Glue catalog, execute the following statement:
CREATE CATALOG iceberg_glue_catalog WITH (
'type' = 'iceberg',
'catalog-impl' = 'org.apache.iceberg.aws.glue.GlueCatalog',
'io-impl' = 'org.apache.iceberg.aws.s3.S3FileIO',
'warehouse' = 's3://<your-bucket>/<your-prefix>/iceberg-glue-catalog/'
);
Create a Hadoop Catalog
Use a Hadoop catalog when you want a warehouse-based catalog without an external metastore. For S3-backed warehouses in Ververica Cloud, use s3a:// and set the S3A filesystem adapter.
To create a Hadoop catalog, execute the following statement:
CREATE CATALOG iceberg_hadoop_catalog WITH (
'type' = 'iceberg',
'catalog-type' = 'hadoop',
'warehouse' = 's3a://<your-bucket>/<your-prefix>/iceberg-hadoop-catalog/',
-- FileIO (choose one)
'io-impl' = 'org.apache.iceberg.hadoop.HadoopFileIO', -- recommended with s3a://
-- 'io-impl' = 'org.apache.iceberg.aws.s3.S3FileIO',
-- Required for s3a:// warehouses on Ververica Cloud
'hadoop.fs.s3a.impl' = 'com.ververica.connectors.iceberg.fs.HadoopFileSystemAdapter'
);
Verify and Browse Catalogs
After you create a catalog, you can verify and browse it using the following commands:
SHOW CATALOGS;
USE CATALOG iceberg_glue_catalog; -- or iceberg_hadoop_catalog
SHOW DATABASES;
SHOW TABLES;
Create and Use Databases and Tables
Iceberg catalogs support CREATE DATABASE and CREATE TABLE statements. Use fully qualified identifiers in the format <catalog>.<database>.<table>.
Create a Database
To create a database, execute the following statement:
CREATE DATABASE IF NOT EXISTS iceberg_glue_catalog.test_db;
Create a Table
To create a table, execute the following statement:
CREATE TABLE IF NOT EXISTS iceberg_glue_catalog.test_db.test_table (
name STRING,
age INT
);
Write to a Table
The following example uses the DataGen connector to write data to an Iceberg table:
CREATE TEMPORARY TABLE datagen (
name STRING,
age INT
) WITH (
'connector' = 'datagen',
'rows-per-second' = '10',
'number-of-rows' = '100'
);
INSERT INTO iceberg_glue_catalog.test_db.test_table
SELECT name, age FROM datagen;
Read From a Table
To read data from an Iceberg table, execute the following statement:
SELECT * FROM iceberg_glue_catalog.test_db.test_table;
End-to-End Examples
Example 1: Datagen to Iceberg (Glue catalog)
CREATE CATALOG iceberg_glue_catalog WITH (
'type' = 'iceberg',
'catalog-impl' = 'org.apache.iceberg.aws.glue.GlueCatalog',
'io-impl' = 'org.apache.iceberg.aws.s3.S3FileIO',
'warehouse' = 's3://<your-bucket>/<your-prefix>/iceberg-glue-catalog/'
);
CREATE DATABASE IF NOT EXISTS iceberg_glue_catalog.test_db;
CREATE TABLE IF NOT EXISTS iceberg_glue_catalog.test_db.test_table (
name STRING,
age INT
);
CREATE TEMPORARY TABLE datagen (
name STRING,
age INT
) WITH (
'connector' = 'datagen',
'rows-per-second' = '10',
'number-of-rows' = '100'
);
INSERT INTO iceberg_glue_catalog.test_db.test_table
SELECT name, age FROM datagen;
Example 2: Datagen to Iceberg (Hadoop catalog)
CREATE CATALOG iceberg_hadoop_catalog WITH (
'type' = 'iceberg',
'catalog-type' = 'hadoop',
'warehouse' = 's3a://<your-bucket>/<your-prefix>/iceberg-hadoop-catalog/',
'io-impl' = 'org.apache.iceberg.hadoop.HadoopFileIO',
'hadoop.fs.s3a.impl' = 'com.ververica.connectors.iceberg.fs.HadoopFileSystemAdapter'
);
CREATE DATABASE IF NOT EXISTS iceberg_hadoop_catalog.test_db;
CREATE TABLE IF NOT EXISTS iceberg_hadoop_catalog.test_db.test_table (
name STRING,
age INT
);
CREATE TEMPORARY TABLE datagen (
name STRING,
age INT
) WITH (
'connector' = 'datagen',
'rows-per-second' = '10',
'number-of-rows' = '100'
);
INSERT INTO iceberg_hadoop_catalog.test_db.test_table
SELECT name, age FROM datagen;
Example 3: Iceberg to Iceberg Copy (Hadoop catalog)
-- Job 1: Create and populate source_table
CREATE CATALOG iceberg_hadoop_catalog WITH (
'type' = 'iceberg',
'catalog-type' = 'hadoop',
'warehouse' = 's3a://<your-bucket>/<your-prefix>/iceberg-hadoop-catalog/',
'io-impl' = 'org.apache.iceberg.hadoop.HadoopFileIO',
'hadoop.fs.s3a.impl' = 'com.ververica.connectors.iceberg.fs.HadoopFileSystemAdapter'
);
CREATE DATABASE IF NOT EXISTS iceberg_hadoop_catalog.test_db;
CREATE TABLE IF NOT EXISTS iceberg_hadoop_catalog.test_db.source_table (
name STRING,
age INT
);
CREATE TEMPORARY TABLE datagen (
name STRING,
age INT
) WITH (
'connector' = 'datagen',
'rows-per-second' = '10',
'number-of-rows' = '100'
);
INSERT INTO iceberg_hadoop_catalog.test_db.source_table
SELECT name, age FROM datagen;
-- Job 2: Read from source_table and write to dest_table
CREATE TABLE IF NOT EXISTS iceberg_hadoop_catalog.test_db.dest_table (
name STRING,
age INT
);
INSERT INTO iceberg_hadoop_catalog.test_db.dest_table
SELECT name, age FROM iceberg_hadoop_catalog.test_db.source_table;
Delete a Catalog
To delete a catalog, execute the DROP CATALOG statement:
DROP CATALOG iceberg_glue_catalog;
-- or
DROP CATALOG iceberg_hadoop_catalog;