Manage Apache Iceberg Catalog

Applies toBYOC

2 min read

On this page

Background Information
Prerequisites
Create an Iceberg Catalog
- Create a Glue Catalog
- Create a Hadoop Catalog
Verify and Browse Catalogs
Create and Use Databases and Tables
End-to-End Examples
Delete a Catalog

After you create an Apache Iceberg catalog, you can discover and use Iceberg tables stored in Amazon S3 directly from Flink SQL in your Ververica deployment. This topic shows how to create, verify, use, and delete Iceberg catalogs.

Supported Version: VERA Engine 4.3

Background Information

Iceberg uses a catalog backend to resolve table metadata (namespaces/databases and tables). Ververica Cloud supports Iceberg catalogs backed by:

AWS Glue Data Catalog: Recommended when you want a managed metastore.
Hadoop catalog (catalog-type = 'hadoop'): A directory or warehouse-based catalog without an external metastore.

Prerequisites

Before you create an Iceberg catalog, ensure you have the following:

An S3 bucket and a dedicated prefix to use as the Iceberg warehouse.
Network access or a private connection and IAM permissions to access S3. For more information, see Amazon S3.
If you use AWS Glue as the catalog backend, you need IAM permissions to access AWS Glue and network access to Glue endpoints.

Create an Iceberg Catalog

You can create catalogs in the SQL Editor using the CREATE CATALOG statement.

Create a Glue Catalog

To create a Glue catalog, execute the following statement:

SQL

1CREATE CATALOG iceberg_glue_catalog WITH (
2  'type'         = 'iceberg',
3  'catalog-impl' = 'org.apache.iceberg.aws.glue.GlueCatalog',
4  'io-impl'      = 'org.apache.iceberg.aws.s3.S3FileIO',
5  'warehouse'    = 's3://<your-bucket>/<your-prefix>/iceberg-glue-catalog/'
6);

Create a Hadoop Catalog

Use a Hadoop catalog when you want a warehouse-based catalog without an external metastore. For S3-backed warehouses in Ververica Cloud, use s3a:// and set the S3A filesystem adapter.

To create a Hadoop catalog, execute the following statement:

SQL

1CREATE CATALOG iceberg_hadoop_catalog WITH (
2  'type'               = 'iceberg',
3  'catalog-type'       = 'hadoop',
4  'warehouse'          = 's3a://<your-bucket>/<your-prefix>/iceberg-hadoop-catalog/',
5  -- FileIO (choose one)
6  'io-impl'            = 'org.apache.iceberg.hadoop.HadoopFileIO', -- recommended with s3a://
7  -- 'io-impl'         = 'org.apache.iceberg.aws.s3.S3FileIO',
8  -- Required for s3a:// warehouses on Ververica Cloud
9  'hadoop.fs.s3a.impl' = 'com.ververica.connectors.iceberg.fs.HadoopFileSystemAdapter'
10);

Verify and Browse Catalogs

After you create a catalog, you can verify and browse it using the following commands:

SQL

1SHOW CATALOGS;
2USE CATALOG iceberg_glue_catalog;   -- or iceberg_hadoop_catalog
3SHOW DATABASES;
4SHOW TABLES;

Create and Use Databases and Tables

Iceberg catalogs support CREATE DATABASE and CREATE TABLE statements. Use fully qualified identifiers in the format <catalog>.<database>.<table>.

Create a Database

To create a database, execute the following statement:

SQL

1CREATE DATABASE IF NOT EXISTS iceberg_glue_catalog.test_db;

Create a Table

To create a table, execute the following statement:

SQL

1CREATE TABLE IF NOT EXISTS iceberg_glue_catalog.test_db.test_table (
2  name STRING,
3  age  INT
4);

Write to a Table

The following example uses the DataGen connector to write data to an Iceberg table:

SQL

1CREATE TEMPORARY TABLE datagen (
2  name STRING,
3  age  INT
4) WITH (
5  'connector'        = 'datagen',
6  'rows-per-second'  = '10',
7  'number-of-rows'   = '100'
8);
9
10INSERT INTO iceberg_glue_catalog.test_db.test_table
11SELECT name, age FROM datagen;

Read From a Table

To read data from an Iceberg table, execute the following statement:

SQL

1SELECT * FROM iceberg_glue_catalog.test_db.test_table;

End-to-End Examples

Example 1: Datagen to Iceberg (Glue catalog)

SQL

1CREATE CATALOG iceberg_glue_catalog WITH (
2  'type'         = 'iceberg',
3  'catalog-impl' = 'org.apache.iceberg.aws.glue.GlueCatalog',
4  'io-impl'      = 'org.apache.iceberg.aws.s3.S3FileIO',
5  'warehouse'    = 's3://<your-bucket>/<your-prefix>/iceberg-glue-catalog/'
6);
7
8CREATE DATABASE IF NOT EXISTS iceberg_glue_catalog.test_db;
9
10CREATE TABLE IF NOT EXISTS iceberg_glue_catalog.test_db.test_table (
11  name STRING,
12  age  INT
13);
14
15CREATE TEMPORARY TABLE datagen (
16  name STRING,
17  age  INT
18) WITH (
19  'connector'        = 'datagen',
20  'rows-per-second'  = '10',
21  'number-of-rows'   = '100'
22);
23
24INSERT INTO iceberg_glue_catalog.test_db.test_table
25SELECT name, age FROM datagen;

Example 2: Datagen to Iceberg (Hadoop catalog)

SQL

1CREATE CATALOG iceberg_hadoop_catalog WITH (
2  'type'               = 'iceberg',
3  'catalog-type'       = 'hadoop',
4  'warehouse'          = 's3a://<your-bucket>/<your-prefix>/iceberg-hadoop-catalog/',
5  'io-impl'            = 'org.apache.iceberg.hadoop.HadoopFileIO',
6  'hadoop.fs.s3a.impl' = 'com.ververica.connectors.iceberg.fs.HadoopFileSystemAdapter'
7);
8
9CREATE DATABASE IF NOT EXISTS iceberg_hadoop_catalog.test_db;
10
11CREATE TABLE IF NOT EXISTS iceberg_hadoop_catalog.test_db.test_table (
12  name STRING,
13  age  INT
14);
15
16CREATE TEMPORARY TABLE datagen (
17  name STRING,
18  age  INT
19) WITH (
20  'connector'        = 'datagen',
21  'rows-per-second'  = '10',
22  'number-of-rows'   = '100'
23);
24
25INSERT INTO iceberg_hadoop_catalog.test_db.test_table
26SELECT name, age FROM datagen;

Example 3: Iceberg to Iceberg Copy (Hadoop catalog)

SQL

1-- Job 1: Create and populate source_table
2CREATE CATALOG iceberg_hadoop_catalog WITH (
3  'type'               = 'iceberg',
4  'catalog-type'       = 'hadoop',
5  'warehouse'          = 's3a://<your-bucket>/<your-prefix>/iceberg-hadoop-catalog/',
6  'io-impl'            = 'org.apache.iceberg.hadoop.HadoopFileIO',
7  'hadoop.fs.s3a.impl' = 'com.ververica.connectors.iceberg.fs.HadoopFileSystemAdapter'
8);
9
10CREATE DATABASE IF NOT EXISTS iceberg_hadoop_catalog.test_db;
11
12CREATE TABLE IF NOT EXISTS iceberg_hadoop_catalog.test_db.source_table (
13  name STRING,
14  age  INT
15);
16
17CREATE TEMPORARY TABLE datagen (
18  name STRING,
19  age  INT
20) WITH (
21  'connector'        = 'datagen',
22  'rows-per-second'  = '10',
23  'number-of-rows'   = '100'
24);
25
26INSERT INTO iceberg_hadoop_catalog.test_db.source_table
27SELECT name, age FROM datagen;
28
29-- Job 2: Read from source_table and write to dest_table
30CREATE TABLE IF NOT EXISTS iceberg_hadoop_catalog.test_db.dest_table (
31  name STRING,
32  age  INT
33);
34
35INSERT INTO iceberg_hadoop_catalog.test_db.dest_table
36SELECT name, age FROM iceberg_hadoop_catalog.test_db.source_table;

Delete a Catalog

To delete a catalog, execute the DROP CATALOG statement:

SQL

1DROP CATALOG iceberg_glue_catalog;
2-- or
3DROP CATALOG iceberg_hadoop_catalog;

Was this helpful?

Yes No