Skip to main content

Manage Delta Catalog

The Delta Lake catalog allows VERA Engine to discover Delta tables stored in external metastores and perform metadata-level operations.

Supported Version: VERA Engine 4.3

Background Information

The Delta Lake catalog acts as a wrapper around other Flink catalogs to maintain data hierarchy and persistence. It supports the following catalog types:

CatalogFunctionalityPersisted
Delta CatalogSchema retrieval and managementYes (supports multiple filesystems like S3)
Databricks Unity CatalogTable hierarchy maintenanceYes (on Databricks Managed Service)
OSS Unity CatalogTable hierarchy maintenanceYes (on standalone instances)
GenericInMemoryCatalogTable hierarchy maintenanceNo (lives only within a single Flink SQL execution)

Features

The Delta Lake catalog enables you to:

  • Discover Delta Lake tables in an external metastore.
  • Expose Delta tables as Flink tables.
  • Perform metadata operations like LIST, GET, ALTER, and DROP.
  • Automatically derive schemas and partition columns from the Delta transaction log.

Prerequisites

  • A storage location for Delta tables (for example, an S3 bucket) accessible from Ververica Cloud.
  • An instance of your chosen metastore (for example, Databricks Unity Catalog or OSS Unity Catalog) for persistent storage.
  • If using Databricks Unity Catalog, a Databricks account and personal access token (PAT).

Create a Delta Catalog

You can create a Delta catalog in the SQL Editor.

SQL Syntax

CREATE CATALOG delta_catalog WITH (
'type' = 'delta-catalog',
'catalog-type' = 'unity', -- can be 'unity' or 'in-memory'
'unity.host' = 'https://<your-host>',
'unity.catalog.name' = '<your-catalog-name>'
);

Common Options

OptionRequiredDescription
typeYesAlways set to delta-catalog.
catalog-typeYesThe wrapped catalog type (for example, unity or in-memory).

Unity Catalog Options

If unity.databricks.token is provided, the catalog uses the Databricks Unity Catalog adapter. Otherwise, it uses the Open Source version.

OptionRequiredDefaultDescription
unity.hostYes(none)Unity Catalog host URL.
unity.databricks.tokenYes(none)Databricks Personal Access Token (PAT).
unity.catalog.nameYes(none)Name of the catalog in Unity Catalog.
unity.uc.default.schemaNodefaultDefault schema name.

Use the Delta Catalog

After creating the catalog, you can use standard Flink SQL commands to interact with it.

-- Select the catalog
USE CATALOG vv_unity_catalog;

-- List and use databases
SHOW DATABASES;
CREATE DATABASE IF NOT EXISTS test_db;
USE test_db;

-- List and describe tables
SHOW TABLES;
DESCRIBE my_delta_table;

Metadata Operations

Table Metadata Exposure

For each Delta table, the catalog exposes:

  • Table schema (columns and data types)
  • Partition columns
  • Merged properties from Unity and Delta metadata
  • Table comments

Alter Table

You can update table properties and comments via the catalog. Structural changes to columns or partitions must be performed through Delta Lake operations directly.

Drop Table

Dropping a table removes the entry from the catalog but does not delete the underlying Delta data.

Managed vs External Tables

All Delta tables exposed via the catalog are treated as external tables. Ververica Cloud does not manage the data lifecycle and will not implicitly delete Delta data.

Validation Steps (Databricks + S3)

To validate your setup with Databricks and Amazon S3:

  1. Set up S3: Create a private connection in Ververica Cloud to your S3 bucket.
  2. Configure Databricks:
    • Create an external location and storage credentials in Databricks Catalog Explorer.
    • Ensure the IAM role for Databricks has access to the S3 bucket.
  3. Create the Catalog in VVC:
    CREATE CATALOG vv_unity_catalog WITH (
    'type' = 'delta-catalog',
    'catalog-type' = 'unity',
    'unity.host' = 'https://<databricks-workspace-id>.cloud.databricks.com',
    'unity.databricks.token' = '<your-pat>',
    'hadoop.fs.s3a.impl' = 'io.delta.flink.internal.table.fs.HadoopFileSystemAdapter'
    );
  4. Verify Data Flow:
    • Create a table in the new catalog.
    • Run a streaming migration query to insert data.
    • Verify the records appear in both Databricks and S3.

Limits

The Delta Lake catalog does not:

  • Create underlying Delta data files (handled by the connector).
  • Modify table schemas (must happen in Delta Lake).
  • Manage the Delta transaction log lifecycle.