Docs Home
Viewing docs for
Self-ManagedNot available for BYOC

Catalogs & Databases

On this page

Catalogs are used to store all metadata about database objects, such as databases, tables, table attributes, functions, and views. The catalog metadata is accessed when a SQL query is parsed, validated, and optimized. Only database objects which are registered in a catalog can be referenced in SQL queries.

A catalog object can be addressed with a fully or a partially specified name. The fully-specified name of a table is <catalog-name>`.`<database-name>`.`<table-name>. For a partially-specified name the prefix of catalog and/or database name can be omitted if those are currently selected. Catalog, database, table, or function names that do not comply with the SQL naming specification need to be quoted with the backtick (` ) character.

Built-In Catalog

Every namespace has its own built-in VVP Catalog (#vvp-catalog) named vvp with a default database named default. The default catalog of a namespace cannot be dropped.

Packaged Catalogs

Ververica Platform packages the following catalogs:

  • VVP Catalog (#vvp-catalog)
  • Apache Hive® Catalog (#apache-hive-catalog)
  • JDBC Catalog (#jdbc-catalog)

Support for additional packaged catalogs, like Confluent® schema registry, is planned in an upcoming release.

VVP Catalog

VVP Catalog is a catalog implementation that is integrated with Ververica Platform. It does not require additional setup steps and does not have any additional dependencies. VVP Catalog stores its metadata in Ververica Platform's persistence layer.

A VVP Catalog mycatalog with a default database mydb is created with the following DDL statement:

SQL
1CREATE CATALOG mycatalog
2WITH (
3  'type' = 'vvp',             -- mandatory
4  'defaultDatabase' = 'main'  -- mandatory
5);

VVP Catalog ensures that the configured default database is always present. The default database can be deleted but is automatically re-created as soon as the catalog is accessed again.

The following database objects are currently supported by VVP Catalog:

  • Databases

See Database DDL (#database-ddl) to learn how to create, alter, and drop databases.

  • Tables See Tables & Views to learn how to create, alter, and drop tables.
  • User-Defined Functions See Functions to learn how to implement and manage user-defined functions.

Not supported features at this point are:

  • Table Partitions
  • Table, Column, and Partition Statistics

We will add these features in future releases depending on user feedback.

JDBC Catalog

With the JDBC catalog you can connect to a remote database and expose its tables in Apache Flink® SQL.

`sql showLineNumbers CREATE CATALOG pg_catalog WITH ( 'type' = 'jdbc', 'base-url' = 'jdbc:postgresql://…', 'default-database' = '…', 'username' = '…', 'password' = '…' ) `

See the documentation for more information on the JDBC catalog and available options.

Manage Packaged Catalogs

Packaged catalogs consist of

  • A YAML metadata file providing Ververica Platform with structured information about the catalog.
  • The JAR file(s) including the catalog, its factory, and all required dependencies.

These files are located in the Ververica Platform Gateway docker image in a dedicated directory for each catalog under /vvp/sql/opt/catalogs.

You can add your own packaged catalog, making it available in all Ververica Platform namespaces, by patching the Docker image. Simply add a new folder for the catalog under the directory given above and include the metadata file as well as all relevant JAR files.

Similarly, you can remove or modify packaged catalogs by removing over replacing the contents of its folder.

Catalog Metadata File

The metadata file of a catalog must be named catalog-meta.yaml and has the following schema:

YAML
1catalog:
2  type:             # Factory identifier of the catalog, e.g. "jdbc". This must be the same as the folder name.
3  packaged:         # This must be set to true for packaged catalogs.
4  readOnly:         # If true, the catalog will be read-only and all write operations will be blocked.
5  properties:       # List of configuration options.
6    - key:          # The key of the configuration option, e.g. "type".
7      required:     # Whether this is a required configuration option.
8      description:  # Description of the configuration option.
9      defaultValue: # Default value if this option is not specified.
10    -

Patching the Gateway Docker Image

The following example shows the Dockerfile to add a new catalog called my-catalog to the VVP Gateway Docker image.

TEXT
1FROM registry.ververica.com/v2.10/vvp-gateway:2.10.3
2COPY catalog-meta.yaml /vvp/sql/opt/catalogs/my-catalog/catalog-meta.yaml
3COPY my-catalog-0.1.jar /vvp/sql/opt/catalogs/my-catalog/my-catalog-0.1.jar

Custom Catalogs

If you need to use a catalog for which Ververica Platform does not provide a packaged catalog, you can register a custom catalog. Custom catalogs are namespaced resources, and unlike packaged catalogs can only be used in namespaces in which they have been registered.

A custom catalog is defined by one or more JAR files which include all required dependencies. These files must either be located in the Universal Blob Storage or at HTTP locations which are accessible by Ververica Platform.

Every catalog has a unique identifier defined by the catalog's implementation of the factory class (see org.apache.flink.table.factories.CatalogFactory). This identifier is referenced in the WITH clause of a CREATE CATALOG statement using the type property.

Since catalog identifiers must be unique, it is not possible to add a custom catalog if there exists another catalog (packaged or custom) with the same identifier already in the same namespace.

You can create, update and delete custom catalogs using the corresponding REST endpoints or through the Ververica Platform UI. The UI supports uploading JAR files to Universal Blob Storage and extracting available catalog implementations from the uploaded files.

Apache Hive® Catalog

Apache Flink® Hive Catalog imports table metadata directly from your Apache Hive® Metastore. Once configured, you can read from and write into Hive tables with Flink SQL. The following section describes how a Hive catalog can be added after providing it as a custom catalog in Ververica Platform.

In order to create a Hive Catalog, you first need to make your hive-site.xml available to Ververica Platform by mounting it into the Ververica Platform containers. For example, you can store your hive-site.xml in a Kubernetes ConfigMap,

YAML
1apiVersion: v1
2kind: ConfigMap
3metadata:
4  name: hive-metastore-config-client
5  namespace: vvp
6  labels:
7    app.kubernetes.io/name: vvp
8data:
9  hive-site.xml: |
10    <?xml version="1.0"?>
11    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
12    <configuration>
13      <property>
14        <name>hive.metastore.uris</name>
15        <value>thrift://hive-metastore.hive.svc:9083</value>
16      </property>
17      ...
18    </configuration>

that you then mount into Ververica Platform via the corresponding Helm chart values.

YAML
1# values.yaml
2volumeMounts:
3- name: hive-site
4  mountPath: /etc/hive
5
6volumes:
7  - name: hive-site
8    configMap:
9      name: hive-metastore-config-client

Afterwards, you can create a Hive Catalog via the CREATE CATALOG DDL statement.

SQL
1CREATE CATALOG hive_metastore
2WITH (
3  'type' = 'hive',                  -- mandatory
4  'hive-conf-dir' = '/etc/hive',    -- mandatory
5  'hive-version' = '3.1.2'          -- mandatory
6)

Ververica Platform supports Amazon S3 (via Hadoop's S3 filesystem implementation s3a) as well as Apache Hadoop® HDFS as Hive storage layer out-of-the-box. If you are using a different storage layer, please reach out.

Limitations

  • Hive built-in functions and user-defined functions stored in Hive are not supported by Ververica Platform yet.
  • Hive's SQL dialect is not supported.

Catalog DDL

Catalogs can be registered to a namespace. Once a catalog is registered, its tables can be accessed by SQL queries.

CREATE CATALOG

Registers a new catalog to the namespace.

SQL
1CREATE CATALOG catalog_name
2WITH (key1=val1, key2=val2, ...)

The name of the new catalog must comply with Ververica Platform's resource name rules. For example, capital letters and the underscore (_) character are not allowed in catalog names.

The type of the catalog must be specified using the 'type' property. For example, a new VVP Catalog named mycatalog with a default database called main is created with the following statement:

SQL
1CREATE CATALOG mycatalog
2WITH (
3  'type' = 'vvp',
4  'defaultDatabase' = 'main');

See Packaged Catalogs (#packaged-catalogs) for a list of catalog types which come bundled with Ververica Platform.

DROP CATALOG

Unregisters the catalog from namespace. The metadata of the catalog (databases, tables, views, functions, etc.) is not deleted.

SQL
1DROP CATALOG [IF EXISTS] catalog_name

Database DDL

Databases are used to organize tables, views, and functions in a catalog. You can create, alter, or drop databases.

DDL commands normally run live against the target database. In an environment where Auto-DDL is disallowed, for example when deploying to a production environment, disable automatic database updates and configure Ververica Platform to generate an offline changeset from which to create or update the database. See Configuring Offline DDL Updates (#create-database), below.

CREATE DATABASE

Creates a new database in a catalog.

SQL
1CREATE DATABASE [IF NOT EXISTS] [catalog_name.]db_name
2[COMMENT database_comment]
3[WITH (key1=val1, key2=val2, ...)]

ALTER DATABASE

Sets or overrides one ore more properties of an existing database.

SQL
1ALTER DATABASE [catalog_name.]db_name SET (key1=val1, key2=val2, ...)

DROP DATABASE

Drops a database from a catalog.

SQL
1DROP DATABASE [IF EXISTS] [catalog_name.]db_name [ (RESTRICT | CASCADE) ]

RESTRICT Attempting to drop a non-empty database, i.e., a database that contains tables or functions, will fail. This is the default behavior.

CASCADE All tables and functions of a database are dropped before the database itself is dropped.

Was this helpful?