Flink Configuration
Flink configuration options provided on the SessionCluster resource are applied on the Flink cluster-level. On this page, we describe how the Flink configuration is applied to your session cluster and highlight important configuration options.
Overview
The Flink configuration is specified as part of the SessionCluster spec.
kind: SessionCluster
spec:
flinkConfiguration:
key: value
Please consult the official Flink documentation for a listing of available configuration options.
Environment Variables
You can reference environment variables inside flinkConfiguration
through shell format strings:
kind: SessionCluster
spec:
flinkConfiguration:
s3.access-key: ${S3_ACCESS_KEY}
This allows you to store sensitive information such as passwords, keys and tokens as Kubernetes secrets and reference them by setting up an environment variable via Kubernetes Pod Templates or a custom Flink Docker image.
File Systems
The default Flink Docker images provided by Ververica Platform include FileSystem implementations for popular blob storage providers.
Blob Storage Provider | Scheme | FileSystem Implementation |
---|---|---|
File System | file | LocalFileSystem |
AWS S3 | s3 , s3p | PrestoS3FileSystem |
AWS S3 | s3a | S3AFileSystem |
Microsoft ABS | wasbs | NativeAzureFileSystem |
Apache Hadoop® HDFS | hdfs | HadoopFileSystem |
Microsoft ABS Workload Identity | wiaz | Ververica custom plugin implementation |
If you use Universal Blob Storage, all relevant Flink options, including credentials, will be configured on the Flink cluster-level. Please consult the official Flink documentation for details about manual configuration of file systems.
Note that additional file system implementations have to be loaded as Flink plugins which requires a custom Flink Docker image.
High-Availability (Jobmanager Failover)
Flink requires an external service for high-availability in order to be able to recover the internal state of the Flink Jobmanager process (including metadata about checkpoints) on failures.
By default, Flink Jobmanager failover is not enabled. For production installations it is highly recommended to configure Flink with such a service. Please refer to the Flink configuration for details on how to configure high-availability services manually.
Kubernetes
Ververica Platform supports two options for Flink high-availability services on Kubernetes that do not have any additional dependencies, Flink Kubernetes and Ververica Platform Kubernetes. See Kubernetes High-Availability Service for discussion.
Flink Kubernetes
The high-availability service is enabled via the following configuration:
kind: SessionCluster
spec:
flinkConfiguration:
high-availability: kubernetes
If Universal Blob Storage
is not configured, you have to additionally provide the
high-availability.storageDir
configuration.