Session Clusters
Session clusters are long-lived Apache Flink® clusters that can be used to execute multiple applications simultaneously or run short-lived, interactive jobs on demand. It is possible to execute Deployments on session clusters by using session mode.
Limitations
Support for session clusters currently has some limitations compared with Deployments:
SSL/TLS: Auto-provisioned SSL/TLS for Flink intra-cluster and external communication is not supported. SSL/TLS has to be configured manually.
Autopilot: Autoscaling is not supported for session clusters and limited to Deployments running in session mode.
Specification
Session clusters are managed via namespaced SessionCluster resources which are configured similarly to Deployments. However, SessionClusters have fewer configurable options than Deployments since this resource only configures the Flink cluster itself and not the applications that will run on it.
Desired State
A SessionCluster resource has a desired state specified at spec.state
. The desired state can be either:
- RUNNING when the cluster should be provisioned and kept running, or
- STOPPED when the cluster should be torn down, along with all currently running applications
All Deployments running on a session cluster must be terminated before the session cluster can be stopped.
Changing a Running SessionCluster
Only the desired state and number of TaskManagers of a session cluster may be changed while the cluster is in a non-terminal state non-terminal-state. A SessionCluster is in a "terminal state" when its desired state is STOPPED
and there are no in-progress operations on the cluster, such as when the cluster is starting, stopping, or being updated.
Scaling down a running session cluster (by reducing the value of spec.numberOfTaskManagers
)
can cause applications running on the cluster to restart.
Full Example
The following snippet is a complete example of a SessionCluster, including optional keys.
kind: SessionCluster
apiVersion: v1
metadata:
name:
labels:
env: testing
spec:
state: RUNNING
deploymentTargetName: default
flinkVersion:1.15
flinkImageRegistry: registry.ververica.com/v2.9
flinkImageRepository: flink
flinkImageTag: 1.15.3-scala_2.12
numberOfTaskManagers: 5
resources:
jobmanager:
cpu: 2
memory: 1g
taskmanager:
cpu: 16
memory: 32g
flinkConfiguration:
taskmanager.numberOfTaskSlots: 32
logging:
loggingProfile: default
log4jLoggers:
"": INFO
org.apache.flink.streaming.examples: DEBUG
kubernetes:
pods:
envVars:
- name: KEY
value: VALUE