Kubernetes High-Availability Service
Ververica Platform supports several alternatives for High Availability (HA) services.
-
Flink Kubernetes supports High-Availability (HA) Kubernetes clusters out of the box. Ververica Platform with Flink Kubernetes therefore supports Kubernetes HA out of the box.
-
Ververica Platform Kubernetes was introduced to enable HA services at a time when Flink Kubernetes did not directly support HA. While Ververica Platform Kubernetes is still available in the platform, it is deprecated since Ververica Platform 2.10 and will be removed in Ververica Platform 2.12. For discussion see below.
-
Other HA options include HA support via e.g. Zookeeper.
For configuration options see Flink Configuration.
HA services based on Kubernetes do not require multiple Job Managers bacause Kubernetes itself will restart the Job Manager pod as required. However, running more than one Job Manager will optimize recovery time.
Flink Kubernetes vs Ververica Platform Kubernetes
Apache Flink® before v1.12 did not support Flink Kubernetes HA out of the box. Instead HA support was based on Zookeeper. Ververica therefore developed its own HA capable Kubernetes service, Ververica Platform Kubernetes, to run HA Flink clusters on Kubernetes.
Ververica Platform 2.8.0 introduced support for Flink Kubernetes with out-of-the-box HA support. Since that time both Flink Kubernetes and Ververica Platform Kubernetes have been available in the platform. Both support HA Flink clusters on Kubernetes but with different implementations which impact the behaviour of Kubernetes TaskManagers, which in turn can impact the JobManager:
-
Flink Kubernetes: TaskManagers use Watchers to monitor for changes to ConfigMaps, which in theory puts less pressure on the API server at large scale compared to querying for changes.
-
Ververica Platform Kubernetes: TaskManagers periodically query the Kubernetes API server for ConfigMaps. Therefore, the number of requests scales with the number of TaskManagers and puts added pressure on the API server at large scale, making JobManager a potential single point of failure.
From Ververica Platform 2.10.0 Ververica Platform Kubernetes is deprecated, and it will be removed in Ververica Platform 2.12. We recommend that Flink applications that use Ververica Platform Kubernetes are migrated to use Flink Kubernetes instead.