In this getting started guide, you will install Ververica Platform, integrate it with MinIO for Universal Blob Storage, and deploy your first Apache Flink application using Ververica Platform.
Ververica Platform runs on top of Kubernetes. In order to get started locally we recommend using minikube, but any other Kubernetes Cluster (1.11+) will do, too.
Minikube relies on virtualization support by your operating system as well as a hypervisor (e.g. Virtualbox). Please check the official installation guide for details.
Minikube on Mac OS (homebrew)¶
$ brew install kubectl minikube
Minikube on Windows (Chocolatey)¶
$ choco install kubernetes-cli minikube
Minikube on Linux¶
Spinning up a Kubernetes Cluster¶
First, you start
minikube. The platform (including a small Apache Flink application) requires at least 8G of memory
and 4 CPUs.
$ minikube start --memory=8G --cpus=4
If this went well, you can continue and check if all system pods are ready.
$ kubectl get pods -n kube-system
Depending on your exact version of minikube, the output should look more or less similar to
NAME READY STATUS RESTARTS AGE coredns-5644d7b6d9-56zhg 1/1 Running 1 2m coredns-5644d7b6d9-fdnts 1/1 Running 1 2m etcd-minikube 1/1 Running 1 2m kube-addon-manager-minikube 1/1 Running 1 2m kube-apiserver-minikube 1/1 Running 1 2m kube-controller-manager-minikube 1/1 Running 1 2m kube-proxy-9w92r 1/1 Running 1 2m kube-scheduler-minikube 1/1 Running 1 2m storage-provisioner 1/1 Running 2 2m
If all pods are running, you are good to go.
“Helm helps you manage Kubernetes applications — Helm Charts help you define, install, and upgrade even the most complex Kubernetes application.” - helm.sh
We distribute Ververica Platform as a Helm Chart. To install Helm please follow the instructions on the official installation guide or use one of the one-liners below.
Helm on Mac OS (homebrew)¶
$ brew install helm
Helm on Windows (Chocolatey)¶
$ choco install kubernetes-helm
Helm on Linux¶
As before, there is a package available for most distros and package managers. For details check the official installation guide.
Setting Up Tiller¶
This guide is based on the Ververica Platform playground repository which contains scripts and Helm values files to make for a smooth getting-started experience. Please clone the repository before continuing; all commands below are meant to be executed from the repository root directory.
$ git clone --branch release-2.1 email@example.com:ververica/ververica-platform-playground.git $ cd ververica-platform-playground
For this playground, you will create two Kubernetes namespaces:
vvp will host the control
plane of Ververica Platform and other services, while the Apache Flink jobs managed by the platform will run in the
You can skip all of the installation steps outlined below by running:
Before installing any of the components you need to create the Kubernetes namespaces vvp and vvp-jobs.
$ kubectl create namespace vvp $ kubectl create namespace vvp-jobs
Install MinIO with Helm, using the official Helm chart from the
Then, install Ververica Platform using
helm. The required configurations slightly differ based on the product edition you would like to install.
In order to access the web user interface or the REST API set up a port forward to the Ververica Platform Kubernetes service:
$ kubectl --namespace vvp port-forward services/vvp-ververica-platform 8080:80
The web interface and API are both now available under
localhost:8080. The UI will show that you do not have any
Deployments are the core resource to manage Apache Flink jobs within Ververica Platform. A Deployment specifies the desired state of an application and its configuration. At the same time, Ververica Platform tracks and reports each Deployment’s status and derives other resources from it. Whenever the Deployment specification is modified, Ververica Platform will ensure the running application will eventually reflect this change.
Before you create your first Deployment you need to create a Deployment Target.
A Deployment Target links a Deployment to a Kubernetes namespace, which your Apache Flink applications will be deployed
into. In this case you can use the
vvp-jobs namespace, that we created earlier.
Now, you can create your first Deployment.
Ververica Platform will now go ahead and create a highly available Flink cluster, which runs your Flink application
vvp-jobs. Checkpointing and Savepoints, as well as Flink master failover, have automatically been configured
by the platform.
Once the Deployment has reached the RUNNING state you can also checkout the Flink UI.
One of the core features of Ververica Platform is application lifecycle management for stateful stream processing applications. As part of this, Ververica Platform takes care of migrating your distributed state consistently when you make changes to your application. For example, you can change your Deployment by changing the parallelism, i.e. rescaling your Flink job.
Under the hood, Ververica Platform now performs an application upgrade according to the configured Upgrade and Restore Strategy. For your Deployment, these default to STATEFUL and LATEST_STATE and Ververica Platform has triggered a graceful shutdown of your Flink application while taking a consistent snapshot of its state via a savepoint. You can see a list of all past savepoints and retained checkpoints for this Deployment in the Snapshots tab. It then restarts your application from the latest snapshot, which is the one taken during shutdown.
Ververica Platform can be integrated with metrics collection and visualization systems to help monitor and debug your Flink applications.
setup.sh script included in the playground repository accepts a flag
--with-metrics which will additionally install Prometheus, a metrics collection and storage system, and Grafana, a time series visualization web application, to demonstrate this kind of integration.
This setup uses Global Deployment Defaults to ensure each Flink job is configured to use the built-in Prometheus metrics reporter and that each Kubernetes Pod running Flink gets an annotation that makes it discoverable by the Prometheus server.
After installing the platform using with
./setup.sh --with-metrics, run the following command to port-forward Grafana:
$ kubectl --namespace vvp port-forward services/grafana 3000:80
If you already installed the platform without metrics, you must first run
Then, when viewing one of your Deployments in the web UI, click the “Metrics” button to be linked to a sample monitoring dashboard in Grafana for that Deployment. It may take a few minutes for metrics to appear.
To understand this setup, check out the following files:
values-prometheus.yaml: Configuration for the Prometheus Helm chart. This example uses the default configuration except for disabling components we don’t need.
values-grafana.yaml: Configuration for the Grafana Helm chart. A datasource and dashboard are preconfigured, and auth is disabled to make for a convenient demonstration.
values-vvp-with-metrics.yaml: See the
globalDeploymentDefaultssection for how the Prometheus metrics reporter and Pod annotations are automatically configured for all Deployments, and see
ui.linkTemplates.metricsfor how the “Metrics” button in the web UI is connected to Grafana.
Now that you have a Ververica Platform instance and your first Deployment up and running, there are multiple areas you can look into to learn more about the platform.
- Have a look at the Advanced tab for configuring Deployments and learn more about Apache Flink Deployments and how to manage their lifecycle
- Take a deep-dive into administration and customization, e.g. around logging and metrics, deployment defaults or access control
- Have a look at how to manage artifacts.