Built-in Connectors
Apache Flink provides a variety of built-in connectors to facilitate the integration of Flink with different data sources and sinks (also called destinations). These connectors make it easy to read and write data from/to various systems in a scalable and fault-tolerant manner. In this section, we will introduce some of the most commonly used built-in connectors in Apache Flink.
Using the Console Network Detection feature, an IP address or a domain name can be used to check whether the running environment of a fully managed Flink deployment is connected to the upstream and downstream systems. See the FAQ section for more information.
Apache Paimonโ
At its core, Apache Paimon is a dynamic data lake storage, streamlined for both streaming and batch data processing. With a knack for supporting high-throughput data writing and offering low-latency data querying, it is tailored for compatibility with Flink-based Ververica Cloud. If you're aiming to set up your data lake storage on Hadoop Distributed File System (HDFS) or Ververica Cloud, Apache Paimon is your go-to solution.
Apache Kafkaโ
Apache Kafka is a distributed streaming platform designed for high-throughput, fault-tolerant, and scalable data streaming. Flinkโs Kafka connector allows you to consume and produce data from and to Kafka topics.
Upsert Kafka SQL Connectorโ
The Upsert Kafka SQL Connector allows Apache Flink to integrate with Apache Kafka for reading and writing data using upsert semantics. This is particularly useful when working with changelog streams or streaming upserts, where each record represents an update or deletion of a previous record based on a primary key.
Amazon Kinesis Data Streamsโ
Amazon Kinesis Data Streams is a managed, real-time data streaming service provided by Amazon Web Services (AWS). Flinkโs Kinesis connector enables you to consume and produce data from and to Kinesis data streams.
DataGenโ
The DataGen connector in Apache Flink allows you to create tables with in-memory data generation, which is particularly useful for developing and testing queries locally without the need to access external systems such as Kafka. DataGen tables can include computed column syntax for flexible record generation.
Fakerโ
The Faker connector leverages the popular Java Faker library to generate random data based on predefined patterns. This allows you to create tables with data that closely resembles real-world data, enabling you to develop and test your Flink applications more effectively.
Elasticsearchโ
Elasticsearch is a distributed, RESTful search and analytics engine built on top of Apache Lucene. Flinkโs Elasticsearch connector enables you to write data to Elasticsearch indices and perform real-time search and analytics operations on the stored data.
MySQL & MySQL CDCโ
Apache Flink provides built-in connectors for MySQL to enable both batch processing and real-time change data capture (CDC) from MySQL databases. This allows you to read and write data from and to MySQL databases, and capture changes in real time as they occur.
The MySQL connector allows you to read and write data from and to MySQL databases using Flinkโs JDBC connector.
PostreSQLโ
While Apache Flink does not provide a dedicated built-in connector for PostgreSQL, you can still integrate Flink with PostgreSQL using the JDBC connector or the Change Data Capture (CDC) approach.
Redisโ
Redis is an open-source, in-memory data structure store that can be used as a database, cache, and message broker. Flinkโs Redis connector provides seamless integration with Redis, enabling you to read and write data from/to Redis data structures.
Most of documentation about built-in connectors comes from the official Apache Flinkยฎ documentation.
Refer to the Credits page for more information.