DataGen
1 min read
On this page
Background Information
Datagen is a connector mainly used for debugging , which can periodically generate random data of the corresponding type in the Datagen source table. If you need to use some test data to quickly verify business logic during development or testing, you can use the Datagen connector to generate random data. Datagen can use computed column syntax (Computed Column syntax), which makes it flexible to generate data.
The information supported by Datagen Connector is as follows:
Prerequisite
None.
Grammatical Structures
SQL
1 CREATE TABLE datagen_source (
2 name VARCHAR,
3 score BIGINT
4 ) WITH (
5 'connector' = 'datagen'
6 );WITH Parameter
Unique to Source
Builder
Currently Datagen can use two generators to generate random data.
- Random Generator: Generates random values. You can specify maximum and minimum values for randomly generated data.
- Sequence Generator: Generate ordered values within a certain range, and end when the generated sequence reaches the end value, so using a sequence generator will generate a bounded table. You can specify start and end values for the sequence. The supported generators for each type are as follows:
Example of Use
Source example
Datagen is often used with the LIKE clause to simulate a table:
SQL
1 CREATE TABLE Orders (
2 order_number BIGINT,
3 price DECIMAL(32,2),
4 buyer ROW<first_name STRING, last_name STRING>,
5 order_time TIMESTAMP(3)
6 ) WITH (...)
7
8 -- create a bounded mock table
9 CREATE TEMPORARY TABLE GenOrders
10 WITH (
11 'connector' = 'datagen',
12 'number-of-rows' = '10'
13 )
14 LIKE Orders (EXCLUDING ALL)Note
This page is derived from the official Apache Flink® documentation.
Refer to the Credits page for more information.
Was this helpful?