CSV

This article introduces you to the usage and type mapping of the CSV format.

Background information

The CSV format allows reading and writing CSV data based on the CSV structure. Currently, the CSV structure is derived based on the table structure.

Example of use

An example of constructing a table using Kafka and Avro format is as follows.

    CREATE TABLE user_behavior (
        user_id BIGINT,
        item_id BIGINT,
        category_id BIGINT,
        behavior STRING,
        ts TIMESTAMP(3)
    ) WITH (
        'connector' = 'kafka',
        'topic' = 'user_behavior',
        'properties.bootstrap.servers' = 'localhost :9092',
        'properties.group.id' = 'testGroup',
        'format' = 'csv',
        'csv.ignore-parse-errors' = 'true',
        'csv.allow-comments' = 'true'
    )

Configuration options

Parameter	Description	Required	Default	Type
format	The format to use for the declaration. When using the CSV format, the parameter value is csv.	yes	none	String
csv.field-delimiter	Specify the field delimiter, only single characters can be used, the default is comma (,). Parameter values can be: The backslash character, used to specify special characters. For example, t represents a tab character. Unicode encoding, specifying special characters in plain SQL text. For example, ‘csv.field-delimiter’ = U&’0001’ represents the 0x01 character.	no	,	String
csv.disable-quote-character	The parameter values are as follows: true: Disallow quotes around quoted values. When disabled, the option csv.quote-character cannot be set. false (default): Allow quotes around quoted values.	no	false	Boolean
csv.quote-character	Specifies the quote character to enclose field values, defaults to double quotes (“).	no	“none”	String
csv.allow-comments	The parameter values are as follows: true: Comment lines are allowed to be ignored, and comment lines start with #. If commented lines are allowed, make sure csv.ignore-parse-errors is also turned on to allow empty lines. false (default): disable ignoring of commented lines.	no	false	Boolean
csv.ignore-parse-errors	The parameter values are as follows: true: When parsing an exception, skip the current field and set the field value to null. false (default): When parsing an exception, an error is thrown and the job fails to start.	no	false	Boolean
csv.array-element-delimiter	Specifies the string separating array and row elements, defaults to a semicolon (;).	no	;	String
csv.escape-character	Specifies the escape character, disabled by default.	no	none	String
csv.null-literal	Specifies the string to convert null values to, disabled by default.	no	none	String
csv.write-bigdecimal-in-scientific-notation	The parameter values are as follows: true (default): Data of type Bigdecimal is represented as scientific notation. Example: 100000 is expressed as 1E+5. false: Data of Bigdecimal type remains as it is. Example: 100000 is still expressed as 100000.	no	true	Boolean

Type mapping

Currently the structure of CSV is derived from the table structure. In Flink, CSV format data uses jackson databind API to parse CSV strings. The mapping relationship between Flink and CSV data types is as follows.

Flink SQL type	CSV type
CHAR/VARCHAR/STRING	string
BOOLEAN	boolean
BINARY / VARBINARY	string with encoding: base64
DECIMAL	number
TINYINT	number
SMALLINT	number
INT	number
BIGINT	number
FLOAT	number
DOUBLE	number
DATE	string with format: date
time	string with format: time
TIMESTAMP	string with format: date-time
INTERVAL	number
ARRAY	array
ROW	object

Type mapping

For writing to object storage S3, currently it does not support writing files in CSV format. For specific reasons, see FLINK-30635.

note

This page is derived from the official Apache Flink® documentation.

Refer to the Credits page for more information.

Background information​

Example of use​

Configuration options​

Type mapping​

Type mapping​

Background information

Example of use

Configuration options

Type mapping

Type mapping