Parquet
1 min read
On this page
The Apache Parquet format allows to read and write Parquet data.
Example of Use
SQL
1 CREATE TABLE user_behavior (
2 user_id BIGINT,
3 item_id BIGINT,
4 category_id BIGINT,
5 behavior STRING,
6 ts TIMESTAMP(3),
7 dt STRING
8 ) PARTITIONED BY (dt) WITH (
9 'connector' = 'filesystem',
10 'path' = '/tmp/user_behavior',
11 'format' = 'parquet'
12 )Format Options
Data Type Mapping
Currently, Parquet format type mapping is compatible with Apache Hive, but different with Apache Spark:
- Timestamp: mapping timestamp type to int96 whatever the precision is.
- Decimal: mapping decimal type to fixed length byte array according to the precision.
The following table lists the type mapping from Flink type to Parquet type.
Was this helpful?