DataGen SQL Connector

Scan Source: Bounded Scan Source: UnBounded

The DataGen connector allows for reading by data generation rules.

The DataGen connector can work with Computed Column syntax. This allows you to generate records flexibly.

The DataGen connector is built-in.

Attention Complex types are not supported: Array, Map, Row. Please construct these types by computed column.

How to create a DataGen table

The boundedness of table: when the generation of field data in the table is completed, the reading is finished. So the boundedness of the table depends on the boundedness of fields.

For each field, there are two ways to generate data:

  • Random generator is the default generator, you can specify random max and min values. For char/varchar/string, the length can be specified. It is a unbounded generator.
  • Sequence generator, you can specify sequence start and end values. It is a bounded generator, when the sequence number reaches the end value, the reading ends.
CREATE TABLE datagen (
 f_sequence INT,
 f_random INT,
 f_random_str STRING,
 ts AS localtimestamp,
 WATERMARK FOR ts AS ts
) WITH (
 'connector' = 'datagen',

 -- optional options --

 'rows-per-second'='5',

 'fields.f_sequence.kind'='sequence',
 'fields.f_sequence.start'='1',
 'fields.f_sequence.end'='1000',

 'fields.f_random.min'='1',
 'fields.f_random.max'='1000',

 'fields.f_random_str.length'='10'
)

Connector Options

Option Required Default Type Description
connector
required (none) String Specify what connector to use, here should be 'datagen'.
rows-per-second
optional 10000 Long Rows per second to control the emit rate.
fields.#.kind
optional random String Generator of this '#' field. Can be 'sequence' or 'random'.
fields.#.min
optional (Minimum value of type) (Type of field) Minimum value of random generator, work for number types.
fields.#.max
optional (Maximum value of type) (Type of field) Maximum value of random generator, work for number types.
fields.#.length
optional 100 Integer Length for string generating of random generator, work for char/varchar/string.
fields.#.start
optional (none) (Type of field) Start value of sequence generator.
fields.#.end
optional (none) (Type of field) End value of sequence generator.