Apache Kafka SQL Connector

本文档是 Apache Flink 的旧版本。建议访问最新的稳定版本。

Scan Source: Unbounded Sink: Streaming Append Mode

Dependencies
How to create a Kafka table
Available Metadata
Connector Options
Features
Data Type Mapping

The Kafka connector allows for reading data from and writing data into Kafka topics.

Dependencies

In order to use the Kafka connector the following dependencies are required for both projects using a build automation tool (such as Maven or SBT) and SQL Client with SQL JAR bundles.

Kafka version	Maven dependency	SQL Client JAR
universal	`<dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-connector-kafka_2.11</artifactId> <version>1.12.7</version> </dependency>`	Download

The Kafka connectors are not currently part of the binary distribution. See how to link with them for cluster execution here.

How to create a Kafka table

The example below shows how to create a Kafka table:

CREATE TABLE KafkaTable (
  `user_id` BIGINT,
  `item_id` BIGINT,
  `behavior` STRING,
  `ts` TIMESTAMP(3) METADATA FROM 'timestamp'
) WITH (
  'connector' = 'kafka',
  'topic' = 'user_behavior',
  'properties.bootstrap.servers' = 'localhost:9092',
  'properties.group.id' = 'testGroup',
  'scan.startup.mode' = 'earliest-offset',
  'format' = 'csv'
)

Available Metadata

The following connector metadata can be accessed as metadata columns in a table definition.

The R/W column defines whether a metadata field is readable (R) and/or writable (W). Read-only columns must be declared VIRTUAL to exclude them during an INSERT INTO operation.

Key	Data Type	Description	R/W
`topic`	`STRING NOT NULL`	Topic name of the Kafka record.	`R`
`partition`	`INT NOT NULL`	Partition ID of the Kafka record.	`R`
`headers`	`MAP<STRING, BYTES> NOT NULL`	Headers of the Kafka record as a map of raw bytes.	`R/W`
`leader-epoch`	`INT NULL`	Leader epoch of the Kafka record if available.	`R`
`offset`	`BIGINT NOT NULL`	Offset of the Kafka record in the partition.	`R`
`timestamp`	`TIMESTAMP(3) WITH LOCAL TIME ZONE NOT NULL`	Timestamp of the Kafka record.	`R/W`
`timestamp-type`	`STRING NOT NULL`	Timestamp type of the Kafka record. Either "NoTimestampType", "CreateTime" (also set when writing metadata), or "LogAppendTime".	`R`

The extended CREATE TABLE example demonstrates the syntax for exposing these metadata fields:

CREATE TABLE KafkaTable (
  `event_time` TIMESTAMP(3) METADATA FROM 'timestamp',
  `partition` BIGINT METADATA VIRTUAL,
  `offset` BIGINT METADATA VIRTUAL,
  `user_id` BIGINT,
  `item_id` BIGINT,
  `behavior` STRING
) WITH (
  'connector' = 'kafka',
  'topic' = 'user_behavior',
  'properties.bootstrap.servers' = 'localhost:9092',
  'properties.group.id' = 'testGroup',
  'scan.startup.mode' = 'earliest-offset',
  'format' = 'csv'
);

Format Metadata

The connector is able to expose metadata of the value format for reading. Format metadata keys are prefixed with 'value.'.

The following example shows how to access both Kafka and Debezium metadata fields:

CREATE TABLE KafkaTable (
  `event_time` TIMESTAMP(3) METADATA FROM 'value.source.timestamp' VIRTUAL,  -- from Debezium format
  `origin_table` STRING METADATA FROM 'value.source.table' VIRTUAL, -- from Debezium format
  `partition_id` BIGINT METADATA FROM 'partition' VIRTUAL,  -- from Kafka connector
  `offset` BIGINT METADATA VIRTUAL,  -- from Kafka connector
  `user_id` BIGINT,
  `item_id` BIGINT,
  `behavior` STRING
) WITH (
  'connector' = 'kafka',
  'topic' = 'user_behavior',
  'properties.bootstrap.servers' = 'localhost:9092',
  'properties.group.id' = 'testGroup',
  'scan.startup.mode' = 'earliest-offset',
  'value.format' = 'debezium-json'
);

Connector Options

Option	Required	Default	Type	Description
connector	required	(none)	String	Specify what connector to use, for Kafka use: `'kafka'`.
topic	required for sink, optional for source(use 'topic-pattern' instead if not set)	(none)	String	Topic name(s) to read data from when the table is used as source. It also supports topic list for source by separating topic by semicolon like `'topic-1;topic-2'`. Note, only one of "topic-pattern" and "topic" can be specified for sources. When the table is used as sink, the topic name is the topic to write data to. Note topic list is not supported for sinks.
topic-pattern	optional	(none)	String	The regular expression for a pattern of topic names to read from. All topics with names that match the specified regular expression will be subscribed by the consumer when the job starts running. Note, only one of "topic-pattern" and "topic" can be specified for sources.
properties.bootstrap.servers	required	(none)	String	Comma separated list of Kafka brokers.
properties.group.id	required by source	(none)	String	The id of the consumer group for Kafka source, optional for Kafka sink.
properties.*	optional	(none)	String	This can set and pass arbitrary Kafka configurations. Suffix names must match the configuration key defined in Kafka Configuration documentation. Flink will remove the "properties." key prefix and pass the transformed key and values to the underlying KafkaClient. For example, you can disable automatic topic creation via `'properties.allow.auto.create.topics' = 'false'`. But there are some configurations that do not support to set, because Flink will override them, e.g. `'key.deserializer'` and `'value.deserializer'`.
format	required	(none)	String	The format used to deserialize and serialize the value part of Kafka messages. Please refer to the formats page for more details and more format options. Note: Either this option or the `'value.format'` option are required.
key.format	optional	(none)	String	The format used to deserialize and serialize the key part of Kafka messages. Please refer to the formats page for more details and more format options. Note: If a key format is defined, the `'key.fields'` option is required as well. Otherwise the Kafka records will have an empty key.
key.fields	optional	[]	List<String>	Defines an explicit list of physical columns from the table schema that configure the data type for the key format. By default, this list is empty and thus a key is undefined. The list should look like `'field1;field2'`.
key.fields-prefix	optional	(none)	String	Defines a custom prefix for all fields of the key format to avoid name clashes with fields of the value format. By default, the prefix is empty. If a custom prefix is defined, both the table schema and `'key.fields'` will work with prefixed names. When constructing the data type of the key format, the prefix will be removed and the non-prefixed names will be used within the key format. Please note that this option requires that `'value.fields-include'` must be set to `'EXCEPT_KEY'`.
value.format	required	(none)	String	The format used to deserialize and serialize the value part of Kafka messages. Please refer to the formats page for more details and more format options. Note: Either this option or the `'format'` option are required.
value.fields-include	optional	ALL	Enum Possible values: [ALL, EXCEPT_KEY]	Defines a strategy how to deal with key columns in the data type of the value format. By default, `'ALL'` physical columns of the table schema will be included in the value format which means that key columns appear in the data type for both the key and value format.
scan.startup.mode	optional	group-offsets	String	Startup mode for Kafka consumer, valid values are `'earliest-offset'`, `'latest-offset'`, `'group-offsets'`, `'timestamp'` and `'specific-offsets'`. See the following Start Reading Position for more details.
scan.startup.specific-offsets	optional	(none)	String	Specify offsets for each partition in case of `'specific-offsets'` startup mode, e.g. `'partition:0,offset:42;partition:1,offset:300'`.
scan.startup.timestamp-millis	optional	(none)	Long	Start from the specified epoch timestamp (milliseconds) used in case of `'timestamp'` startup mode.
scan.topic-partition-discovery.interval	optional	(none)	Duration	Interval for consumer to discover dynamically created Kafka topics and partitions periodically.
sink.partitioner	optional	'default'	String	Output partitioning from Flink's partitions into Kafka's partitions. Valid values are `default`: use the kafka default partitioner to partition records. `fixed`: each Flink partition ends up in at most one Kafka partition. `round-robin`: a Flink partition is distributed to Kafka partitions sticky round-robin. It only works when record's keys are not specified. Custom `FlinkKafkaPartitioner` subclass: e.g. `'org.mycompany.MyPartitioner'`. See the following Sink Partitioning for more details.
sink.semantic	optional	at-least-once	String	Defines the delivery semantic for the Kafka sink. Valid enumerationns are `'at-least-once'`, `'exactly-once'` and `'none'`. See Consistency guarantees for more details.
sink.parallelism	optional	(none)	Integer	Defines the parallelism of the Kafka sink operator. By default, the parallelism is determined by the framework using the same parallelism of the upstream chained operator.

Features

Key and Value Formats

Both the key and value part of a Kafka record can be serialized to and deserialized from raw bytes using one of the given formats.

Value Format

Since a key is optional in Kafka records, the following statement reads and writes records with a configured value format but without a key format. The 'format' option is a synonym for 'value.format'. All format options are prefixed with the format identifier.

CREATE TABLE KafkaTable (,
  `ts` TIMESTAMP(3) METADATA FROM 'timestamp',
  `user_id` BIGINT,
  `item_id` BIGINT,
  `behavior` STRING
) WITH (
  'connector' = 'kafka',
  ...

  'format' = 'json',
  'json.ignore-parse-errors' = 'true'
)

The value format will be configured with the following data type:

ROW<`user_id` BIGINT, `item_id` BIGINT, `behavior` STRING>

Key and Value Format

The following example shows how to specify and configure key and value formats. The format options are prefixed with either the 'key' or 'value' plus format identifier.

CREATE TABLE KafkaTable (
  `ts` TIMESTAMP(3) METADATA FROM 'timestamp',
  `user_id` BIGINT,
  `item_id` BIGINT,
  `behavior` STRING
) WITH (
  'connector' = 'kafka',
  ...

  'key.format' = 'json',
  'key.json.ignore-parse-errors' = 'true',
  'key.fields' = 'user_id;item_id',

  'value.format' = 'json',
  'value.json.fail-on-missing-field' = 'false',
  'value.fields-include' = 'ALL'
)

The key format includes the fields listed in 'key.fields' (using ';' as the delimiter) in the same order. Thus, it will be configured with the following data type:

ROW<`user_id` BIGINT, `item_id` BIGINT>

Since the value format is configured with 'value.fields-include' = 'ALL', key fields will also end up in the value format’s data type:

ROW<`user_id` BIGINT, `item_id` BIGINT, `behavior` STRING>

Overlapping Format Fields

The connector cannot split the table’s columns into key and value fields based on schema information if both key and value formats contain fields of the same name. The 'key.fields-prefix' option allows to give key columns a unique name in the table schema while keeping the original names when configuring the key format.

The following example shows a key and value format that both contain a version field:

CREATE TABLE KafkaTable (
  `k_version` INT,
  `k_user_id` BIGINT,
  `k_item_id` BIGINT,
  `version` INT,
  `behavior` STRING
) WITH (
  'connector' = 'kafka',
  ...

  'key.format' = 'json',
  'key.fields-prefix' = 'k_',
  'key.fields' = 'k_version;k_user_id;k_item_id',

  'value.format' = 'json',
  'value.fields-include' = 'EXCEPT_KEY'
)

The value format must be configured in 'EXCEPT_KEY' mode. The formats will be configured with the following data types:

key format:
ROW<`version` INT, `user_id` BIGINT, `item_id` BIGINT>

value format:
ROW<`version` INT, `behavior` STRING>

Topic and Partition Discovery

The config option topic and topic-pattern specifies the topics or topic pattern to consume for source. The config option topic can accept topic list using semicolon separator like ‘topic-1;topic-2’. The config option topic-pattern will use regular expression to discover the matched topic. For example, if the topic-pattern is test-topic-[0-9], then all topics with names that match the specified regular expression (starting with test-topic- and ending with a single digit)) will be subscribed by the consumer when the job starts running.

To allow the consumer to discover dynamically created topics after the job started running, set a non-negative value for scan.topic-partition-discovery.interval. This allows the consumer to discover partitions of new topics with names that also match the specified pattern.

Please refer to Kafka DataStream Connector documentation for more about topic and partition discovery.

Note that topic list and topic pattern only work in sources. In sinks, Flink currently only supports a single topic.

Start Reading Position

The config option scan.startup.mode specifies the startup mode for Kafka consumer. The valid enumerations are:

group-offsets: start from committed offsets in ZK / Kafka brokers of a specific consumer group.
earliest-offset: start from the earliest offset possible.
latest-offset: start from the latest offset.
timestamp: start from user-supplied timestamp for each partition.
specific-offsets: start from user-supplied specific offsets for each partition.

The default option value is group-offsets which indicates to consume from last committed offsets in ZK / Kafka brokers.

If timestamp is specified, another config option scan.startup.timestamp-millis is required to specify a specific startup timestamp in milliseconds since January 1, 1970 00:00:00.000 GMT.

If specific-offsets is specified, another config option scan.startup.specific-offsets is required to specify specific startup offsets for each partition, e.g. an option value partition:0,offset:42;partition:1,offset:300 indicates offset 42 for partition 0 and offset 300 for partition 1.

Changelog Source

Flink natively supports Kafka as a changelog source. If messages in Kafka topic is change event captured from other databases using CDC tools, then you can use a CDC format to interpret messages as INSERT/UPDATE/DELETE messages into Flink SQL system. Flink provides two CDC formats debezium-json and canal-json to interpret change events captured by Debezium and Canal. The changelog source is a very useful feature in many cases, such as synchronizing incremental data from databases to other systems, auditing logs, materialized views on databases, temporal join changing history of a database table and so on. See more about how to use the CDC formats in debezium-json and canal-json.

Sink Partitioning

The config option sink.partitioner specifies output partitioning from Flink’s partitions into Kafka’s partitions. By default, Flink uses the Kafka default partitioner to parititon records. It uses the sticky partition strategy for records with null keys and uses a murmur2 hash to compute the partition for a record with the key defined.

In order to control the routing of rows into partitions, a custom sink partitioner can be provided. The ‘fixed’ partitioner will write the records in the same Flink partition into the same partition, which could reduce the cost of the network connections.

Consistency guarantees

By default, a Kafka sink ingests data with at-least-once guarantees into a Kafka topic if the query is executed with checkpointing enabled.

With Flink’s checkpointing enabled, the kafka connector can provide exactly-once delivery guarantees.

Besides enabling Flink’s checkpointing, you can also choose three different modes of operating chosen by passing appropriate sink.semantic option:

NONE: Flink will not guarantee anything. Produced records can be lost or they can be duplicated.
AT_LEAST_ONCE (default setting): This guarantees that no records will be lost (although they can be duplicated).
EXACTLY_ONCE: Kafka transactions will be used to provide exactly-once semantic. Whenever you write to Kafka using transactions, do not forget about setting desired isolation.level (read_committed or read_uncommitted - the latter one is the default value) for any application consuming records from Kafka.

Please refer to Kafka documentation for more caveats about delivery guarantees.

Source Per-Partition Watermarks

Flink supports to emit per-partition watermarks for Kafka. Watermarks are generated inside the Kafka consumer. The per-partition watermarks are merged in the same way as watermarks are merged during streaming shuffles. The output watermark of the source is determined by the minimum watermark among the partitions it reads. If some partitions in the topics are idle, the watermark generator will not advance. You can alleviate this problem by setting the 'table.exec.source.idle-timeout' option in the table configuration.

Please refer to Kafka watermark strategies for more details.

Data Type Mapping

Kafka stores message keys and values as bytes, so Kafka doesn’t have schema or data types. The Kafka messages are deserialized and serialized by formats, e.g. csv, json, avro. Thus, the data type mapping is determined by specific formats. Please refer to Formats pages for more details.

Apache Kafka SQL Connector

Dependencies

How to create a Kafka table

Available Metadata

Connector Options

connector

topic

topic-pattern

properties.bootstrap.servers

properties.group.id

properties.*

format

key.format

key.fields

key.fields-prefix

value.format

value.fields-include

scan.startup.mode

scan.startup.specific-offsets

scan.startup.timestamp-millis

scan.topic-partition-discovery.interval

sink.partitioner

sink.semantic

sink.parallelism