public class FlinkKafkaConsumer08<T> extends FlinkKafkaConsumerBase<T>
The Flink Kafka Consumer participates in checkpointing and guarantees that no data is lost during a failure, and that the computation processes elements "exactly once". (Note: These guarantees naturally assume that Kafka itself does not loose any data.)
Flink's Kafka Consumer is designed to be compatible with Kafka's High-Level Consumer API (0.8.x). Most of Kafka's configuration variables can be used with this consumer as well:
Offsets whose records have been read and are checkpointed will be committed back to ZooKeeper by the offset handler. In addition, the offset handler finds the point where the source initially starts reading from the stream, when the streaming job is started.
Please note that Flink snapshots the offsets internally as part of its distributed checkpoints. The offsets committed to Kafka / ZooKeeper are only to bring the outside view of progress in sync with Flink's view of the progress. That way, monitoring and other jobs can get a view of how far the Flink Kafka consumer has consumed a topic.
If checkpointing is disabled, the consumer will periodically commit the current offset to Zookeeper.
When using a Kafka topic to send data between Flink jobs, we recommend using the
TypeInformationSerializationSchema
and TypeInformationKeyValueSerializationSchema
.
NOTE: The implementation currently accesses partition metadata when the consumer is constructed. That means that the client that submits the program needs to be able to reach the Kafka brokers or ZooKeeper.
SourceFunction.SourceContext<T>
Modifier and Type | Field and Description |
---|---|
static int |
DEFAULT_GET_PARTITIONS_RETRIES
Default number of retries for getting the partition info.
|
static String |
GET_PARTITIONS_RETRIES_KEY
Configuration key for the number of retries for getting the partition info
|
allSubscribedPartitions, deserializer, KEY_DISABLE_METRICS, LOG, MAX_NUM_PENDING_CHECKPOINTS
Constructor and Description |
---|
FlinkKafkaConsumer08(List<String> topics,
DeserializationSchema<T> deserializer,
Properties props)
Creates a new Kafka streaming source consumer for Kafka 0.8.x
This constructor allows passing multiple topics to the consumer.
|
FlinkKafkaConsumer08(List<String> topics,
KeyedDeserializationSchema<T> deserializer,
Properties props)
Creates a new Kafka streaming source consumer for Kafka 0.8.x
This constructor allows passing multiple topics and a key/value deserialization schema.
|
FlinkKafkaConsumer08(String topic,
DeserializationSchema<T> valueDeserializer,
Properties props)
Creates a new Kafka streaming source consumer for Kafka 0.8.x
|
FlinkKafkaConsumer08(String topic,
KeyedDeserializationSchema<T> deserializer,
Properties props)
Creates a new Kafka streaming source consumer for Kafka 0.8.x
This constructor allows passing a
KeyedDeserializationSchema for reading key/value
pairs, offsets, and topic names from Kafka. |
Modifier and Type | Method and Description |
---|---|
protected AbstractFetcher<T,?> |
createFetcher(SourceFunction.SourceContext<T> sourceContext,
List<KafkaTopicPartition> thisSubtaskPartitions,
SerializedValue<AssignerWithPeriodicWatermarks<T>> watermarksPeriodic,
SerializedValue<AssignerWithPunctuatedWatermarks<T>> watermarksPunctuated,
StreamingRuntimeContext runtimeContext)
Creates the fetcher that connect to the Kafka brokers, pulls data, deserialized the
data, and emits it into the data streams.
|
static List<KafkaTopicPartitionLeader> |
getPartitionsForTopic(List<String> topics,
Properties properties)
Send request to Kafka to get partitions for topic.
|
protected static void |
validateZooKeeperConfig(Properties props)
Validate the ZK configuration, checking for required parameters
|
assignPartitions, assignTimestampsAndWatermarks, assignTimestampsAndWatermarks, cancel, close, getProducedType, logPartitionInfo, notifyCheckpointComplete, restoreState, run, setSubscribedPartitions, snapshotState
getIterationRuntimeContext, getRuntimeContext, open, setRuntimeContext
public static final String GET_PARTITIONS_RETRIES_KEY
public static final int DEFAULT_GET_PARTITIONS_RETRIES
public FlinkKafkaConsumer08(String topic, DeserializationSchema<T> valueDeserializer, Properties props)
topic
- The name of the topic that should be consumed.valueDeserializer
- The de-/serializer used to convert between Kafka's byte messages and Flink's objects.props
- The properties used to configure the Kafka consumer client, and the ZooKeeper client.public FlinkKafkaConsumer08(String topic, KeyedDeserializationSchema<T> deserializer, Properties props)
KeyedDeserializationSchema
for reading key/value
pairs, offsets, and topic names from Kafka.topic
- The name of the topic that should be consumed.deserializer
- The keyed de-/serializer used to convert between Kafka's byte messages and Flink's objects.props
- The properties used to configure the Kafka consumer client, and the ZooKeeper client.public FlinkKafkaConsumer08(List<String> topics, DeserializationSchema<T> deserializer, Properties props)
topics
- The Kafka topics to read from.deserializer
- The de-/serializer used to convert between Kafka's byte messages and Flink's objects.props
- The properties that are used to configure both the fetcher and the offset handler.public FlinkKafkaConsumer08(List<String> topics, KeyedDeserializationSchema<T> deserializer, Properties props)
topics
- The Kafka topics to read from.deserializer
- The keyed de-/serializer used to convert between Kafka's byte messages and Flink's objects.props
- The properties that are used to configure both the fetcher and the offset handler.protected AbstractFetcher<T,?> createFetcher(SourceFunction.SourceContext<T> sourceContext, List<KafkaTopicPartition> thisSubtaskPartitions, SerializedValue<AssignerWithPeriodicWatermarks<T>> watermarksPeriodic, SerializedValue<AssignerWithPunctuatedWatermarks<T>> watermarksPunctuated, StreamingRuntimeContext runtimeContext) throws Exception
FlinkKafkaConsumerBase
createFetcher
in class FlinkKafkaConsumerBase<T>
sourceContext
- The source context to emit data to.thisSubtaskPartitions
- The set of partitions that this subtask should handle.watermarksPeriodic
- Optional, a serialized timestamp extractor / periodic watermark generator.watermarksPunctuated
- Optional, a serialized timestamp extractor / punctuated watermark generator.runtimeContext
- The task's runtime context.Exception
- The method should forward exceptionspublic static List<KafkaTopicPartitionLeader> getPartitionsForTopic(List<String> topics, Properties properties)
topics
- The name of the topics.properties
- The properties for the Kafka Consumer that is used to query the partitions for the topic.protected static void validateZooKeeperConfig(Properties props)
props
- Properties to checkCopyright © 2014–2017 The Apache Software Foundation. All rights reserved.