public class FileSystemTableSink extends Object implements AppendStreamTableSink<RowData>, PartitionableTableSink, OverwritableTableSink
TableSink
.Modifier and Type | Class and Description |
---|---|
static class |
FileSystemTableSink.ProjectionBulkFactory
Project row to non-partition fields.
|
static class |
FileSystemTableSink.TableBucketAssigner
Table bucket assigner, wrap
PartitionComputer . |
static class |
FileSystemTableSink.TableRollingPolicy
Table
RollingPolicy , it extends CheckpointRollingPolicy for bulk writers. |
Constructor and Description |
---|
FileSystemTableSink(ObjectIdentifier tableIdentifier,
boolean isBounded,
TableSchema schema,
Path path,
List<String> partitionKeys,
String defaultPartName,
Map<String,String> properties)
Construct a file system table sink.
|
Modifier and Type | Method and Description |
---|---|
FileSystemTableSink |
configure(String[] fieldNames,
TypeInformation<?>[] fieldTypes)
Returns a copy of this
TableSink configured with the field names and types of the
table to emit. |
boolean |
configurePartitionGrouping(boolean supportsGrouping)
If returns true, sink can trust all records will definitely be grouped by partition fields
before consumed by the
TableSink , i.e. |
DataStreamSink<RowData> |
consumeDataStream(DataStream<RowData> dataStream)
Consumes the DataStream and return the sink transformation
DataStreamSink . |
static DataStreamSink<RowData> |
createStreamingSink(Configuration conf,
Path path,
List<String> partitionKeys,
ObjectIdentifier tableIdentifier,
boolean overwrite,
DataStream<RowData> inputStream,
StreamingFileSink.BucketsBuilder<RowData,String,? extends StreamingFileSink.BucketsBuilder<RowData,?,?>> bucketsBuilder,
TableMetaStoreFactory msFactory,
FileSystemFactory fsFactory,
long rollingCheckInterval) |
DataType |
getConsumedDataType()
Returns the data type consumed by this
TableSink . |
TableSchema |
getTableSchema()
Returns the schema of the consumed table.
|
void |
setOverwrite(boolean overwrite)
Configures whether the insert should overwrite existing data or not.
|
void |
setStaticPartition(Map<String,String> partitions)
Sets the static partition into the
TableSink . |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
getFieldNames, getFieldTypes, getOutputType
public FileSystemTableSink(ObjectIdentifier tableIdentifier, boolean isBounded, TableSchema schema, Path path, List<String> partitionKeys, String defaultPartName, Map<String,String> properties)
isBounded
- whether the input of sink is bounded.schema
- schema of the table.path
- directory path of the file system table.partitionKeys
- partition keys of the table.defaultPartName
- The default partition name in case the dynamic partition column value
is null/empty string.properties
- properties.public final DataStreamSink<RowData> consumeDataStream(DataStream<RowData> dataStream)
StreamTableSink
DataStreamSink
. The
returned DataStreamSink
will be used to set resources for the sink operator.consumeDataStream
in interface StreamTableSink<RowData>
public static DataStreamSink<RowData> createStreamingSink(Configuration conf, Path path, List<String> partitionKeys, ObjectIdentifier tableIdentifier, boolean overwrite, DataStream<RowData> inputStream, StreamingFileSink.BucketsBuilder<RowData,String,? extends StreamingFileSink.BucketsBuilder<RowData,?,?>> bucketsBuilder, TableMetaStoreFactory msFactory, FileSystemFactory fsFactory, long rollingCheckInterval)
public FileSystemTableSink configure(String[] fieldNames, TypeInformation<?>[] fieldTypes)
TableSink
TableSink
configured with the field names and types of the
table to emit.public void setOverwrite(boolean overwrite)
OverwritableTableSink
setOverwrite
in interface OverwritableTableSink
public void setStaticPartition(Map<String,String> partitions)
PartitionableTableSink
TableSink
. The static partition may be partial of
all partition columns. See the class Javadoc for more details.
The static partition is represented as a Map<String, String>
which maps from
partition field name to partition value. The partition values are all encoded as strings,
i.e. encoded using String.valueOf(...). For example, if we have a static partition f0=1024, f1="foo", f2="bar"
. f0 is an integer type, f1 and f2 are string types. They will
all be encoded as strings: "1024", "foo", "bar". And can be decoded to original literals
based on the field types.
setStaticPartition
in interface PartitionableTableSink
partitions
- user specified static partitionpublic TableSchema getTableSchema()
TableSink
getTableSchema
in interface TableSink<RowData>
TableSchema
of the consumed table.public DataType getConsumedDataType()
TableSink
TableSink
.getConsumedDataType
in interface TableSink<RowData>
TableSink
.public boolean configurePartitionGrouping(boolean supportsGrouping)
PartitionableTableSink
TableSink
, i.e. the sink will receive all elements of one
partition and then all elements of another partition, elements of different partitions will
not be mixed. For some sinks, this can be used to reduce number of the partition writers to
improve writing performance.
This method is used to configure the behavior of input whether to be grouped by partition, if true, at the same time the sink should also configure itself, i.e. set an internal field that changes the writing behavior (writing one partition at a time).
configurePartitionGrouping
in interface PartitionableTableSink
supportsGrouping
- whether the execution mode supports grouping, e.g. grouping (usually
use sort to implement) is only supported in batch mode, not supported in streaming mode.supportsGrouping
is false, it should never return true (requires
grouping), otherwise it will fail.Copyright © 2014–2021 The Apache Software Foundation. All rights reserved.