public class HiveTableSink extends OutputFormatTableSink<Row> implements PartitionableTableSink, OverwritableTableSink
Constructor and Description |
---|
HiveTableSink(org.apache.hadoop.mapred.JobConf jobConf,
ObjectPath tablePath,
CatalogTable table) |
Modifier and Type | Method and Description |
---|---|
TableSink<Row> |
configure(String[] fieldNames,
TypeInformation<?>[] fieldTypes)
Returns a copy of this
TableSink configured with the field names and types of the
table to emit. |
boolean |
configurePartitionGrouping(boolean supportsGrouping)
If returns true, sink can trust all records will definitely be grouped by partition fields
before consumed by the
TableSink , i.e. |
DataType |
getConsumedDataType()
Returns the data type consumed by this
TableSink . |
OutputFormat<Row> |
getOutputFormat()
Returns an
OutputFormat for writing the data of the table. |
TableSchema |
getTableSchema()
Returns the schema of the consumed table.
|
void |
setOverwrite(boolean overwrite)
Configures whether the insert should overwrite existing data or not.
|
void |
setStaticPartition(Map<String,String> partitionSpec)
Sets the static partition into the
TableSink . |
consumeDataStream, emitDataStream
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
getFieldNames, getFieldTypes, getOutputType
public HiveTableSink(org.apache.hadoop.mapred.JobConf jobConf, ObjectPath tablePath, CatalogTable table)
public OutputFormat<Row> getOutputFormat()
OutputFormatTableSink
OutputFormat
for writing the data of the table.getOutputFormat
in class OutputFormatTableSink<Row>
public TableSink<Row> configure(String[] fieldNames, TypeInformation<?>[] fieldTypes)
TableSink
TableSink
configured with the field names and types of the
table to emit.public DataType getConsumedDataType()
TableSink
TableSink
.getConsumedDataType
in interface TableSink<Row>
TableSink
.public TableSchema getTableSchema()
TableSink
getTableSchema
in interface TableSink<Row>
TableSchema
of the consumed table.public boolean configurePartitionGrouping(boolean supportsGrouping)
PartitionableTableSink
TableSink
, i.e. the sink will receive all elements of one
partition and then all elements of another partition, elements of different partitions
will not be mixed. For some sinks, this can be used to reduce number of the partition
writers to improve writing performance.
This method is used to configure the behavior of input whether to be grouped by partition, if true, at the same time the sink should also configure itself, i.e. set an internal field that changes the writing behavior (writing one partition at a time).
configurePartitionGrouping
in interface PartitionableTableSink
supportsGrouping
- whether the execution mode supports grouping,
e.g. grouping (usually use sort to implement) is only supported
in batch mode, not supported in streaming mode.supportsGrouping
is false, it should never return true (requires grouping), otherwise it will fail.public void setStaticPartition(Map<String,String> partitionSpec)
PartitionableTableSink
TableSink
. The static partition may be partial
of all partition columns. See the class Javadoc for more details.
The static partition is represented as a Map<String, String>
which maps from
partition field name to partition value. The partition values are all encoded as strings,
i.e. encoded using String.valueOf(...). For example, if we have a static partition
f0=1024, f1="foo", f2="bar"
. f0 is an integer type, f1 and f2 are string types.
They will all be encoded as strings: "1024", "foo", "bar". And can be decoded to original
literals based on the field types.
setStaticPartition
in interface PartitionableTableSink
partitionSpec
- user specified static partitionpublic void setOverwrite(boolean overwrite)
OverwritableTableSink
setOverwrite
in interface OverwritableTableSink
Copyright © 2014–2020 The Apache Software Foundation. All rights reserved.