@PublicEvolving public interface SupportsComputedColumnPushDown
ScanTableSource
.
Computed columns add additional columns to the table's schema. They are defined by logical expressions that reference other physically existing columns.
An example in SQL looks like:
CREATE TABLE t (str STRING, ts AS PARSE_TIMESTAMP(str), i INT) // `ts` is a computed column
By default, if this interface is not implemented, computed columns are added to the physically produced row in a subsequent operation after the source.
However, it might be beneficial to perform the computation as early as possible in order to be close to the actual data generation. Especially in cases where computed columns are used for generating watermarks, a source must push down the computation as deep as possible such that the computation can happen within a source's data partition.
This interface provides a SupportsComputedColumnPushDown.ComputedColumnConverter
that needs to be applied to every
row during runtime.
Note: The final output data type emitted by a source changes from the physically produced data type to the full data type of the table's schema. For the example above, this means:
ROW<str STRING, i INT> // before conversion
ROW<str STRING, ts TIMESTAMP(3), i INT> // after conversion
Note: If a source implements SupportsProjectionPushDown
, the projection must be
applied to the physical data in the first step. The SupportsComputedColumnPushDown
(already aware of the projection) will then use the projected physical data and insert computed
columns into the result. In the example below, the projections [i, d]
are derived from
the DDL (c
requires i
) and query (d
and c
are required). The
pushed converter will rely on this order and will process [i, d]
to produce [d,
c]
.
CREATE TABLE t (i INT, s STRING, c AS i + 2, d DOUBLE);
SELECT d, c FROM t;
Modifier and Type | Interface and Description |
---|---|
static interface |
SupportsComputedColumnPushDown.ComputedColumnConverter
Generates and adds computed columns to
RowData if necessary. |
Modifier and Type | Method and Description |
---|---|
void |
applyComputedColumn(SupportsComputedColumnPushDown.ComputedColumnConverter converter,
DataType outputDataType)
|
void applyComputedColumn(SupportsComputedColumnPushDown.ComputedColumnConverter converter, DataType outputDataType)
RowData
containing the physical
fields of the external system into a new RowData
with push-downed computed columns.
Note: Use the passed data type instead of TableSchema.toPhysicalRowDataType()
for
describing the final output data type when creating TypeInformation
. If the source
implements SupportsProjectionPushDown
, the projection is already considered in both
the converter and the given output data type.
Copyright © 2014–2021 The Apache Software Foundation. All rights reserved.