@PublicEvolving public interface SupportsPartitionPushDown
Partitions split the data stored in an external system into smaller portions that are identified
by one or more string-based partition keys. A single partition is represented as a
Map < String, String >
which maps each partition key to a partition value. Partition keys and their order is defined by the
For example, data can be partitioned by region and within a region partitioned by month. The order of the partition keys (in the example: first by region then by month) is defined by the catalog table. A list of partitions could be:
List( ['region'='europe', 'month'='2020-01'], ['region'='europe', 'month'='2020-02'], ['region'='asia', 'month'='2020-01'], ['region'='asia', 'month'='2020-02'] )
By default, if this interface is not implemented, the data is read entirely with a subsequent filter operation after the source.
For efficiency, the planner can pass the number of required partitions and a source must exclude
those partitions from reading (including reading the metadata). See
By default, the list of all partitions is queried from the catalog if necessary. However, depending
on the external system, it might be necessary to query the list of partitions in a connector-specific
way instead of using the catalog information. See
Note: After partitions are pushed into the source, the runtime will not perform a subsequent filter operation for partition keys.
|Modifier and Type||Method and Description|
Provides a list of remaining partitions.
Returns a list of all partitions that a source can read if available.
A single partition maps each partition key to a partition value.
Optional.empty() is returned, the list of partitions is queried from the catalog.
See the documentation of
SupportsPartitionPushDown for more information.
Copyright © 2014–2020 The Apache Software Foundation. All rights reserved.