Creates a new DataSet by aggregating the specified field using the given aggregation function.
Creates a new DataSet by aggregating the specified field using the given aggregation function. Since this is a keyed DataSet the aggregation will be performed on groups of elements with the same key.
This only works on CaseClass DataSets.
Creates a new DataSet by aggregating the specified tuple field using the given aggregation function.
Creates a new DataSet by aggregating the specified tuple field using the given aggregation function. Since this is a keyed DataSet the aggregation will be performed on groups of tuples with the same key.
This only works on Tuple DataSets.
Applies a CombineFunction on a grouped DataSet.
Applies a CombineFunction on a grouped DataSet. A CombineFunction is similar to a GroupReduceFunction but does not perform a full data exchange. Instead, the CombineFunction calls the combine method once per partition for combining a group of results. This operator is suitable for combining values into an intermediate format before doing a proper groupReduce where the data is shuffled across the node for further reduction. The GroupReduce operator can also be supplied with a combiner by implementing the RichGroupReduce function. The combine method of the RichGroupReduce function demands input and output type to be the same. The CombineFunction, on the other side, can have an arbitrary output type.
Applies a CombineFunction on a grouped DataSet.
Applies a CombineFunction on a grouped DataSet. A CombineFunction is similar to a GroupReduceFunction but does not perform a full data exchange. Instead, the CombineFunction calls the combine method once per partition for combining a group of results. This operator is suitable for combining values into an intermediate format before doing a proper groupReduce where the data is shuffled across the node for further reduction. The GroupReduce operator can also be supplied with a combiner by implementing the RichGroupReduce function. The combine method of the RichGroupReduce function demands input and output type to be the same. The CombineFunction, on the other side, can have an arbitrary output type.
Creates a new DataSet containing the first n
elements of each group of this DataSet.
Gets the custom partitioner to be used for this grouping, or null, if none was defined.
Gets the custom partitioner to be used for this grouping, or null, if none was defined.
Syntactic sugar for aggregate with MAX
Syntactic sugar for aggregate with MAX
Applies a special case of a reduce transformation maxBy
on a grouped DataSet
The transformation consecutively calls a ReduceFunction
until only a single element remains which is the result of the transformation.
Applies a special case of a reduce transformation maxBy
on a grouped DataSet
The transformation consecutively calls a ReduceFunction
until only a single element remains which is the result of the transformation.
A ReduceFunction combines two elements into one new element of the same type.
Syntactic sugar for aggregate with MIN
Syntactic sugar for aggregate with MIN
Applies a special case of a reduce transformation minBy
on a grouped DataSet.
Applies a special case of a reduce transformation minBy
on a grouped DataSet.
The transformation consecutively calls a ReduceFunction
until only a single element remains which is the result of the transformation.
A ReduceFunction combines two elements into one new element of the same type.
Special reduce operation for explicitly telling the system what strategy to use for the combine phase.
Special reduce operation for explicitly telling the system what strategy to use for the combine phase. If null is given as the strategy, then the optimizer will pick the strategy.
Creates a new DataSet by merging the elements of each group (elements with the same key) using an associative reduce function.
Special reduce operation for explicitly telling the system what strategy to use for the combine phase.
Special reduce operation for explicitly telling the system what strategy to use for the combine phase. If null is given as the strategy, then the optimizer will pick the strategy.
Creates a new DataSet by merging the elements of each group (elements with the same key) using an associative reduce function.
Creates a new DataSet by passing for each group (elements with the same key) the list of elements to the GroupReduceFunction.
Creates a new DataSet by passing for each group (elements with the same key) the list of elements to the group reduce function.
Creates a new DataSet by passing for each group (elements with the same key) the list of elements to the group reduce function.
Adds a secondary sort key to this GroupedDataSet.
Adds a secondary sort key to this GroupedDataSet. This will only have an effect if you
use one of the group-at-a-time, i.e. reduceGroup
.
This works on any data type.
Adds a secondary sort key to this GroupedDataSet.
Adds a secondary sort key to this GroupedDataSet. This will only have an effect if you
use one of the group-at-a-time, i.e. reduceGroup
.
This only works on CaseClass DataSets.
Adds a secondary sort key to this GroupedDataSet.
Adds a secondary sort key to this GroupedDataSet. This will only have an effect if you
use one of the group-at-a-time, i.e. reduceGroup
.
This only works on Tuple DataSets.
Syntactic sugar for aggregate with SUM
Syntactic sugar for aggregate with SUM
Sets a custom partitioner for the grouping.
A DataSet to which a grouping key was added. Operations work on groups of elements with the same key (
aggregate
,reduce
, andreduceGroup
).A secondary sort order can be added with sortGroup, but this is only used when using one of the group-at-a-time operations, i.e.
reduceGroup
.