T
- the type of the aggregation resultACC
- the type of the aggregation accumulator. The accumulator is used to keep the
aggregated values which are needed to compute an aggregation result.
AggregateFunction represents its state using accumulator, thereby the state of the
AggregateFunction must be put into the accumulator.@PublicEvolving public abstract class AggregateFunction<T,ACC> extends UserDefinedFunction
The behavior of an AggregateFunction
can be defined by implementing a series of custom
methods. An AggregateFunction
needs at least three methods:
- createAccumulator
,
- accumulate
, and
- getValue
.
There are a few other methods that can be optional to have:
- retract
,
- merge
, and
- resetAccumulator
.
All these methods must be declared publicly, not static, and named exactly as the names
mentioned above. The methods createAccumulator()
and getValue(ACC)
are defined in
the AggregateFunction
functions, while other methods are explained below.
Processes the input values and update the provided accumulator instance. The method
accumulate can be overloaded with different custom types and arguments. An AggregateFunction
requires at least one accumulate() method.
param: accumulator the accumulator which contains the current aggregated results
param: [user defined inputs] the input value (usually obtained from a new arrived data).
public void accumulate(ACC accumulator, [user defined inputs])
Retracts the input values from the accumulator instance. The current design assumes the
inputs are the values that have been previously accumulated. The method retract can be
overloaded with different custom types and arguments. This function must be implemented for
data stream bounded OVER aggregates.
param: accumulator the accumulator which contains the current aggregated results
param: [user defined inputs] the input value (usually obtained from a new arrived data).
public void retract(ACC accumulator, [user defined inputs])
Merges a group of accumulator instances into one accumulator instance. This function must be
implemented for data stream session window grouping aggregates and data set grouping aggregates.
param: accumulator the accumulator which will keep the merged aggregate results. It should
be noted that the accumulator may contain the previous aggregated
results. Therefore user should not replace or clean this instance in the
custom merge method.
param: its an java.lang.Iterable pointed to a group of accumulators that will be
merged.
public void merge(ACC accumulator, java.lang.Iterable<ACC> iterable)
Resets the accumulator for this AggregateFunction. This function must be implemented for
data set grouping aggregates.
param: accumulator the accumulator which needs to be reset
public void resetAccumulator(ACC accumulator)
Constructor and Description |
---|
AggregateFunction() |
Modifier and Type | Method and Description |
---|---|
abstract ACC |
createAccumulator()
Creates and initializes the accumulator for this
AggregateFunction . |
TypeInformation<ACC> |
getAccumulatorType()
Returns the
TypeInformation of the AggregateFunction 's accumulator. |
TypeInformation<T> |
getResultType()
Returns the
TypeInformation of the AggregateFunction 's result. |
abstract T |
getValue(ACC accumulator)
Called every time when an aggregation result should be materialized.
|
boolean |
requiresOver()
Returns
true if this AggregateFunction can only be applied in an
OVER window. |
close, functionIdentifier, isDeterministic, open, toString
public abstract ACC createAccumulator()
AggregateFunction
. The accumulator
is used to keep the aggregated values which are needed to compute an aggregation result.public abstract T getValue(ACC accumulator)
accumulator
- the accumulator which contains the current
aggregated resultspublic boolean requiresOver()
true
if this AggregateFunction
can only be applied in an
OVER window.true
if the AggregateFunction
requires an OVER window,
false
otherwise.public TypeInformation<T> getResultType()
TypeInformation
of the AggregateFunction
's result.TypeInformation
of the AggregateFunction
's result or
null
if the result type should be automatically inferred.public TypeInformation<ACC> getAccumulatorType()
TypeInformation
of the AggregateFunction
's accumulator.TypeInformation
of the AggregateFunction
's accumulator or
null
if the accumulator type should be automatically inferred.Copyright © 2014–2020 The Apache Software Foundation. All rights reserved.