public class PythonStreamGroupAggregateOperator extends AbstractOneInputPythonFunctionOperator<RowData,RowData> implements Triggerable<RowData,VoidNamespace>, CleanupState
Modifier and Type | Field and Description |
---|---|
protected static String |
FLINK_AGGREGATE_FUNCTION_SCHEMA_CODER_URN |
protected RowType |
inputType
The input logical type.
|
protected static byte |
NORMAL_RECORD |
protected RowType |
outputType
The output logical type.
|
protected static String |
STREAM_GROUP_AGGREGATE_URN |
protected TypeSerializer<RowData> |
udfInputTypeSerializer
The TypeSerializer for udf input elements.
|
protected TypeSerializer<RowData> |
udfOutputTypeSerializer
The TypeSerializer for udf execution results.
|
protected RowType |
userDefinedFunctionInputType
The user-defined function input logical type.
|
elementCount, maxBundleSize, pythonFunctionRunner
chainingStrategy, latencyStats, LOG, metrics, output, processingTimeService
Constructor and Description |
---|
PythonStreamGroupAggregateOperator(Configuration config,
RowType inputType,
RowType outputType,
PythonAggregateFunctionInfo[] aggregateFunctions,
DataViewUtils.DataViewSpec[][] dataViewSpecs,
int[] grouping,
int indexOfCountStar,
boolean countStarInserted,
boolean generateUpdateBefore,
long minRetentionTime,
long maxRetentionTime) |
Modifier and Type | Method and Description |
---|---|
PythonFunctionRunner |
createPythonFunctionRunner()
Creates the
PythonFunctionRunner which is responsible for Python user-defined
function execution. |
void |
emitResult(Tuple2<byte[],Integer> resultTuple)
Sends the execution result to the downstream operator.
|
Object |
getCurrentKey() |
protected TypeSerializer |
getKeySerializer() |
protected RowType |
getKeyType() |
PythonEnv |
getPythonEnv()
Returns the
PythonEnv used to create PythonEnvironmentManager.. |
FlinkFnApi.UserDefinedAggregateFunctions |
getUserDefinedFunctionsProto()
Gets the proto representation of the Python user-defined aggregate functions to be executed.
|
void |
onEventTime(InternalTimer<RowData,VoidNamespace> timer)
Invoked when an event-time timer fires.
|
void |
onProcessingTime(InternalTimer<RowData,VoidNamespace> timer)
Invoked when a processing-time timer fires.
|
void |
open()
This method is called immediately before any elements are processed, it should contain the
operator's initialization logic, e.g.
|
void |
processElement(StreamRecord<RowData> element)
Processes one element that arrived on this input of the
MultipleInputStreamOperator . |
void |
setCurrentKey(Object key)
As the beam state gRPC service will access the KeyedStateBackend in parallel with this
operator, we must override this method to prevent changing the current key of the
KeyedStateBackend while the beam service is handling requests.
|
endInput
checkInvokeFinishBundleByCount, close, createPythonEnvironmentManager, dispose, emitResults, getConfig, getFlinkMetricContainer, getPythonConfig, invokeFinishBundle, isBundleFinished, prepareSnapshotPreBarrier, processWatermark, setPythonConfig
getChainingStrategy, getContainingTask, getExecutionConfig, getInternalTimerService, getKeyedStateBackend, getKeyedStateStore, getMetricGroup, getOperatorConfig, getOperatorID, getOperatorName, getOperatorStateBackend, getOrCreateKeyedState, getPartitionedState, getPartitionedState, getProcessingTimeService, getRuntimeContext, getTimeServiceManager, getUserCodeClassloader, initializeState, initializeState, isUsingCustomRawKeyedState, notifyCheckpointAborted, notifyCheckpointComplete, processLatencyMarker, processLatencyMarker1, processLatencyMarker2, processWatermark1, processWatermark2, reportOrForwardLatencyMarker, setChainingStrategy, setKeyContextElement1, setKeyContextElement2, setProcessingTimeService, setup, snapshotState, snapshotState
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
registerProcessingCleanupTimer
setKeyContextElement
close, dispose, getMetricGroup, getOperatorID, initializeState, prepareSnapshotPreBarrier, setKeyContextElement1, setKeyContextElement2, snapshotState
notifyCheckpointAborted, notifyCheckpointComplete
processLatencyMarker, processWatermark
@VisibleForTesting protected static final String FLINK_AGGREGATE_FUNCTION_SCHEMA_CODER_URN
@VisibleForTesting protected static final String STREAM_GROUP_AGGREGATE_URN
@VisibleForTesting protected static final byte NORMAL_RECORD
protected final RowType inputType
protected final RowType outputType
protected transient RowType userDefinedFunctionInputType
protected transient TypeSerializer<RowData> udfOutputTypeSerializer
protected transient TypeSerializer<RowData> udfInputTypeSerializer
public PythonStreamGroupAggregateOperator(Configuration config, RowType inputType, RowType outputType, PythonAggregateFunctionInfo[] aggregateFunctions, DataViewUtils.DataViewSpec[][] dataViewSpecs, int[] grouping, int indexOfCountStar, boolean countStarInserted, boolean generateUpdateBefore, long minRetentionTime, long maxRetentionTime)
public void setCurrentKey(Object key)
setCurrentKey
in interface KeyContext
setCurrentKey
in class AbstractStreamOperator<RowData>
public Object getCurrentKey()
getCurrentKey
in interface KeyContext
getCurrentKey
in class AbstractStreamOperator<RowData>
public void open() throws Exception
AbstractStreamOperator
The default implementation does nothing.
open
in interface StreamOperator<RowData>
open
in class AbstractPythonFunctionOperator<RowData>
Exception
- An exception in this method causes the operator to fail.public PythonFunctionRunner createPythonFunctionRunner() throws Exception
AbstractPythonFunctionOperator
PythonFunctionRunner
which is responsible for Python user-defined
function execution.createPythonFunctionRunner
in class AbstractPythonFunctionOperator<RowData>
Exception
public void processElement(StreamRecord<RowData> element) throws Exception
Input
MultipleInputStreamOperator
.
This method is guaranteed to not be called concurrently with other methods of the operator.processElement
in interface Input<RowData>
Exception
public void emitResult(Tuple2<byte[],Integer> resultTuple) throws Exception
AbstractPythonFunctionOperator
emitResult
in class AbstractPythonFunctionOperator<RowData>
Exception
public void onEventTime(InternalTimer<RowData,VoidNamespace> timer)
onEventTime
in interface Triggerable<RowData,VoidNamespace>
public void onProcessingTime(InternalTimer<RowData,VoidNamespace> timer) throws Exception
onProcessingTime
in interface Triggerable<RowData,VoidNamespace>
Exception
public PythonEnv getPythonEnv()
AbstractPythonFunctionOperator
PythonEnv
used to create PythonEnvironmentManager..getPythonEnv
in class AbstractPythonFunctionOperator<RowData>
@VisibleForTesting protected TypeSerializer getKeySerializer()
protected RowType getKeyType()
public FlinkFnApi.UserDefinedAggregateFunctions getUserDefinedFunctionsProto()
Copyright © 2014–2021 The Apache Software Foundation. All rights reserved.