@Internal public class PythonKeyedProcessOperator<OUT> extends AbstractOneInputPythonFunctionOperator<Row,OUT> implements Triggerable<Row,Object>
PythonKeyedProcessOperator
is responsible for launching beam runner which will start a
python harness to execute user defined python function. It is also able to handle the timer and
state request from the python stateful user defined function.baos, baosWrapper
pythonFunctionRunner
bundleFinishedCallback, config, elementCount, lastFinishBundleTime, maxBundleSize, systemEnvEnabled
chainingStrategy, latencyStats, LOG, metrics, output, processingTimeService
Constructor and Description |
---|
PythonKeyedProcessOperator(Configuration config,
DataStreamPythonFunctionInfo pythonFunctionInfo,
RowTypeInfo inputTypeInfo,
TypeInformation<OUT> outputTypeInfo) |
PythonKeyedProcessOperator(Configuration config,
DataStreamPythonFunctionInfo pythonFunctionInfo,
RowTypeInfo inputTypeInfo,
TypeInformation<OUT> outputTypeInfo,
TypeSerializer namespaceSerializer) |
Modifier and Type | Method and Description |
---|---|
<T> AbstractDataStreamPythonFunctionOperator<T> |
copy(DataStreamPythonFunctionInfo pythonFunctionInfo,
TypeInformation<T> outputTypeInfo) |
PythonFunctionRunner |
createPythonFunctionRunner()
Creates the
PythonFunctionRunner which is responsible for Python user-defined
function execution. |
Object |
getCurrentKey() |
void |
onEventTime(InternalTimer<Row,Object> timer)
Invoked when an event-time timer fires.
|
void |
onProcessingTime(InternalTimer<Row,Object> timer)
Invoked when a processing-time timer fires.
|
void |
open()
This method is called immediately before any elements are processed, it should contain the
operator's initialization logic, e.g.
|
void |
processElement(StreamRecord<Row> element)
Processes one element that arrived on this input of the
MultipleInputStreamOperator . |
void |
setCurrentKey(Object key)
As the beam state gRPC service will access the KeyedStateBackend in parallel with this
operator, we must override this method to prevent changing the current key of the
KeyedStateBackend while the beam service is handling requests.
|
createInputCoderInfoDescriptor, createOutputCoderInfoDescriptor, emitResult, endInput, getInputTypeInfo, processElement
containsPartitionCustom, getInternalParameters, getProducedType, getPythonEnv, getPythonFunctionInfo, setContainsPartitionCustom, setNumPartitions
close, createPythonEnvironmentManager, emitResults, invokeFinishBundle
checkInvokeFinishBundleByCount, finish, getConfiguration, getFlinkMetricContainer, isBundleFinished, prepareSnapshotPreBarrier, processWatermark
getChainingStrategy, getContainingTask, getExecutionConfig, getInternalTimerService, getKeyedStateBackend, getKeyedStateStore, getMetricGroup, getOperatorConfig, getOperatorID, getOperatorName, getOperatorStateBackend, getOrCreateKeyedState, getPartitionedState, getPartitionedState, getProcessingTimeService, getRuntimeContext, getTimeServiceManager, getUserCodeClassloader, initializeState, initializeState, isUsingCustomRawKeyedState, notifyCheckpointAborted, notifyCheckpointComplete, processLatencyMarker, processLatencyMarker1, processLatencyMarker2, processWatermark1, processWatermark2, processWatermarkStatus, processWatermarkStatus1, processWatermarkStatus2, registerCounterOnOutput, reportOrForwardLatencyMarker, setChainingStrategy, setKeyContextElement1, setKeyContextElement2, setProcessingTimeService, setup, snapshotState, snapshotState
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
setKeyContextElement
close, finish, getMetricGroup, getOperatorID, initializeState, prepareSnapshotPreBarrier, setKeyContextElement1, setKeyContextElement2, snapshotState
notifyCheckpointAborted, notifyCheckpointComplete
processLatencyMarker, processWatermark, processWatermarkStatus
public PythonKeyedProcessOperator(Configuration config, DataStreamPythonFunctionInfo pythonFunctionInfo, RowTypeInfo inputTypeInfo, TypeInformation<OUT> outputTypeInfo)
public PythonKeyedProcessOperator(Configuration config, DataStreamPythonFunctionInfo pythonFunctionInfo, RowTypeInfo inputTypeInfo, TypeInformation<OUT> outputTypeInfo, TypeSerializer namespaceSerializer)
public void open() throws Exception
AbstractStreamOperator
The default implementation does nothing.
open
in interface StreamOperator<OUT>
open
in class AbstractOneInputPythonFunctionOperator<Row,OUT>
Exception
- An exception in this method causes the operator to fail.public void onEventTime(InternalTimer<Row,Object> timer) throws Exception
Triggerable
onEventTime
in interface Triggerable<Row,Object>
Exception
public void onProcessingTime(InternalTimer<Row,Object> timer) throws Exception
Triggerable
onProcessingTime
in interface Triggerable<Row,Object>
Exception
public PythonFunctionRunner createPythonFunctionRunner() throws Exception
AbstractExternalPythonFunctionOperator
PythonFunctionRunner
which is responsible for Python user-defined
function execution.createPythonFunctionRunner
in class AbstractExternalPythonFunctionOperator<OUT>
Exception
public void processElement(StreamRecord<Row> element) throws Exception
Input
MultipleInputStreamOperator
.
This method is guaranteed to not be called concurrently with other methods of the operator.processElement
in interface Input<Row>
Exception
public void setCurrentKey(Object key)
setCurrentKey
in interface KeyContext
setCurrentKey
in class AbstractPythonFunctionOperator<OUT>
public Object getCurrentKey()
getCurrentKey
in interface KeyContext
getCurrentKey
in class AbstractStreamOperator<OUT>
public <T> AbstractDataStreamPythonFunctionOperator<T> copy(DataStreamPythonFunctionInfo pythonFunctionInfo, TypeInformation<T> outputTypeInfo)
copy
in class AbstractDataStreamPythonFunctionOperator<OUT>
Copyright © 2014–2022 The Apache Software Foundation. All rights reserved.