@Internal public abstract class TwoInputPythonFunctionOperator<IN1,IN2,RUNNER_OUT,OUT> extends AbstractTwoInputPythonFunctionOperator<IN1,IN2,OUT>
TwoInputPythonFunctionOperator
is responsible for launching beam runner which will start
a python harness to execute two-input user defined python function.Modifier and Type | Field and Description |
---|---|
protected ByteArrayInputStreamWithPos |
bais |
protected DataInputViewStreamWrapper |
baisWrapper |
protected ByteArrayOutputStreamWithPos |
baos |
protected DataOutputViewStreamWrapper |
baosWrapper |
protected TimestampedCollector |
collector |
protected Row |
reuseRow |
elementCount, maxBundleSize, pythonFunctionRunner
chainingStrategy, latencyStats, LOG, metrics, output, processingTimeService
Constructor and Description |
---|
TwoInputPythonFunctionOperator(Configuration config,
DataStreamPythonFunctionInfo pythonFunctionInfo,
String coderUrn,
TypeInformation<Row> runnerInputTypeInfo,
TypeInformation<RUNNER_OUT> runnerOutputTypeInfo) |
TwoInputPythonFunctionOperator(Configuration config,
TypeInformation<IN1> inputTypeInfo1,
TypeInformation<IN2> inputTypeInfo2,
TypeInformation<OUT> outputTypeInfo,
DataStreamPythonFunctionInfo pythonFunctionInfo,
String coderUrn) |
Modifier and Type | Method and Description |
---|---|
PythonFunctionRunner |
createPythonFunctionRunner()
Creates the
PythonFunctionRunner which is responsible for Python user-defined
function execution. |
protected String |
getCoderUrn() |
protected Map<String,String> |
getJobOptions() |
PythonEnv |
getPythonEnv()
Returns the
PythonEnv used to create PythonEnvironmentManager.. |
protected DataStreamPythonFunctionInfo |
getPythonFunctionInfo() |
protected TypeInformation<Row> |
getRunnerInputTypeInfo() |
protected TypeSerializer<Row> |
getRunnerInputTypeSerializer() |
protected TypeInformation<RUNNER_OUT> |
getRunnerOutputTypeInfo() |
protected TypeSerializer<RUNNER_OUT> |
getRunnerOutputTypeSerializer() |
void |
open()
This method is called immediately before any elements are processed, it should contain the
operator's initialization logic, e.g.
|
void |
processElement1(StreamRecord<IN1> element)
Processes one element that arrived on the first input of this two-input operator.
|
void |
processElement2(StreamRecord<IN2> element)
Processes one element that arrived on the second input of this two-input operator.
|
endInput
checkInvokeFinishBundleByCount, close, createPythonEnvironmentManager, dispose, emitResult, emitResults, getConfig, getFlinkMetricContainer, getPythonConfig, invokeFinishBundle, isBundleFinished, prepareSnapshotPreBarrier, processWatermark, setCurrentKey, setPythonConfig
getChainingStrategy, getContainingTask, getCurrentKey, getExecutionConfig, getInternalTimerService, getKeyedStateBackend, getKeyedStateStore, getMetricGroup, getOperatorConfig, getOperatorID, getOperatorName, getOperatorStateBackend, getOrCreateKeyedState, getPartitionedState, getPartitionedState, getProcessingTimeService, getRuntimeContext, getTimeServiceManager, getUserCodeClassloader, initializeState, initializeState, isUsingCustomRawKeyedState, notifyCheckpointAborted, notifyCheckpointComplete, processLatencyMarker, processLatencyMarker1, processLatencyMarker2, processWatermark1, processWatermark2, reportOrForwardLatencyMarker, setChainingStrategy, setKeyContextElement1, setKeyContextElement2, setProcessingTimeService, setup, snapshotState, snapshotState
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
processLatencyMarker1, processLatencyMarker2, processWatermark1, processWatermark2
close, dispose, getMetricGroup, getOperatorID, initializeState, prepareSnapshotPreBarrier, setKeyContextElement1, setKeyContextElement2, snapshotState
notifyCheckpointAborted, notifyCheckpointComplete
getCurrentKey, setCurrentKey
protected transient ByteArrayInputStreamWithPos bais
protected transient DataInputViewStreamWrapper baisWrapper
protected transient ByteArrayOutputStreamWithPos baos
protected transient DataOutputViewStreamWrapper baosWrapper
protected transient TimestampedCollector collector
protected transient Row reuseRow
public TwoInputPythonFunctionOperator(Configuration config, DataStreamPythonFunctionInfo pythonFunctionInfo, String coderUrn, TypeInformation<Row> runnerInputTypeInfo, TypeInformation<RUNNER_OUT> runnerOutputTypeInfo)
public TwoInputPythonFunctionOperator(Configuration config, TypeInformation<IN1> inputTypeInfo1, TypeInformation<IN2> inputTypeInfo2, TypeInformation<OUT> outputTypeInfo, DataStreamPythonFunctionInfo pythonFunctionInfo, String coderUrn)
public void open() throws Exception
AbstractStreamOperator
The default implementation does nothing.
open
in interface StreamOperator<OUT>
open
in class AbstractPythonFunctionOperator<OUT>
Exception
- An exception in this method causes the operator to fail.public PythonFunctionRunner createPythonFunctionRunner() throws Exception
AbstractPythonFunctionOperator
PythonFunctionRunner
which is responsible for Python user-defined
function execution.createPythonFunctionRunner
in class AbstractPythonFunctionOperator<OUT>
Exception
public PythonEnv getPythonEnv()
AbstractPythonFunctionOperator
PythonEnv
used to create PythonEnvironmentManager..getPythonEnv
in class AbstractPythonFunctionOperator<OUT>
public void processElement1(StreamRecord<IN1> element) throws Exception
TwoInputStreamOperator
Exception
public void processElement2(StreamRecord<IN2> element) throws Exception
TwoInputStreamOperator
Exception
protected String getCoderUrn()
protected TypeInformation<Row> getRunnerInputTypeInfo()
protected TypeInformation<RUNNER_OUT> getRunnerOutputTypeInfo()
protected DataStreamPythonFunctionInfo getPythonFunctionInfo()
protected TypeSerializer<Row> getRunnerInputTypeSerializer()
protected TypeSerializer<RUNNER_OUT> getRunnerOutputTypeSerializer()
Copyright © 2014–2022 The Apache Software Foundation. All rights reserved.