public abstract class InputGate extends Object implements AsyncDataInput<BufferOrEvent>, AutoCloseable
Each intermediate result is partitioned over its producing parallel subtasks; each of these partitions is furthermore partitioned into one or more subpartitions.
As an example, consider a map-reduce program, where the map operator produces data and the reduce operator consumes the produced data.
+-----+ +---------------------+ +--------+
| Map | = produce => | Intermediate Result | <= consume = | Reduce |
+-----+ +---------------------+ +--------+
When deploying such a program in parallel, the intermediate result will be partitioned over its producing parallel subtasks; each of these partitions is furthermore partitioned into one or more subpartitions.
Intermediate result
+-----------------------------------------+
| +----------------+ | +-----------------------+
+-------+ | +-------------+ +=> | Subpartition 1 | | <=======+=== | Input Gate | Reduce 1 |
| Map 1 | ==> | | Partition 1 | =| +----------------+ | | +-----------------------+
+-------+ | +-------------+ +=> | Subpartition 2 | | <==+ |
| +----------------+ | | | Subpartition request
| | | |
| +----------------+ | | |
+-------+ | +-------------+ +=> | Subpartition 1 | | <==+====+
| Map 2 | ==> | | Partition 2 | =| +----------------+ | | +-----------------------+
+-------+ | +-------------+ +=> | Subpartition 2 | | <==+======== | Input Gate | Reduce 2 |
| +----------------+ | +-----------------------+
+-----------------------------------------+
In the above example, two map subtasks produce the intermediate result in parallel, resulting in two partitions (Partition 1 and 2). Each of these partitions is further partitioned into two subpartitions -- one for each parallel reduce subtask. As shown in the Figure, each reduce task will have an input gate attached to it. This will provide its input, which will consist of one subpartition from each partition of the intermediate result.
Modifier and Type | Class and Description |
---|---|
protected static class |
InputGate.InputWithData<INPUT,DATA>
Simple pojo for INPUT, DATA and moreAvailable.
|
Modifier and Type | Field and Description |
---|---|
protected CompletableFuture<?> |
isAvailable |
AVAILABLE
Constructor and Description |
---|
InputGate() |
Modifier and Type | Method and Description |
---|---|
abstract Optional<BufferOrEvent> |
getNext()
Blocking call waiting for next
BufferOrEvent . |
abstract int |
getNumberOfInputChannels() |
CompletableFuture<?> |
isAvailable()
Check if this instance is available for further processing.
|
abstract boolean |
isFinished() |
abstract Optional<BufferOrEvent> |
pollNext()
Poll the
BufferOrEvent . |
protected void |
resetIsAvailable() |
abstract void |
sendTaskEvent(TaskEvent event) |
abstract void |
setup()
Setup gate, potentially heavy-weight, blocking operation comparing to just creation.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
close
protected CompletableFuture<?> isAvailable
public abstract int getNumberOfInputChannels()
public abstract boolean isFinished()
isFinished
in interface AvailabilityListener
public abstract Optional<BufferOrEvent> getNext() throws IOException, InterruptedException
BufferOrEvent
.Optional.empty()
if isFinished()
returns true.IOException
InterruptedException
public abstract Optional<BufferOrEvent> pollNext() throws IOException, InterruptedException
BufferOrEvent
.pollNext
in interface AsyncDataInput<BufferOrEvent>
Optional.empty()
if there is no data to return or if isFinished()
returns true.IOException
InterruptedException
public abstract void sendTaskEvent(TaskEvent event) throws IOException
IOException
public CompletableFuture<?> isAvailable()
AvailabilityListener
When hot looping to avoid volatile access in CompletableFuture.isDone()
user of
this method should do the following check:
AvailabilityListener input = ...;
if (input.isAvailable() == AvailabilityListener.AVAILABLE || input.isAvailable().isDone()) {
// do something;
}
isAvailable
in interface AvailabilityListener
AvailabilityListener.AVAILABLE
should be returned. Previously returned
not completed futures should become completed once there are more records available.protected void resetIsAvailable()
public abstract void setup() throws IOException, InterruptedException
IOException
InterruptedException
Copyright © 2014–2020 The Apache Software Foundation. All rights reserved.