public class ActiveResourceManager<WorkerType extends ResourceIDRetrievable> extends ResourceManager<WorkerType> implements ResourceEventHandler<WorkerType>
ResourceManager
.
This resource manager actively requests and releases resources from/to the external resource
management frameworks. With different ResourceManagerDriver
provided, this resource
manager can work with various frameworks.
RpcEndpoint.MainThreadExecutor
Modifier and Type | Field and Description |
---|---|
protected Configuration |
flinkConfig |
ioExecutor, RESOURCE_MANAGER_NAME, resourceManagerMetricGroup
log, rpcServer
Constructor and Description |
---|
ActiveResourceManager(ResourceManagerDriver<WorkerType> resourceManagerDriver,
Configuration flinkConfig,
RpcService rpcService,
ResourceID resourceId,
HighAvailabilityServices highAvailabilityServices,
HeartbeatServices heartbeatServices,
SlotManager slotManager,
ResourceManagerPartitionTrackerFactory clusterPartitionTrackerFactory,
JobLeaderIdService jobLeaderIdService,
ClusterInformation clusterInformation,
FatalErrorHandler fatalErrorHandler,
ResourceManagerMetricGroup resourceManagerMetricGroup,
ThresholdMeter startWorkerFailureRater,
java.time.Duration retryInterval,
java.time.Duration workerRegistrationTimeout,
Executor ioExecutor) |
Modifier and Type | Method and Description |
---|---|
protected CompletableFuture<Void> |
clearStateAsync()
This method can be overridden to add a (non-blocking) state clearing routine to the
ResourceManager that will be called when leadership is revoked.
|
protected void |
initialize()
Initializes the framework specific components.
|
protected void |
internalDeregisterApplication(ApplicationStatus finalStatus,
String optionalDiagnostics)
The framework specific code to deregister the application.
|
void |
onError(Throwable exception)
Notifies that an error has occurred that the process cannot proceed.
|
void |
onPreviousAttemptWorkersRecovered(Collection<WorkerType> recoveredWorkers)
Notifies that workers of previous attempt have been recovered from the external resource
manager.
|
protected void |
onWorkerRegistered(WorkerType worker) |
void |
onWorkerTerminated(ResourceID resourceId,
String diagnostics)
Notifies that the worker has been terminated.
|
protected CompletableFuture<Void> |
prepareLeadershipAsync()
This method can be overridden to add a (non-blocking) initialization routine to the
ResourceManager that will be called when leadership is granted but before leadership is
confirmed.
|
protected void |
registerMetrics() |
boolean |
startNewWorker(WorkerResourceSpec workerResourceSpec)
Allocates a resource using the worker resource specification.
|
boolean |
stopWorker(WorkerType worker)
Stops the given worker.
|
protected void |
terminate()
Terminates the framework specific components.
|
protected WorkerType |
workerStarted(ResourceID resourceID)
Callback when a worker was started.
|
cancelSlotRequest, closeJobManagerConnection, closeTaskManagerConnection, declareRequiredResources, deregisterApplication, disconnectJobManager, disconnectTaskManager, getNumberOfRegisteredTaskManagers, getNumberRequiredTaskManagers, getRequiredResources, grantLeadership, handleError, hasLeadership, heartbeatFromJobManager, heartbeatFromTaskManager, jobLeaderLostLeadership, listDataSets, notifySlotAvailable, onFatalError, onLeadership, onStart, onStop, registerJobManager, registerTaskExecutor, releaseClusterPartitions, releaseResource, removeJob, requestResourceOverview, requestSlot, requestTaskExecutorThreadInfoGateway, requestTaskManagerDetailsInfo, requestTaskManagerFileUploadByName, requestTaskManagerFileUploadByType, requestTaskManagerInfo, requestTaskManagerLogList, requestTaskManagerMetricQueryServiceAddresses, requestThreadDump, revokeLeadership, sendSlotReport, setFailUnfulfillableRequest
callAsyncWithoutFencing, getFencingToken, getMainThreadExecutor, getUnfencedMainThreadExecutor, runAsyncWithoutFencing, setFencingToken
callAsync, closeAsync, getAddress, getEndpointId, getHostname, getRpcService, getSelfGateway, getTerminationFuture, internalCallOnStart, internalCallOnStop, isRunning, runAsync, scheduleRunAsync, scheduleRunAsync, start, stop, validateRunsInMainThread
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
getFencingToken
getAddress, getHostname
getDescription
close
protected final Configuration flinkConfig
public ActiveResourceManager(ResourceManagerDriver<WorkerType> resourceManagerDriver, Configuration flinkConfig, RpcService rpcService, ResourceID resourceId, HighAvailabilityServices highAvailabilityServices, HeartbeatServices heartbeatServices, SlotManager slotManager, ResourceManagerPartitionTrackerFactory clusterPartitionTrackerFactory, JobLeaderIdService jobLeaderIdService, ClusterInformation clusterInformation, FatalErrorHandler fatalErrorHandler, ResourceManagerMetricGroup resourceManagerMetricGroup, ThresholdMeter startWorkerFailureRater, java.time.Duration retryInterval, java.time.Duration workerRegistrationTimeout, Executor ioExecutor)
protected void initialize() throws ResourceManagerException
ResourceManager
initialize
in class ResourceManager<WorkerType extends ResourceIDRetrievable>
ResourceManagerException
- which occurs during initialization and causes the resource
manager to fail.protected void terminate() throws ResourceManagerException
ResourceManager
terminate
in class ResourceManager<WorkerType extends ResourceIDRetrievable>
ResourceManagerException
protected CompletableFuture<Void> prepareLeadershipAsync()
ResourceManager
prepareLeadershipAsync
in class ResourceManager<WorkerType extends ResourceIDRetrievable>
CompletableFuture
that completes when the computation is finished.protected CompletableFuture<Void> clearStateAsync()
ResourceManager
clearStateAsync
in class ResourceManager<WorkerType extends ResourceIDRetrievable>
CompletableFuture
that completes when the state clearing routine is
finished.protected void internalDeregisterApplication(ApplicationStatus finalStatus, @Nullable String optionalDiagnostics) throws ResourceManagerException
ResourceManager
This method also needs to make sure all pending containers that are not registered yet are returned.
internalDeregisterApplication
in class ResourceManager<WorkerType extends ResourceIDRetrievable>
finalStatus
- The application status to report.optionalDiagnostics
- A diagnostics message or null
.ResourceManagerException
- if the application could not be shut down.public boolean startNewWorker(WorkerResourceSpec workerResourceSpec)
ResourceManager
startNewWorker
in class ResourceManager<WorkerType extends ResourceIDRetrievable>
workerResourceSpec
- workerResourceSpec specifies the size of the to be allocated
resourceprotected WorkerType workerStarted(ResourceID resourceID)
ResourceManager
workerStarted
in class ResourceManager<WorkerType extends ResourceIDRetrievable>
resourceID
- The worker resource idpublic boolean stopWorker(WorkerType worker)
ResourceManager
stopWorker
in class ResourceManager<WorkerType extends ResourceIDRetrievable>
worker
- The worker.protected void onWorkerRegistered(WorkerType worker)
onWorkerRegistered
in class ResourceManager<WorkerType extends ResourceIDRetrievable>
protected void registerMetrics()
registerMetrics
in class ResourceManager<WorkerType extends ResourceIDRetrievable>
public void onPreviousAttemptWorkersRecovered(Collection<WorkerType> recoveredWorkers)
ResourceEventHandler
onPreviousAttemptWorkersRecovered
in interface ResourceEventHandler<WorkerType extends ResourceIDRetrievable>
recoveredWorkers
- Collection of worker nodes, in the deployment specific type.public void onWorkerTerminated(ResourceID resourceId, String diagnostics)
ResourceEventHandler
onWorkerTerminated
in interface ResourceEventHandler<WorkerType extends ResourceIDRetrievable>
resourceId
- Identifier of the terminated worker.diagnostics
- Diagnostic message about the worker termination.public void onError(Throwable exception)
ResourceEventHandler
onError
in interface ResourceEventHandler<WorkerType extends ResourceIDRetrievable>
exception
- Exception that describes the error.Copyright © 2014–2022 The Apache Software Foundation. All rights reserved.