public abstract class ResourceManager<WorkerType extends Serializable> extends RpcEndpoint<ResourceManagerGateway> implements LeaderContender
#registerJobManager(UUID, UUID, String, JobID, ResourceID)
registers a JobMaster
at the resource managerrequestSlot(UUID, UUID, SlotRequest)
requests a slot from the resource managerModifier and Type | Field and Description |
---|---|
static String |
RESOURCE_MANAGER_NAME |
log
Constructor and Description |
---|
ResourceManager(RpcService rpcService,
String resourceManagerEndpointId,
ResourceID resourceId,
ResourceManagerConfiguration resourceManagerConfiguration,
HighAvailabilityServices highAvailabilityServices,
HeartbeatServices heartbeatServices,
SlotManager slotManager,
MetricRegistry metricRegistry,
JobLeaderIdService jobLeaderIdService,
FatalErrorHandler fatalErrorHandler) |
Modifier and Type | Method and Description |
---|---|
protected void |
closeJobManagerConnection(JobID jobId,
Exception cause)
This method should be called by the framework once it detects that a currently registered
job manager has failed.
|
protected void |
closeTaskManagerConnection(ResourceID resourceID,
Exception cause)
This method should be called by the framework once it detects that a currently registered
task executor has failed.
|
void |
disconnectJobManager(JobID jobId,
Exception cause) |
void |
disconnectTaskManager(ResourceID resourceId,
Exception cause) |
Integer |
getNumberOfRegisteredTaskManagers(UUID requestLeaderSessionId) |
void |
grantLeadership(UUID newLeaderSessionID)
Callback method when current resourceManager is granted leadership
|
void |
handleError(Exception exception)
Handles error occurring in the leader election service
|
void |
heartbeatFromJobManager(ResourceID resourceID) |
void |
heartbeatFromTaskManager(ResourceID resourceID) |
protected abstract void |
initialize()
Initializes the framework specific components.
|
protected boolean |
isValid(UUID resourceManagerLeaderId)
Checks whether the given resource manager leader id is matching the current leader id and
not null.
|
protected void |
jobLeaderLostLeadership(JobID jobId,
UUID oldJobLeaderId) |
void |
notifySlotAvailable(UUID resourceManagerLeaderId,
InstanceID instanceID,
SlotID slotId,
AllocationID allocationId)
Notification from a TaskExecutor that a slot has become available
|
protected void |
onFatalErrorAsync(Throwable t)
Notifies the ResourceManager that a fatal error has occurred and it cannot proceed.
|
void |
registerInfoMessageListener(String address)
Registers an info message listener
|
Future<RegistrationResponse> |
registerJobManager(UUID resourceManagerLeaderId,
UUID jobManagerLeaderId,
ResourceID jobManagerResourceId,
String jobManagerAddress,
JobID jobId) |
Future<RegistrationResponse> |
registerTaskExecutor(UUID resourceManagerLeaderId,
String taskExecutorAddress,
ResourceID taskExecutorResourceId,
SlotReport slotReport)
Register a
TaskExecutor at the resource manager |
protected void |
removeJob(JobID jobId) |
Acknowledge |
requestSlot(UUID jobMasterLeaderID,
UUID resourceManagerLeaderID,
SlotRequest slotRequest)
Requests a slot from the resource manager.
|
void |
revokeLeadership()
Callback method when current resourceManager loses leadership.
|
void |
sendInfoMessage(String message) |
void |
shutDown()
Shuts down the underlying RPC endpoint via the RPC service.
|
protected abstract void |
shutDownApplication(ApplicationStatus finalStatus,
String optionalDiagnostics)
The framework specific code for shutting down the application.
|
void |
shutDownCluster(ApplicationStatus finalStatus,
String optionalDiagnostics)
Cleanup application and shut down cluster
|
void |
start()
Starts the rpc endpoint.
|
abstract void |
startNewWorker(ResourceProfile resourceProfile)
Allocates a resource using the resource profile.
|
abstract void |
stopWorker(InstanceID instanceId) |
void |
unRegisterInfoMessageListener(String address)
Unregisters an info message listener
|
protected abstract WorkerType |
workerStarted(ResourceID resourceID)
Callback when a worker was started.
|
callAsync, getAddress, getEndpointId, getMainThreadExecutor, getRpcService, getSelf, getSelfGatewayType, getTerminationFuture, runAsync, scheduleRunAsync, scheduleRunAsync, validateRunsInMainThread
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
getAddress
public static final String RESOURCE_MANAGER_NAME
public ResourceManager(RpcService rpcService, String resourceManagerEndpointId, ResourceID resourceId, ResourceManagerConfiguration resourceManagerConfiguration, HighAvailabilityServices highAvailabilityServices, HeartbeatServices heartbeatServices, SlotManager slotManager, MetricRegistry metricRegistry, JobLeaderIdService jobLeaderIdService, FatalErrorHandler fatalErrorHandler)
public void start() throws Exception
RpcEndpoint
start
in class RpcEndpoint<ResourceManagerGateway>
Exception
- indicating that something went wrong while starting the RPC endpointpublic void shutDown() throws Exception
RpcEndpoint
self gateway
. It will also not accepts executions in main thread
any more (via RpcEndpoint.callAsync(Callable, Time)
and RpcEndpoint.runAsync(Runnable)
).
This method can be overridden to add RPC endpoint specific shut down code. The overridden method should always call the parent shut down method.
shutDown
in class RpcEndpoint<ResourceManagerGateway>
Exception
- indicating that the something went wrong while shutting the RPC endpoint downpublic Future<RegistrationResponse> registerJobManager(UUID resourceManagerLeaderId, UUID jobManagerLeaderId, ResourceID jobManagerResourceId, String jobManagerAddress, JobID jobId)
public Future<RegistrationResponse> registerTaskExecutor(UUID resourceManagerLeaderId, String taskExecutorAddress, ResourceID taskExecutorResourceId, SlotReport slotReport)
TaskExecutor
at the resource managerresourceManagerLeaderId
- The fencing token for the ResourceManager leadertaskExecutorAddress
- The address of the TaskExecutor that registerstaskExecutorResourceId
- The resource ID of the TaskExecutor that registerspublic void heartbeatFromTaskManager(ResourceID resourceID)
public void heartbeatFromJobManager(ResourceID resourceID)
public void disconnectTaskManager(ResourceID resourceId, Exception cause)
public Acknowledge requestSlot(UUID jobMasterLeaderID, UUID resourceManagerLeaderID, SlotRequest slotRequest) throws ResourceManagerException, LeaderSessionIDException
slotRequest
- Slot requestResourceManagerException
LeaderSessionIDException
public void notifySlotAvailable(UUID resourceManagerLeaderId, InstanceID instanceID, SlotID slotId, AllocationID allocationId)
resourceManagerLeaderId
- TaskExecutor's resource manager leader idinstanceID
- TaskExecutor's instance idslotId
- The slot id of the available slotpublic void registerInfoMessageListener(String address)
address
- address of infoMessage listener to register to this resource managerpublic void unRegisterInfoMessageListener(String address)
address
- of the info message listener to unregister from this resource managerpublic void shutDownCluster(ApplicationStatus finalStatus, String optionalDiagnostics)
finalStatus
- of the Flink applicationoptionalDiagnostics
- for the Flink applicationpublic Integer getNumberOfRegisteredTaskManagers(UUID requestLeaderSessionId) throws LeaderIdMismatchException
LeaderIdMismatchException
protected void closeJobManagerConnection(JobID jobId, Exception cause)
jobId
- identifying the job whose leader shall be disconnected.cause
- The exception which cause the JobManager failed.protected void closeTaskManagerConnection(ResourceID resourceID, Exception cause)
resourceID
- Id of the TaskManager that has failed.cause
- The exception which cause the TaskManager failed.protected boolean isValid(UUID resourceManagerLeaderId)
resourceManagerLeaderId
- to checkprotected void removeJob(JobID jobId)
public void sendInfoMessage(String message)
protected void onFatalErrorAsync(Throwable t)
t
- The exception describing the fatal errorpublic void grantLeadership(UUID newLeaderSessionID)
grantLeadership
in interface LeaderContender
newLeaderSessionID
- unique leadershipIDpublic void revokeLeadership()
revokeLeadership
in interface LeaderContender
public void handleError(Exception exception)
handleError
in interface LeaderContender
exception
- Exception being thrown in the leader election serviceprotected abstract void initialize() throws ResourceManagerException
ResourceManagerException
- which occurs during initialization and causes the resource manager to fail.protected abstract void shutDownApplication(ApplicationStatus finalStatus, String optionalDiagnostics)
finalStatus
- The application status to report.optionalDiagnostics
- An optional diagnostics message.@VisibleForTesting public abstract void startNewWorker(ResourceProfile resourceProfile)
resourceProfile
- The resource descriptionpublic abstract void stopWorker(InstanceID instanceId)
protected abstract WorkerType workerStarted(ResourceID resourceID)
resourceID
- The worker resource idCopyright © 2014–2018 The Apache Software Foundation. All rights reserved.