public interface RecoverableWriter
RecoverableFsDataOutputStream
.
It can be used to write data to a file system in a way that the writing can be
resumed consistently after a failure and recovery without loss of data or possible
duplication of bytes.
The streams do not make the files they write to immediately visible, but instead write
to temp files or other temporary storage. To publish the data atomically in the
end, the stream offers the RecoverableFsDataOutputStream.closeForCommit()
method
to create a committer that publishes the result.
These writers are useful in the context of checkpointing. The example below illustrates how to use them:
// --------- initial run --------
RecoverableWriter writer = fileSystem.createRecoverableWriter();
RecoverableFsDataOutputStream out = writer.open(path);
out.write(...);
// persist intermediate state
ResumeRecoverable intermediateState = out.persist();
storeInCheckpoint(intermediateState);
// --------- recovery --------
ResumeRecoverable lastCheckpointState = ...; // get state from checkpoint
RecoverableWriter writer = fileSystem.createRecoverableWriter();
RecoverableFsDataOutputStream out = writer.recover(lastCheckpointState);
out.write(...); // append more data
out.closeForCommit().commit(); // close stream and publish all the data
// --------- recovery without appending --------
ResumeRecoverable lastCheckpointState = ...; // get state from checkpoint
RecoverableWriter writer = fileSystem.createRecoverableWriter();
Committer committer = writer.recoverForCommit(lastCheckpointState);
committer.commit(); // publish the state as of the last checkpoint
Recovery relies on data persistence in the target file system or object store. While the code itself works with the specific primitives that the target storage offers, recovery will fail if the data written so far was deleted by an external factor. For example, some implementations stage data in temp files or object parts. If these were deleted by someone or by an automated cleanup policy, then resuming may fail. This is not surprising and should be expected, but we want to explicitly point this out here.
Specific care is needed for systems like S3, where the implementation uses Multipart Uploads to incrementally upload and persist parts of the result. Timeouts for Multipart Uploads and life time of Parts in unfinished Multipart Uploads need to be set in the bucket policy high enough to accommodate the recovery. These values are typically in the days, so regular recovery is typically not a problem. What can become an issue is situations where a Flink application is hard killed (all processes or containers removed) and then one tries to manually recover the application from an externalized checkpoint some days later. In that case, systems like S3 may have removed uncommitted parts and recovery will not succeed.
From the perspective of the implementer, it would be desirable to make this class generic with respect to the concrete types of 'CommitRecoverable' and 'ResumeRecoverable'. However, we found that this makes the code more clumsy to use and we hence dropped the generics at the cost of doing some explicit casts in the implementation that would otherwise have been implicitly generated by the generics compiler.
Modifier and Type | Interface and Description |
---|---|
static interface |
RecoverableWriter.CommitRecoverable
A handle to an in-progress stream with a defined and persistent amount of data.
|
static interface |
RecoverableWriter.ResumeRecoverable
A handle to an in-progress stream with a defined and persistent amount of data.
|
Modifier and Type | Method and Description |
---|---|
SimpleVersionedSerializer<RecoverableWriter.CommitRecoverable> |
getCommitRecoverableSerializer()
The serializer for the CommitRecoverable types created in this writer.
|
SimpleVersionedSerializer<RecoverableWriter.ResumeRecoverable> |
getResumeRecoverableSerializer()
The serializer for the ResumeRecoverable types created in this writer.
|
RecoverableFsDataOutputStream |
open(Path path)
Opens a new recoverable stream to write to the given path.
|
RecoverableFsDataOutputStream |
recover(RecoverableWriter.ResumeRecoverable resumable)
Resumes a recoverable stream consistently at the point indicated by the given ResumeRecoverable.
|
RecoverableFsDataOutputStream.Committer |
recoverForCommit(RecoverableWriter.CommitRecoverable resumable)
Recovers a recoverable stream consistently at the point indicated by the given CommitRecoverable
for finalizing and committing.
|
boolean |
supportsResume()
Checks whether the writer and its streams support resuming (appending to) files after
recovery (via the
recover(ResumeRecoverable) method). |
RecoverableFsDataOutputStream open(Path path) throws IOException
path
- The path of the file/object to write to.IOException
- Thrown if the stream could not be opened/initialized.RecoverableFsDataOutputStream recover(RecoverableWriter.ResumeRecoverable resumable) throws IOException
This method is optional and whether it is supported is indicated through the
supportsResume()
method.
resumable
- The opaque handle with the recovery information.IOException
- Thrown, if resuming fails.UnsupportedOperationException
- Thrown if this optional method is not supported.RecoverableFsDataOutputStream.Committer recoverForCommit(RecoverableWriter.CommitRecoverable resumable) throws IOException
resumable
- The opaque handle with the recovery information.IOException
- Thrown, if recovery fails.SimpleVersionedSerializer<RecoverableWriter.CommitRecoverable> getCommitRecoverableSerializer()
SimpleVersionedSerializer<RecoverableWriter.ResumeRecoverable> getResumeRecoverableSerializer()
boolean supportsResume()
recover(ResumeRecoverable)
method).
If true, then this writer supports the recover(ResumeRecoverable)
method.
If false, then that method may not be supported and streams can only be recovered via
recoverForCommit(CommitRecoverable)
.
Copyright © 2014–2020 The Apache Software Foundation. All rights reserved.