@Public public abstract class BinaryInputFormat<T> extends FileInputFormat<T>
Modifier and Type | Class and Description |
---|---|
protected class |
BinaryInputFormat.BlockBasedInput
Writes a block info at the end of the blocks.
Current implementation uses only int and not long. |
FileInputFormat.FileBaseStatistics, FileInputFormat.InputSplitOpenThread
Modifier and Type | Field and Description |
---|---|
static String |
BLOCK_SIZE_PARAMETER_KEY
The config parameter which defines the fixed length of a record.
|
static long |
NATIVE_BLOCK_SIZE |
currentSplit, ENUMERATE_NESTED_FILES_FLAG, enumerateNestedFiles, filePath, INFLATER_INPUT_STREAM_FACTORIES, minSplitSize, numSplits, openTimeout, READ_WHOLE_SPLIT_FLAG, splitLength, splitStart, stream, unsplittable
Constructor and Description |
---|
BinaryInputFormat() |
Modifier and Type | Method and Description |
---|---|
void |
configure(Configuration parameters)
Configures the file input format by reading the file path from the configuration.
|
BlockInfo |
createBlockInfo() |
FileInputSplit[] |
createInputSplits(int minNumSplits)
Computes the input splits for the file.
|
protected org.apache.flink.api.common.io.BinaryInputFormat.SequentialStatistics |
createStatistics(List<FileStatus> files,
FileInputFormat.FileBaseStatistics stats)
Fill in the statistics.
|
protected abstract T |
deserialize(T reuse,
DataInputView dataInput) |
protected List<FileStatus> |
getFiles() |
protected FileInputSplit[] |
getInputSplits() |
org.apache.flink.api.common.io.BinaryInputFormat.SequentialStatistics |
getStatistics(BaseStatistics cachedStats)
Obtains basic file statistics containing only file size.
|
T |
nextRecord(T record)
Reads the next record from the input.
|
void |
open(FileInputSplit split)
Opens an input stream to the file defined in the input format.
|
boolean |
reachedEnd()
Method used to check if the end of the input is reached.
|
acceptFile, close, decorateInputStream, extractFileExtension, getFilePath, getFileStats, getInflaterInputStreamFactory, getInputSplitAssigner, getMinSplitSize, getNumSplits, getOpenTimeout, getSplitLength, getSplitStart, registerInflaterInputStreamFactory, setFilePath, setFilePath, setMinSplitSize, setNumSplits, setOpenTimeout, testForUnsplittable, toString
getRuntimeContext, setRuntimeContext
public static final String BLOCK_SIZE_PARAMETER_KEY
public static final long NATIVE_BLOCK_SIZE
public void configure(Configuration parameters)
FileInputFormat
configure
in interface InputFormat<T,FileInputSplit>
configure
in class FileInputFormat<T>
parameters
- The configuration with all parameters.InputFormat.configure(org.apache.flink.configuration.Configuration)
public FileInputSplit[] createInputSplits(int minNumSplits) throws IOException
FileInputFormat
createInputSplits
in interface InputFormat<T,FileInputSplit>
createInputSplits
in interface InputSplitSource<FileInputSplit>
createInputSplits
in class FileInputFormat<T>
minNumSplits
- The minimum desired number of file splits.IOException
- Thrown, when the creation of the splits was erroneous.InputFormat.createInputSplits(int)
protected List<FileStatus> getFiles() throws IOException
IOException
public org.apache.flink.api.common.io.BinaryInputFormat.SequentialStatistics getStatistics(BaseStatistics cachedStats)
FileInputFormat
getStatistics
in interface InputFormat<T,FileInputSplit>
getStatistics
in class FileInputFormat<T>
cachedStats
- The statistics that were cached. May be null.InputFormat.getStatistics(org.apache.flink.api.common.io.statistics.BaseStatistics)
protected FileInputSplit[] getInputSplits() throws IOException
IOException
public BlockInfo createBlockInfo()
protected org.apache.flink.api.common.io.BinaryInputFormat.SequentialStatistics createStatistics(List<FileStatus> files, FileInputFormat.FileBaseStatistics stats) throws IOException
files
- The files that are associated with this block input format.stats
- The pre-filled statistics.IOException
public void open(FileInputSplit split) throws IOException
FileInputFormat
The stream is actually opened in an asynchronous thread to make sure any interruptions to the thread working on the input format do not reach the file system.
open
in interface InputFormat<T,FileInputSplit>
open
in class FileInputFormat<T>
split
- The split to be opened.IOException
- Thrown, if the spit could not be opened due to an I/O problem.public boolean reachedEnd() throws IOException
InputFormat
When this method is called, the input format it guaranteed to be opened.
IOException
- Thrown, if an I/O error occurred.public T nextRecord(T record) throws IOException
InputFormat
When this method is called, the input format it guaranteed to be opened.
record
- Object that may be reused.IOException
- Thrown, if an I/O error occurred.protected abstract T deserialize(T reuse, DataInputView dataInput) throws IOException
IOException
Copyright © 2014–2017 The Apache Software Foundation. All rights reserved.