public class JDBCInputFormat extends RichInputFormat<Row,InputSplit> implements ResultTypeQueryable<Row>
TypeInformation>[] fieldTypes = new TypeInformation>[] {
BasicTypeInfo.INT_TYPE_INFO,
BasicTypeInfo.STRING_TYPE_INFO,
BasicTypeInfo.STRING_TYPE_INFO,
BasicTypeInfo.DOUBLE_TYPE_INFO,
BasicTypeInfo.INT_TYPE_INFO
};
RowTypeInfo rowTypeInfo = new RowTypeInfo(fieldTypes);
JDBCInputFormat jdbcInputFormat = JDBCInputFormat.buildJDBCInputFormat()
.setDrivername("org.apache.derby.jdbc.EmbeddedDriver")
.setDBUrl("jdbc:derby:memory:ebookshop")
.setQuery("select * from books")
.setRowTypeInfo(rowTypeInfo)
.finish();
In order to query the JDBC source in parallel, you need to provide a
parameterized query template (i.e. a valid PreparedStatement
) and
a ParameterValuesProvider
which provides binding values for the
query parameters. E.g.:
Serializable[][] queryParameters = new String[2][1];
queryParameters[0] = new String[]{"Kumar"};
queryParameters[1] = new String[]{"Tan Ah Teck"};
JDBCInputFormat jdbcInputFormat = JDBCInputFormat.buildJDBCInputFormat()
.setDrivername("org.apache.derby.jdbc.EmbeddedDriver")
.setDBUrl("jdbc:derby:memory:ebookshop")
.setQuery("select * from books WHERE author = ?")
.setRowTypeInfo(rowTypeInfo)
.setParametersProvider(new GenericParameterValuesProvider(queryParameters))
.finish();
Row
,
ParameterValuesProvider
,
PreparedStatement
,
DriverManager
,
Serialized FormModifier and Type | Class and Description |
---|---|
static class |
JDBCInputFormat.JDBCInputFormatBuilder |
Constructor and Description |
---|
JDBCInputFormat() |
Modifier and Type | Method and Description |
---|---|
static JDBCInputFormat.JDBCInputFormatBuilder |
buildJDBCInputFormat()
A builder used to set parameters to the output format's configuration in a fluent way.
|
void |
close()
Closes all resources used.
|
void |
closeInputFormat()
Closes this InputFormat instance.
|
void |
configure(Configuration parameters)
Configures this input format.
|
InputSplit[] |
createInputSplits(int minNumSplits)
Creates the different splits of the input that can be processed in parallel.
|
InputSplitAssigner |
getInputSplitAssigner(InputSplit[] inputSplits)
Gets the type of the input splits that are processed by this input format.
|
RowTypeInfo |
getProducedType()
Gets the data type (as a
TypeInformation ) produced by this function or input format. |
BaseStatistics |
getStatistics(BaseStatistics cachedStatistics)
Gets the basic statistics from the input described by this format.
|
Row |
nextRecord(Row row)
Stores the next resultSet row in a tuple
|
void |
open(InputSplit inputSplit)
Connects to the source database and executes the query in a parallel
fashion if
this
InputFormat is built using a parameterized query (i.e. |
void |
openInputFormat()
Opens this InputFormat instance.
|
boolean |
reachedEnd()
Checks whether all data has been read.
|
getRuntimeContext, setRuntimeContext
public RowTypeInfo getProducedType()
ResultTypeQueryable
TypeInformation
) produced by this function or input format.getProducedType
in interface ResultTypeQueryable<Row>
public void configure(Configuration parameters)
InputFormat
This method is always called first on a newly instantiated input format.
configure
in interface InputFormat<Row,InputSplit>
parameters
- The configuration with all parameters (note: not the Flink config but the TaskConfig).public void openInputFormat()
RichInputFormat
openInputFormat
in class RichInputFormat<Row,InputSplit>
InputFormat
public void closeInputFormat()
RichInputFormat
RichInputFormat.openInputFormat()
should be closed in this method.closeInputFormat
in class RichInputFormat<Row,InputSplit>
InputFormat
public void open(InputSplit inputSplit) throws IOException
InputFormat
is built using a parameterized query (i.e. using
a PreparedStatement
)
and a proper ParameterValuesProvider
, in a non-parallel
fashion otherwise.open
in interface InputFormat<Row,InputSplit>
inputSplit
- which is ignored if this InputFormat is executed as a
non-parallel source,
a "hook" to the query parameters otherwise (using its
splitNumber)IOException
- if there's an error during the execution of the querypublic void close() throws IOException
close
in interface InputFormat<Row,InputSplit>
IOException
- Indicates that a resource could not be closed.public boolean reachedEnd() throws IOException
reachedEnd
in interface InputFormat<Row,InputSplit>
IOException
public Row nextRecord(Row row) throws IOException
nextRecord
in interface InputFormat<Row,InputSplit>
row
- row to be reused.Row
IOException
public BaseStatistics getStatistics(BaseStatistics cachedStatistics) throws IOException
InputFormat
When this method is called, the input format it guaranteed to be configured.
getStatistics
in interface InputFormat<Row,InputSplit>
cachedStatistics
- The statistics that were cached. May be null.IOException
public InputSplit[] createInputSplits(int minNumSplits) throws IOException
InputFormat
When this method is called, the input format it guaranteed to be configured.
createInputSplits
in interface InputFormat<Row,InputSplit>
createInputSplits
in interface InputSplitSource<InputSplit>
minNumSplits
- The minimum desired number of splits. If fewer are created, some parallel
instances may remain idle.IOException
- Thrown, when the creation of the splits was erroneous.public InputSplitAssigner getInputSplitAssigner(InputSplit[] inputSplits)
InputFormat
getInputSplitAssigner
in interface InputFormat<Row,InputSplit>
getInputSplitAssigner
in interface InputSplitSource<InputSplit>
public static JDBCInputFormat.JDBCInputFormatBuilder buildJDBCInputFormat()
Copyright © 2014–2018 The Apache Software Foundation. All rights reserved.