Users can use their existing Hive User Defined Functions in Flink.
Supported UDF types include:
Upon query planning and execution, Hive’s UDF and GenericUDF are automatically translated into Flink’s ScalarFunction,
Hive’s GenericUDTF is automatically translated into Flink’s TableFunction,
and Hive’s UDAF and GenericUDAFResolver2 are translated into Flink’s AggregateFunction.
To use a Hive User Defined Function, user have to
set a HiveCatalog backed by Hive Metastore that contains that function as current catalog of the session
include a jar that contains that function in Flink’s classpath
use Blink planner.
Using Hive User Defined Functions
Assuming we have the following Hive functions registered in Hive Metastore:
From Hive CLI, we can see they are registered:
Then, users can use them in SQL as:
Hive built-in functions are currently not supported out of box in Flink. To use Hive built-in functions, users must register them manually in Hive Metastore first.
Support for Hive functions has only been tested for Flink batch in Blink planner.
Hive functions currently cannot be used across catalogs in Flink.
Please reference to Hive for data type limitations.
Use Hive Built-in Functions via HiveModule
The HiveModule provides Hive built-in functions as Flink system (built-in) functions to Flink SQL and Table API users.
NOTE that some Hive built-in functions in older versions have thread safety issues.
We recommend users patch their own Hive to fix them.