If third-party Java dependencies are used, you can specify the dependencies with the following Python Table APIs or through command line arguments directly when submitting the job.
If third-party Python dependencies are used, you can specify the dependencies with the following Python Table APIs or through command line arguments directly when submitting the job.
APIs | Description |
---|---|
add_python_file(file_path) |
Adds python file dependencies which could be python files, python packages or local directories. They will be added to the PYTHONPATH of the python UDF worker. |
set_python_requirements(requirements_file_path, requirements_cache_dir=None) |
Specifies a requirements.txt file which defines the third-party dependencies. These dependencies will be installed to a temporary directory and added to the PYTHONPATH of the python UDF worker. For the dependencies which could not be accessed in the cluster, a directory which contains the installation packages of these dependencies could be specified using the parameter "requirements_cached_dir". It will be uploaded to the cluster to support offline installation. Please make sure the installation packages matches the platform of the cluster and the python version used. These packages will be installed using pip, so also make sure the version of Pip (version >= 7.1.0) and the version of SetupTools (version >= 37.0.0). |
add_python_archive(archive_path, target_dir=None) |
Adds a python archive file dependency. The file will be extracted to the working directory of python UDF worker. If the parameter "target_dir" is specified, the archive file will be extracted to a directory named "target_dir". Otherwise, the archive file will be extracted to a directory with the same name of the archive file. Please make sure the uploaded python environment matches the platform that the cluster is running on. Currently only zip-format is supported. i.e. zip, jar, whl, egg, etc. |
set_python_executable(python_exec) |
Sets the path of the python interpreter which is used to execute the python udf workers, e.g., "/usr/local/bin/python3". Please note that if the path of the python interpreter comes from the uploaded python archive, the path specified in set_python_executable should be a relative path. Please make sure that the specified environment matches the platform that the cluster is running on. |