Debugging #

This page describes how to debug in PyFlink.

Logging Infos #

Python UDFs can log contextual and debug information via standard Python logging modules.

@udf(input_types=[DataTypes.BIGINT(), DataTypes.BIGINT()], result_type=DataTypes.BIGINT())
def add(i, j):
    import logging"debug")
    return i + j

Accessing Logs #

If the environment variable FLINK_HOME is set, logs will be written in the log directory under FLINK_HOME. Otherwise, logs will be placed in the directory of the PyFlink module. You can execute the following command to find the log directory of the PyFlink module:

$ python -c "import pyflink;import os;print(os.path.dirname(os.path.abspath(pyflink.__file__))+'/log')"

Debugging Python UDFs #

You can make use of the pydevd_pycharm tool of PyCharm to debug Python UDFs.

  1. Create a Python Remote Debug in PyCharm

    run -> Python Remote Debug -> + -> choose a port (e.g. 6789)

  2. Install the pydevd-pycharm tool

    $ pip install pydevd-pycharm
  3. Add the following command in your Python UDF

    import pydevd_pycharm
    pydevd_pycharm.settrace('localhost', port=6789, stdoutToServer=True, stderrToServer=True)
  4. Start the previously created Python Remote Debug Server

  5. Run your Python Code