This documentation is for an out-of-date version of Apache Flink. We recommend you use the latest stable version.

Debugging

This page describes how to debug in PyFlink.

Logging Infos

Python UDFs can log contextual and debug information via standard Python logging modules.

@udf(input_types=[DataTypes.BIGINT(), DataTypes.BIGINT()], result_type=DataTypes.BIGINT())
def add(i, j):
    import logging
    logging.info("debug")
    return i + j

Accessing Logs

If the environment variable FLINK_HOME is set, logs will be written in the log directory under FLINK_HOME. Otherwise, logs will be placed in the directory of the PyFlink module. You can execute the following command to find the log directory of the PyFlink module:

$ python -c "import pyflink;import os;print(os.path.dirname(os.path.abspath(pyflink.__file__))+'/log')"

Debugging Python UDFs

You can make use of the pydevd_pycharm tool of PyCharm to debug Python UDFs.

  1. Create a Python Remote Debug in PyCharm

    run -> Python Remote Debug -> + -> choose a port (e.g. 6789)

  2. Install the pydevd-pycharm tool

     $ pip install pydevd-pycharm
    
  3. Add the following command in your Python UDF

     import pydevd_pycharm
     pydevd_pycharm.settrace('localhost', port=6789, stdoutToServer=True, stderrToServer=True)
    
  4. Start the previously created Python Remote Debug Server

  5. Run your Python Code