This documentation is for an unreleased version of Apache Flink Stateful Functions. We recommend you use the latest stable version.

Module Configuration #

An application’s module configuration contains all necessary runtime information to configure the Stateful Functions runtime for a specific application. It includes the endpoints where functions can be reached along with ingress and egress definitions.

version: "3.0"

module:
  meta: 
    type: remote
  spec:
    endpoints:
      meta:
        kind: http
      spec:
        functions: com.example/*
        urlPathTempalte: https://bar.foo.com/{function.name}
    ingresses:
      - ingress:
        meta:
          type: io.statefun.kafka/ingress
          id: com.example/my-ingress
        spec:
          address: kafka-broker:9092
          consumerGroupId: my-consumer-group
          startupPosition:
            type: earliest
          topics:
            - topic: message-topic
              valueType: io.statefun.types/string
              targets:
                - com.example/greeter
    egresses:
      - egress:
        meta:
          type: io.statefun.kafka/egress
          id: example/output-messages
        spec:
          address: kafka-broker:9092
          deliverySemantic:
            type: exactly-once
            transactionTimeout: 15min

Endpoint Definition #

module.spec.endpoints declares a list of endpoint objects containing all metadata the Stateful Function runtime needs to invoke a function. The following is an example of an endpoint definition for a single function.

endpoints:
  - endpoint:
    meta: 
      kind: http
    spec:
      functions: com.example/*
      urlPathTemplate: https://bar.foo.com/{function.name}

In this example, an endpoint for a function within the logical namespace com.example is declared. The runtime will invoke all functions under this namespace with the endpoint URL template.

URL Template #

The URL template name may contain template parameters that are filled in based on the function’s specific type. For example, a message sent to message type com.example/greeter will be sent to http://bar.foo.com/greeter.

endpoints:
  - endpoint:
    meta: 
      kind: http
    spec:
      functions: com.example/*
      urlPathTemplate: https://bar.foo.com/{function.name}

Templating parameterization works well with load balancers and service gateways. Suppose http://bar.foo.com was an NGINX server, you can use the different paths to physical systems. Users may now deploy some functions on Kubernetes, others AWS Lambda, while others still on physical servers.

Example Using NGINX #

http {
    server {
        listen 80;
        server_name proxy;

        location /greeter {
            proxy_pass http://greeter:8000/service;
        }

        location /emailsender {
            proxy_pass http://emailsender:8000/service;
        }
    }
}

events {}

This pattern makes the function’s physical deployment transparent to the Stateful Functions runtime. They can be deployed, upgraded, rolled back and scaled transparently from the cluster.

Full Options #

Protocol #

The RPC protocol used to communicate by the runtime to communicate with the function. http is currently the only supported value.

endpoint:
  meta:
    kind: http
  spec:

Typename #

The meta typename is the logical type name used to match a function invocation with a physical endpoint. Typenames are composed of a namespace (required) and name (optional). Endpoints containing a namespace but no name will match all function types in that namespace.

It is recommended to have endpoints only specified against a namespace to enable dynamic function registration.

endpoint:
  meta:
    kind: http
  spec:
    functions: com.example/*

Url Path Template #

The urlPathTemplate is the physical path to be resolved when an endpoint is matched. It may contain templated parameters for dynamic routing.

endpoint:
  meta:
  spec:
    urlPathTemplate: http://bar.foo.com/{function.name}

Supported schemes:

http
https

Transport via UNIX domain sockets is supported by using the schemes http+unix or https+unix. When using UNIX domain sockets, the endpoint format is: http+unix://<socket-file-path>/<serve-url-path>. For example, http+unix:///uds.sock/path/of/url. For example, http+unix:///uds.sock/path/of/url.

Max Batch Requests #

The maximum number of records that can be processed by a function for a particular address (typename + id) before invoking backpressure on the system. The default value is 1000.

endpoint:
  meta:
  spec:
    maxNumBatchRequests: 1000

Timeouts #

Timeouts is an optional object containing various timeouts for a request before a function is considered unreachable.

Call

The timeout for a complete function call. This configuration spans the entire call: resolving DNS, connecting, writing the request body, server processing, and reading the response body. If the call requires redirects or retries all must complete within one timeout period.

endpoint: 
  meta:
  spec:
    timeout:
      call: 1 min

Connect

The default connect timeout for new connections. The connect timeout is applied when connecting a TCP socket to the target host.

endpoint: 
  meta:
  spec:
    timeout:
      connect: 10 sec

Read

The default read timeout for new connections. The read timeout is applied to both the TCP socket and for individual read IO operations.

endpoint: 
  meta:
  spec:
    timeout:
      read: 10 sec

Write

The default write timeout for new connections. The write timeout is applied for individual write IO operations.

endpoint: 
  meta:
  spec:
    timeout:
      write: 10 sec

Ingress #

An ingress is an input point where data is consumed from an external system and forwarded to zero or more functions. It is defined via an identifier and specification.

An ingress identifier, similar to a function type, uniquely identifies an ingress by specifying its input type, a namespace, and a name.

The spec defines the details of how to connect to the external system, which is specific to each individual I/O module. Each identifier-spec pair is bound to the system inside an stateful function module.

See [IO Modules]//nightlies.apache.org/flink/flink-statefun-docs-master/docs/io-module/overview/ for more information on configuring an ingress.

version: "3.0"

module:
  meta:
    type: remote
  spec:
    ingresses:
      - ingress:
          meta:
            id: example/my-ingress
            type: # ingress type
          spec: # ingress specific configurations

Egress #

An egress is an output point where data is written to an external system and forwarded to zero or more functions. It is defined via an identifier and specification.

An egress identifier, similar to a function type, uniquely identifies an egress.

The spec defines the details of how to connect to the external system, which is specific to each individual I/O module. Each identifier-spec pair is bound to the system inside a stateful function module.

See [IO Modules]//nightlies.apache.org/flink/flink-statefun-docs-master/docs/io-module/overview/ for more information on configuring an egress.

version: "3.0"

module:
  meta:
    type: remote
  spec:
    egresses:
      - egress:
        meta:
          id: example/my-egress
          type: # egress type
        spec: # egress specific configurations