Modules

Stateful Function applications are composed of one or more Modules. A module is a bundle of functions that are loaded by the runtime and available to be messaged. Functions from all loaded modules are multiplexed and free to message each other arbitrarily.

Stateful Functions supports two types of modules: Embedded and Remote.

Embedded Module

Embedded modules are co-located with, and embedded within, the Apache Flink® runtime.

This module type only supports JVM based languages and are defined by implementing the StatefulFunctionModule interface. Embedded modules offer a single configuration method where stateful functions are bound to the system based on their function type. Runtime configurations are available through the globalConfiguration, which is the union of all configurations in the applications flink-conf.yaml under the prefix statefun.module.global-config and any command line arguments passed in the form --key value.

package org.apache.flink.statefun.docs;

import java.util.Map;
import org.apache.flink.statefun.sdk.spi.StatefulFunctionModule;

public class BasicFunctionModule implements StatefulFunctionModule {

	public void configure(Map<String, String> globalConfiguration, Binder binder) {

		// Declare the user function and bind it to its type
		binder.bindFunctionProvider(FnWithDependency.TYPE, new CustomProvider());

		// Stateful functions that do not require any configuration
		// can declare their provider using java 8 lambda syntax
		binder.bindFunctionProvider(Identifiers.HELLO_TYPE, unused -> new FnHelloWorld());
	}
}

Embedded modules leverage Java’s Service Provider Interfaces (SPI) for discovery. This means that every JAR should contain a file org.apache.flink.statefun.sdk.spi.StatefulFunctionModule in the META_INF/services resource directory that lists all available modules that it provides.

org.apache.flink.statefun.docs.BasicFunctionModule

Remote Module

Remote modules are run as external processes from the Apache Flink® runtime; in the same container, as a sidecar, or other external location.

This module type can support any number of language SDK’s. Remote modules are registered with the system via YAML configuration files.

Specification

A remote module configuration consists of a meta section and a spec section. meta contains auxillary information about the module. The spec describes the functions contained within the module and defines their persisted values.

Defining Functions

module.spec.functions declares a list of function objects that are implemented by the remote module. A function is described via a number of properties.

  • function.meta.kind
    • The protocol used to communicate with the remote function.
    • Supported Values - http
  • function.meta.type
    • The function type, defined as <namespace>/<name>.
  • function.spec.endpoint
    • The endpoint at which the function is reachable.
    • Supported schemes are: http, https.
    • Transport via UNIX domain sockets is supported by using the schemes http+unix or https+unix.
    • When using UNIX domain sockets, the endpoint format is: http+unix://<socket-file-path>/<serve-url-path>. For example, http+unix:///uds.sock/path/of/url.
  • function.spec.states
    • A list of the persisted values declared within the remote function.
    • Each entry consists of a name property and an optional expireAfter property.
    • Default for expireAfter - 0, meaning that state expiration is disabled.
  • function.spec.maxNumBatchRequests
    • The maximum number of records that can be processed by a function for a particular address before invoking backpressure on the system.
    • Default - 1000
  • function.spec.timeout
    • The maximum amount of time for the runtime to wait for the remote function to return before failing.
    • Default - 1 min

Full Example

version: "2.0"

module:
  meta:
    type: remote
  spec:
    functions:
      - function:
        meta:
          kind: http
          type: example/greeter
        spec:
          endpoint: http://<host-name>/statefun
          states:
            - name: seen_count
              expireAfter: 5min
          maxNumBatchRequests: 500
          timeout: 2min