Graph Algorithms

This documentation is for an out-of-date version of Apache Flink. We recommend you use the latest stable version.

The logic blocks with which the Graph API and top-level algorithms are assembled are accessible in Gelly as graph algorithms in the org.apache.flink.graph.asm package. These algorithms provide optimization and tuning through configuration parameters and may provide implicit runtime reuse when processing the same input with a similar configuration.

Algorithm	Description
degree.annotate.directed. VertexInDegree	Annotate vertices of a directed graph with the in-degree. `DataSet<Vertex<K, LongValue>> inDegree = graph .run(new VertexInDegree() .setIncludeZeroDegreeVertices(true));` Optional configuration: setIncludeZeroDegreeVertices: by default only the edge set is processed for the computation of degree; when this flag is set an additional join is performed against the vertex set in order to output vertices with an in-degree of zero setParallelism: override the operator parallelism
degree.annotate.directed. VertexOutDegree	Annotate vertices of a directed graph with the out-degree. `DataSet<Vertex<K, LongValue>> outDegree = graph .run(new VertexOutDegree() .setIncludeZeroDegreeVertices(true));` Optional configuration: setIncludeZeroDegreeVertices: by default only the edge set is processed for the computation of degree; when this flag is set an additional join is performed against the vertex set in order to output vertices with an out-degree of zero setParallelism: override the operator parallelism
degree.annotate.directed. VertexDegrees	Annotate vertices of a directed graph with the degree, out-degree, and in-degree. `DataSet<Vertex<K, Tuple2<LongValue, LongValue>>> degrees = graph .run(new VertexDegrees() .setIncludeZeroDegreeVertices(true));` Optional configuration: setIncludeZeroDegreeVertices: by default only the edge set is processed for the computation of degree; when this flag is set an additional join is performed against the vertex set in order to output vertices with out- and in-degree of zero setParallelism: override the operator parallelism
degree.annotate.directed. EdgeSourceDegrees	Annotate edges of a directed graph with the degree, out-degree, and in-degree of the source ID. `DataSet<Edge<K, Tuple2<EV, Degrees>>> sourceDegrees = graph .run(new EdgeSourceDegrees());` Optional configuration: setParallelism: override the operator parallelism
degree.annotate.directed. EdgeTargetDegrees	Annotate edges of a directed graph with the degree, out-degree, and in-degree of the target ID. `DataSet<Edge<K, Tuple2<EV, Degrees>>> targetDegrees = graph .run(new EdgeTargetDegrees();` Optional configuration: setParallelism: override the operator parallelism
degree.annotate.directed. EdgeDegreesPair	Annotate edges of a directed graph with the degree, out-degree, and in-degree of both the source and target vertices. `DataSet<Edge<K, Tuple2<EV, Degrees>>> degrees = graph .run(new EdgeDegreesPair());` Optional configuration: setParallelism: override the operator parallelism
degree.annotate.undirected. VertexDegree	Annotate vertices of an undirected graph with the degree. `DataSet<Vertex<K, LongValue>> degree = graph .run(new VertexDegree() .setIncludeZeroDegreeVertices(true) .setReduceOnTargetId(true));` Optional configuration: setIncludeZeroDegreeVertices: by default only the edge set is processed for the computation of degree; when this flag is set an additional join is performed against the vertex set in order to output vertices with a degree of zero setParallelism: override the operator parallelism setReduceOnTargetId: the degree can be counted from either the edge source or target IDs. By default the source IDs are counted. Reducing on target IDs may optimize the algorithm if the input edge list is sorted by target ID.
degree.annotate.undirected. EdgeSourceDegree	Annotate edges of an undirected graph with degree of the source ID. `DataSet<Edge<K, Tuple2<EV, LongValue>>> sourceDegree = graph .run(new EdgeSourceDegree() .setReduceOnTargetId(true));` Optional configuration: setParallelism: override the operator parallelism setReduceOnTargetId: the degree can be counted from either the edge source or target IDs. By default the source IDs are counted. Reducing on target IDs may optimize the algorithm if the input edge list is sorted by target ID.
degree.annotate.undirected. EdgeTargetDegree	Annotate edges of an undirected graph with degree of the target ID. `DataSet<Edge<K, Tuple2<EV, LongValue>>> targetDegree = graph .run(new EdgeTargetDegree() .setReduceOnSourceId(true));` Optional configuration: setParallelism: override the operator parallelism setReduceOnSourceId: the degree can be counted from either the edge source or target IDs. By default the target IDs are counted. Reducing on source IDs may optimize the algorithm if the input edge list is sorted by source ID.
degree.annotate.undirected. EdgeDegreePair	Annotate edges of an undirected graph with the degree of both the source and target vertices. `DataSet<Edge<K, Tuple3<EV, LongValue, LongValue>>> pairDegree = graph .run(new EdgeDegreePair() .setReduceOnTargetId(true));` Optional configuration: setParallelism: override the operator parallelism setReduceOnTargetId: the degree can be counted from either the edge source or target IDs. By default the source IDs are counted. Reducing on target IDs may optimize the algorithm if the input edge list is sorted by target ID.
degree.filter.undirected. MaximumDegree	Filter an undirected graph by maximum degree. `Graph<K, VV, EV> filteredGraph = graph .run(new MaximumDegree(5000) .setBroadcastHighDegreeVertices(true) .setReduceOnTargetId(true));` Optional configuration: setBroadcastHighDegreeVertices: join high-degree vertices using a broadcast-hash to reduce data shuffling when removing a relatively small number of high-degree vertices. setParallelism: override the operator parallelism setReduceOnTargetId: the degree can be counted from either the edge source or target IDs. By default the source IDs are counted. Reducing on target IDs may optimize the algorithm if the input edge list is sorted by target ID.
simple.directed. Simplify	Remove self-loops and duplicate edges from a directed graph. `graph.run(new Simplify());` Optional configuration: setParallelism: override the operator parallelism
simple.undirected. Simplify	Add symmetric edges and remove self-loops and duplicate edges from an undirected graph. `graph.run(new Simplify());` Optional configuration: setParallelism: override the operator parallelism
translate. TranslateGraphIds	Translate vertex and edge IDs using the given `TranslateFunction`. `graph.run(new TranslateGraphIds(new LongValueToStringValue()));` Required configuration: translator: implements type or value conversion Optional configuration: setParallelism: override the operator parallelism
translate. TranslateVertexValues	Translate vertex values using the given `TranslateFunction`. `graph.run(new TranslateVertexValues(new LongValueAddOffset(vertexCount)));` Required configuration: translator: implements type or value conversion Optional configuration: setParallelism: override the operator parallelism
translate. TranslateEdgeValues	Translate edge values using the given `TranslateFunction`. `graph.run(new TranslateEdgeValues(new Nullify()));` Required configuration: translator: implements type or value conversion Optional configuration: setParallelism: override the operator parallelism