Class

org.apache.flink.api.scala

GroupedDataSet

Related Doc: package scala

Permalink

class GroupedDataSet[T] extends AnyRef

A DataSet to which a grouping key was added. Operations work on groups of elements with the same key (aggregate, reduce, and reduceGroup).

A secondary sort order can be added with sortGroup, but this is only used when using one of the group-at-a-time operations, i.e. reduceGroup.

Annotations
@Public()
Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. GroupedDataSet
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new GroupedDataSet(set: DataSet[T], keys: Keys[T])(implicit arg0: ClassTag[T])

    Permalink

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. def aggregate(agg: Aggregations, field: Int): AggregateDataSet[T]

    Permalink

    Creates a new DataSet by aggregating the specified field using the given aggregation function.

    Creates a new DataSet by aggregating the specified field using the given aggregation function. Since this is a keyed DataSet the aggregation will be performed on groups of elements with the same key.

    This only works on CaseClass DataSets.

  5. def aggregate(agg: Aggregations, field: String): AggregateDataSet[T]

    Permalink

    Creates a new DataSet by aggregating the specified tuple field using the given aggregation function.

    Creates a new DataSet by aggregating the specified tuple field using the given aggregation function. Since this is a keyed DataSet the aggregation will be performed on groups of tuples with the same key.

    This only works on Tuple DataSets.

  6. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  7. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  8. def combineGroup[R](combiner: GroupCombineFunction[T, R])(implicit arg0: TypeInformation[R], arg1: ClassTag[R]): DataSet[R]

    Permalink

    Applies a CombineFunction on a grouped DataSet.

    Applies a CombineFunction on a grouped DataSet. A CombineFunction is similar to a GroupReduceFunction but does not perform a full data exchange. Instead, the CombineFunction calls the combine method once per partition for combining a group of results. This operator is suitable for combining values into an intermediate format before doing a proper groupReduce where the data is shuffled across the node for further reduction. The GroupReduce operator can also be supplied with a combiner by implementing the RichGroupReduce function. The combine method of the RichGroupReduce function demands input and output type to be the same. The CombineFunction, on the other side, can have an arbitrary output type.

  9. def combineGroup[R](fun: (Iterator[T], Collector[R]) ⇒ Unit)(implicit arg0: TypeInformation[R], arg1: ClassTag[R]): DataSet[R]

    Permalink

    Applies a CombineFunction on a grouped DataSet.

    Applies a CombineFunction on a grouped DataSet. A CombineFunction is similar to a GroupReduceFunction but does not perform a full data exchange. Instead, the CombineFunction calls the combine method once per partition for combining a group of results. This operator is suitable for combining values into an intermediate format before doing a proper groupReduce where the data is shuffled across the node for further reduction. The GroupReduce operator can also be supplied with a combiner by implementing the RichGroupReduce function. The combine method of the RichGroupReduce function demands input and output type to be the same. The CombineFunction, on the other side, can have an arbitrary output type.

  10. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  11. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  12. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  13. def first(n: Int): DataSet[T]

    Permalink

    Creates a new DataSet containing the first n elements of each group of this DataSet.

  14. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  15. def getCustomPartitioner[K](): Partitioner[K]

    Permalink

    Gets the custom partitioner to be used for this grouping, or null, if none was defined.

    Gets the custom partitioner to be used for this grouping, or null, if none was defined.

    Annotations
    @Internal()
  16. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  17. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  18. def max(field: String): AggregateDataSet[T]

    Permalink

    Syntactic sugar for aggregate with MAX

  19. def max(field: Int): AggregateDataSet[T]

    Permalink

    Syntactic sugar for aggregate with MAX

  20. def maxBy(fields: Int*): DataSet[T]

    Permalink

    Applies a special case of a reduce transformation maxBy on a grouped DataSet The transformation consecutively calls a ReduceFunction until only a single element remains which is the result of the transformation.

    Applies a special case of a reduce transformation maxBy on a grouped DataSet The transformation consecutively calls a ReduceFunction until only a single element remains which is the result of the transformation. A ReduceFunction combines two elements into one new element of the same type.

  21. def min(field: String): AggregateDataSet[T]

    Permalink

    Syntactic sugar for aggregate with MIN

  22. def min(field: Int): AggregateDataSet[T]

    Permalink

    Syntactic sugar for aggregate with MIN

  23. def minBy(fields: Int*): DataSet[T]

    Permalink

    Applies a special case of a reduce transformation minBy on a grouped DataSet.

    Applies a special case of a reduce transformation minBy on a grouped DataSet. The transformation consecutively calls a ReduceFunction until only a single element remains which is the result of the transformation. A ReduceFunction combines two elements into one new element of the same type.

  24. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  25. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  26. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  27. def reduce(reducer: ReduceFunction[T], strategy: CombineHint): DataSet[T]

    Permalink

    Special reduce operation for explicitly telling the system what strategy to use for the combine phase.

    Special reduce operation for explicitly telling the system what strategy to use for the combine phase. If null is given as the strategy, then the optimizer will pick the strategy.

    Annotations
    @PublicEvolving()
  28. def reduce(reducer: ReduceFunction[T]): DataSet[T]

    Permalink

    Creates a new DataSet by merging the elements of each group (elements with the same key) using an associative reduce function.

  29. def reduce(fun: (T, T) ⇒ T, strategy: CombineHint): DataSet[T]

    Permalink

    Special reduce operation for explicitly telling the system what strategy to use for the combine phase.

    Special reduce operation for explicitly telling the system what strategy to use for the combine phase. If null is given as the strategy, then the optimizer will pick the strategy.

    Annotations
    @PublicEvolving()
  30. def reduce(fun: (T, T) ⇒ T): DataSet[T]

    Permalink

    Creates a new DataSet by merging the elements of each group (elements with the same key) using an associative reduce function.

  31. def reduceGroup[R](reducer: GroupReduceFunction[T, R])(implicit arg0: TypeInformation[R], arg1: ClassTag[R]): DataSet[R]

    Permalink

    Creates a new DataSet by passing for each group (elements with the same key) the list of elements to the GroupReduceFunction.

    Creates a new DataSet by passing for each group (elements with the same key) the list of elements to the GroupReduceFunction. The function can output zero or more elements. The concatenation of the emitted values will form the resulting DataSet.

  32. def reduceGroup[R](fun: (Iterator[T], Collector[R]) ⇒ Unit)(implicit arg0: TypeInformation[R], arg1: ClassTag[R]): DataSet[R]

    Permalink

    Creates a new DataSet by passing for each group (elements with the same key) the list of elements to the group reduce function.

    Creates a new DataSet by passing for each group (elements with the same key) the list of elements to the group reduce function. The function can output zero or more elements using the Collector. The concatenation of the emitted values will form the resulting DataSet.

  33. def reduceGroup[R](fun: (Iterator[T]) ⇒ R)(implicit arg0: TypeInformation[R], arg1: ClassTag[R]): DataSet[R]

    Permalink

    Creates a new DataSet by passing for each group (elements with the same key) the list of elements to the group reduce function.

    Creates a new DataSet by passing for each group (elements with the same key) the list of elements to the group reduce function. The function must output one element. The concatenation of those will form the resulting DataSet.

  34. def sortGroup[K](fun: (T) ⇒ K, order: Order)(implicit arg0: TypeInformation[K]): GroupedDataSet[T]

    Permalink

    Adds a secondary sort key to this GroupedDataSet.

    Adds a secondary sort key to this GroupedDataSet. This will only have an effect if you use one of the group-at-a-time, i.e. reduceGroup.

    This works on any data type.

  35. def sortGroup(field: String, order: Order): GroupedDataSet[T]

    Permalink

    Adds a secondary sort key to this GroupedDataSet.

    Adds a secondary sort key to this GroupedDataSet. This will only have an effect if you use one of the group-at-a-time, i.e. reduceGroup.

    This only works on CaseClass DataSets.

  36. def sortGroup(field: Int, order: Order): GroupedDataSet[T]

    Permalink

    Adds a secondary sort key to this GroupedDataSet.

    Adds a secondary sort key to this GroupedDataSet. This will only have an effect if you use one of the group-at-a-time, i.e. reduceGroup.

    This only works on Tuple DataSets.

  37. def sum(field: String): AggregateDataSet[T]

    Permalink

    Syntactic sugar for aggregate with SUM

  38. def sum(field: Int): AggregateDataSet[T]

    Permalink

    Syntactic sugar for aggregate with SUM

  39. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  40. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  41. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  42. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  43. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  44. def withPartitioner[K](partitioner: Partitioner[K])(implicit arg0: TypeInformation[K]): GroupedDataSet[T]

    Permalink

    Sets a custom partitioner for the grouping.

Inherited from AnyRef

Inherited from Any

Ungrouped