GroupedDataSet

Instance Constructors

new GroupedDataSet(set: DataSet[T], keys: Keys[T])(implicit arg0: ClassTag[T])

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def aggregate(agg: Aggregations, field: Int): AggregateDataSet[T]

Creates a new DataSet by aggregating the specified field using the given aggregation function.
Creates a new DataSet by aggregating the specified field using the given aggregation function. Since this is a keyed DataSet the aggregation will be performed on groups of elements with the same key.
This only works on CaseClass DataSets.
def aggregate(agg: Aggregations, field: String): AggregateDataSet[T]

Creates a new DataSet by aggregating the specified tuple field using the given aggregation function.
Creates a new DataSet by aggregating the specified tuple field using the given aggregation function. Since this is a keyed DataSet the aggregation will be performed on groups of tuples with the same key.
This only works on Tuple DataSets.
final def asInstanceOf[T0]: T0

Definition Classes
Any
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( ... )
def combineGroup[R](combiner: GroupCombineFunction[T, R])(implicit arg0: TypeInformation[R], arg1: ClassTag[R]): DataSet[R]

Applies a CombineFunction on a grouped DataSet.
Applies a CombineFunction on a grouped DataSet. A CombineFunction is similar to a GroupReduceFunction but does not perform a full data exchange. Instead, the CombineFunction calls the combine method once per partition for combining a group of results. This operator is suitable for combining values into an intermediate format before doing a proper groupReduce where the data is shuffled across the node for further reduction. The GroupReduce operator can also be supplied with a combiner by implementing the RichGroupReduce function. The combine method of the RichGroupReduce function demands input and output type to be the same. The CombineFunction, on the other side, can have an arbitrary output type.
def combineGroup[R](fun: (Iterator[T], Collector[R]) ⇒ Unit)(implicit arg0: TypeInformation[R], arg1: ClassTag[R]): DataSet[R]

Applies a CombineFunction on a grouped DataSet.
Applies a CombineFunction on a grouped DataSet. A CombineFunction is similar to a GroupReduceFunction but does not perform a full data exchange. Instead, the CombineFunction calls the combine method once per partition for combining a group of results. This operator is suitable for combining values into an intermediate format before doing a proper groupReduce where the data is shuffled across the node for further reduction. The GroupReduce operator can also be supplied with a combiner by implementing the RichGroupReduce function. The combine method of the RichGroupReduce function demands input and output type to be the same. The CombineFunction, on the other side, can have an arbitrary output type.
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
def first(n: Int): DataSet[T]

Creates a new DataSet containing the first n elements of each group of this DataSet.
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
def getCustomPartitioner[K](): Partitioner[K]

Gets the custom partitioner to be used for this grouping, or null, if none was defined.
Gets the custom partitioner to be used for this grouping, or null, if none was defined.

Annotations
@Internal()
def hashCode(): Int

Definition Classes
AnyRef → Any
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
def max(field: String): AggregateDataSet[T]

Syntactic sugar for aggregate with MAX
def max(field: Int): AggregateDataSet[T]

Syntactic sugar for aggregate with MAX
def maxBy(fields: Int*): DataSet[T]

Applies a special case of a reduce transformation maxBy on a grouped DataSet The transformation consecutively calls a ReduceFunction until only a single element remains which is the result of the transformation.
Applies a special case of a reduce transformation maxBy on a grouped DataSet The transformation consecutively calls a ReduceFunction until only a single element remains which is the result of the transformation. A ReduceFunction combines two elements into one new element of the same type.
def min(field: String): AggregateDataSet[T]

Syntactic sugar for aggregate with MIN
def min(field: Int): AggregateDataSet[T]

Syntactic sugar for aggregate with MIN
def minBy(fields: Int*): DataSet[T]

Applies a special case of a reduce transformation minBy on a grouped DataSet.
Applies a special case of a reduce transformation minBy on a grouped DataSet. The transformation consecutively calls a ReduceFunction until only a single element remains which is the result of the transformation. A ReduceFunction combines two elements into one new element of the same type.
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
final def notifyAll(): Unit

Definition Classes
AnyRef
def reduce(reducer: ReduceFunction[T], strategy: CombineHint): DataSet[T]

Special reduce operation for explicitly telling the system what strategy to use for the combine phase.
Special reduce operation for explicitly telling the system what strategy to use for the combine phase. If null is given as the strategy, then the optimizer will pick the strategy.

Annotations
@PublicEvolving()
def reduce(reducer: ReduceFunction[T]): DataSet[T]

Creates a new DataSet by merging the elements of each group (elements with the same key) using an associative reduce function.
def reduce(fun: (T, T) ⇒ T, strategy: CombineHint): DataSet[T]

Special reduce operation for explicitly telling the system what strategy to use for the combine phase.
Special reduce operation for explicitly telling the system what strategy to use for the combine phase. If null is given as the strategy, then the optimizer will pick the strategy.

Annotations
@PublicEvolving()
def reduce(fun: (T, T) ⇒ T): DataSet[T]

Creates a new DataSet by merging the elements of each group (elements with the same key) using an associative reduce function.
def reduceGroup[R](reducer: GroupReduceFunction[T, R])(implicit arg0: TypeInformation[R], arg1: ClassTag[R]): DataSet[R]

Creates a new DataSet by passing for each group (elements with the same key) the list of elements to the GroupReduceFunction.
Creates a new DataSet by passing for each group (elements with the same key) the list of elements to the GroupReduceFunction. The function can output zero or more elements. The concatenation of the emitted values will form the resulting DataSet.
def reduceGroup[R](fun: (Iterator[T], Collector[R]) ⇒ Unit)(implicit arg0: TypeInformation[R], arg1: ClassTag[R]): DataSet[R]

Creates a new DataSet by passing for each group (elements with the same key) the list of elements to the group reduce function.
Creates a new DataSet by passing for each group (elements with the same key) the list of elements to the group reduce function. The function can output zero or more elements using the Collector. The concatenation of the emitted values will form the resulting DataSet.
def reduceGroup[R](fun: (Iterator[T]) ⇒ R)(implicit arg0: TypeInformation[R], arg1: ClassTag[R]): DataSet[R]

Creates a new DataSet by passing for each group (elements with the same key) the list of elements to the group reduce function.
Creates a new DataSet by passing for each group (elements with the same key) the list of elements to the group reduce function. The function must output one element. The concatenation of those will form the resulting DataSet.
def sortGroup[K](fun: (T) ⇒ K, order: Order)(implicit arg0: TypeInformation[K]): GroupedDataSet[T]

Adds a secondary sort key to this GroupedDataSet.
Adds a secondary sort key to this GroupedDataSet. This will only have an effect if you use one of the group-at-a-time, i.e. reduceGroup.
This works on any data type.
def sortGroup(field: String, order: Order): GroupedDataSet[T]

Adds a secondary sort key to this GroupedDataSet.
Adds a secondary sort key to this GroupedDataSet. This will only have an effect if you use one of the group-at-a-time, i.e. reduceGroup.
This only works on CaseClass DataSets.
def sortGroup(field: Int, order: Order): GroupedDataSet[T]

Adds a secondary sort key to this GroupedDataSet.
Adds a secondary sort key to this GroupedDataSet. This will only have an effect if you use one of the group-at-a-time, i.e. reduceGroup.
This only works on Tuple DataSets.
def sum(field: String): AggregateDataSet[T]

Syntactic sugar for aggregate with SUM
def sum(field: Int): AggregateDataSet[T]

Syntactic sugar for aggregate with SUM
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
AnyRef → Any
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
def withPartitioner[K](partitioner: Partitioner[K])(implicit arg0: TypeInformation[K]): GroupedDataSet[T]

Sets a custom partitioner for the grouping.

Related Doc: package scala

class GroupedDataSet[T] extends AnyRef

Instance Constructors

new GroupedDataSet(set: DataSet[T], keys: Keys[T])(implicit arg0: ClassTag[T])

Value Members

final def !=(arg0: Any): Boolean

final def ##(): Int

final def ==(arg0: Any): Boolean

def aggregate(agg: Aggregations, field: Int): AggregateDataSet[T]

def aggregate(agg: Aggregations, field: String): AggregateDataSet[T]

final def asInstanceOf[T0]: T0

def clone(): AnyRef

def combineGroup[R](combiner: GroupCombineFunction[T, R])(implicit arg0: TypeInformation[R], arg1: ClassTag[R]): DataSet[R]

def combineGroup[R](fun: (Iterator[T], Collector[R]) ⇒ Unit)(implicit arg0: TypeInformation[R], arg1: ClassTag[R]): DataSet[R]

final def eq(arg0: AnyRef): Boolean

def equals(arg0: Any): Boolean

def finalize(): Unit

def first(n: Int): DataSet[T]

final def getClass(): Class[_]

def getCustomPartitioner[K](): Partitioner[K]

def hashCode(): Int

final def isInstanceOf[T0]: Boolean

def max(field: String): AggregateDataSet[T]

def max(field: Int): AggregateDataSet[T]

def maxBy(fields: Int*): DataSet[T]

def min(field: String): AggregateDataSet[T]

def min(field: Int): AggregateDataSet[T]

def minBy(fields: Int*): DataSet[T]

final def ne(arg0: AnyRef): Boolean

final def notify(): Unit

final def notifyAll(): Unit

def reduce(reducer: ReduceFunction[T], strategy: CombineHint): DataSet[T]

def reduce(reducer: ReduceFunction[T]): DataSet[T]

def reduce(fun: (T, T) ⇒ T, strategy: CombineHint): DataSet[T]

def reduce(fun: (T, T) ⇒ T): DataSet[T]

def reduceGroup[R](reducer: GroupReduceFunction[T, R])(implicit arg0: TypeInformation[R], arg1: ClassTag[R]): DataSet[R]

def reduceGroup[R](fun: (Iterator[T], Collector[R]) ⇒ Unit)(implicit arg0: TypeInformation[R], arg1: ClassTag[R]): DataSet[R]

def reduceGroup[R](fun: (Iterator[T]) ⇒ R)(implicit arg0: TypeInformation[R], arg1: ClassTag[R]): DataSet[R]

def sortGroup[K](fun: (T) ⇒ K, order: Order)(implicit arg0: TypeInformation[K]): GroupedDataSet[T]

def sortGroup(field: String, order: Order): GroupedDataSet[T]

def sortGroup(field: Int, order: Order): GroupedDataSet[T]

def sum(field: String): AggregateDataSet[T]

def sum(field: Int): AggregateDataSet[T]

final def synchronized[T0](arg0: ⇒ T0): T0

def toString(): String

final def wait(): Unit

final def wait(arg0: Long, arg1: Int): Unit

final def wait(arg0: Long): Unit

def withPartitioner[K](partitioner: Partitioner[K])(implicit arg0: TypeInformation[K]): GroupedDataSet[T]

Inherited from AnyRef

Inherited from Any

Ungrouped