Processing math: 100%
This documentation is for an out-of-date version of Apache Flink. We recommend you use the latest stable version.

MinMax Scaler

Description

The MinMax scaler scales the given data set, so that all values will lie between a user specified range [min,max]. In case the user does not provide a specific minimum and maximum value for the scaling range, the MinMax scaler transforms the features of the input data set to lie in the [0,1] interval. Given a set of input data x1,x2,xn, with minimum value:

xmin=min(x1,x2,...,xn)

and maximum value:

xmax=max(x1,x2,...,xn)

The scaled data set z1,z2,,zn will be:

zi=xixminxmaxxmin(maxmin)+min

where min and max are the user specified minimum and maximum values of the range to scale.

Operations

MinMaxScaler is a Transformer. As such, it supports the fit and transform operation.

Fit

MinMaxScaler is trained on all subtypes of Vector or LabeledVector:

  • fit[T <: Vector]: DataSet[T] => Unit
  • fit: DataSet[LabeledVector] => Unit

Transform

MinMaxScaler transforms all subtypes of Vector or LabeledVector into the respective type:

  • transform[T <: Vector]: DataSet[T] => DataSet[T]
  • transform: DataSet[LabeledVector] => DataSet[LabeledVector]

Parameters

The MinMax scaler implementation can be controlled by the following two parameters:

Parameters Description
Min

The minimum value of the range for the scaled data set. (Default value: 0.0)

Max

The maximum value of the range for the scaled data set. (Default value: 1.0)

Examples

// Create MinMax scaler transformer
val minMaxscaler = MinMaxScaler()
  .setMin(-1.0)

// Obtain data set to be scaled
val dataSet: DataSet[Vector] = ...

// Learn the minimum and maximum values of the training data
minMaxscaler.fit(dataSet)

// Scale the provided data set to have min=-1.0 and max=1.0
val scaledDS = minMaxscaler.transform(dataSet)

Back to top