public class MultipleLinearRegression extends Object implements Predictor<MultipleLinearRegression>
The linear regression finds a solution to the problem
y = w_0 + w_1*x_1 + w_2*x_2 ... + w_n*x_n = w_0 + w^T*x
such that the sum of squared residuals is minimized
min_{w, w_0} \sum (y - w^T*x - w_0)^2
The minimization problem is solved by (stochastic) gradient descent. For each labeled vector
(x,y)
, the gradient is calculated. The weighted average of all gradients is subtracted from
the current value w
which gives the new value of w_new
. The weight is defined as
stepsize/math.sqrt(iteration)
.
The optimization runs at most a maximum number of iterations or, if a convergence threshold has been set, until the convergence criterion has been met. As convergence criterion the relative change of the sum of squared residuals is used:
(S_{k-1} - S_k)/S_{k-1} < \rho
with S_k being the sum of squared residuals in iteration k and \rho
being the convergence
threshold.
At the moment, the whole partition is used for SGD, making it effectively a batch gradient descent. Once a sampling operator has been introduced, the algorithm can be optimized.
Modifier and Type | Class and Description |
---|---|
static class |
MultipleLinearRegression.ConvergenceThreshold$ |
static class |
MultipleLinearRegression.Iterations$ |
static class |
MultipleLinearRegression.LearningRateMethodValue$ |
static class |
MultipleLinearRegression.Stepsize$ |
Constructor and Description |
---|
MultipleLinearRegression() |
Modifier and Type | Method and Description |
---|---|
static MultipleLinearRegression |
apply() |
static <Testing,PredictionValue> |
evaluate(DataSet<Testing> testing,
ParameterMap evaluateParameters,
EvaluateDataSetOperation<Self,Testing,PredictionValue> evaluator) |
static <Testing,PredictionValue> |
evaluate$default$2() |
static <Training> void |
fit(DataSet<Training> training,
ParameterMap fitParameters,
FitOperation<Self,Training> fitOperation) |
static <Training> ParameterMap |
fit$default$2() |
static Object |
fitMLR()
Trains the linear model to fit the training data.
|
static GenericLossFunction |
lossFunction() |
static ParameterMap |
parameters() |
static <Testing,Prediction> |
predict(DataSet<Testing> testing,
ParameterMap predictParameters,
PredictDataSetOperation<Self,Testing,Prediction> predictor) |
static <Testing,Prediction> |
predict$default$2() |
static <T extends Vector> |
predictVectors() |
MultipleLinearRegression |
setConvergenceThreshold(double convergenceThreshold) |
MultipleLinearRegression |
setIterations(int iterations) |
MultipleLinearRegression |
setLearningRateMethod(LearningRateMethod.LearningRateMethodTrait learningRateMethod) |
MultipleLinearRegression |
setStepsize(double stepsize) |
DataSet<Object> |
squaredResidualSum(DataSet<LabeledVector> input) |
scala.Option<DataSet<WeightVector>> |
weightsOption() |
static String |
WEIGHTVECTOR_BROADCAST() |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
parameters
public static String WEIGHTVECTOR_BROADCAST()
public static GenericLossFunction lossFunction()
public static MultipleLinearRegression apply()
public static Object fitMLR()
MultipleLinearRegression
instance.
public static ParameterMap parameters()
public static <Training> void fit(DataSet<Training> training, ParameterMap fitParameters, FitOperation<Self,Training> fitOperation)
public static <Training> ParameterMap fit$default$2()
public static <Testing,Prediction> DataSet<Prediction> predict(DataSet<Testing> testing, ParameterMap predictParameters, PredictDataSetOperation<Self,Testing,Prediction> predictor)
public static <Testing,PredictionValue> DataSet<scala.Tuple2<PredictionValue,PredictionValue>> evaluate(DataSet<Testing> testing, ParameterMap evaluateParameters, EvaluateDataSetOperation<Self,Testing,PredictionValue> evaluator)
public static <Testing,Prediction> ParameterMap predict$default$2()
public static <Testing,PredictionValue> ParameterMap evaluate$default$2()
public scala.Option<DataSet<WeightVector>> weightsOption()
public MultipleLinearRegression setIterations(int iterations)
public MultipleLinearRegression setStepsize(double stepsize)
public MultipleLinearRegression setConvergenceThreshold(double convergenceThreshold)
public MultipleLinearRegression setLearningRateMethod(LearningRateMethod.LearningRateMethodTrait learningRateMethod)
public DataSet<Object> squaredResidualSum(DataSet<LabeledVector> input)
Copyright © 2014–2018 The Apache Software Foundation. All rights reserved.