KFoldRLS - RLS with Kfold cross-validation regularization parameter selection¶

class rlscore.learner.rls.KfoldRLS(X, Y, folds, kernel='LinearKernel', basis_vectors=None, regparams=None, measure=None, save_predictions=False, **kwargs)¶

Bases: rlscore.predictor.predictor.PredictorInterface

Regularized least-squares regression/classification. Wrapper code that selects regularization parameter automatically based on K-fold cross-validation.

Parameters:

X : {array-like, sparse matrix}, shape = [n_samples, n_features]: Data matrix
Y : {array-like}, shape = [n_samples] or [n_samples, n_labels]: Training set labels
folds : list of index lists, shape = [n_folds]: Each list within the folds list contains the indices of samples in one fold, indices must be from range [0,n_samples-1]
kernel : {‘LinearKernel’, ‘GaussianKernel’, ‘PolynomialKernel’, ‘PrecomputedKernel’, …}: kernel function name, imported dynamically from rlscore.kernel
basis_vectors : {array-like, sparse matrix}, shape = [n_bvectors, n_features], optional: basis vectors (typically a randomly chosen subset of the training data)
regparams : {array-like}, shape = [grid_size] (optional): regularization parameter values to be tested, default = [2^-15,…,2^15]
measure : function(Y, P) (optional): a performance measure from rlscore.measure used for model selection, default sqerror (squared error)

Other Parameters:

Typical kernel parameters include:
bias : float, optional: LinearKernel: the model is w*x + bias*w0, (default=1.0)
gamma : float, optional: GaussianKernel: k(xi,xj) = e^(-gamma*<xi-xj,xi-xj>) (default=1.0) PolynomialKernel: k(xi,xj) = (gamma * <xi, xj> + coef0)**degree (default=1.0)
coef0 : float, optional: PolynomialKernel: k(xi,xj) = (gamma * <xi, xj> + coef0)**degree (default=0.)
degree : int, optional: PolynomialKernel: k(xi,xj) = (gamma * <xi, xj> + coef0)**degree (default=2)

Notes

Uses fast solve and holdout algorithms, complexity depends on the sizes of the folds. Complexity when using K-fold cross-validation is:

m = n_samples, d = n_features, l = n_labels, b = n_bvectors, r=grid_size, k = n_folds

O(m^3 + dm^2 + r*(m^3/k^2 + lm^2)): basic case

O(md^2 + r*(min(m^3/k^2 + lm^2/k, kd^3 + kld^2) + ldm)): Linear Kernel, d < m

O(mb^2 + r*(min(m^3/k^2 + lm^2/k, kb^3 + klb^2) + lbm)): Sparse approximation with basis vectors

Basic information about RLS can be found in [1]. The K-fold algorithm is based on results published in [2] and [3].

References

[1] Ryan Rifkin, Ross Lippert. Notes on Regularized Least Squares. Technical Report, MIT, 2007.

[2] Tapio Pahikkala, Jorma Boberg, and Tapio Salakoski. Fast n-Fold Cross-Validation for Regularized Least-Squares. Proceedings of the Ninth Scandinavian Conference on Artificial Intelligence, 83-90, Otamedia Oy, 2006.

[3] Tapio Pahikkala, Hanna Suominen, and Jorma Boberg. Efficient cross-validation for kernelized least-squares regression with sparse basis expansions. Machine Learning, 87(3):381–407, June 2012.

Attributes:	predictor : {LinearPredictor, KernelPredictor} trained predictor cv_performances : array, shape = [grid_size] K-fold performances for each grid point cv_predictions : list of 1D or 2D arrays, shape = [grid_size, n_folds] predictions for each fold, shapes [fold_size] or [fold_size, n_labels] regparam : float regparam from grid with best performance

predict(X)¶

Predicts outputs for new inputs

Parameters:	X : {array-like, sparse matrix}, shape = [n_samples, n_features] input data matrix
Returns:	P : array, shape = [n_samples, n_tasks] predictions

KFoldRLS - RLS with Kfold cross-validation regularization parameter selection¶

Previous topic

Next topic

This Page