LeavePairOutRLS - RLS with leave-pair-out regularization parameter selection

class rlscore.learner.rls.LeavePairOutRLS(X, Y, kernel='LinearKernel', basis_vectors=None, regparams=None, **kwargs)

Bases: rlscore.predictor.predictor.PredictorInterface

Regularized least-squares regression/classification. Wrapper code that selects regularization parameter automatically based on ranking accuracy (area under ROC curve for binary classification tasks) in leave-pair-out cross-validation.

Parameters:
X : {array-like, sparse matrix}, shape = [n_samples, n_features]

Data matrix

Y : {array-like}, shape = [n_samples] or [n_samples, n_labels]

Training set labels

kernel : {‘LinearKernel’, ‘GaussianKernel’, ‘PolynomialKernel’, ‘PrecomputedKernel’, …}

kernel function name, imported dynamically from rlscore.kernel

basis_vectors : {array-like, sparse matrix}, shape = [n_bvectors, n_features], optional

basis vectors (typically a randomly chosen subset of the training data)

regparams : {array-like}, shape = [grid_size] (optional)

regularization parameter values to be tested, default = [2^-15,…,2^15]

measure : function(Y, P) (optional)

a performance measure from rlscore.measure used for model selection, default sqerror (squared error)

Other Parameters:
 
Typical kernel parameters include:
bias : float, optional

LinearKernel: the model is w*x + bias*w0, (default=1.0)

gamma : float, optional

GaussianKernel: k(xi,xj) = e^(-gamma*<xi-xj,xi-xj>) (default=1.0) PolynomialKernel: k(xi,xj) = (gamma * <xi, xj> + coef0)**degree (default=1.0)

coef0 : float, optional

PolynomialKernel: k(xi,xj) = (gamma * <xi, xj> + coef0)**degree (default=0.)

degree : int, optional

PolynomialKernel: k(xi,xj) = (gamma * <xi, xj> + coef0)**degree (default=2)

Notes

Computational complexity of training and model selection: m = n_samples, d = n_features, l = n_labels, b = n_bvectors, r = grid_size

O(rlm^2 + dm^2 + rm^3): basic case

O(rlm^2 + rdm^2): Linear Kernel, d < m

O(rlm^2 + rbm^2): Sparse approximation with basis vectors

Basic information about RLS can be found in [1]. The leave-pair-out algorithm is an adaptation of the method published in [2]. The use of leave-pair-out cross-validation for AUC estimation has been analyzed in [3].

References

[1] Ryan Rifkin, Ross Lippert. Notes on Regularized Least Squares. Technical Report, MIT, 2007.

[2] Tapio Pahikkala, Antti Airola, Jorma Boberg, and Tapio Salakoski. Exact and efficient leave-pair-out cross-validation for ranking RLS. In Proceedings of the 2nd International and Interdisciplinary Conference on Adaptive Knowledge Representation and Reasoning (AKRR‘08), pages 1-8, Espoo, Finland, 2008.

[3] Antti Airola, Tapio Pahikkala, Willem Waegeman, Bernard De Baets, and Tapio Salakoski. An experimental comparison of cross-validation techniques for estimating the area under the ROC curve. Computational Statistics & Data Analysis, 55(4):1828–1844, April 2011.

Attributes:
predictor : {LinearPredictor, KernelPredictor}

trained predictor

cv_performances : array, shape = [grid_size]

leave-pair-out performances for each grid point

regparam : float

regparam from grid with best performance

predict(X)

Predicts outputs for new inputs

Parameters:
X : {array-like, sparse matrix}, shape = [n_samples, n_features]

input data matrix

Returns:
P : array, shape = [n_samples, n_tasks]

predictions