Pairwise (dyadic) data, transfer- and zero-shot learning

In this tutorial, we consider learning from pairwise (dyadic) data. We assume that the training data consist a sequence of paired inputs and correct labels ((u, v), y), and the task is to learn a function f(u,v) = y, that would given a new paired input correctly predict the label. The setting appears in a wide variety of settings, such as learning interactions for drug-target pairs, protein-protein interactions, rankings for query-document pairs, customer-product ratings etc.

Pair-input prediction is often considered under the framework of network inference, where the inputs correspond to vertices of a graph, pairs (u,v) to directed edges, and labels y to edge weights. Here, one can distinguish between two types of graphs. If the start and end nodes u and v belong to different sets, the problem corresponds to predicting edges in a bipartite network. Typical examples would be drug-target interaction, or customer-product rating prediction. On the other hand, if u and v belong to same set, the problem corresponds to edge prediction in a homogenous network. A typical example would be protein-protein interaction prediction.

For the bipartite case, four settings are commonly recognized. Let us assume making a prediction for a new paired input (u,v):

  1. Both u and v are present in the training set, as parts of different labeled pairs, and the label of the pair (u,v) must be predicted.
  2. Pairs containing v are present in the training set, while u is not observed in any training pair, and the label of the pair (u,v) must be predicted.
  3. Pairs containing u are present in the training set, while v is not observed in any training pair, and the label of the pair (u,v) must be predicted.
  4. Neither u nor v occurs in any training pair, and the label of the pair (u,v) must be predicted.

A is the standard setting of matrix completion, that has been considered especially in the context of recommender systems and matrix factorization methods. B and C are examples of multi-target learning problems, that can also be solved by using the regular RLS. Setting D can be seen as an example of the zero-shot learning problem, where we need to generalize from related learning tasks to a new problem, for which we have no training data. For the homogenous network case, settings B and C become equivalent. For a more detailed overview of these four settings, as well as analysis and comparison of different Kronecker kernel RLS methods, see [1] and [2]. Terminology varies between the articles, the settings considered in [2] can be mapped to this tutorial as follows: I=A, R=B, C=C, B=D (bipartite case); E=A, V=B/C (homogenous case).

We assume that the feature representations of the inputs are given either as two data matrices X1 and X2, or as two kernel matrices K1 and K2. The feature representation can then be formed as the Kronecker product of these matrices. The Kronecker RLS (KronRLS) method corresponds to training a RLS on the Kronecker product data or kernel matrices, but is much faster to train due to computational shortcuts used [3] [4]. Alternatively, a related variant of the method known as TwoStepRLS may be used [5]. The main advantage of TwoStepRLS is that it implements fast cross-validation algorithms for Settings A - D, as well as related settings for homogenous networks [2].

In the following, we will experiment on two drug-target binding affinity data sets. Let us assume that we have n_drugs and n_targets drugs and targets in our training data. In the complete data setting, we assume that the correct labels are known for each drug-target-combination in the training data. That is, the labels can be represented as a n_drugs x n_targets -sized Y-matrix, that has no missing entries. In practice even if a small amount of the values are missing, this setting can be realized by imputing the missing entries for example with row- and/or column means. In the incomplete data setting we assume that many of the values are unknown. These settings are considered separately, since they lead to different learning algorithms.

Tutorial 1: KronRLS

In the first tutorial, we consider learning from complete data and compare settings A-D. The experimental setup is similar to that of [6] (results in Table 2, column KdQ). The main difference is that in order to keep the examples simple and understandable we implement only a single training / test set split, rather than using proper nested cross-validation.

Data set

For these experiments, we need to download from the drug-target binding affinity data sets page for the Davis et al. data the drug-target interaction affinities (Y), drug-drug 2D similarities (X1), and WS target-target similarities (X2). In the following we will use similarity scores directly as features for the linear kernel, since the similarity matrices themselves are not valid positive semi-definite kernel matrices.

We can load the data set as follows:

import numpy as np

def load_davis():
    Y = np.loadtxt("drug-target_interaction_affinities_Kd__Davis_et_al.2011.txt")
    XD = np.loadtxt("drug-drug_similarities_2D.txt")
    XT = np.loadtxt("target-target_similarities_WS.txt")    
    return XD, XT, Y

def settingB_split():
    np.random.seed(1)
    XD, XT, Y = load_davis()
    drug_ind = list(range(Y.shape[0]))
    np.random.shuffle(drug_ind)
    train_drug_ind = drug_ind[:40]
    test_drug_ind = drug_ind[40:]
    #Setting 2: split according to drugs
    Y_train = Y[train_drug_ind]
    Y_test = Y[test_drug_ind]
    Y_train = Y_train.ravel(order='F')
    Y_test = Y_test.ravel(order='F')
    XD_train = XD[train_drug_ind]
    XT_train = XT
    XD_test = XD[test_drug_ind]
    XT_test = XT
    return XD_train, XT_train, Y_train, XD_test, XT_test, Y_test   

def settingC_split():
    np.random.seed(1)
    XD, XT, Y = load_davis()
    drug_ind = list(range(Y.shape[0]))
    target_ind = list(range(Y.shape[1]))
    np.random.shuffle(target_ind)
    train_target_ind = target_ind[:300]
    test_target_ind = target_ind[300:]
    #Setting 3: split according to targets
    Y_train = Y[:, train_target_ind]
    Y_test = Y[:, test_target_ind]
    Y_train = Y_train.ravel(order='F')
    Y_test = Y_test.ravel(order='F')
    XD_train = XD
    XT_train = XT[train_target_ind]
    XD_test = XD
    XT_test = XT[test_target_ind]
    return XD_train, XT_train, Y_train, XD_test, XT_test, Y_test  

def settingD_split():
    np.random.seed(1)
    XD, XT, Y = load_davis()
    drug_ind = list(range(Y.shape[0]))
    target_ind = list(range(Y.shape[1]))
    np.random.shuffle(drug_ind)
    np.random.shuffle(target_ind)
    train_drug_ind = drug_ind[:40]
    test_drug_ind = drug_ind[40:]
    train_target_ind = target_ind[:300]
    test_target_ind = target_ind[300:]
    #Setting 4: ensure that d,t pairs do not overlap between
    #training and test set
    Y_train = Y[np.ix_(train_drug_ind, train_target_ind)]
    Y_test = Y[np.ix_(test_drug_ind, test_target_ind)]
    Y_train = Y_train.ravel(order='F')
    Y_test = Y_test.ravel(order='F')
    XD_train = XD[train_drug_ind]
    XT_train = XT[train_target_ind]
    XD_test = XD[test_drug_ind]
    XT_test = XT[test_target_ind]
    return XD_train, XT_train, Y_train, XD_test, XT_test, Y_test    

if __name__=="__main__":
    XD, XT, Y = load_davis()
    print("Y dimensions %d %d" %Y.shape)
    print("XD dimensions %d %d" %XD.shape)
    print("XT dimensions %d %d" %XT.shape)
    print("drug-target pairs: %d" %(Y.shape[0]*Y.shape[1]))
Y dimensions 68 442
XD dimensions 68 68
XT dimensions 442 442
drug-target pairs: 30056

There are 68 drugs, 442 targets and 30056 drug-target pairs. The benefit of using KronRLS or TwoStepRLS is that they operate only on these small matrices, rather than explicitly forming the Kronecker product matrices whose sizes correspond to the number of pairs. The functions settingx_split() generate a training / test split compatible with the implied setting (split on level of drugs, targets or both).

Setting A

First, we consider setting A, imputing missing values inside the Y-matrix. This is not the most interesting problem for this data set, since we assume complete training set where there are no missing entries in Y in the first place. Still, we can check the in-sample leave-one-out predictions: if the label of one (u,v) pair is left out of Y, how well can we predict it from the remaining data? In practice the fast leave-one-out algorithm could be used to impute missing values for example by first replacing them with mean values, and then computing more accurate estimates via the leave-one-out.

from rlscore.learner import KronRLS
from rlscore.measure import cindex
import davis_data

def main():
    X1, X2, Y = davis_data.load_davis()
    Y = Y.ravel(order='F')
    learner = KronRLS(X1 = X1, X2 = X2, Y = Y)
    log_regparams = range(15, 35)
    for log_regparam in log_regparams:
        learner.solve(2.**log_regparam)
        P = learner.in_sample_loo()
        perf = cindex(Y, P)
        print("regparam 2**%d, cindex %f" %(log_regparam, perf))

if __name__=="__main__":
    main()
regparam 2**15, cindex 0.866841
regparam 2**16, cindex 0.875637
regparam 2**17, cindex 0.882923
regparam 2**18, cindex 0.888907
regparam 2**19, cindex 0.893437
regparam 2**20, cindex 0.896402
regparam 2**21, cindex 0.897650
regparam 2**22, cindex 0.897185
regparam 2**23, cindex 0.894893
regparam 2**24, cindex 0.890913
regparam 2**25, cindex 0.885419
regparam 2**26, cindex 0.878818
regparam 2**27, cindex 0.871421
regparam 2**28, cindex 0.863367
regparam 2**29, cindex 0.854289
regparam 2**30, cindex 0.843720
regparam 2**31, cindex 0.831136
regparam 2**32, cindex 0.816192
regparam 2**33, cindex 0.798639
regparam 2**34, cindex 0.777252

The best results are for regparam 2**21, with concordance index (e.g. pairwise ranking accuracy, generalization of AUC for real-valued data) around 0.90.

Setting B

Next, we consider setting B, generalizing to predictions for new drugs that were not observed in the training set.

from rlscore.learner import KronRLS
from rlscore.measure import cindex
import davis_data

def main():
    X1_train, X2_train, Y_train, X1_test, X2_test, Y_test = davis_data.settingB_split()
    learner = KronRLS(X1 = X1_train, X2 = X2_train, Y = Y_train)
    log_regparams = range(15, 35)
    for log_regparam in log_regparams:
        learner.solve(2.**log_regparam)
        P = learner.predict(X1_test, X2_test)
        perf = cindex(Y_test, P)
        print("regparam 2**%d, cindex %f" %(log_regparam, perf))

if __name__=="__main__":
    main()
regparam 2**15, cindex 0.625615
regparam 2**16, cindex 0.626731
regparam 2**17, cindex 0.628061
regparam 2**18, cindex 0.629595
regparam 2**19, cindex 0.631256
regparam 2**20, cindex 0.633273
regparam 2**21, cindex 0.635672
regparam 2**22, cindex 0.638943
regparam 2**23, cindex 0.643901
regparam 2**24, cindex 0.651453
regparam 2**25, cindex 0.661095
regparam 2**26, cindex 0.670401
regparam 2**27, cindex 0.677799
regparam 2**28, cindex 0.685400
regparam 2**29, cindex 0.693695
regparam 2**30, cindex 0.702920
regparam 2**31, cindex 0.711303
regparam 2**32, cindex 0.715542
regparam 2**33, cindex 0.713162
regparam 2**34, cindex 0.704249

The results show that this problem is really quite different from setting A: the best results are now much lower than in Setting A. Also, the required amount of regularization is quite different, suggesting that selecting regularization parameter with leave-one-out may be a bad idea, if the goal is to generalize to the other settings.

Setting C

Next, we consider setting C, generalizing to predictions for new targets that were not observed in the training set.

from rlscore.learner import KronRLS
from rlscore.measure import cindex
import davis_data

def main():
    X1_train, X2_train, Y_train, X1_test, X2_test, Y_test = davis_data.settingC_split()
    learner = KronRLS(X1 = X1_train, X2 = X2_train, Y = Y_train)
    log_regparams = range(15, 35)
    for log_regparam in log_regparams:
        learner.solve(2.**log_regparam)
        P = learner.predict(X1_test, X2_test)
        perf = cindex(Y_test, P)
        print("regparam 2**%d, cindex %f" %(log_regparam, perf))

if __name__=="__main__":
    main()
regparam 2**15, cindex 0.701616
regparam 2**16, cindex 0.719695
regparam 2**17, cindex 0.742733
regparam 2**18, cindex 0.771481
regparam 2**19, cindex 0.803546
regparam 2**20, cindex 0.835486
regparam 2**21, cindex 0.853427
regparam 2**22, cindex 0.858628
regparam 2**23, cindex 0.860152
regparam 2**24, cindex 0.860020
regparam 2**25, cindex 0.858415
regparam 2**26, cindex 0.855578
regparam 2**27, cindex 0.851229
regparam 2**28, cindex 0.844872
regparam 2**29, cindex 0.836173
regparam 2**30, cindex 0.824684
regparam 2**31, cindex 0.809855
regparam 2**32, cindex 0.791903
regparam 2**33, cindex 0.770734
regparam 2**34, cindex 0.745637

Again quite different results.

Setting D

Finally we consider the most demanding setting D, generalizing to new (u,v) pairs such that neither have been observed in the training set.

from rlscore.learner import KronRLS
from rlscore.measure import cindex
import davis_data

def main():
    X1_train, X2_train, Y_train, X1_test, X2_test, Y_test = davis_data.settingD_split()
    learner = KronRLS(X1 = X1_train, X2 = X2_train, Y = Y_train)
    log_regparams = range(15, 35)
    for log_regparam in log_regparams:
        learner.solve(2.**log_regparam)
        P = learner.predict(X1_test, X2_test)
        perf = cindex(Y_test, P)
        print("regparam 2**%d, cindex %f" %(log_regparam, perf))

if __name__=="__main__":
    main()
regparam 2**15, cindex 0.565452
regparam 2**16, cindex 0.568766
regparam 2**17, cindex 0.573612
regparam 2**18, cindex 0.579559
regparam 2**19, cindex 0.585241
regparam 2**20, cindex 0.588507
regparam 2**21, cindex 0.590205
regparam 2**22, cindex 0.593595
regparam 2**23, cindex 0.600126
regparam 2**24, cindex 0.610286
regparam 2**25, cindex 0.622301
regparam 2**26, cindex 0.633359
regparam 2**27, cindex 0.643321
regparam 2**28, cindex 0.653964
regparam 2**29, cindex 0.665960
regparam 2**30, cindex 0.677539
regparam 2**31, cindex 0.685681
regparam 2**32, cindex 0.687180
regparam 2**33, cindex 0.681317
regparam 2**34, cindex 0.669595

The results are noticeably lower than for any other setting, yet still much better than the random baseline 0.5 meaning that the model still has predictive power.

Using kernels

By default KronRLS assumes linear kernel for both input domains. However, KronRLS also supports the use of pre-computed kernel matrices. In the following, we repeat experiment for setting D, this time first computing the kernel matrices, before passing them to the learner. One can use any proper kernel function for learning, and the instances from the first and second domain may often have different types of kernel functions. Some very basic kernel functions are implemented in the module rlscore.kernel.

from rlscore.learner import KronRLS
from rlscore.measure import cindex
import davis_data

def main():
    X1_train, X2_train, Y_train, X1_test, X2_test, Y_test = davis_data.settingD_split()
    K1_train = X1_train.dot(X1_train.T)
    K2_train = X2_train.dot(X2_train.T)
    K1_test = X1_test.dot(X1_train.T)
    K2_test = X2_test.dot(X2_train.T)
    learner = KronRLS(K1 = K1_train, K2 = K2_train, Y = Y_train)
    log_regparams = range(15, 35)
    for log_regparam in log_regparams:
        learner.solve(2.**log_regparam)
        P = learner.predict(K1_test, K2_test)
        perf = cindex(Y_test, P)
        print("regparam 2**%d, cindex %f" %(log_regparam, perf))

if __name__=="__main__":
    main()
regparam 2**15, cindex 0.565452
regparam 2**16, cindex 0.568766
regparam 2**17, cindex 0.573612
regparam 2**18, cindex 0.579559
regparam 2**19, cindex 0.585241
regparam 2**20, cindex 0.588507
regparam 2**21, cindex 0.590205
regparam 2**22, cindex 0.593595
regparam 2**23, cindex 0.600126
regparam 2**24, cindex 0.610286
regparam 2**25, cindex 0.622301
regparam 2**26, cindex 0.633359
regparam 2**27, cindex 0.643321
regparam 2**28, cindex 0.653964
regparam 2**29, cindex 0.665960
regparam 2**30, cindex 0.677539
regparam 2**31, cindex 0.685681
regparam 2**32, cindex 0.687180
regparam 2**33, cindex 0.681317
regparam 2**34, cindex 0.669595

Results are same as before.

Tutorial 2: TwoStepRLS, cross-validation with bipartite network

For complete data, another method that can be used in the TwoStepRLS algorithm [5]. The algorithm has two different regularization parameters, one that can be used in our experiments to regularize the drugs, and one the targets. The main advantage of the method is that it allows for fast cross-validation methods [2]. In the following experiments, we consider cross-validations for settings A - D. We use the data set introduced in previous example.

Setting A

Leave-pair-out cross-validation. On each round of CV, one (drug,target) pair is left out of the training set as test pair.

from rlscore.learner import TwoStepRLS
from rlscore.measure import cindex
import davis_data

def main():
    X1, X2, Y = davis_data.load_davis()
    Y = Y.ravel(order='F')
    learner = TwoStepRLS(X1 = X1, X2 = X2, Y = Y, regparam1=1.0, regparam2=1.0)
    log_regparams1 = range(-8, -4)
    log_regparams2 = range(20,25)
    for log_regparam1 in log_regparams1:
        for log_regparam2 in log_regparams2:
            learner.solve(2.**log_regparam1, 2.**log_regparam2)
            P = learner.in_sample_loo()
            perf = cindex(Y, P)
            print("regparam 2**%d 2**%d, cindex %f" %(log_regparam1, log_regparam2, perf))



if __name__=="__main__":
    main()
regparam 2**-8 2**20, cindex 0.856107
regparam 2**-8 2**21, cindex 0.870316
regparam 2**-8 2**22, cindex 0.879996
regparam 2**-8 2**23, cindex 0.884644
regparam 2**-8 2**24, cindex 0.884816
regparam 2**-7 2**20, cindex 0.845540
regparam 2**-7 2**21, cindex 0.861627
regparam 2**-7 2**22, cindex 0.873693
regparam 2**-7 2**23, cindex 0.880633
regparam 2**-7 2**24, cindex 0.882209
regparam 2**-6 2**20, cindex 0.834259
regparam 2**-6 2**21, cindex 0.850631
regparam 2**-6 2**22, cindex 0.864345
regparam 2**-6 2**23, cindex 0.873417
regparam 2**-6 2**24, cindex 0.876857
regparam 2**-5 2**20, cindex 0.822796
regparam 2**-5 2**21, cindex 0.838143
regparam 2**-5 2**22, cindex 0.852142
regparam 2**-5 2**23, cindex 0.862541
regparam 2**-5 2**24, cindex 0.867615

Kfold cross-validation. Same as above, but several (drug, target) pairs left out at once.

import numpy as np

from rlscore.learner import TwoStepRLS
from rlscore.measure import cindex
import davis_data
from rlscore.utilities.cross_validation import random_folds

def main():
    X1, X2, Y = davis_data.load_davis()
    n = X1.shape[0]
    m = X2.shape[0]
    Y = Y.ravel(order='F')
    learner = TwoStepRLS(X1 = X1, X2 = X2, Y = Y, regparam1=1.0, regparam2=1.0)
    log_regparams1 = range(-8, -4)
    log_regparams2 = range(20,25)
    #Create random split to 5 folds for the drug-target pairs
    folds = random_folds(n*m, 5, seed=12345)
    #Map the indices back to (drug_indices, target_indices)
    folds = [np.unravel_index(fold, (n,m)) for fold in folds]
    for log_regparam1 in log_regparams1:
        for log_regparam2 in log_regparams2:
            learner.solve(2.**log_regparam1, 2.**log_regparam2)
            P = learner.in_sample_kfoldcv(folds)
            perf = cindex(Y, P)
            print("regparam 2**%d 2**%d, cindex %f" %(log_regparam1, log_regparam2, perf))



if __name__=="__main__":
    main()
regparam 2**-8 2**20, cindex 0.847951
regparam 2**-8 2**21, cindex 0.861524
regparam 2**-8 2**22, cindex 0.870746
regparam 2**-8 2**23, cindex 0.875478
regparam 2**-8 2**24, cindex 0.875747
regparam 2**-7 2**20, cindex 0.836888
regparam 2**-7 2**21, cindex 0.852547
regparam 2**-7 2**22, cindex 0.864314
regparam 2**-7 2**23, cindex 0.871177
regparam 2**-7 2**24, cindex 0.872904
regparam 2**-6 2**20, cindex 0.825043
regparam 2**-6 2**21, cindex 0.841144
regparam 2**-6 2**22, cindex 0.854672
regparam 2**-6 2**23, cindex 0.863623
regparam 2**-6 2**24, cindex 0.867058
regparam 2**-5 2**20, cindex 0.813032
regparam 2**-5 2**21, cindex 0.828317
regparam 2**-5 2**22, cindex 0.842215
regparam 2**-5 2**23, cindex 0.852473
regparam 2**-5 2**24, cindex 0.857356

Setting B

Leave-drug-out cross-validation. On each CV round, a single holdout drug is left out, and all (drug, target) pairs this drug belongs to used as the test fold.

from rlscore.learner import TwoStepRLS
from rlscore.measure import cindex
import davis_data

def main():
    X1_train, X2_train, Y_train, X1_test, X2_test, Y_test = davis_data.settingB_split()
    learner = TwoStepRLS(X1 = X1_train, X2 = X2_train, Y = Y_train, regparam1=1.0, regparam2=1.0)
    log_regparams1 = range(-8, -4)
    log_regparams2 = range(20,25)
    for log_regparam1 in log_regparams1:
        for log_regparam2 in log_regparams2:
            learner.solve(2.**log_regparam1, 2.**log_regparam2)
            P = learner.predict(X1_test, X2_test)
            perf = cindex(Y_test, P)
            print("regparam 2**%d 2**%d, test set cindex %f" %(log_regparam1, log_regparam2, perf))
            P = learner.leave_x1_out()
            perf = cindex(Y_train, P)
            print("regparam 2**%d 2**%d, leave-row-out cindex %f" %(log_regparam1, log_regparam2, perf))


if __name__=="__main__":
    main()
regparam 2**-8 2**20, test set cindex 0.663481
regparam 2**-8 2**20, leave-row-out cindex 0.652126
regparam 2**-8 2**21, test set cindex 0.662677
regparam 2**-8 2**21, leave-row-out cindex 0.652187
regparam 2**-8 2**22, test set cindex 0.661160
regparam 2**-8 2**22, leave-row-out cindex 0.651892
regparam 2**-8 2**23, test set cindex 0.658421
regparam 2**-8 2**23, leave-row-out cindex 0.650985
regparam 2**-8 2**24, test set cindex 0.653732
regparam 2**-8 2**24, leave-row-out cindex 0.648959
regparam 2**-7 2**20, test set cindex 0.675787
regparam 2**-7 2**20, leave-row-out cindex 0.665417
regparam 2**-7 2**21, test set cindex 0.675215
regparam 2**-7 2**21, leave-row-out cindex 0.665552
regparam 2**-7 2**22, test set cindex 0.673975
regparam 2**-7 2**22, leave-row-out cindex 0.665345
regparam 2**-7 2**23, test set cindex 0.671551
regparam 2**-7 2**23, leave-row-out cindex 0.664527
regparam 2**-7 2**24, test set cindex 0.667291
regparam 2**-7 2**24, leave-row-out cindex 0.662678
regparam 2**-6 2**20, test set cindex 0.689898
regparam 2**-6 2**20, leave-row-out cindex 0.677247
regparam 2**-6 2**21, test set cindex 0.689480
regparam 2**-6 2**21, leave-row-out cindex 0.677388
regparam 2**-6 2**22, test set cindex 0.688509
regparam 2**-6 2**22, leave-row-out cindex 0.677212
regparam 2**-6 2**23, test set cindex 0.686348
regparam 2**-6 2**23, leave-row-out cindex 0.676372
regparam 2**-6 2**24, test set cindex 0.682283
regparam 2**-6 2**24, leave-row-out cindex 0.674487
regparam 2**-5 2**20, test set cindex 0.704264
regparam 2**-5 2**20, leave-row-out cindex 0.687138
regparam 2**-5 2**21, test set cindex 0.703962
regparam 2**-5 2**21, leave-row-out cindex 0.687201
regparam 2**-5 2**22, test set cindex 0.703055
regparam 2**-5 2**22, leave-row-out cindex 0.686857
regparam 2**-5 2**23, test set cindex 0.700970
regparam 2**-5 2**23, leave-row-out cindex 0.685875
regparam 2**-5 2**24, test set cindex 0.696871
regparam 2**-5 2**24, leave-row-out cindex 0.683746

Kfold cross-validation. Same as above, but several drugs left out at once.

from rlscore.learner import TwoStepRLS
from rlscore.measure import cindex
import davis_data
from rlscore.utilities.cross_validation import random_folds

def main():
    X1_train, X2_train, Y_train, X1_test, X2_test, Y_test = davis_data.settingB_split()
    n = X1_train.shape[0]
    learner = TwoStepRLS(X1 = X1_train, X2 = X2_train, Y = Y_train, regparam1=1.0, regparam2=1.0)
    log_regparams1 = range(-8, -4)
    log_regparams2 = range(20,25)
    #Create random split to 5 folds for the drugs
    folds = random_folds(n, 5, seed=12345)
    for log_regparam1 in log_regparams1:
        for log_regparam2 in log_regparams2:
            learner.solve(2.**log_regparam1, 2.**log_regparam2)
            P = learner.predict(X1_test, X2_test)
            perf = cindex(Y_test, P)
            print("regparam 2**%d 2**%d, test set cindex %f" %(log_regparam1, log_regparam2, perf))
            P = learner.x1_kfold_cv(folds)
            perf = cindex(Y_train, P)
            print("regparam 2**%d 2**%d, K-fold cindex %f" %(log_regparam1, log_regparam2, perf))


if __name__=="__main__":
    main()
regparam 2**-8 2**20, test set cindex 0.663481
regparam 2**-8 2**20, K-fold cindex 0.659214
regparam 2**-8 2**21, test set cindex 0.662677
regparam 2**-8 2**21, K-fold cindex 0.659419
regparam 2**-8 2**22, test set cindex 0.661160
regparam 2**-8 2**22, K-fold cindex 0.659421
regparam 2**-8 2**23, test set cindex 0.658421
regparam 2**-8 2**23, K-fold cindex 0.658657
regparam 2**-8 2**24, test set cindex 0.653732
regparam 2**-8 2**24, K-fold cindex 0.656883
regparam 2**-7 2**20, test set cindex 0.675787
regparam 2**-7 2**20, K-fold cindex 0.671587
regparam 2**-7 2**21, test set cindex 0.675215
regparam 2**-7 2**21, K-fold cindex 0.671923
regparam 2**-7 2**22, test set cindex 0.673975
regparam 2**-7 2**22, K-fold cindex 0.672029
regparam 2**-7 2**23, test set cindex 0.671551
regparam 2**-7 2**23, K-fold cindex 0.671496
regparam 2**-7 2**24, test set cindex 0.667291
regparam 2**-7 2**24, K-fold cindex 0.669824
regparam 2**-6 2**20, test set cindex 0.689898
regparam 2**-6 2**20, K-fold cindex 0.683680
regparam 2**-6 2**21, test set cindex 0.689480
regparam 2**-6 2**21, K-fold cindex 0.684026
regparam 2**-6 2**22, test set cindex 0.688509
regparam 2**-6 2**22, K-fold cindex 0.684179
regparam 2**-6 2**23, test set cindex 0.686348
regparam 2**-6 2**23, K-fold cindex 0.683625
regparam 2**-6 2**24, test set cindex 0.682283
regparam 2**-6 2**24, K-fold cindex 0.681945
regparam 2**-5 2**20, test set cindex 0.704264
regparam 2**-5 2**20, K-fold cindex 0.694867
regparam 2**-5 2**21, test set cindex 0.703962
regparam 2**-5 2**21, K-fold cindex 0.695190
regparam 2**-5 2**22, test set cindex 0.703055
regparam 2**-5 2**22, K-fold cindex 0.695232
regparam 2**-5 2**23, test set cindex 0.700970
regparam 2**-5 2**23, K-fold cindex 0.694596
regparam 2**-5 2**24, test set cindex 0.696871
regparam 2**-5 2**24, K-fold cindex 0.692804

Setting C

Leave-target-out cross-validation. On each CV round, a single holdout target is left out, and all (drug, target) pairs this target belongs to used as the test fold.

from rlscore.learner import TwoStepRLS
from rlscore.measure import cindex
import davis_data

def main():
    X1_train, X2_train, Y_train, X1_test, X2_test, Y_test = davis_data.settingC_split()
    learner = TwoStepRLS(X1 = X1_train, X2 = X2_train, Y = Y_train, regparam1=1.0, regparam2=1.0)
    log_regparams1 = range(-8, -4)
    log_regparams2 = range(20,25)
    for log_regparam1 in log_regparams1:
        for log_regparam2 in log_regparams2:
            learner.solve(2.**log_regparam1, 2.**log_regparam2)
            P = learner.predict(X1_test, X2_test)
            perf = cindex(Y_test, P)
            print("regparam 2**%d 2**%d, test set cindex %f" %(log_regparam1, log_regparam2, perf))
            P = learner.leave_x2_out()
            perf = cindex(Y_train, P)
            print("regparam 2**%d 2**%d, leave-column-out cindex %f" %(log_regparam1, log_regparam2, perf))


if __name__=="__main__":
    main()
regparam 2**-8 2**20, test set cindex 0.859528
regparam 2**-8 2**20, leave-column-out cindex 0.855517
regparam 2**-8 2**21, test set cindex 0.864690
regparam 2**-8 2**21, leave-column-out cindex 0.860196
regparam 2**-8 2**22, test set cindex 0.868333
regparam 2**-8 2**22, leave-column-out cindex 0.863492
regparam 2**-8 2**23, test set cindex 0.870697
regparam 2**-8 2**23, leave-column-out cindex 0.864836
regparam 2**-8 2**24, test set cindex 0.871736
regparam 2**-8 2**24, leave-column-out cindex 0.863748
regparam 2**-7 2**20, test set cindex 0.858433
regparam 2**-7 2**20, leave-column-out cindex 0.853880
regparam 2**-7 2**21, test set cindex 0.863401
regparam 2**-7 2**21, leave-column-out cindex 0.858446
regparam 2**-7 2**22, test set cindex 0.866956
regparam 2**-7 2**22, leave-column-out cindex 0.861664
regparam 2**-7 2**23, test set cindex 0.869128
regparam 2**-7 2**23, leave-column-out cindex 0.862892
regparam 2**-7 2**24, test set cindex 0.869852
regparam 2**-7 2**24, leave-column-out cindex 0.861579
regparam 2**-6 2**20, test set cindex 0.856016
regparam 2**-6 2**20, leave-column-out cindex 0.850710
regparam 2**-6 2**21, test set cindex 0.860732
regparam 2**-6 2**21, leave-column-out cindex 0.855091
regparam 2**-6 2**22, test set cindex 0.864151
regparam 2**-6 2**22, leave-column-out cindex 0.858253
regparam 2**-6 2**23, test set cindex 0.866186
regparam 2**-6 2**23, leave-column-out cindex 0.859377
regparam 2**-6 2**24, test set cindex 0.866570
regparam 2**-6 2**24, leave-column-out cindex 0.857775
regparam 2**-5 2**20, test set cindex 0.851254
regparam 2**-5 2**20, leave-column-out cindex 0.845166
regparam 2**-5 2**21, test set cindex 0.855569
regparam 2**-5 2**21, leave-column-out cindex 0.849251
regparam 2**-5 2**22, test set cindex 0.858841
regparam 2**-5 2**22, leave-column-out cindex 0.852248
regparam 2**-5 2**23, test set cindex 0.860764
regparam 2**-5 2**23, leave-column-out cindex 0.853174
regparam 2**-5 2**24, test set cindex 0.860787
regparam 2**-5 2**24, leave-column-out cindex 0.851221

Kfold cross-validation. Same as above, but several targets left out at once.

from rlscore.learner import TwoStepRLS
from rlscore.measure import cindex
import davis_data
from rlscore.utilities.cross_validation import random_folds

def main():
    X1_train, X2_train, Y_train, X1_test, X2_test, Y_test = davis_data.settingC_split()
    m = X2_train.shape[0]
    learner = TwoStepRLS(X1 = X1_train, X2 = X2_train, Y = Y_train, regparam1=1.0, regparam2=1.0)
    log_regparams1 = range(-8, -4)
    log_regparams2 = range(20,25)
    #Create random split to 5 folds for the targets
    folds = random_folds(m, 5, seed=12345)
    for log_regparam1 in log_regparams1:
        for log_regparam2 in log_regparams2:
            learner.solve(2.**log_regparam1, 2.**log_regparam2)
            P = learner.predict(X1_test, X2_test)
            perf = cindex(Y_test, P)
            print("regparam 2**%d 2**%d, test set cindex %f" %(log_regparam1, log_regparam2, perf))
            P = learner.x2_kfold_cv(folds)
            perf = cindex(Y_train, P)
            print("regparam 2**%d 2**%d, K-fold cindex %f" %(log_regparam1, log_regparam2, perf))


if __name__=="__main__":
    main()
regparam 2**-8 2**20, test set cindex 0.859528
regparam 2**-8 2**20, K-fold cindex 0.848338
regparam 2**-8 2**21, test set cindex 0.864690
regparam 2**-8 2**21, K-fold cindex 0.852689
regparam 2**-8 2**22, test set cindex 0.868333
regparam 2**-8 2**22, K-fold cindex 0.855798
regparam 2**-8 2**23, test set cindex 0.870697
regparam 2**-8 2**23, K-fold cindex 0.857111
regparam 2**-8 2**24, test set cindex 0.871736
regparam 2**-8 2**24, K-fold cindex 0.856163
regparam 2**-7 2**20, test set cindex 0.858433
regparam 2**-7 2**20, K-fold cindex 0.846538
regparam 2**-7 2**21, test set cindex 0.863401
regparam 2**-7 2**21, K-fold cindex 0.850740
regparam 2**-7 2**22, test set cindex 0.866956
regparam 2**-7 2**22, K-fold cindex 0.853785
regparam 2**-7 2**23, test set cindex 0.869128
regparam 2**-7 2**23, K-fold cindex 0.854982
regparam 2**-7 2**24, test set cindex 0.869852
regparam 2**-7 2**24, K-fold cindex 0.853804
regparam 2**-6 2**20, test set cindex 0.856016
regparam 2**-6 2**20, K-fold cindex 0.843186
regparam 2**-6 2**21, test set cindex 0.860732
regparam 2**-6 2**21, K-fold cindex 0.847180
regparam 2**-6 2**22, test set cindex 0.864151
regparam 2**-6 2**22, K-fold cindex 0.850133
regparam 2**-6 2**23, test set cindex 0.866186
regparam 2**-6 2**23, K-fold cindex 0.851215
regparam 2**-6 2**24, test set cindex 0.866570
regparam 2**-6 2**24, K-fold cindex 0.849806
regparam 2**-5 2**20, test set cindex 0.851254
regparam 2**-5 2**20, K-fold cindex 0.837417
regparam 2**-5 2**21, test set cindex 0.855569
regparam 2**-5 2**21, K-fold cindex 0.841089
regparam 2**-5 2**22, test set cindex 0.858841
regparam 2**-5 2**22, K-fold cindex 0.843835
regparam 2**-5 2**23, test set cindex 0.860764
regparam 2**-5 2**23, K-fold cindex 0.844748
regparam 2**-5 2**24, test set cindex 0.860787
regparam 2**-5 2**24, K-fold cindex 0.843035

Setting D

Out-of-sample leave-pair-out. On each CV round, a single (drug, target) pair is used as test pair (similar to setting A). However, all pairs where either the drug or the target appears, are left out of the training set.

from rlscore.learner import TwoStepRLS
from rlscore.measure import cindex
import davis_data

def main():
    X1_train, X2_train, Y_train, X1_test, X2_test, Y_test = davis_data.settingD_split()
    learner = TwoStepRLS(X1 = X1_train, X2 = X2_train, Y = Y_train, regparam1=1.0, regparam2=1.0)
    log_regparams1 = range(-8, -4)
    log_regparams2 = range(20,25)
    for log_regparam1 in log_regparams1:
        for log_regparam2 in log_regparams2:
            learner.solve(2.**log_regparam1, 2.**log_regparam2)
            P = learner.predict(X1_test, X2_test)
            perf = cindex(Y_test, P)
            print("regparam 2**%d 2**%d, test set cindex %f" %(log_regparam1, log_regparam2, perf))
            P = learner.out_of_sample_loo()
            perf = cindex(Y_train, P)
            print("regparam 2**%d 2**%d, out-of-sample loo cindex %f" %(log_regparam1, log_regparam2, perf))

if __name__=="__main__":
    main()
regparam 2**-8 2**20, test set cindex 0.624309
regparam 2**-8 2**20, out-of-sample loo cindex 0.624975
regparam 2**-8 2**21, test set cindex 0.625197
regparam 2**-8 2**21, out-of-sample loo cindex 0.625858
regparam 2**-8 2**22, test set cindex 0.625455
regparam 2**-8 2**22, out-of-sample loo cindex 0.626316
regparam 2**-8 2**23, test set cindex 0.624484
regparam 2**-8 2**23, out-of-sample loo cindex 0.625976
regparam 2**-8 2**24, test set cindex 0.622206
regparam 2**-8 2**24, out-of-sample loo cindex 0.624525
regparam 2**-7 2**20, test set cindex 0.636002
regparam 2**-7 2**20, out-of-sample loo cindex 0.637527
regparam 2**-7 2**21, test set cindex 0.637207
regparam 2**-7 2**21, out-of-sample loo cindex 0.638591
regparam 2**-7 2**22, test set cindex 0.637822
regparam 2**-7 2**22, out-of-sample loo cindex 0.639295
regparam 2**-7 2**23, test set cindex 0.637217
regparam 2**-7 2**23, out-of-sample loo cindex 0.639258
regparam 2**-7 2**24, test set cindex 0.635080
regparam 2**-7 2**24, out-of-sample loo cindex 0.638165
regparam 2**-6 2**20, test set cindex 0.648428
regparam 2**-6 2**20, out-of-sample loo cindex 0.648411
regparam 2**-6 2**21, test set cindex 0.650012
regparam 2**-6 2**21, out-of-sample loo cindex 0.649594
regparam 2**-6 2**22, test set cindex 0.650829
regparam 2**-6 2**22, out-of-sample loo cindex 0.650417
regparam 2**-6 2**23, test set cindex 0.650597
regparam 2**-6 2**23, out-of-sample loo cindex 0.650526
regparam 2**-6 2**24, test set cindex 0.648906
regparam 2**-6 2**24, out-of-sample loo cindex 0.649560
regparam 2**-5 2**20, test set cindex 0.659964
regparam 2**-5 2**20, out-of-sample loo cindex 0.656937
regparam 2**-5 2**21, test set cindex 0.661793
regparam 2**-5 2**21, out-of-sample loo cindex 0.658125
regparam 2**-5 2**22, test set cindex 0.663059
regparam 2**-5 2**22, out-of-sample loo cindex 0.658953
regparam 2**-5 2**23, test set cindex 0.663037
regparam 2**-5 2**23, out-of-sample loo cindex 0.659027
regparam 2**-5 2**24, test set cindex 0.661705
regparam 2**-5 2**24, out-of-sample loo cindex 0.657928

Kfold cross-validation. Same as above, but several (drug, target) pairs left out at once.

from rlscore.learner import TwoStepRLS
from rlscore.measure import cindex
import davis_data
from rlscore.utilities.cross_validation import random_folds

def main():
    X1_train, X2_train, Y_train, X1_test, X2_test, Y_test = davis_data.settingD_split()
    n = X1_train.shape[0]
    m = X2_train.shape[0]
    learner = TwoStepRLS(X1 = X1_train, X2 = X2_train, Y = Y_train, regparam1=1.0, regparam2=1.0)
    log_regparams1 = range(-8, -4)
    log_regparams2 = range(20,25)
    #Create random split to 5 folds for both drugs and targets
    drug_folds = random_folds(n, 5, seed=123)
    target_folds = random_folds(m, 5, seed=456)
    for log_regparam1 in log_regparams1:
        for log_regparam2 in log_regparams2:
            learner.solve(2.**log_regparam1, 2.**log_regparam2)
            P = learner.predict(X1_test, X2_test)
            perf = cindex(Y_test, P)
            print("regparam 2**%d 2**%d, test set cindex %f" %(log_regparam1, log_regparam2, perf))
            P = learner.out_of_sample_kfold_cv(drug_folds, target_folds)
            perf = cindex(Y_train, P)
            print("regparam 2**%d 2**%d, out-of-sample loo cindex %f" %(log_regparam1, log_regparam2, perf))

if __name__=="__main__":
    main()
regparam 2**-8 2**20, test set cindex 0.624309
regparam 2**-8 2**20, out-of-sample loo cindex 0.618318
regparam 2**-8 2**21, test set cindex 0.625197
regparam 2**-8 2**21, out-of-sample loo cindex 0.618970
regparam 2**-8 2**22, test set cindex 0.625455
regparam 2**-8 2**22, out-of-sample loo cindex 0.619301
regparam 2**-8 2**23, test set cindex 0.624484
regparam 2**-8 2**23, out-of-sample loo cindex 0.618913
regparam 2**-8 2**24, test set cindex 0.622206
regparam 2**-8 2**24, out-of-sample loo cindex 0.617366
regparam 2**-7 2**20, test set cindex 0.636002
regparam 2**-7 2**20, out-of-sample loo cindex 0.626475
regparam 2**-7 2**21, test set cindex 0.637207
regparam 2**-7 2**21, out-of-sample loo cindex 0.627182
regparam 2**-7 2**22, test set cindex 0.637822
regparam 2**-7 2**22, out-of-sample loo cindex 0.627465
regparam 2**-7 2**23, test set cindex 0.637217
regparam 2**-7 2**23, out-of-sample loo cindex 0.627089
regparam 2**-7 2**24, test set cindex 0.635080
regparam 2**-7 2**24, out-of-sample loo cindex 0.625512
regparam 2**-6 2**20, test set cindex 0.648428
regparam 2**-6 2**20, out-of-sample loo cindex 0.635430
regparam 2**-6 2**21, test set cindex 0.650012
regparam 2**-6 2**21, out-of-sample loo cindex 0.636122
regparam 2**-6 2**22, test set cindex 0.650829
regparam 2**-6 2**22, out-of-sample loo cindex 0.636423
regparam 2**-6 2**23, test set cindex 0.650597
regparam 2**-6 2**23, out-of-sample loo cindex 0.635981
regparam 2**-6 2**24, test set cindex 0.648906
regparam 2**-6 2**24, out-of-sample loo cindex 0.634308
regparam 2**-5 2**20, test set cindex 0.659964
regparam 2**-5 2**20, out-of-sample loo cindex 0.644468
regparam 2**-5 2**21, test set cindex 0.661793
regparam 2**-5 2**21, out-of-sample loo cindex 0.645096
regparam 2**-5 2**22, test set cindex 0.663059
regparam 2**-5 2**22, out-of-sample loo cindex 0.645339
regparam 2**-5 2**23, test set cindex 0.663037
regparam 2**-5 2**23, out-of-sample loo cindex 0.644775
regparam 2**-5 2**24, test set cindex 0.661705
regparam 2**-5 2**24, out-of-sample loo cindex 0.642933

Tutorial 2: TwoStepRLS, cross-validation with homogenous network

Here, our goal is to predict one type of drug similarity matrix (ECFP4 similarities) from another (2D similarities). For these experiments, we need to download from the drug-target binding affinity data sets page for the Metz et al. data the drug-drug ECFP4 similarities (Y) and drug-drug 2D similarities (X).

These implementations are a work in progress, and the inteface may still change. Currently, only the kernel versions are implemented, and only symmetric (f(u,v) = f(v,u)) or antisymmetric (f(u,v) = -f(v,u)) labels are supported.

Setting A

Leave-vertex-out cross-validation. On each round of CV, a single (drug_i, drug_j) pair is left out of the training set as test pair.

import numpy as np
from rlscore.learner import TwoStepRLS
from rlscore.measure import cindex
import davis_data

def main():
    X = np.loadtxt("drug-drug_similarities_2D.txt")
    Y = np.loadtxt("drug-drug_similarities_ECFP4.txt")
    Y = Y.ravel(order='F')
    Y = Y.ravel(order='F')
    K = np.dot(X, X)
    learner = TwoStepRLS(K1 = K, K2 = K, Y = Y, regparam1=1.0, regparam2=1.0)
    log_regparams = range(-10,0)
    for log_regparam in log_regparams:
        learner.solve(2.**log_regparam, 2.**log_regparam)
        P = learner.in_sample_loo_symmetric()
        perf = cindex(Y, P)
        print("regparam 2**%d, cindex %f" %(log_regparam, perf))



if __name__=="__main__":
    main()
regparam 2**-10, cindex 0.718990
regparam 2**-9, cindex 0.725494
regparam 2**-8, cindex 0.732533
regparam 2**-7, cindex 0.738109
regparam 2**-6, cindex 0.742912
regparam 2**-5, cindex 0.743215
regparam 2**-4, cindex 0.742478
regparam 2**-3, cindex 0.743173
regparam 2**-2, cindex 0.741882
regparam 2**-1, cindex 0.733905

Setting B/C

Leave-edge-out cross-validation. On each CV round, a single holdout drug d_i is left out, and all (drug_i, drug_j) pairs this drug belongs to are used as the test fold.

import numpy as np
from rlscore.learner import TwoStepRLS
from rlscore.measure import cindex
import davis_data

def main():
    X = np.loadtxt("drug-drug_similarities_2D.txt")
    Y = np.loadtxt("drug-drug_similarities_ECFP4.txt")
    Y = Y.ravel(order='F')
    K = np.dot(X, X)
    learner = TwoStepRLS(K1 = K, K2 = K, Y = Y, regparam1=1.0, regparam2=1.0)
    log_regparams = range(-10, 0)
    for log_regparam in log_regparams:
        learner.solve(2.**log_regparam, 2.**log_regparam)
        P = learner.leave_vertex_out()
        perf = cindex(Y, P)
        print("regparam 2**%d, cindex %f" %(log_regparam, perf))



if __name__=="__main__":
    main()
regparam 2**-10, cindex 0.571253
regparam 2**-9, cindex 0.579307
regparam 2**-8, cindex 0.593338
regparam 2**-7, cindex 0.619724
regparam 2**-6, cindex 0.669924
regparam 2**-5, cindex 0.737458
regparam 2**-4, cindex 0.794501
regparam 2**-3, cindex 0.815712
regparam 2**-2, cindex 0.806867
regparam 2**-1, cindex 0.780328

Setting D

Out-of-sample leave-one-out. On each CV round, a single (drug_i, drug_j) pair is used as test pair (similar to setting A). However, all other pairs where either of these drugs appears, are left out of the training set.

import numpy as np
from rlscore.learner import TwoStepRLS
from rlscore.measure import cindex
import davis_data

def main():
    X = np.loadtxt("drug-drug_similarities_2D.txt")
    Y = np.loadtxt("drug-drug_similarities_ECFP4.txt")
    Y = Y.ravel(order='F')
    K = np.dot(X, X)
    learner = TwoStepRLS(K1 = K, K2 = K, Y = Y, regparam1=1.0, regparam2=1.0)
    log_regparams = range(-10, 0)
    for log_regparam in log_regparams:
        learner.solve(2.**log_regparam, 2.**log_regparam)
        P = learner.out_of_sample_loo_symmetric()
        perf = cindex(Y, P)
        print("regparam 2**%d, cindex %f" %(log_regparam, perf))



if __name__=="__main__":
    main()
regparam 2**-10, cindex 0.614197
regparam 2**-9, cindex 0.625297
regparam 2**-8, cindex 0.637933
regparam 2**-7, cindex 0.652413
regparam 2**-6, cindex 0.665824
regparam 2**-5, cindex 0.678009
regparam 2**-4, cindex 0.687217
regparam 2**-3, cindex 0.692244
regparam 2**-2, cindex 0.691334
regparam 2**-1, cindex 0.683918

Tutorial 3: CGKronRLS, incomplete data

In many applications one does not have available the correct labels for all (u,v) pairs in the training set. Rather, only a (possibly small) fraction of these are known. Next, we consider learning from such data, using an iterative Kronecker RLS training algorithm, based on a generalization of the classical Vec trick shortcut for Kronecker products, and a conjugate gradient type of optimization approach [7]. CGKronRLS is an iterative training algorithm, in the next experiments we will use a callback function to check, how the test set predictive accuracy behaves as a function of training iterations. Early stopping of optimization has a regularizing effect, which is seen to be beneficial especially in setting D.

Data set

For these experiments, we need to download from the drug-target binding affinity data sets page for the Metz et al. data the drug-target interaction affinities (Y), drug-drug 2D similarities (X1), and WS normalized target-target similarities (X2). In the following we will use similarity scores directly as features for the linear kernel, since the similarity matrices themselves are not valid positive semi-definite kernel matrices.

We can load the data set as follows:

import numpy as np

def load_metz():
    Y = np.loadtxt("known_drug-target_interaction_affinities_pKi__Metz_et_al.2011.txt")
    XD = np.loadtxt("drug-drug_similarities_2D__Metz_et_al.2011.txt")
    XT = np.loadtxt("target-target_similarities_WS_normalized__Metz_et_al.2011.txt")
    drug_inds, target_inds = np.where(np.isnan(Y)==False)
    Y = Y[drug_inds, target_inds]
    return XD, XT, Y, drug_inds, target_inds

def settingA_split():
    XD, XT, Y, drug_inds, target_inds = load_metz()
    np.random.seed(77)
    #random split to train/test, corresponds to setting A
    ind = list(range(len(Y)))
    np.random.shuffle(ind)
    train_ind = ind[:50000]
    test_ind = ind[50000:]
    train_drug_inds = drug_inds[train_ind]
    train_target_inds = target_inds[train_ind]
    Y_train = Y[train_ind]
    test_drug_inds = drug_inds[test_ind]
    test_target_inds = target_inds[test_ind]
    Y_test = Y[test_ind]
    return XD, XT, train_drug_inds, train_target_inds, Y_train, test_drug_inds, test_target_inds, Y_test

def settingB_split():
    XD, XT, Y, drug_inds, target_inds = load_metz()
    np.random.seed(77)
    #random split to train/test, corresponds to setting B
    drows = list(range(XD.shape[0]))
    np.random.shuffle(drows)
    train_drows = set(drows[:800])
    #test_drug_ind = set(drug_ind[800:])
    train_ind = []
    test_ind = []
    for i in range(len(drug_inds)):
        if drug_inds[i] in train_drows:
            train_ind.append(i)
        else:
            test_ind.append(i)
    train_drug_inds = drug_inds[train_ind]
    train_target_inds = target_inds[train_ind]
    Y_train = Y[train_ind]
    test_drug_inds = drug_inds[test_ind]
    test_target_inds = target_inds[test_ind]
    Y_test = Y[test_ind]
    return XD, XT, train_drug_inds, train_target_inds, Y_train, test_drug_inds, test_target_inds, Y_test

def settingC_split():
    XD, XT, Y, drug_inds, target_inds = load_metz()
    np.random.seed(77)
    #random split to train/test, corresponds to setting C
    trows = list(range(XT.shape[0]))
    np.random.shuffle(trows)
    train_trows = set(trows[:80])
    train_ind = []
    test_ind = []
    for i in range(len(target_inds)):
        if target_inds[i] in train_trows:
            train_ind.append(i)
        else:
            test_ind.append(i)
    train_drug_inds = drug_inds[train_ind]
    train_target_inds = target_inds[train_ind]
    Y_train = Y[train_ind]
    test_drug_inds = drug_inds[test_ind]
    test_target_inds = target_inds[test_ind]
    Y_test = Y[test_ind]
    return XD, XT, train_drug_inds, train_target_inds, Y_train, test_drug_inds, test_target_inds, Y_test


def settingD_split():
    XD, XT, Y, drug_inds, target_inds = load_metz()
    np.random.seed(77)
    #random split to train/test, corresponds to setting D
    drows = list(range(XD.shape[0]))
    np.random.shuffle(drows)
    train_drows = set(drows[:800])
    trows = list(range(XT.shape[0]))
    np.random.shuffle(trows)
    train_trows = set(trows[:80])
    train_ind = []
    test_ind = []
    for i in range(len(target_inds)):
        if drug_inds[i] in train_drows and target_inds[i] in train_trows:
            train_ind.append(i)
        elif drug_inds[i] not in train_drows and target_inds[i] not in train_trows:
            test_ind.append(i)
    train_drug_inds = drug_inds[train_ind]
    train_target_inds = target_inds[train_ind]
    Y_train = Y[train_ind]
    test_drug_inds = drug_inds[test_ind]
    test_target_inds = target_inds[test_ind]
    Y_test = Y[test_ind]
    return XD, XT, train_drug_inds, train_target_inds, Y_train, test_drug_inds, test_target_inds, Y_test

if __name__=="__main__":
    XD, XT, Y, drug_inds, target_inds = load_metz()
    print("XD dimensions %d %d" %XD.shape)
    print("XT dimensions %d %d" %XT.shape)
    print("Labeled pairs %d, all possible pairs %d" %(len(Y), XD.shape[0]*XT.shape[0]))

The code includes four functions to split the data into a training and test set according to the four different settings.

Setting A

First, we consider setting A, imputing missing values inside the Y-matrix.

from rlscore.learner import CGKronRLS
from rlscore.measure import cindex
import metz_data

class CallBack(object):

    def __init__(self, X1, X2, Y, row_inds, col_inds):
        self.X1 = X1
        self.X2 = X2
        self.Y = Y
        self.row_inds = row_inds
        self.col_inds = col_inds
        self.iter = 1

    def callback(self, learner):
        if self.iter%10 == 0:
            P = learner.predict(self.X1, self.X2, self.row_inds, self.col_inds)
            perf = cindex(self.Y, P)
            print("iteration %d cindex %f" %(self.iter, perf))
        self.iter += 1

    def finished(self, learner):
        pass
    
def main():
    XD, XT, train_drug_inds, train_target_inds, Y_train, test_drug_inds, test_target_inds, Y_test = metz_data.settingA_split()
    cb = CallBack(XD, XT, Y_test, test_drug_inds, test_target_inds)
    learner = CGKronRLS(X1 = XD, X2 = XT, Y=Y_train, label_row_inds = train_drug_inds, label_col_inds = train_target_inds, callback = cb, maxiter=1000)
    

if __name__=="__main__":
    main()
iteration 10 cindex 0.663269
iteration 20 cindex 0.711649
iteration 30 cindex 0.723401
iteration 40 cindex 0.731723
iteration 50 cindex 0.736525
iteration 60 cindex 0.742051
iteration 70 cindex 0.745895
iteration 80 cindex 0.748916
iteration 90 cindex 0.752246
iteration 100 cindex 0.754948
iteration 110 cindex 0.757870
iteration 120 cindex 0.759562
iteration 130 cindex 0.761514
iteration 140 cindex 0.763529
iteration 150 cindex 0.765115
iteration 160 cindex 0.766211
iteration 170 cindex 0.767283
iteration 180 cindex 0.768554
iteration 190 cindex 0.769795
iteration 200 cindex 0.770929
iteration 210 cindex 0.772193
iteration 220 cindex 0.773187
iteration 230 cindex 0.774213
iteration 240 cindex 0.775022
iteration 250 cindex 0.775675
iteration 260 cindex 0.776379
iteration 270 cindex 0.776835
iteration 280 cindex 0.777696
iteration 290 cindex 0.778162
iteration 300 cindex 0.778763
iteration 310 cindex 0.779539
iteration 320 cindex 0.780013
iteration 330 cindex 0.780932
iteration 340 cindex 0.781495
iteration 350 cindex 0.781950
iteration 360 cindex 0.782539
iteration 370 cindex 0.783041
iteration 380 cindex 0.783477
iteration 390 cindex 0.783882
iteration 400 cindex 0.784308
iteration 410 cindex 0.784765
iteration 420 cindex 0.785084
iteration 430 cindex 0.785388
iteration 440 cindex 0.785864
iteration 450 cindex 0.786244
iteration 460 cindex 0.786477
iteration 470 cindex 0.786903
iteration 480 cindex 0.787255
iteration 490 cindex 0.787576
iteration 500 cindex 0.787963
iteration 510 cindex 0.788213
iteration 520 cindex 0.788503
iteration 530 cindex 0.788884
iteration 540 cindex 0.789157
iteration 550 cindex 0.789457
iteration 560 cindex 0.789799
iteration 570 cindex 0.790070
iteration 580 cindex 0.790305
iteration 590 cindex 0.790562
iteration 600 cindex 0.790768
iteration 610 cindex 0.790995
iteration 620 cindex 0.791228
iteration 630 cindex 0.791398
iteration 640 cindex 0.791631
iteration 650 cindex 0.791762
iteration 660 cindex 0.791995
iteration 670 cindex 0.792214
iteration 680 cindex 0.792352
iteration 690 cindex 0.792546
iteration 700 cindex 0.792718
iteration 710 cindex 0.792898
iteration 720 cindex 0.793073
iteration 730 cindex 0.793210
iteration 740 cindex 0.793375
iteration 750 cindex 0.793516
iteration 760 cindex 0.793625
iteration 770 cindex 0.793804
iteration 780 cindex 0.793949
iteration 790 cindex 0.794091
iteration 800 cindex 0.794209
iteration 810 cindex 0.794301
iteration 820 cindex 0.794446
iteration 830 cindex 0.794525
iteration 840 cindex 0.794616
iteration 850 cindex 0.794711
iteration 860 cindex 0.794802
iteration 870 cindex 0.794900
iteration 880 cindex 0.795020
iteration 890 cindex 0.795096
iteration 900 cindex 0.795205
iteration 910 cindex 0.795314
iteration 920 cindex 0.795385
iteration 930 cindex 0.795489
iteration 940 cindex 0.795559
iteration 950 cindex 0.795663
iteration 960 cindex 0.795753
iteration 970 cindex 0.795841
iteration 980 cindex 0.795920
iteration 990 cindex 0.796021
iteration 1000 cindex 0.796096

Even at 1000 iterations the test set cindex is still slowly increasing, it might be beneficial to continue optimization even further.

Setting B

Next, we consider setting B, generalizing to predictions for new drugs that were not observed in the training set.

from rlscore.learner import CGKronRLS
from rlscore.measure import cindex
import metz_data

class CallBack(object):

    def __init__(self, X1, X2, Y, row_inds, col_inds):
        self.X1 = X1
        self.X2 = X2
        self.Y = Y
        self.row_inds = row_inds
        self.col_inds = col_inds
        self.iter = 1

    def callback(self, learner):
        if self.iter%10 == 0:
            P = learner.predict(self.X1, self.X2, self.row_inds, self.col_inds)
            perf = cindex(self.Y, P)
            print("iteration %d cindex %f" %(self.iter, perf))
        self.iter += 1

    def finished(self, learner):
        pass
    
def main():
    XD, XT, train_drug_inds, train_target_inds, Y_train, test_drug_inds, test_target_inds, Y_test = metz_data.settingB_split()
    cb = CallBack(XD, XT, Y_test, test_drug_inds, test_target_inds)
    learner = CGKronRLS(X1 = XD, X2 = XT, Y=Y_train, label_row_inds = train_drug_inds, label_col_inds = train_target_inds, callback = cb, maxiter=1000)
    

if __name__=="__main__":
    main()
iteration 10 cindex 0.662735
iteration 20 cindex 0.705899
iteration 30 cindex 0.711650
iteration 40 cindex 0.718579
iteration 50 cindex 0.722185
iteration 60 cindex 0.724653
iteration 70 cindex 0.725704
iteration 80 cindex 0.726423
iteration 90 cindex 0.727315
iteration 100 cindex 0.727875
iteration 110 cindex 0.727771
iteration 120 cindex 0.727416
iteration 130 cindex 0.726778
iteration 140 cindex 0.726266
iteration 150 cindex 0.725675
iteration 160 cindex 0.725386
iteration 170 cindex 0.725221
iteration 180 cindex 0.725049
iteration 190 cindex 0.724844
iteration 200 cindex 0.724400
iteration 210 cindex 0.723917
iteration 220 cindex 0.723213
iteration 230 cindex 0.722629
iteration 240 cindex 0.722286
iteration 250 cindex 0.721923
iteration 260 cindex 0.721169
iteration 270 cindex 0.720778
iteration 280 cindex 0.720153
iteration 290 cindex 0.719602
iteration 300 cindex 0.719064
iteration 310 cindex 0.718131
iteration 320 cindex 0.717484
iteration 330 cindex 0.716514
iteration 340 cindex 0.715976
iteration 350 cindex 0.714997
iteration 360 cindex 0.714572
iteration 370 cindex 0.713973
iteration 380 cindex 0.713348
iteration 390 cindex 0.712987
iteration 400 cindex 0.712495
iteration 410 cindex 0.712041
iteration 420 cindex 0.711481
iteration 430 cindex 0.711024
iteration 440 cindex 0.710647
iteration 450 cindex 0.710086
iteration 460 cindex 0.709655
iteration 470 cindex 0.709208
iteration 480 cindex 0.708799
iteration 490 cindex 0.708517
iteration 500 cindex 0.708192
iteration 510 cindex 0.707744
iteration 520 cindex 0.707395
iteration 530 cindex 0.707147
iteration 540 cindex 0.706883
iteration 550 cindex 0.706510
iteration 560 cindex 0.706215
iteration 570 cindex 0.705977
iteration 580 cindex 0.705643
iteration 590 cindex 0.705311
iteration 600 cindex 0.705032
iteration 610 cindex 0.704709
iteration 620 cindex 0.704457
iteration 630 cindex 0.704233
iteration 640 cindex 0.703959
iteration 650 cindex 0.703630
iteration 660 cindex 0.703400
iteration 670 cindex 0.703020
iteration 680 cindex 0.702779
iteration 690 cindex 0.702472
iteration 700 cindex 0.702195
iteration 710 cindex 0.701954
iteration 720 cindex 0.701726
iteration 730 cindex 0.701455
iteration 740 cindex 0.701199
iteration 750 cindex 0.700996
iteration 760 cindex 0.700655
iteration 770 cindex 0.700362
iteration 780 cindex 0.700122
iteration 790 cindex 0.699881
iteration 800 cindex 0.699625
iteration 810 cindex 0.699392
iteration 820 cindex 0.699111
iteration 830 cindex 0.698917
iteration 840 cindex 0.698721
iteration 850 cindex 0.698546
iteration 860 cindex 0.698384
iteration 870 cindex 0.697999
iteration 880 cindex 0.697805
iteration 890 cindex 0.697622
iteration 900 cindex 0.697288
iteration 910 cindex 0.697141
iteration 920 cindex 0.696962
iteration 930 cindex 0.696693
iteration 940 cindex 0.696498
iteration 950 cindex 0.696361
iteration 960 cindex 0.696202
iteration 970 cindex 0.696019
iteration 980 cindex 0.695773
iteration 990 cindex 0.695513
iteration 1000 cindex 0.695348

Now the performance peaks around hundred iterations, and starts slowly decreasing. In this setting, regularization by early stopping seems to be beneficial.

Setting C

Next, we consider setting C, generalizing to predictions for new targets that were not observed in the training set.

from rlscore.learner import CGKronRLS
from rlscore.measure import cindex
import metz_data

class CallBack(object):

    def __init__(self, X1, X2, Y, row_inds, col_inds):
        self.X1 = X1
        self.X2 = X2
        self.Y = Y
        self.row_inds = row_inds
        self.col_inds = col_inds
        self.iter = 1

    def callback(self, learner):
        if self.iter%10 == 0:
            P = learner.predict(self.X1, self.X2, self.row_inds, self.col_inds)
            perf = cindex(self.Y, P)
            print("iteration %d cindex %f" %(self.iter, perf))
        self.iter += 1

    def finished(self, learner):
        pass
    
def main():
    XD, XT, train_drug_inds, train_target_inds, Y_train, test_drug_inds, test_target_inds, Y_test = metz_data.settingC_split()
    cb = CallBack(XD, XT, Y_test, test_drug_inds, test_target_inds)
    learner = CGKronRLS(X1 = XD, X2 = XT, Y=Y_train, label_row_inds = train_drug_inds, label_col_inds = train_target_inds, callback = cb, maxiter=1000)
    

if __name__=="__main__":
    main()
iteration 10 cindex 0.608983
iteration 20 cindex 0.610994
iteration 30 cindex 0.611172
iteration 40 cindex 0.608488
iteration 50 cindex 0.605859
iteration 60 cindex 0.602215
iteration 70 cindex 0.600122
iteration 80 cindex 0.600734
iteration 90 cindex 0.602477
iteration 100 cindex 0.605751
iteration 110 cindex 0.608898
iteration 120 cindex 0.610367
iteration 130 cindex 0.610609
iteration 140 cindex 0.609669
iteration 150 cindex 0.608698
iteration 160 cindex 0.607995
iteration 170 cindex 0.608126
iteration 180 cindex 0.608752
iteration 190 cindex 0.609594
iteration 200 cindex 0.610229
iteration 210 cindex 0.610807
iteration 220 cindex 0.611119
iteration 230 cindex 0.611249
iteration 240 cindex 0.611444
iteration 250 cindex 0.612123
iteration 260 cindex 0.612627
iteration 270 cindex 0.613632
iteration 280 cindex 0.614712
iteration 290 cindex 0.615403
iteration 300 cindex 0.615942
iteration 310 cindex 0.616248
iteration 320 cindex 0.616590
iteration 330 cindex 0.616789
iteration 340 cindex 0.617239
iteration 350 cindex 0.617785
iteration 360 cindex 0.618310
iteration 370 cindex 0.618666
iteration 380 cindex 0.618812
iteration 390 cindex 0.618748
iteration 400 cindex 0.618785
iteration 410 cindex 0.618773
iteration 420 cindex 0.618913
iteration 430 cindex 0.619156
iteration 440 cindex 0.619458
iteration 450 cindex 0.619847
iteration 460 cindex 0.620143
iteration 470 cindex 0.620308
iteration 480 cindex 0.620430
iteration 490 cindex 0.620664
iteration 500 cindex 0.621065
iteration 510 cindex 0.621354
iteration 520 cindex 0.621889
iteration 530 cindex 0.622320
iteration 540 cindex 0.622623
iteration 550 cindex 0.623032
iteration 560 cindex 0.623328
iteration 570 cindex 0.623535
iteration 580 cindex 0.623727
iteration 590 cindex 0.623989
iteration 600 cindex 0.624302
iteration 610 cindex 0.624554
iteration 620 cindex 0.624809
iteration 630 cindex 0.624984
iteration 640 cindex 0.625122
iteration 650 cindex 0.625272
iteration 660 cindex 0.625394
iteration 670 cindex 0.625607
iteration 680 cindex 0.625824
iteration 690 cindex 0.626180
iteration 700 cindex 0.626362
iteration 710 cindex 0.626692
iteration 720 cindex 0.627064
iteration 730 cindex 0.627259
iteration 740 cindex 0.627514
iteration 750 cindex 0.627848
iteration 760 cindex 0.628185
iteration 770 cindex 0.628377
iteration 780 cindex 0.628581
iteration 790 cindex 0.628831
iteration 800 cindex 0.628954
iteration 810 cindex 0.629140
iteration 820 cindex 0.629274
iteration 830 cindex 0.629441
iteration 840 cindex 0.629645
iteration 850 cindex 0.629729
iteration 860 cindex 0.629801
iteration 870 cindex 0.629992
iteration 880 cindex 0.630097
iteration 890 cindex 0.630285
iteration 900 cindex 0.630523
iteration 910 cindex 0.630679
iteration 920 cindex 0.630889
iteration 930 cindex 0.631041
iteration 940 cindex 0.631347
iteration 950 cindex 0.631545
iteration 960 cindex 0.631827
iteration 970 cindex 0.632046
iteration 980 cindex 0.632301
iteration 990 cindex 0.632556
iteration 1000 cindex 0.632638

Behaviour is similar as in setting A, the performance is still slowly increasing when optimization is terminated.

Setting D

Finally we consider the most demanding setting D, generalizing to new (u,v) pairs such that neither have been observed in the training set.

from rlscore.learner import CGKronRLS
from rlscore.measure import cindex
import metz_data

class CallBack(object):

    def __init__(self, X1, X2, Y, row_inds, col_inds):
        self.X1 = X1
        self.X2 = X2
        self.Y = Y
        self.row_inds = row_inds
        self.col_inds = col_inds
        self.iter = 1

    def callback(self, learner):
        if self.iter%10 == 0:
            P = learner.predict(self.X1, self.X2, self.row_inds, self.col_inds)
            perf = cindex(self.Y, P)
            print("iteration %d cindex %f" %(self.iter, perf))
        self.iter += 1

    def finished(self, learner):
        pass
    
def main():
    XD, XT, train_drug_inds, train_target_inds, Y_train, test_drug_inds, test_target_inds, Y_test = metz_data.settingD_split()
    cb = CallBack(XD, XT, Y_test, test_drug_inds, test_target_inds)
    learner = CGKronRLS(X1 = XD, X2 = XT, Y=Y_train, label_row_inds = train_drug_inds, label_col_inds = train_target_inds, callback = cb, maxiter=1000)
    

if __name__=="__main__":
    main()
iteration 10 cindex 0.593853
iteration 20 cindex 0.617019
iteration 30 cindex 0.617731
iteration 40 cindex 0.613917
iteration 50 cindex 0.611996
iteration 60 cindex 0.607733
iteration 70 cindex 0.606607
iteration 80 cindex 0.606964
iteration 90 cindex 0.608529
iteration 100 cindex 0.610105
iteration 110 cindex 0.610333
iteration 120 cindex 0.610189
iteration 130 cindex 0.609194
iteration 140 cindex 0.608503
iteration 150 cindex 0.608736
iteration 160 cindex 0.609599
iteration 170 cindex 0.610378
iteration 180 cindex 0.611034
iteration 190 cindex 0.611145
iteration 200 cindex 0.610896
iteration 210 cindex 0.610558
iteration 220 cindex 0.610450
iteration 230 cindex 0.610506
iteration 240 cindex 0.610654
iteration 250 cindex 0.611199
iteration 260 cindex 0.611325
iteration 270 cindex 0.611274
iteration 280 cindex 0.611280
iteration 290 cindex 0.611414
iteration 300 cindex 0.611622
iteration 310 cindex 0.611768
iteration 320 cindex 0.612021
iteration 330 cindex 0.612064
iteration 340 cindex 0.612003
iteration 350 cindex 0.612003
iteration 360 cindex 0.611893
iteration 370 cindex 0.611777
iteration 380 cindex 0.611682
iteration 390 cindex 0.611499
iteration 400 cindex 0.611408
iteration 410 cindex 0.611271
iteration 420 cindex 0.611089
iteration 430 cindex 0.611000
iteration 440 cindex 0.610876
iteration 450 cindex 0.610804
iteration 460 cindex 0.610717
iteration 470 cindex 0.610668
iteration 480 cindex 0.610627
iteration 490 cindex 0.610565
iteration 500 cindex 0.610413
iteration 510 cindex 0.610250
iteration 520 cindex 0.610163
iteration 530 cindex 0.610087
iteration 540 cindex 0.609982
iteration 550 cindex 0.609893
iteration 560 cindex 0.609822
iteration 570 cindex 0.609755
iteration 580 cindex 0.609656
iteration 590 cindex 0.609555
iteration 600 cindex 0.609480
iteration 610 cindex 0.609375
iteration 620 cindex 0.609312
iteration 630 cindex 0.609225
iteration 640 cindex 0.609144
iteration 650 cindex 0.608971
iteration 660 cindex 0.608851
iteration 670 cindex 0.608780
iteration 680 cindex 0.608694
iteration 690 cindex 0.608648
iteration 700 cindex 0.608624
iteration 710 cindex 0.608544
iteration 720 cindex 0.608499
iteration 730 cindex 0.608437
iteration 740 cindex 0.608342
iteration 750 cindex 0.608211
iteration 760 cindex 0.608114
iteration 770 cindex 0.608014
iteration 780 cindex 0.607899
iteration 790 cindex 0.607814
iteration 800 cindex 0.607718
iteration 810 cindex 0.607660
iteration 820 cindex 0.607621
iteration 830 cindex 0.607547
iteration 840 cindex 0.607434
iteration 850 cindex 0.607325
iteration 860 cindex 0.607187
iteration 870 cindex 0.607089
iteration 880 cindex 0.607000
iteration 890 cindex 0.606932
iteration 900 cindex 0.606840
iteration 910 cindex 0.606729
iteration 920 cindex 0.606597
iteration 930 cindex 0.606499
iteration 940 cindex 0.606428
iteration 950 cindex 0.606392
iteration 960 cindex 0.606309
iteration 970 cindex 0.606156
iteration 980 cindex 0.606022
iteration 990 cindex 0.605862
iteration 1000 cindex 0.605803

Now the best results are reached soon after 100 iterations, after which the model starts to overfit.

References

[1]Michiel Stock, Tapio Pahikkala, Antti Airola, Bernard De Baets, and Willem Waegeman. A Comparative Study of Pairwise Learning Methods Based on Kernel Ridge Regression. Neural Computation, 30(8):2245–2283, 2018.
[2](1, 2, 3, 4) Michiel Stock, Tapio Pahikkala, Antti Airola, Willem Waegeman, and Bernard De Baets. (2018). Algebraic shortcuts for leave-one-out cross-validation in supervised network inference. bioRxiv, 242321.
[3]Tapio Pahikkala, Willem Waegeman, Antti Airola, Tapio Salakoski, and Bernard De Baets. Conditional ranking on relational data. In José L. Balcázar, Francesco Bonchi, Aristides Gionis, and Michèle Sebag, editors, Machine Learning and Knowledge Discovery in Databases (ECML PKDD 2010), volume 6322 of Lecture Notes in Computer Science, pages 499–514. Springer, 2010.
[4]Tapio Pahikkala, Antti Airola, Michiel Stock, Bernard De Baets, and Willem Waegeman. Efficient regularized least-squares algorithms for conditional ranking on relational data. Machine Learning, 93(2-3):321–356, 2013.
[5](1, 2) Tapio Pahikkala, Michiel Stock, Antti Airola, Tero Aittokallio, Bernard De Baets, and Willem Waegeman. A two-step learning approach for solving full and almost full cold start problems in dyadic prediction. In Toon Calders, Floriana Esposito, Eyke Hüllermeier, and Rosa Meo, editors, Machine Learning and Knowledge Discovery in Databases (ECML PKDD 2014), volume 8725 of Lecture Notes in Computer Science, pages 517–532. Springer, 2014.
[6]Tapio Pahikkala, Antti Airola, Sami Pietilä, Sushil Shakyawar, Agnieszka Szwajda, Jing Tang, and Tero Aittokallio. Toward more realistic drug-target interaction predictions. Briefings in Bioinformatics, 16(2):325–337, 2015.
[7]Antti Airola, and Tapio Pahikkala. “Fast Kronecker Product Kernel Methods via Generalized Vec Trick.” IEEE transactions on neural networks and learning systems 99 (2017): 1-14.