Partial Least Squares cross-decomposition (PLS regression)

../../../../_images/PCA.svg

Finds the fundamental relations between two matrices X and Y, ie. it finds the (multidimensional) direction in X that best explains maximum multidimensional direction in Y. See also PCA-analysis

Documentation

Partial Least Squares (PLS) cross-decomposition is a statistical method used to find the

fundamental relations between two matrices, typically predictors (X) and responses (Y). It projects both X and Y into a lower-dimensional subspace such that the covariance between transformed(X) and transformed(Y) is maximal.

PLS draws similarities with Principal Component Regression (PCR), where the samples are first projected into a lower-dimensional subspace, and the targets y are predicted using transformed(X). One issue with PCR is that the dimensionality reduction is unsupervised, and may lose some important variables: PCR would keep the features with the most variance, but it’s possible that features with a small variances are relevant from predicting the target. In a way, PLS allows for the same kind of dimensionality reduction, but by taking into account the targets y.

Attributes

coef_

The coefficients of the linear model such that Y is approximated as Y = X @ coef_.T + intercept_.

n_iter_

Number of iterations of the power method, for each component.

x_loadings_

The loadings of X.

x_rotations_

The projection matrix used to transform X.

x_scores_

The transformed training samples.

x_weights_

The left singular vectors of the cross-covariance matrices of each iteration.

y_loadings_

The loadings of Y.

y_rotations_

The projection matrix used to transform Y.

y_scores_

The transformed training targets.

y_weights_

The right singular vectors of the cross-covariance matrices of each iteration.

Definition

Output ports

model
Type: model
Description: Model

Configuration

Max iterations (max_iter)

The maximum number of iterations of the power method when algorithm=’nipals’. Ignored otherwise.

Number of components to keep (n_components)

Number of components to keep. Should be in [1, n_features].

Scale the data (scale)

Whether to scale X and Y.

Tolerance (tol)

The tolerance used as convergence criteria in the power method: the algorithm stops whenever the squared norm of u_i - u_{i-1} is less than tol, where u corresponds to the left singular vector.

Implementation

class node_decomposition.PLSRegressionCrossDecomposition[source]