Options

# SVD Vectors

Member Posts: 3 Contributor I
edited December 2018 in Help

Hello,

I would be really thankfull if someone could help me. I cannot understant how "svd vectors" table is generated (not how eigenvalues are calculated). It would be perfect if somone could propose a reference to study!

Thank you

Marianna

• Options
Administrator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 364 RM Data Scientist
Solution Accepted

Hi @mariannita,

It may need some review sessions of Colleague Linear Algebra for matrix computations and transformations ^_^

Singular value decomposition (SVD) and principal component analysis (PCA) are two eigenvalue methods used to reduce a high-dimensional dataset into fewer dimensions while retaining important information.

The open sourced Java code for SVD and PCA can be found on github, for feature transformations:

https://github.com/rapidprom/source/tree/master/RapidMiner_Unuk/src/com/rapidminer/operator/features/transformation

Simply put, the PCA viewpoint requires that one compute the eigenvalues and eigenvectors of the covariance matrix, which is the product XX' (X' is the transpose of matrix X, usually also noted as X^T), where X is the data matrix. Since the covariance matrix is symmetric, the matrix is diagonalizable, and the eigenvectors can be normalized such that they are orthonormal.

XX'=WDW'
On the other hand, applying SVD to the data matrix X as follows:

X=UΣV'
and attempting to construct the covariance matrix from this decomposition gives

XX'=(UΣV')(UΣV')'=(UΣV')(VΣU')
and since V is an orthogonal matrix (VV=I),

XX'=UΣΣU'
and the correspondence is easily seen (the square roots of the eigenvalues of XX' are the singular values of X, etc.)

As you may have known the calculation of eigenvalues and eigen-vectors. Please refer to the graph in https://intoli.com/blog/pca-and-svd/

• The unit vectors ui along the semi-axes of the ellipse are called the “left” singular vectors of  X.
• The unit vectors vi such that Xvi=σiu are called the “right” singular vectors of  X

The end result is that the first k principal components of  XX' correspond exactly to the eigenvectors of the covariance matrix ordered by their eigenvalues. Moreover, the eigenvalues are exactly equal to the variance of the dataset along the corresponding eigenvectors.

In fact, using the SVD to perform PCA makes much better sense numerically than forming the covariance matrix to begin with, since the formation of XX' can cause loss of precision. This is detailed in books on numerical linear algebra.

A tutorial on Principal Component Analysis by Jonathon Shlens is a good tutorial on PCA and its relation to SVD. Specifically, section VI: A More General Solution Using SVD. https://arxiv.org/pdf/1404.1100.pdf

YY

• Options
Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,525 RM Data Scientist

Hi @mariannita,

what operator are you talking about? SVM? SVM Linear?

Best,

Martin

- Sr. Director Data Solutions, Altair RapidMiner -
Dortmund, Germany
• Options
Member Posts: 3 Contributor I

Hello Martin,

SVD (Singular Value Decomposition) -not SVM- which is in "Dimentionality Reduction" folder.

Thank you!

Marianna

• Options
Member Posts: 3 Contributor I