🎉 🎉 RAPIDMINER 9.10 IS OUT!!! 🎉🎉

Download the latest version helping analytics teams accelerate time-to-value for streaming and IIOT use cases.

CLICK HERE TO DOWNLOAD

Why SVD "cumulative variance plot" is not scaled to 100%

jacobcybulskijacobcybulski Member, University Professor Posts: 391   Unicorn
When using PCA the cumulative variance plot, among many things, allows determining if your visualisation in PC1xPC2 reliably depicts your data (shows large part of variance). In SVD this plot is called "Cumulative Proportion of Single Values" and it is not scaled to 100%. Is there any reason for SVD not to represent variance, is it not variance that is depicted in the plot?
Jacob

Answers

  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,625   Unicorn
    @jacobcybulski interesting, since PCA is a special case of SVD, but I am not sure what it is being scaled to in the exhibits presented in RapidMiner.  @mschmitz any idea what the denominator is?

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • jacobcybulskijacobcybulski Member, University Professor Posts: 391   Unicorn
    Thanks @Telcontar120 , I agree that there is some discrepancy between PCA and SVD. If SVD indeed shows cumulative variance, the units would not need to scale to 100. However, a scaled cumulative variance is the expected norm, especially that analytic decisions are being made around the chart.
  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,128  RM Data Scientist
    edited May 2020
    i am honored that you think that I know those things, but i don't. What i can say is:
    /**
    * This operator performs a Singular Value Decomposition (SVD) of the data The user can specify the
    * number of target dimensions operator outputs a {@link SVDModel}. With the
    * <code>ModelApplier</code> you can transform the features.
    *
    * @author Sebastian Land
    */
    So this is more @land thing.

    Best,
    Martin
    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
Sign In or Register to comment.