"Bug in PCA operator output"

Legacy UserLegacy User Member Posts: 0 Newbie
edited May 2019 in Help
Input file:

X1 X2 X3
1. 1. 0.
1. 1. 1.
1. 1. 2.
0. 0. 0.
0. 0. 1.

Experiment:

<operator name="Root" class="Process" expanded="yes">
    <operator name="ExampleSource" class="ExampleSource">
        <parameter key="attributes" value="C:\Documents and Settings\Victor\My Documents\test_pca\input.aml"/>
    </operator>
    <operator name="PCA" class="PCA">
    </operator>
    <operator name="ModelApplier" class="ModelApplier">
        <list key="application_parameters">
        </list>
        <parameter key="keep_model" value="true"/>
    </operator>
    <operator name="CSVExampleSetWriter" class="CSVExampleSetWriter">
        <parameter key="column_separator" value=","/>
        <parameter key="csv_file" value="C:\Documents and Settings\Victor\My Documents\test_pca\output.csv"/>
    </operator>
</operator>

The log shows:

Principal Components:
Variance Threshold: 0.95
PC 1:  + 0.439 * X1 - 0.554 * X2 + 0.707 * X3
PC 2:  + 0.439 * X1 - 0.554 * X2 - 0.707 * X3
PC 3:  + 0.784 * X1 + 0.621 * X2 - 0.000 * X3
(created by PCA)

I think the correct components are transpose of this, so the log should show:

PC1 = 0.439*X1 + 0.439*X2 + 0.784*X3
Pc2 = -0.554*X1 - 0.554*X2 ...




Answers

  • Legacy UserLegacy User Member Posts: 0 Newbie
    Just to clarify - the incorrect (transposed) values are printed only in the log.

    The PCA matrix on the screen (tab Eigenvectors view) is displayed correctly.
    The next step, model applier, also uses the correct values.

  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi Vector,

    thanks for sending this in. And especially for the additional note - I got a slight shock before you mentioned that it is only transposed in the log out. Puh... :D

    I have changed the logging output and the fixed version is available via CVS in a few hours.

    Cheers,
    Ingo









Sign In or Register to comment.