RapidMiner

RapidMiner

Bug in PCA operator output

Bug in PCA operator output

Input file:

X1 X2 X3
1. 1. 0.
1. 1. 1.
1. 1. 2.
0. 0. 0.
0. 0. 1.

Experiment:

<operator name="Root" class="Process" expanded="yes">
    <operator name="ExampleSource" class="ExampleSource">
        <parameter key="attributes" value="C:\Documents and Settings\Victor\My Documents\test_pca\input.aml"/>
    </operator>
    <operator name="PCA" class="PCA">
    </operator>
    <operator name="ModelApplier" class="ModelApplier">
        <list key="application_parameters">
        </list>
        <parameter key="keep_model" value="true"/>
    </operator>
    <operator name="CSVExampleSetWriter" class="CSVExampleSetWriter">
        <parameter key="column_separator" value=","/>
        <parameter key="csv_file" value="C:\Documents and Settings\Victor\My Documents\test_pca\output.csv"/>
    </operator>
</operator>

The log shows:

Principal Components:
Variance Threshold: 0.95
PC 1:  + 0.439 * X1 - 0.554 * X2 + 0.707 * X3
PC 2:  + 0.439 * X1 - 0.554 * X2 - 0.707 * X3
PC 3:  + 0.784 * X1 + 0.621 * X2 - 0.000 * X3
(created by PCA)

I think the correct components are transpose of this, so the log should show:

PC1 = 0.439*X1 + 0.439*X2 + 0.784*X3
Pc2 = -0.554*X1 - 0.554*X2 ...




2 REPLIES

Re: Bug in PCA operator output

Just to clarify - the incorrect (transposed) values are printed only in the log.

The PCA matrix on the screen (tab Eigenvectors view) is displayed correctly.
The next step, model applier, also uses the correct values.

RMStaff

Re: Bug in PCA operator output

Hi Vector,

thanks for sending this in. And especially for the additional note - I got a slight shock before you mentioned that it is only transposed in the log out. Puh... Smiley Very Happy

I have changed the logging output and the fixed version is available via CVS in a few hours.

Cheers,
Ingo