🎉 🎉 RAPIDMINER 9.10 IS OUT!!! 🎉🎉

Download the latest version helping analytics teams accelerate time-to-value for streaming and IIOT use cases.

CLICK HERE TO DOWNLOAD

"Calculating PCA scores"

frankiefrankie Member Posts: 26 Contributor II
edited May 2019 in Help
Hi,

Looking at example #13 in the RM help. How does one calculate the PCA scores based the dataset values and selected components?

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.008">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.0.000" expanded="true" name="Root">
    <description>The calculation of principal components is often used as a feature transforming preprocessing step. It can reduce the dimensionality of the data set at hand while the major data variance is preserved. Perform the process and check out the plot view of the Iris data set loaded and transformed by this process.</description>
    <process expanded="true" height="494" width="433">
      <operator activated="true" class="retrieve" compatibility="5.0.000" expanded="true" height="60" name="Retrieve" width="90" x="45" y="30">
        <parameter key="repository_entry" value="../../data/Iris"/>
      </operator>
      <operator activated="true" class="normalize" compatibility="5.0.000" expanded="true" height="94" name="Normalization" width="90" x="180" y="30"/>
      <operator activated="true" class="principal_component_analysis" compatibility="5.0.000" expanded="true" height="94" name="PrincipalComponents" width="90" x="313" y="30"/>
      <connect from_op="Retrieve" from_port="output" to_op="Normalization" to_port="example set input"/>
      <connect from_op="Normalization" from_port="example set output" to_op="PrincipalComponents" to_port="example set input"/>
      <connect from_op="PrincipalComponents" from_port="example set output" to_port="result 1"/>
      <connect from_op="PrincipalComponents" from_port="preprocessing model" to_port="result 2"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="18"/>
      <portSpacing port="sink_result 3" spacing="0"/>
    </process>
  </operator>
</process>

Answers

  • earmijoearmijo Member Posts: 265   Unicorn
    PCA is considered a "model". To get the predictions or scores, you "apply" the model to the example set.
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.1.008">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.1.008" expanded="true" name="Root">
        <description>The calculation of principal components is often used as a feature transforming preprocessing step. It can reduce the dimensionality of the data set at hand while the major data variance is preserved. Perform the process and check out the plot view of the Iris data set loaded and transformed by this process.</description>
        <process expanded="true" height="494" width="748">
          <operator activated="true" class="retrieve" compatibility="5.1.008" expanded="true" height="60" name="Retrieve (2)" width="90" x="45" y="165">
            <parameter key="repository_entry" value="//Samples/data/Iris"/>
          </operator>
          <operator activated="true" class="normalize" compatibility="5.1.008" expanded="true" height="94" name="Normalization" width="90" x="246" y="30"/>
          <operator activated="true" class="principal_component_analysis" compatibility="5.1.008" expanded="true" height="94" name="PrincipalComponents" width="90" x="447" y="30">
            <parameter key="dimensionality_reduction" value="fixed number"/>
          </operator>
          <operator activated="true" class="apply_model" compatibility="5.1.008" expanded="true" height="76" name="Apply Model" width="90" x="648" y="75">
            <list key="application_parameters"/>
          </operator>
          <connect from_op="Retrieve (2)" from_port="output" to_op="Normalization" to_port="example set input"/>
          <connect from_op="Normalization" from_port="example set output" to_op="PrincipalComponents" to_port="example set input"/>
          <connect from_op="PrincipalComponents" from_port="original" to_op="Apply Model" to_port="unlabelled data"/>
          <connect from_op="PrincipalComponents" from_port="preprocessing model" to_op="Apply Model" to_port="model"/>
          <connect from_op="Apply Model" from_port="labelled data" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="18"/>
        </process>
      </operator>
    </process>
Sign In or Register to comment.