The Altair Community is migrating to a new platform to provide a better experience for you. The RapidMiner Community will merge with the Altair Community at the same time. In preparation for the migration, both communities are on read-only mode from July 15th - July 24th, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here.
Options

"Calculating PCA scores"

frankiefrankie Member Posts: 26 Contributor II
edited May 2019 in Help
Hi,

Looking at example #13 in the RM help. How does one calculate the PCA scores based the dataset values and selected components?

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.008">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.0.000" expanded="true" name="Root">
    <description>The calculation of principal components is often used as a feature transforming preprocessing step. It can reduce the dimensionality of the data set at hand while the major data variance is preserved. Perform the process and check out the plot view of the Iris data set loaded and transformed by this process.</description>
    <process expanded="true" height="494" width="433">
      <operator activated="true" class="retrieve" compatibility="5.0.000" expanded="true" height="60" name="Retrieve" width="90" x="45" y="30">
        <parameter key="repository_entry" value="../../data/Iris"/>
      </operator>
      <operator activated="true" class="normalize" compatibility="5.0.000" expanded="true" height="94" name="Normalization" width="90" x="180" y="30"/>
      <operator activated="true" class="principal_component_analysis" compatibility="5.0.000" expanded="true" height="94" name="PrincipalComponents" width="90" x="313" y="30"/>
      <connect from_op="Retrieve" from_port="output" to_op="Normalization" to_port="example set input"/>
      <connect from_op="Normalization" from_port="example set output" to_op="PrincipalComponents" to_port="example set input"/>
      <connect from_op="PrincipalComponents" from_port="example set output" to_port="result 1"/>
      <connect from_op="PrincipalComponents" from_port="preprocessing model" to_port="result 2"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="18"/>
      <portSpacing port="sink_result 3" spacing="0"/>
    </process>
  </operator>
</process>

Answers

  • Options
    earmijoearmijo Member Posts: 271 Unicorn
    PCA is considered a "model". To get the predictions or scores, you "apply" the model to the example set.
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.1.008">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.1.008" expanded="true" name="Root">
        <description>The calculation of principal components is often used as a feature transforming preprocessing step. It can reduce the dimensionality of the data set at hand while the major data variance is preserved. Perform the process and check out the plot view of the Iris data set loaded and transformed by this process.</description>
        <process expanded="true" height="494" width="748">
          <operator activated="true" class="retrieve" compatibility="5.1.008" expanded="true" height="60" name="Retrieve (2)" width="90" x="45" y="165">
            <parameter key="repository_entry" value="//Samples/data/Iris"/>
          </operator>
          <operator activated="true" class="normalize" compatibility="5.1.008" expanded="true" height="94" name="Normalization" width="90" x="246" y="30"/>
          <operator activated="true" class="principal_component_analysis" compatibility="5.1.008" expanded="true" height="94" name="PrincipalComponents" width="90" x="447" y="30">
            <parameter key="dimensionality_reduction" value="fixed number"/>
          </operator>
          <operator activated="true" class="apply_model" compatibility="5.1.008" expanded="true" height="76" name="Apply Model" width="90" x="648" y="75">
            <list key="application_parameters"/>
          </operator>
          <connect from_op="Retrieve (2)" from_port="output" to_op="Normalization" to_port="example set input"/>
          <connect from_op="Normalization" from_port="example set output" to_op="PrincipalComponents" to_port="example set input"/>
          <connect from_op="PrincipalComponents" from_port="original" to_op="Apply Model" to_port="unlabelled data"/>
          <connect from_op="PrincipalComponents" from_port="preprocessing model" to_op="Apply Model" to_port="model"/>
          <connect from_op="Apply Model" from_port="labelled data" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="18"/>
        </process>
      </operator>
    </process>
Sign In or Register to comment.