Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

YAGGA - attribute constructions

danjeharrydanjeharry Member Posts: 20 Contributor II
edited November 2018 in Help
Hey,

How do I recreate the attribute constructions generated by YAGGA onto a new test dataset if some of the attribute constructions are based on generated attributes which are no longer in the original example set? (e.g. gensym100 = Attribute1 + gensym99, but gensym99 is not defined in the attribute construction data).

Thanks.

Answers

  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Hi,

    below there is a small sample process which first generates some attributes, then stores their constructions to a file, rereads them and applies them to another dataset. Please be sure to adjust the paths in the Read/Write Constructions operators.
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.1.011">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.1.011" expanded="true" name="Process">
        <process expanded="true" height="505" width="547">
          <operator activated="true" class="generate_data" compatibility="5.1.011" expanded="true" height="60" name="Generate Data" width="90" x="45" y="75">
            <parameter key="target_function" value="random classification"/>
          </operator>
          <operator activated="true" class="optimize_by_generation_yagga" compatibility="5.1.011" expanded="true" height="94" name="Generate" width="90" x="179" y="75">
            <parameter key="reciprocal_value" value="false"/>
            <process expanded="true" height="527" width="725">
              <operator activated="true" class="naive_bayes" compatibility="5.1.011" expanded="true" height="76" name="Naive Bayes" width="90" x="112" y="30"/>
              <operator activated="true" class="apply_model" compatibility="5.1.011" expanded="true" height="76" name="Apply Model" width="90" x="313" y="30">
                <list key="application_parameters"/>
              </operator>
              <operator activated="true" class="performance" compatibility="5.1.011" expanded="true" height="76" name="Performance" width="90" x="514" y="30"/>
              <connect from_port="example set source" to_op="Naive Bayes" to_port="training set"/>
              <connect from_op="Naive Bayes" from_port="model" to_op="Apply Model" to_port="model"/>
              <connect from_op="Naive Bayes" from_port="exampleSet" to_op="Apply Model" to_port="unlabelled data"/>
              <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
              <connect from_op="Performance" from_port="performance" to_port="performance sink"/>
              <portSpacing port="source_example set source" spacing="0"/>
              <portSpacing port="sink_performance sink" spacing="0"/>
            </process>
          </operator>
          <operator activated="true" class="write_constructions" compatibility="5.1.011" expanded="true" height="60" name="Write Constructions" width="90" x="380" y="30">
            <parameter key="attribute_constructions_file" value="C:\Users\mhelf\Documents\tmp\constructions"/>
          </operator>
          <operator activated="true" class="generate_data" compatibility="5.1.011" expanded="true" height="60" name="Generate Data (2)" width="90" x="45" y="255">
            <parameter key="target_function" value="random classification"/>
          </operator>
          <operator activated="true" class="read_constructions" compatibility="5.1.011" expanded="true" height="60" name="Read Constructions" width="90" x="380" y="255">
            <parameter key="attribute_constructions_file" value="C:\Users\mhelf\Documents\tmp\constructions"/>
          </operator>
          <connect from_op="Generate Data" from_port="output" to_op="Generate" to_port="example set in"/>
          <connect from_op="Generate" from_port="example set out" to_op="Write Constructions" to_port="input"/>
          <connect from_op="Generate" from_port="attribute weights out" to_port="result 2"/>
          <connect from_op="Write Constructions" from_port="through" to_port="result 1"/>
          <connect from_op="Generate Data (2)" from_port="output" to_op="Read Constructions" to_port="example set"/>
          <connect from_op="Read Constructions" from_port="example set" to_port="result 3"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
          <portSpacing port="sink_result 3" spacing="0"/>
          <portSpacing port="sink_result 4" spacing="0"/>
        </process>
      </operator>
    </process>
  • danjeharrydanjeharry Member Posts: 20 Contributor II
    Thanks for the info Marius, but I set up my process exactly as yours. The issue I'm having is that I have one attribute called gensym100 = att1 and gensym99. The generated example set has att1 but does not have gensym99, which appears to have been generated earlier in the evolutionary process. So when I save the constructions down, gensym99 is not defined, which no longer allows me to generate the correct attributes on a new test data set.
Sign In or Register to comment.