Options

Generate multiple new attribute names for a single attribute

BigblackchairBigblackchair Member Posts: 3 Contributor I
edited November 2018 in Help

Hello Community:

I want to create a new attribute by changing several attribute names within another attribute.  Data set:

 

StudentID Major1
1 Studio Arts
2 Cinematic Arts
3 Museum Studies
4 Business
5 Creative Writing
6 Liberal Studies

I want to create a new attribute Major2 with Studio Arts, Cinematic Arts, and Museum Studies listed as Arts.  I want Business, Creative Writing, and Liberal Studies listed as Nonarts.  I wrote six lines in Generate Attributes, but it only performs the last change.

 

<?xml version="1.0" encoding="UTF-8"?><process version="8.0.001">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="8.0.001" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="retrieve" compatibility="8.0.001" expanded="true" height="68" name="Retrieve GenerateAttributeDataset" width="90" x="112" y="187">
<parameter key="repository_entry" value="//F Drive Repository/NASAD HEADS Survey/GenerateAttributeDataset"/>
</operator>
<operator activated="true" class="generate_attributes" compatibility="8.0.001" expanded="true" height="82" name="Generate Attributes" width="90" x="313" y="187">
<list key="function_descriptions">
<parameter key="Major2" value="replaceAll(Major1,&quot;Studio Arts&quot;,&quot;Arts&quot;)"/>
<parameter key="Major2" value="replaceAll(Major1,&quot;Cinematic Arts&quot;,&quot;Arts&quot;)"/>
<parameter key="Major2" value="replaceAll(Major1,&quot;Museum Studies&quot;,&quot;Arts&quot;)"/>
<parameter key="Major2" value="replaceAll(Major1,&quot;Business&quot;,&quot;Nonarts&quot;)"/>
<parameter key="Major2" value="replaceAll(Major1,&quot;Creative Writing&quot;,&quot;Nonarts&quot;)"/>
<parameter key="Major2" value="replaceAll(Major1,&quot;Liberal Studies&quot;,&quot;Nonarts&quot;)"/>
</list>
</operator>
<connect from_op="Retrieve GenerateAttributeDataset" from_port="output" to_op="Generate Attributes" to_port="example set input"/>
<connect from_op="Generate Attributes" from_port="example set output" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>

 

BUT, if I don't create a new attribute, and just re-code Major1, it works.

 

<?xml version="1.0" encoding="UTF-8"?><process version="8.0.001">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="8.0.001" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="retrieve" compatibility="8.0.001" expanded="true" height="68" name="Retrieve GenerateAttributeDataset" width="90" x="112" y="187">
<parameter key="repository_entry" value="//F Drive Repository/NASAD HEADS Survey/GenerateAttributeDataset"/>
</operator>
<operator activated="true" class="generate_attributes" compatibility="8.0.001" expanded="true" height="82" name="Generate Attributes" width="90" x="313" y="187">
<list key="function_descriptions">
<parameter key="Major1" value="replaceAll(Major1,&quot;Studio Arts&quot;,&quot;Arts&quot;)"/>
<parameter key="Major1" value="replaceAll(Major1,&quot;Cinematic Arts&quot;,&quot;Arts&quot;)"/>
<parameter key="Major1" value="replaceAll(Major1,&quot;Museum Studies&quot;,&quot;Arts&quot;)"/>
<parameter key="Major1" value="replaceAll(Major1,&quot;Business&quot;,&quot;Nonarts&quot;)"/>
<parameter key="Major1" value="replaceAll(Major1,&quot;Creative Writing&quot;,&quot;Nonarts&quot;)"/>
<parameter key="Major1" value="replaceAll(Major1,&quot;Liberal Studies&quot;,&quot;Nonarts&quot;)"/>
</list>
</operator>
<connect from_op="Retrieve GenerateAttributeDataset" from_port="output" to_op="Generate Attributes" to_port="example set input"/>
<connect from_op="Generate Attributes" from_port="example set output" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>

 

I'd like to create the new attribute and keep the old one, rather than over-write the old one.  How can I do that?

Thanks!

 

Best Answer

  • Options
    sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager
    Solution Accepted

    hi @Bigblackchair (ok that's the best username I've seen in a while :) ) - I'd do it with a joined lookup table rather than Generate Attributes:

     

    <?xml version="1.0" encoding="UTF-8"?><process version="8.0.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="8.0.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="retrieve" compatibility="8.0.001" expanded="true" height="68" name="Retrieve lookup data set" width="90" x="45" y="85">
    <parameter key="repository_entry" value="//RapidMiner OneDrive/random community stuff/lookup data set"/>
    </operator>
    <operator activated="true" class="retrieve" compatibility="8.0.001" expanded="true" height="68" name="Retrieve lookup table" width="90" x="45" y="187">
    <parameter key="repository_entry" value="//RapidMiner OneDrive/random community stuff/lookup table"/>
    </operator>
    <operator activated="true" class="join" compatibility="8.0.001" expanded="true" height="82" name="Join" width="90" x="179" y="85">
    <parameter key="join_type" value="left"/>
    <parameter key="use_id_attribute_as_key" value="false"/>
    <list key="key_attributes">
    <parameter key="Major1" value="Major1"/>
    </list>
    </operator>
    <connect from_op="Retrieve lookup data set" from_port="output" to_op="Join" to_port="left"/>
    <connect from_op="Retrieve lookup table" from_port="output" to_op="Join" to_port="right"/>
    <connect from_op="Join" from_port="join" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>

     

    Screen Shot 2018-01-24 at 4.00.54 PM.png

     

    ExampleSets attached.


    Scott

     

Answers

Sign In or Register to comment.