generate attribute

michaelglovenmichaelgloven RapidMiner Certified Analyst, Member Posts: 46 Guru
edited December 2018 in Help

Probably a simple question, but I could not find relevant documentation. What is the correct syntax in Generate Attributes for combining multiple attributes into one? For example, how do I combine a numerical, nominal and date field to give me a new attribute field that looks like this: "2015 - Excellent - March 2015" as the new concatenated field? The attribute fields are 2015 (integer [field 1]), Excellent (nominal [field 2]) and March 2015 (date [field 3])...

 

thanks in advance!

Best Answer

  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
    Solution Accepted

    Hi again @michaelgloven,

     

    Have you try the Generate Concatenation operator ?

    Here an example : 

    Regards, 

     

    Lionel

     

    <?xml version="1.0" encoding="UTF-8"?><process version="8.0.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="6.0.002" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="generate_nominal_data" compatibility="8.0.001" expanded="true" height="68" name="Generate Nominal Data" width="90" x="112" y="120">
    <parameter key="number_examples" value="9800"/>
    </operator>
    <operator activated="true" class="multiply" compatibility="8.0.001" expanded="true" height="124" name="Multiply" width="90" x="246" y="136"/>
    <operator activated="true" class="generate_concatenation" compatibility="8.0.001" expanded="true" height="82" name="Generate Concatenation" width="90" x="514" y="340">
    <parameter key="first_attribute" value="att1"/>
    <parameter key="second_attribute" value="att3"/>
    <parameter key="separator" value="-"/>
    </operator>
    <operator activated="true" class="generate_concatenation" compatibility="8.0.001" expanded="true" height="82" name="Generate Concatenation (2)" width="90" x="648" y="340">
    <parameter key="first_attribute" value="att2"/>
    <parameter key="second_attribute" value="att1-att3"/>
    <parameter key="separator" value="-"/>
    </operator>
    <operator activated="true" class="rename" compatibility="8.0.001" expanded="true" height="82" name="Rename" width="90" x="782" y="340">
    <parameter key="old_name" value="att2-att1-att3"/>
    <parameter key="new_name" value="concat"/>
    <list key="rename_additional_attributes"/>
    </operator>
    <connect from_op="Generate Nominal Data" from_port="output" to_op="Multiply" to_port="input"/>
    <connect from_op="Multiply" from_port="output 1" to_port="result 1"/>
    <connect from_op="Multiply" from_port="output 2" to_op="Generate Concatenation" to_port="example set input"/>
    <connect from_op="Multiply" from_port="output 3" to_port="result 2"/>
    <connect from_op="Generate Concatenation" from_port="example set output" to_op="Generate Concatenation (2)" to_port="example set input"/>
    <connect from_op="Generate Concatenation (2)" from_port="example set output" to_op="Rename" to_port="example set input"/>
    <connect from_op="Rename" from_port="example set output" to_port="result 3"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    <portSpacing port="sink_result 3" spacing="0"/>
    <portSpacing port="sink_result 4" spacing="0"/>
    </process>
    </operator>
    </process>

Answers

  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn

    Hi @michaelgloven,

     

    Here an example of a process that can maybe meet your needs ? : 

    <?xml version="1.0" encoding="UTF-8"?><process version="8.0.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="6.0.002" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="generate_nominal_data" compatibility="8.0.001" expanded="true" height="68" name="Generate Nominal Data" width="90" x="112" y="120">
    <parameter key="number_examples" value="9800"/>
    </operator>
    <operator activated="true" class="write_csv" compatibility="8.0.001" expanded="true" height="82" name="Write CSV" width="90" x="308" y="114">
    <parameter key="column_separator" value=" -"/>
    <parameter key="write_attribute_names" value="false"/>
    <parameter key="quote_nominal_values" value="false"/>
    <parameter key="format_date_attributes" value="false"/>
    </operator>
    <operator activated="true" class="read_csv" compatibility="6.0.003" expanded="true" height="68" name="Read CSV" width="90" x="447" y="165">
    <parameter key="column_separators" value=","/>
    <parameter key="use_quotes" value="false"/>
    <parameter key="parse_numbers" value="false"/>
    <parameter key="first_row_as_names" value="false"/>
    <list key="annotations"/>
    <list key="data_set_meta_data_information"/>
    </operator>
    <operator activated="true" class="generate_id" compatibility="8.0.001" expanded="true" height="82" name="Generate ID" width="90" x="581" y="136"/>
    <operator activated="true" class="generate_nominal_data" compatibility="8.0.001" expanded="true" height="68" name="Generate Nominal Data (2)" width="90" x="112" y="289">
    <parameter key="number_examples" value="9800"/>
    </operator>
    <operator activated="true" class="generate_id" compatibility="8.0.001" expanded="true" height="82" name="Generate ID (2)" width="90" x="581" y="289"/>
    <operator activated="true" class="join" compatibility="8.0.001" expanded="true" height="82" name="Join" width="90" x="782" y="238">
    <parameter key="remove_double_attributes" value="false"/>
    <parameter key="join_type" value="outer"/>
    <list key="key_attributes"/>
    </operator>
    <connect from_op="Generate Nominal Data" from_port="output" to_op="Write CSV" to_port="input"/>
    <connect from_op="Write CSV" from_port="file" to_op="Read CSV" to_port="file"/>
    <connect from_op="Read CSV" from_port="output" to_op="Generate ID" to_port="example set input"/>
    <connect from_op="Generate ID" from_port="example set output" to_op="Join" to_port="left"/>
    <connect from_op="Generate Nominal Data (2)" from_port="output" to_op="Generate ID (2)" to_port="example set input"/>
    <connect from_op="Generate ID (2)" from_port="example set output" to_op="Join" to_port="right"/>
    <connect from_op="Join" from_port="join" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>

    Regards, 

     

    Lionel

  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn

    There is a CONCAT function in Generate Attributes, but it will work only with nominals, so you need to first turn your other attributes (or copies of them if you want to keep the originals) into nominal form, using Date to Nominal and Numeric to Nominal.   

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • michaelglovenmichaelgloven RapidMiner Certified Analyst, Member Posts: 46 Guru

    ah, I was not aware of this specific operator...thanks, it solves my issue!

     

    Mike

  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn

    Neither was I---and it's always amazing to find these little nuggets after using RapidMiner for so many years!

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn

    Hi,

     

    I'm using RapidMiner since a few months and I assure you, I discovered this operator today too : 

    When I saw this post, I decided to type "concat" in the operator search box and "Generate Concatenation" operator appeared "as if by magic"...

     

    Best regards, 

     

    Lionel

     

     

  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    stay tuned - we're about to put that search box on steroids. :)


    Scott

     

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    @sgenzer@lionelderkrikor@Telcontar120, I had to use Generate Concatentation once and boy was it handy.  It's like that wierd tool in your tool box that just sits there forever and you wonder why you bought it in the first place. Then you need it one day and boy were to glad it was there. 

  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    Hmm I have not found that operator extremely handy - it only handles two atts at a time. It would be FAR more useful if you could expand. I almost always use concat() in Generate Attributes.

     

    Scott

     

Sign In or Register to comment.