RapidMiner

columns with the same role will be conflicting?

SOLVED
Regular Contributor

columns with the same role will be conflicting?

[ Edited ]

Hi want to set some roles to customer roles. But it seems RM only allow one column for each role except for the regular role `attribute`.

 

Here's an exmaple. I generate the follow table:
Untitled11.png

 

and I set att1, att2, att3 to have role == 'something', and only att3 shows at the output:

 

 Untitled11.png

 

I used `execute R` to set the role by changing the metadata. Inside `execute R` the metadata and column count is correct (printed in log)

 

metaData before changing role:

Oct 27, 2016 3:49:16 PM INFO: data col_name role type

Oct 27, 2016 3:49:16 PM INFO: 1 table_in att1 attribute real

Oct 27, 2016 3:49:16 PM INFO: 2 table_in att2 attribute real

Oct 27, 2016 3:49:16 PM INFO: 3 table_in att3 attribute real

Oct 27, 2016 3:49:16 PM INFO: 4 table_in att4 attribute real

Oct 27, 2016 3:49:16 PM INFO: 5 table_in att5 attribute real

Oct 27, 2016 3:49:16 PM INFO: 6 table_in label label real

 

How many columns:

Oct 27, 2016 3:49:16 PM INFO: [1] 6

 

metaData after changing role:

Oct 27, 2016 3:49:16 PM INFO: data col_name role type

Oct 27, 2016 3:49:16 PM INFO: 1 table_in att1 something real

Oct 27, 2016 3:49:16 PM INFO: 2 table_in att2 something real

Oct 27, 2016 3:49:16 PM INFO: 3 table_in att3 something real

Oct 27, 2016 3:49:16 PM INFO: 4 table_in att4 attribute real

Oct 27, 2016 3:49:16 PM INFO: 5 table_in att5 attribute real

Oct 27, 2016 3:49:16 PM INFO: 6 table_in label label real

 

How many columns:

Oct 27, 2016 3:49:16 PM INFO: [1] 6

 

Could somebody help please? Thanks--

Code is here:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="6.4.000">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="6.0.002" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" breakpoints="after" class="generate_data" compatibility="6.4.000" expanded="true" height="60" name="Generate Data" width="90" x="45" y="30">
        <parameter key="number_examples" value="20"/>
      </operator>
      <operator activated="true" class="r_scripting:execute_r" compatibility="6.4.000" expanded="true" height="76" name="Execute R 1" width="90" x="246" y="30">
        <parameter key="script" value="library(reshape2)&#10;&#10;organize_metadata = function(metaData){&#10;        &#10;        meta = melt(metaData)&#10;        names(meta) = c('value', 'variable', 'col_name', 'data')&#10;        meta = dcast(meta, formula = data + col_name ~ variable)&#10;        &#10;        meta&#10;}&#10;&#10;&#10;rm_main = function(table_in){&#10;        &#10;        print(organize_metadata(metaData))&#10;        print(ncol(table_in))&#10;        &#10;        metaData$table_in[[1]]$role &lt;&lt;- 'something'&#10;        metaData$table_in[[2]]$role &lt;&lt;- 'something'&#10;        metaData$table_in[[3]]$role &lt;&lt;- 'something'&#10;        &#10;        print(organize_metadata(metaData))&#10;        print(ncol(table_in))&#10;        &#10;        table_in&#10;}&#10;"/>
      </operator>
      <connect from_op="Generate Data" from_port="output" to_op="Execute R 1" to_port="input 1"/>
      <connect from_op="Execute R 1" from_port="output 1" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>

 

 

 

 

 

2 REPLIES
Highlighted
Regular Contributor

Re: columns with the same role will be conflicting?

As far as I know, it's not possible to have special roles with the same role name. I tried (I expect you did too) using Set Role to set a number of regular attributes to be the same and only the first is set. It would be quite a nice enhancement to have and would build on the one of the greatest strengths of RapidMiner, namely the ability to easily exclude meta data from analysis. I use R a lot and it is a constant source of frustration having to remember to exclude attributes from analysis. It's all too easy to forget to exclude some attribute like an ID, only to find that it's silently polluted your analyses.

 

My only suggestion would be make the roles Something1, Something2, etc. I expect you thought of that already however.

 

Andrew

Regular Contributor

Re: columns with the same role will be conflicting?

Thank you so much, Andrew~

 

Just to add to the detail - in my experiment, setting two columns to the same role (not attribute) causes losing one of the columns (not only losing the role)