Merging examples
Hello,
My example set is similar to the one generated by this process:
Thanks for any pointers
- R
My example set is similar to the one generated by this process:
Basically, I have something like this:
<operator name="Root" class="Process" expanded="yes">
<operator name="OperatorChain" class="OperatorChain" expanded="no">
<operator name="ExampleSetGenerator" class="ExampleSetGenerator">
<parameter key="target_function" value="random"/>
</operator>
<operator name="label is regular" class="ChangeAttributeRole">
<parameter key="name" value="label"/>
</operator>
<operator name="BinDiscretization - 50" class="BinDiscretization">
<parameter key="number_of_bins" value="50"/>
<parameter key="range_name_type" value="short"/>
</operator>
<operator name="label is label" class="ChangeAttributeRole">
<parameter key="name" value="label"/>
<parameter key="target_role" value="label"/>
</operator>
<operator name="Nominal2Numerical" class="Nominal2Numerical">
</operator>
<operator name="BinDiscretization - 2" class="BinDiscretization">
<parameter key="range_name_type" value="short"/>
</operator>
<operator name="Nominal2Numerical (2)" class="Nominal2Numerical">
</operator>
<operator name="Sorting" class="Sorting">
<parameter key="attribute_name" value="label"/>
</operator>
</operator>
</operator>
I would like to merge all the "rangeX" examples, so that for each attribute, the maximum across all examples with the same ID is kept. eg, I want:
label att1 att2 att3 att4 att5
range1 1.0 0.0 0.0 0.0 1.0
range1 0.0 0.0 1.0 1.0 0.0
range10 1.0 1.0 1.0 0.0 0.0
range11 1.0 0.0 0.0 1.0 0.0
range11 1.0 0.0 0.0 1.0 1.0
range11 1.0 0.0 0.0 1.0 1.0
....
I hope I'm clear here... Unfortunately, I don't have access to the data format, so I must do this crazy trick. I guess I could always write my own operator to do this, but I'm sure RapidMiner has all the necessary operators already available for this!
label att1 att2 att3 att4 att5
range1 1.0 0.0 1.0 1.0 1.0
range10 1.0 1.0 1.0 0.0 0.0
range11 1.0 0.0 0.0 1.0 1.0
....
Thanks for any pointers

- R
1
Answers
thank you for this excellent post. Solving a problem described in a such detailed manner is fun. So I will not only point you to the Aggregation operator, but I also have an example process for you: Greetings,
Sebastian
Many years later and I had a similar (if not exactly the same) problem as OP.
Found the solution on this post