Join Columns Different Processes

video icon RapidMiner now offering a full suite of online training videos - free! Check it out
Contributor I emanuelmcruz
Contributor I

Join Columns Different Processes

Hi,how can i join different columns from different processes all in one process.

I've made 4 processes in order to calculate 1 column for each process, now i want to select the different columns that i created in one single results page. How can i do that?


Re: Join Columns Different Processes

If you different example sets have an ID you could use repeatedly the operator JOIN. 


If you don't have an ID but have the same number of rows, you could use MERGE ATTRIBUTES (for this one you have to install both, the OPERATOR extension and the TEXT MINING extension. 


Here's the sample process that accompanies the MERGE ATTRIBUTES operator:


<?xml version="1.0" encoding="UTF-8"?><process version="8.2.000">
  <operator activated="true" class="process" compatibility="8.2.000" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="generate_direct_mailing_data" compatibility="8.2.000" expanded="true" height="68" name="Generate Direct Mailing Data" width="90" x="45" y="34">
        <parameter key="number_examples" value="9000"/>
        <description align="center" color="transparent" colored="false" width="126">The direct mailing data is the primary data source.</description>
      <operator activated="true" class="generate_churn_data" compatibility="8.2.000" expanded="true" height="68" name="Generate Churn Data" width="90" x="45" y="238">
        <parameter key="number_examples" value="9000"/>
        <description align="center" color="transparent" colored="false" width="126">For demonstration the churn data is assumed to be additional information about the 9000 examples of the direct mailing data.</description>
      <operator activated="true" class="operator_toolbox:merge" compatibility="1.0.000" expanded="true" height="103" name="Merge" width="90" x="447" y="34">
        <description align="center" color="transparent" colored="false" width="126">Both ExampleSets are merged together, without any join operation. The attributes of the second set are just appended to the first set.</description>
      <connect from_op="Generate Direct Mailing Data" from_port="output" to_op="Merge" to_port="example set 1"/>
      <connect from_op="Generate Churn Data" from_port="output" to_op="Merge" to_port="example set 2"/>
      <connect from_op="Merge" from_port="merged set" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
      <description align="center" color="yellow" colored="false" height="222" resized="true" width="357" x="575" y="92">The parameter 'handling of duplicate attributes' is set to rename. Thus the attribute 'label' of the second ExampleSet is renamed to label_2, cause the first ExampleSet has already the attribute 'label'.&lt;br&gt;&lt;br&gt;The parameter 'handling of special attributes' is set to keep_first_special_other_regular. Thus the attribute 'label' which has also the role 'label' in the first ExampleSet is kept with this role.&lt;br&gt;For the attribute 'label_2' (which was renamed, see above) the role is changed from label to regular.</description>