Combine Datasets with more than one 'label' attribute but retains both 'label' attribute

Emily_123Emily_123 Member Posts: 1 Learner I
edited November 2018 in Help

Hi!

I was wondering whether Rapidminer has an operator that can combine two datasets with two different 'label' attributes but retains both 'label' attribute in the new dataset.  Most blending operators only retain the 'label' attribute of the first dataset.  I have an ID for both data sets that are the same.

 

My data looks like this:

ID       Qty A (label)        Prediction Qty A        

A           2                                    3                      

B           4                                    3.5                     

 

I want to combine it with this:

ID      Qty B (label)       Prediction Qty B              

 B           7                                6.5                           

 A           6                                6.5                          

 

I want the new dataset to look like this

 

ID   Qty A (label)     Prediction Qty A        Qty B (label)       Prediction Qty B             

A         2                              3                           7                           6.5

B         4                              3.5                       6                             6.5

 

 

 

Thank you!                           

 

 

 

Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,503 RM Data Scientist

    Hey,

     

    sure you can. Since roles need to be unique, you would need to set the role of your attribute to something like label_1 and label_2 and then join afterwards.

     

    ~Martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • jczogallajczogalla Employee, Member Posts: 144 RM Engineering

    Hi Emily!

    If you use the Join operator to combine the two data sets by ID, RapidMiner will remove the second label attribute because label is a special role, i.e. can only be assigned to one attribute. If you want to keep the second label attribute, you can set it to another sepcial role by using the Set Role operator on the second data set before joining and giving the label the role "label2" for example. The only thing to keep in mind is that label is not only some special role, but a predefined role also, meaning that it is used by most learning operators.

     

    Cheers

    Jan

Sign In or Register to comment.