RapidMiner

Combine Datasets with more than one 'label' attribute but retains both 'label' attribute

video icon RapidMiner now offering a full suite of online training videos - free! Check it out
Highlighted
Newbie Emily_123
Newbie

Combine Datasets with more than one 'label' attribute but retains both 'label' attribute

Hi!

I was wondering whether Rapidminer has an operator that can combine two datasets with two different 'label' attributes but retains both 'label' attribute in the new dataset.  Most blending operators only retain the 'label' attribute of the first dataset.  I have an ID for both data sets that are the same.

 

My data looks like this:

ID       Qty A (label)        Prediction Qty A        

A           2                                    3                      

B           4                                    3.5                     

 

I want to combine it with this:

ID      Qty B (label)       Prediction Qty B              

 B           7                                6.5                           

 A           6                                6.5                          

 

I want the new dataset to look like this

 

ID   Qty A (label)     Prediction Qty A        Qty B (label)       Prediction Qty B             

A         2                              3                           7                           6.5

B         4                              3.5                       6                             6.5

 

 

 

Thank you!                           

 

 

 

2 REPLIES
RM Staff RM Staff
RM Staff

Re: Combine Datasets with more than one 'label' attribute but retains both 'label' attribute

Hey,

 

sure you can. Since roles need to be unique, you would need to set the role of your attribute to something like label_1 and label_2 and then join afterwards.

 

~Martin

--------------------------------------------------------------------------
Head of Data Science Services at RapidMiner
RM Staff RM Staff
RM Staff

Re: Combine Datasets with more than one 'label' attribute but retains both 'label' attribute

Hi Emily!

If you use the Join operator to combine the two data sets by ID, RapidMiner will remove the second label attribute because label is a special role, i.e. can only be assigned to one attribute. If you want to keep the second label attribute, you can set it to another sepcial role by using the Set Role operator on the second data set before joining and giving the label the role "label2" for example. The only thing to keep in mind is that label is not only some special role, but a predefined role also, meaning that it is used by most learning operators.

 

Cheers

Jan