Roles and Labels - A Quick Guide
The ID role
Your typical data set has rows and columns (known as "examples" and "attributes" in RapidMiner Studio. Columns are also sometimes known as "features".). You can do most things with these attributes just the way they are:
But sometimes you want to give some attributes special "roles" for various reasons. For example, the most common role is an "id":
You see here that there is a NEW, BLUE attribute on the left that I called "sonarID". I made it an 'id' attribute because its ID numbers are unique and should NOT be used in modeling or other purposes. This id column is also useful if you want to join it with other data sets.
How did I make this make this column an 'id'? I used the SET ROLE operator:
The Set Role operator is very easy to use. You select the attribute whose role you want to change (usually from "regular" = no role at all) to some other role. I set it to 'id' in this case. Easy!
The Label RoleThe 'label' role is one of the most important roles in RapidMiner. It indicates which attribute is the predicted class when used in any modeling operator:
This ExampleSet has a new attribute called "Prediction" but you will get a error if you try to use it as it does not have the 'label' role:
Now I use the Set Role operator to change this attribute to 'label':
And voilà! My ExampleSet is ready for modeling.
So what are the other roles and what are they used for? Let's just create a master list....
) that will be the predicted class for a modeling operator later on. - i.e. the dependent variable or the column of values you want to predictlabel
Don't forget that if an attribute has a role, it is now considered a 'special attribute' and hence must be manually included in many operators such as Select Attributes or Filter Examples: