Accounting for number of observations / evidence

User23400User23400 Member Posts: 3 Contributor I
edited December 2019 in Help
Dear RM-Enthusiasts,

Working on an online advertising dataset I have a list with Product-IDs. Every product has a number of attributes and I want to predict one of them. A fairly basic Decision Tree model is already yielding acceptable results.

However I still have one source of predictive potential that is not used yet and that is the number of observations. The data for some Product-IDs are based on 1 observation, while others are based on 20 or more observations. Obviously I would like to weigh the data for the IDs with many observations heavier than the ones with few observations. 

Can anybody direct me to a way of handling this? Maybe a tutorial or youtube video?

Any advice would be greatly appreciated. Thanks in advance!



  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,453 RM Data Scientist
    you can use aggreagte to generate this count and then set the role of this attribute to weight. Then it is counting more in learners.

    Be a bit careful with it. It may lead to a bias towards well known things.

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • User23400User23400 Member Posts: 3 Contributor I

    Dear Martin,

    Thanks a lot for your response. I understand the concept and I found 3 „Aggregate“ operators: Generate Aggregation, Aggregate and Extract aggregates. I chose „Aggregate“.

    Next, I chose number_observations as “aggregation attribute”. When selecting the corresponding “aggregation_function” (average, concatenation, count etc.) though, I could not find “weight”.

    Do you have any idea where I’m going wrong?



  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
    Hi @User23400,

    You have to choose count in aggregation function in the parameters of the Aggregate operator.
    Then you have to put a Set Role operator in your process and in the parameters of this operator, select in attribute name the attribute you just created and set weight as target role


  • User23400User23400 Member Posts: 3 Contributor I
    Thanks Lionel,

    Clear. It worked so far, but I now only have the aggregated attribute on the output port of the Aggregator operator. The other attributes are not passed through. I tried a few things but couldn't get it to work. Any idea?

Sign In or Register to comment.