Options

Accounting for number of observations / evidence

User23400User23400 Member Posts: 3 Contributor I
edited December 2019 in Help
Dear RM-Enthusiasts,

Working on an online advertising dataset I have a list with Product-IDs. Every product has a number of attributes and I want to predict one of them. A fairly basic Decision Tree model is already yielding acceptable results.

However I still have one source of predictive potential that is not used yet and that is the number of observations. The data for some Product-IDs are based on 1 observation, while others are based on 20 or more observations. Obviously I would like to weigh the data for the IDs with many observations heavier than the ones with few observations. 

Can anybody direct me to a way of handling this? Maybe a tutorial or youtube video?

Any advice would be greatly appreciated. Thanks in advance!

Best,
Marc

Answers

  • Options
    MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,517 RM Data Scientist
    Hi,
    you can use aggreagte to generate this count and then set the role of this attribute to weight. Then it is counting more in learners.

    Be a bit careful with it. It may lead to a bias towards well known things.

    Best,
    martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • Options
    User23400User23400 Member Posts: 3 Contributor I

    Dear Martin,

    Thanks a lot for your response. I understand the concept and I found 3 „Aggregate“ operators: Generate Aggregation, Aggregate and Extract aggregates. I chose „Aggregate“.

    Next, I chose number_observations as “aggregation attribute”. When selecting the corresponding “aggregation_function” (average, concatenation, count etc.) though, I could not find “weight”.

    Do you have any idea where I’m going wrong?


    Best,

    Marc

  • Options
    lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
    Hi @User23400,

    You have to choose count in aggregation function in the parameters of the Aggregate operator.
    Then you have to put a Set Role operator in your process and in the parameters of this operator, select in attribute name the attribute you just created and set weight as target role

    Regards,

    Lionel
  • Options
    User23400User23400 Member Posts: 3 Contributor I
    Thanks Lionel,

    Clear. It worked so far, but I now only have the aggregated attribute on the output port of the Aggregator operator. The other attributes are not passed through. I tried a few things but couldn't get it to work. Any idea?

    Thanks,
    Marc
Sign In or Register to comment.