Filtering examples based on number of occurences in attribute

Baski · January 2017

Hi,

For example I have examples that containts information about visits. Every visit is defined to visitor_id. I want to filter the examples(rows) where the visitor_id occure more than 5 times. So there will be no more then 4 rows for every visitor_id. I tried filter, but that was not helpfull.

Any idea how to do this in rapid miner ?
Thanks.

IngoRM · January 2017

Hi,

While I am pretty sure that the answer to this question will involve the operators "Aggregate", "Pivot", and "Filter Examples", I am unfortunately not sure if I fully got the problem. Can you give us a small data sample (original data) as well as how the desired output for this sample should look like?

Merci,

Ingo

sgenzer · January 2017

hi...no the Filter Examples operator is not going to help you here (as you saw). The way I see it, you need to first create an attribute that lists # of occurrences, and then you can filter for n > 5 or whatever. Personally I would use the Aggregate operator where you group by visitor_id and aggregate by visitor_id. Then join this with your original data set on the visitor_id attribute.

Scott

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

Filtering examples based on number of occurences in attribute

Answers