The RapidMiner community is on read-only mode until further notice. Technical support via cases will continue to work as is. For any urgent licensing related requests from Students/Faculty members, please use the Altair academic forum here.
Group-based sampling
chrisleong
Member Posts: 4 Contributor I
I have a data set of users, each who attend a school. I want to filter out the data so that only the data from a random number of schools is displayed. My current method involves using `Select attribute` to select a list of schools (once per attending student), applying `Remove duplicates`, applying sampling the schools, then joining with the original dataset.
Four operations seems rather complicated for such a simple operation, so I was wondering if there was a better way.
Four operations seems rather complicated for such a simple operation, so I was wondering if there was a better way.
0
Answers
there are so many "easy operations" around that we decided not to implement all of them into their own operator, as long as it is possible to achieve the same result with a combination of other operators. So what you are doing is the exactly correct way.
Best regards,
Marius