Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
How to balance examples ?
Hello everybody,
I have a classification problem with two classes and one of those classes is in large excess in my data set.
I would like to use roughly equal numbers of the two classes for my learner and so I wonder, if
there Is a way to select only a subset of the examples whose class is in excess ?
I looked at the Sampling operator, but that samples the same fraction from all classes.
Many thanks,
axel
I have a classification problem with two classes and one of those classes is in large excess in my data set.
I would like to use roughly equal numbers of the two classes for my learner and so I wonder, if
there Is a way to select only a subset of the examples whose class is in excess ?
I looked at the Sampling operator, but that samples the same fraction from all classes.
Many thanks,
axel
0
Answers
There probably is a much smarter way of doing this, but I'm too wrecked to think of it ;D, so you'll have to make do with the following... You'd better test it as well, as I haven't !
Have fun...
if your learner supports weighted examples, you could use the equal label weighting operator. It will distribute over all labels the same amount of weight.
But I guess we should add some sort of balancing operator in the future...
Greetings,
Sebastian
that's not very nice, but it works !
Many thanks,
Axel
P.S. But I think, RapidMiner really needs a special operator for this...
Thanks
Alejandro
well I think you either have to install RM4.x and load it there, store it and import the file, or you could extract another valid RapidMiner 4.x process file, insert the code there and import it with RapidMiner 5.0.
Or you simply build the process manually from scratch...
Greetings,
Sebastian