Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
problem of imbalnce dataset
hanaabdalrahman
Member Posts: 9 Learner III
hello i am new in data mining and rapidminer, I have problem of imbalance data set, I wok with decision tree and naïve bayes and random foreset the accuracy of DT, NB is very good but it is not real my question is what is best operators that work with three techniques , my data set contain 1031 sample
hana mohamed
student
Tagged:
0
Best Answer
-
MartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,528 RM Data Scientist
Hi @hanaabdalrahman,
as mentioned earlier: This operator is part of an extension and not of RM core. The extension can be found in our marketplace:
Best,
Martin
- Sr. Director Data Solutions, Altair RapidMiner -
Dortmund, Germany1
Answers
@hanaabdalrahman you will need to use the Sample operator and toggle on the 'balance data' option. Then enter the classes and # of samples for each class.
Hi Hana,
i recommend to use the SMOTE operator which is part of the operator toolbox extension.
Best,
Martin
Dortmund, Germany
thanks for replay...
but how i use it the class false (44) only and class true about (986)
thanks...
i work on version 8.0.001 these operator not found in it.. what is best one instead of it and how work?
thanks very match.. upsampling operator solve the problem