RapidMiner 9.7 is Now Available

Lots of amazing new improvements including true version control! Learn more about what's new here.

CLICK HERE TO DOWNLOAD

How to perform oversampling to this

nathaliejoynathaliejoy Member Posts: 3 Learner I
I want to apply oversampling to my data analysis with Rapidminer. I believe my category is not balance Having the NAT-Grade-Remarks as the category, I have VLM, MTM, LM, and AM as value for my category which is the NAT-Grade-Remarks. Now I tried using the Sample operator but nothing is happening it keeps giving me error.

It always told me that I have one label, I believe I only have unbalance data I have more than 700 rows so, this does not mean Iย  have very small amount of data . Please help me on this, the following is my XML

I cant paste my xml but here is the capture for process:


Answers

  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,497   Unicorn
    You can try weighting or sampling if your data is imbalanced.
    Brian T.
    Lindon Venturesย 
    Data Science Consulting from Certified RapidMiner Experts
  • tftemmetftemme Administrator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, RMResearcher, Member Posts: 141  RM Research
    Hi @nathaliejoy

    You can also insert a breakpoint before the Sample (Stratified) operator to investigate the data which is directly going into the operator (insert breakpoints by right click on the operator and selecting the corresponding option).

    When the error message is that it only has one label, it seems that the label attribute (I assume this is the Nat-Grade-Resume attribute in your case) only have one value). Maybe in the preprocessing before it is accidently reduced to only one value or something similar.

    Without having a look into the data which goes into the Sample operator, we cannot do much more than guessing for now.

    Hopes this helps
    Best regards,
    Fabain

Sign In or Register to comment.