Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

How to perform oversampling to this

nathaliejoynathaliejoy Member Posts: 7 Contributor II
I want to apply oversampling to my data analysis with Rapidminer. I believe my category is not balance Having the NAT-Grade-Remarks as the category, I have VLM, MTM, LM, and AM as value for my category which is the NAT-Grade-Remarks. Now I tried using the Sample operator but nothing is happening it keeps giving me error.

It always told me that I have one label, I believe I only have unbalance data I have more than 700 rows so, this does not mean IĀ  have very small amount of data . Please help me on this, the following is my XML

I cant paste my xml but here is the capture for process:


Answers

  • Telcontar120Telcontar120 RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    You can try weighting or sampling if your data is imbalanced.
    Brian T.
    Lindon VenturesĀ 
    Data Science Consulting from Certified RapidMiner Experts
  • tftemmetftemme Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, RMResearcher, Member Posts: 164 RM Research
    Hi @nathaliejoy

    You can also insert a breakpoint before the Sample (Stratified) operator to investigate the data which is directly going into the operator (insert breakpoints by right click on the operator and selecting the corresponding option).

    When the error message is that it only has one label, it seems that the label attribute (I assume this is the Nat-Grade-Resume attribute in your case) only have one value). Maybe in the preprocessing before it is accidently reduced to only one value or something similar.

    Without having a look into the data which goes into the Sample operator, we cannot do much more than guessing for now.

    Hopes this helps
    Best regards,
    Fabain

Sign In or Register to comment.