How to control the proportion of predicted example

stephen5huostephen5huo Member Posts: 5 Contributor II
edited November 2018 in Help
Help needed to understand how to control the proportion of predicted examples in binomial classes: I have a database with binomial class label: either 1 or 0. I have got a training set with 20000 records, inside which around 2600 of them have the class label 1, others are all class 0. Then I have a test set with around 5000 records, but only around 100 of them are belong to class 1. However when I perform the prediction by applying different algorithms (Logistic regression, linear regression, Support Vector Machine, Neural Net…) I found it very difficult to control the proportion of the predicted examples, in this case, the records being predicted as class 1. Ideally I would like to control the mining process to only predicted records in class 1 around 3% - 5% of the whole dataset and maintain 95% of records being classified class 0. Any idea?  Many thx



  • Options
    hagen85hagen85 Member Posts: 18 Contributor II
    Hi Stephen,

    have your tried using a sample operator with "balance data"-option? Doing so you can help the learner to reach a good accuracy on both cases.
    Hope I got your problem right.

Sign In or Register to comment.