Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

why the accuracy and AUC values ​​are so different?

Dhiii12Dhiii12 Member Posts: 8 Learner I
Please help me,

Why does the 65% accuracy value get an AUC value of 0.727, while the accuracy of 73.54% get an AUC value of 0.711. what affects both, why higher accuracy has lower AUC?





I thank you in advance for your help! 

Best regards, 

Dhiii

Answers

  • BalazsBaranyBalazsBarany Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert Posts: 955 Unicorn
    Hi!

    This often happens with an imbalanced label distribution, e. g. if one value of the label is 80 % and the other is 20 % of the data. But there can be other reasons, too.

    AUC and accuracy measure different things. 

    Accuracy is just the percentage of the correct predictions. In an extreme case, when the simplest possible model just predicts the majority class, it can be right as often as the majority class is in the data. But this is not a good model (a model you'd want to use).

    AUC measures true positive to false positive rates at each confidence level. The red line that you see in the chart goes up for correct predictions and to the right for wrong ones. The line describes a "curve" and the AUC is the area under that curve. So the best AUC (1.0) is achieved by just making correct predictions with the highest confidence. 

    So AUC also measures the confidence levels, not just the (more or less arbitrary) decision boundary applied by Apply Model. It is a more complex but more reliable measure, and it's not affected by imbalanced label distributions.

    Check the confusion matrix. Do you have a situation where one class is being overpredicted (has many false positivies)? That would be a situation where AUC is a better indicator of the model performance than Accuracy.

    Regards,
    Balázs

Sign In or Register to comment.