Predicting rare event with Auto Model

niall_dempsterniall_dempster Member Posts: 1 Newbie
edited April 2020 in Help

I am trying to develop 2 models that predict relatively rare events (the F3/4 column and F4 column in attached file). I am a physician and not too familiar with machine learning so trying to get up to speed. I used Turbo Prep to impute missing data in the attached training/validation database and I have a separate independent Testing database that I would like to use once the models have been generated.

Initially using Auto Model, accuracy seemed to be prioritised (every case was predicted to be index 1, which was almost always correct since index 2 is infrequent). However, for this problem it is important to have a sensitive model so I am picking up cases of the rare event (index 2). Is it possible to optimise the AUC/Youden Index rather than accuracy? 

So far I have tried adding in custom settings for costs and benefits, so that predicting range 1 where true range is 2 is penalised, and correctly predicting true range 2 is rewarded. Are there recommended numbers to add in for these costs/benefits?

Many thanks for your help




  • Options
    hbajpaihbajpai Member Posts: 102 Unicorn
    Hey @niall_dempster,

    The cost/benefits are typically based on domain knowledge. Think it like this, what profits you will have for every correct predictions and how much money you will lose if you predict incorrectly and then you can use the exact values in the matrix. 

    Auto model main criterion is set to classification error. However, you can open process of your best performing model and change the main criterion. It is in (4) - SCORING, VALIDATION, EXPLANATIONS, WEIGHTS & SIMULATOR section. You can open Validate Model sub-process and evaluate different options with performance operator. 
Sign In or Register to comment.