Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

Predicting rare event with Auto Model

niall_dempsterniall_dempster Member Posts: 1 Learner I
edited April 2020 in Help
Hi, 

I am trying to develop 2 models that predict relatively rare events (the F3/4 column and F4 column in attached file). I am a physician and not too familiar with machine learning so trying to get up to speed. I used Turbo Prep to impute missing data in the attached training/validation database and I have a separate independent Testing database that I would like to use once the models have been generated.

Initially using Auto Model, accuracy seemed to be prioritised (every case was predicted to be index 1, which was almost always correct since index 2 is infrequent). However, for this problem it is important to have a sensitive model so I am picking up cases of the rare event (index 2). Is it possible to optimise the AUC/Youden Index rather than accuracy? 

So far I have tried adding in custom settings for costs and benefits, so that predicting range 1 where true range is 2 is penalised, and correctly predicting true range 2 is rewarded. Are there recommended numbers to add in for these costs/benefits?

Many thanks for your help

BW,

Niall
Tagged:

Answers

  • hbajpaihbajpai Member Posts: 102 Unicorn
    Hey @niall_dempster,

    The cost/benefits are typically based on domain knowledge. Think it like this, what profits you will have for every correct predictions and how much money you will lose if you predict incorrectly and then you can use the exact values in the matrix. 

    Auto model main criterion is set to classification error. However, you can open process of your best performing model and change the main criterion. It is in (4) - SCORING, VALIDATION, EXPLANATIONS, WEIGHTS & SIMULATOR section. You can open Validate Model sub-process and evaluate different options with performance operator. 
    Best,
    Harshit
Sign In or Register to comment.