Options

ROC threshold blue line

damir_imamovicdamir_imamovic Member Posts: 2 Contributor I
edited April 2020 in Help
Can anyone explain what the blue threshold line means and how to interpret it? Is there any way to disable it remove from chart in Rapid MIner?
Thanks in advance.
Tagged:

Best Answer

Answers

  • Options
    bernardo_pagnonbernardo_pagnon Member, University Professor Posts: 64 University Professor
    Hello,

    the ROC curve is built by decreasing the thresholds, from 1 to 0. At each value, we have a confusion matrix which gives a point in the graph (FPR versus TPR). This the blue curve, the thresholds that were used to build the ROC. I am not sure if it cannot be disabled, but if you use the Compare ROC operator (even with one model) the blue curve does not appear. 

    Regards,
    Bernardo
  • Options
    damir_imamovicdamir_imamovic Member Posts: 2 Contributor I
    @bernardo_pagnon thanks for your quick response. I wish i can delete blue line. Even if I use compare roc operator with one model there is still roc threshold blue line. I found that (The ROC (thresholds) curve just shows this confidence threshold (sometimes also called confidence cut).) Best solution would be to remove it. I'll make one ROC in excel.
  • Options
    amitdamitd Member, University Professor Posts: 49 Maven
    @jacobcybulski, can you please clarify your comment - "Typically people want to find out the "optimum" threshold as the ROC point closest to left-upper corner (which is not quite correct) and you could find the threshold that way"? Why is this point not an optimum threshold? If not, what would be an optimum threshold (based on what criteria) and how do you find it?
  • Options
    jacobcybulskijacobcybulski Member, University Professor Posts: 391 Unicorn
    @avd , you have asked the right question, which really addresses my qualification of the general belief that the optimum threshold can be found in the point closest to left-upper corner of ROC. The key of course is your observation that it all must depend on the selected criteria. The simplistic approach is to find the threshold, which maximises TPR and minimises FPR, which cannot be done unless you define some cost function to involve TPR and FPR, or some other metrics. There are many approaches to doing this, e.g. by looking at the ROC gradient (under the assumption of TPR and FPR having equal costs), looking at geometric mean or sensitivity and specificity, using Youden’s J index, or by tuning the threshold according to some other statistic, e.g. threshold vs kappa or F1, or relying on the precision-recall curve, etc.
Sign In or Register to comment.