"Compare the perfomance of various models (accuracy)"

a64863a64863 Member Posts: 1 Contributor I
edited June 2019 in Help

Hi,

 

I'm new to data mining, and i'm stuck in a project that my teacher gave me to do.

I have a data set and 6 models and for each model i want to generate a report that compare the accuracys between them.

The illustration of my work is here:

 

15102336_1350520011639610_972960018_o.pngProcess15064892_1350518428306435_1241446617_o.pngCross Validation

I'm sorry if this is not the correct place for post this doubt, but like i said before, i'm new here.

 

Thanks for the attention!

Tagged:

Best Answer

Answers

  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,312   Unicorn

    I am not totally certain what output you are trying to achieve, but you might try the operator "Compare ROCs."  You simply place your individual models into that subprocess, which is similar to the Cross Validation process you are using.  It will produce an exhibit that shows the performance of all of the models together. 

     

     

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 2,316  RM Data Scientist

    Hi,

     

    i usually use Performance to Data. This can then be joined/filtered/appended how ever you like it.

     

    ~Martin

    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,312   Unicorn

    @mschmitz I have a related question: when running cross-validation, if performance vectors are output, there are 3 summary values returned, such as the following example:

    "accuracy: 77.896% +/- 3.824% (mikro: 77.912%)"

    Can you clarify the calculation of these 3 metrics? My assumption is as follows:

    1. The first is the simple arithmetic mean of the chosen performance metric across the k folds of the cross-validation.
    2. The +/- margin is simply the standard deviation of the recorded performance vector across the same k folds of the validation and it is computed using the typical sample standard deviation formula.
    3. I don't know, however, what the third value reported in parentheses represents--some kind of adjusted average performance? (mikro=German for micro?) My obseravtion from experience is that it is often identical or very close to the first average performance reported, but I don't know its meaning or derivation?

    Thanks for the help!  

     

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 2,316  RM Data Scientist

    Brian,

     

    you got it right. Mirko is german for micro :). And also the rest. The one value is a simple average. The other value is a average where you use the number of examples on the testing side of the fold as a weight. I always mix it up which one is which.

     

    With enough examples micro=macro because all weights are equal.


    ~Martin

    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
    Telcontar120
  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,312   Unicorn

    Thanks @mschmitz!  

    @IngoRM do you remember which average is which (i.e., what the mikro version is, weighted or unweighted)?

    Happy Thanksgiving!

     

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,312   Unicorn

    Thanks @mschmitz!  

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,733  RM Founder

    Man, people are too fast here.  I never get a chance to answer myself :smileytongue:

     

    Thanks,

    Ingo

Sign In or Register to comment.