Auto Model lost data

DrGintoki2021DrGintoki2021 Member Posts: 3 Contributor I
edited January 2021 in Help
hi guys,
(1)when I was using auto model in rapidminer 9.8 and trying to predict the values of a column,
(2)I found that it only show 65 rows of data and the confusion matrix only show me less than 50 data——
(3)actually I have 162 rows. 
  so... why? how to show me the whole 162 rows predictions performance and confusion matrix?
thanks !!!

Best Answer

  • Options
    lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
    edited January 2021 Solution Accepted
    Hi @DrGintoki2021,

    It is because AutoModel is using a multi hold out validation method.
    AM is using 40 % of your initial dataset to test/evaluate the performance of your model.(and the remaining 60 % of the initial dataset to train your model)
    For that it split the 40% of your initial training set into 7 folds .
    Then he calculates the performances for each of the 7 folds.
    then he remove the max performance and the minimum performances of the 7 performances
    and thus he keep 5 performances and display the confusion matrix for this remaining 5 folds.

    In other words you have in your confusion matrix : 162 data points x 0,4 (40%) x 5/7 = 46 data points 
    It matches with what you are displaying in the picture you shared, we have 46  = 20+21+5 in your confusion matrix...

    EDIT : 

    Take a look at the "information" panel in the results screen of AutoModel : look at Model -> Performance to have a description of how is calculated the performance in  AutoModel.


Sign In or Register to comment.