Prediction provides for not empty values but all records?

yavuzkayayavuzkaya Member Posts: 1 Newbie
edited June 2019 in Help
Hi there, 

When I use the auto model for predicting, Rapidminer works on all data and predict every values even the filled ones instead of predicting only the empty ones that I want to have only, and this takes huge time for sure. Before the update, it used to predict only the empty cells not all! 

Any solution?




  • Options
    IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Auto Model is actually not creating the predictions for all the data but only for the hold-out validation set plus those rows with actual missing labels (as before).  This was one of the most frequently requested changes for Auto Model by the way since many people also wanted to see the predictions on the validation set in order to check the result and see how the model performed for themselves.  Makes sense in my opinion.
    It actually is not taking that much time to be honest.  All the models supported in Auto Model can perform the scoring in linear part.  The explanation of predictions slows this down a bit but is typically also relatively fast compared to the actual model building.  I would estimate that the total runtime can be brought down by maybe 5% or so only if we would remove this (again). The reason why Auto Model may "feel" a bit slower after the update is (a) feature engineering (if turned on) and (b) an improved validation approach which delivers more robust performance estimations incl. standard deviations now.
    So in short: you cannot turn the scoring of.  Many people wanted to get it and it actually does not increase the runtime by a lot due to the linear nature of the scoring algorithms.
    Hope this helps,

Sign In or Register to comment.