Compete in RapidMiner's 3rd Competition: Fantasy Football. Top prize is $750. Deadline December 19.
Yes it's out and we're super excited about it. We call it "Data Science for the Enterprise". Download today and let us know how you like it!
Read about how our community works. Meet other newbies. Get your questions answered fast!
Hi there, I am currently training a model for sales predictions based on a data set of existing products with a range of attributes and sales data. The best model so far seems to be linear regression. However, for some products I am getting negative sales predictions. Is there a way to covert those into zeros? And can I also insert this operator into the cross-validation process to recalculate performance?
Solved! Go to Solution.
The fastest way to do this is to generate a new attribute based on existing prediction attribute and use an expression like this (assume the column name is 'prediction'):
This will keep old predictions and also add a new column where all negative values are replaced by zeros.
Then you can simply use the new version of "prediction" and feed it into a new "Performance" operator and recalculate your performance (you don't need to do this inside cross-validation but you can). Note that it is actually possible that your aggregate performance would decrease depending on the performance metric you are using. But you may still prefer the modified version of your prediction for other reasons.