Options

Build a regression model FOR EACH example

leviavihayleviavihay Member Posts: 5 Learner I
Hi,
Following my previous question (https://community.rapidminer.com/discussion/55089), I'm posting a different question regarding my next step.

I got a data set in which each row is a series of [value, date] points.
My goal is to build a linear regression model for each row.

Is it possible?...

Thanks,
Avihay

Answers

  • Options
    Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    Of course it is possible. You can use a Loop Examples to do so and then simply put your ML algorithm inside (and no need for split or cross validation if you are doing one at a time).  But of course the related question would be why this is necessary?  The variance of models produced on a single example is extraordinarily high and would probably not be robust.  Plus you would have a high number of models to manage.
    If you want something similar, you can also check out the "leave one out" cross-validation approach.  This builds a model on n-1 examples (where n is your total example counts) and then validates that model on each example separately.
    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • Options
    leviavihayleviavihay Member Posts: 5 Learner I
    Hi @Telcontar120
    I will go over the "Loop Examples" operator info, thanks.

    Regarding you comment about whether it's even necessary - in this case each row is a different device. For each device I got different reads in different dates. I wish to build a linear regression model (for now) for each one to predict when it will go over a certain threshold (different one for each device)
  • Options
    Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    So an alternative approach to building separate models would be to include device type as a potential predictor, and then use LOO cross-validation as noted above.  Basically, if you believe that similar predictive patterns should hold across devices, then you could use a combined model to make your prediction.
    I would certainly at least check the performance of such a combined model before I went down the road of building and managing many separate models. 
    Another significant problem with your approach is that it will be very difficult to measure or assess the accuracy of the approach over time, since you will only have one record for which you can validate the model in the future (presumably, although if you have multiple time periods from the same device then you might be able to increase your sample in that way).
    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
Sign In or Register to comment.