Applying prediction model to numerical values

akseleratorakselerator Member Posts: 3 Learner I
edited April 2021 in Help
Hi Rapid Miner Community
New here, so this is my first question. Hope to take part in this awesome community!
I'm trying to dig a bit further into predicting/understand the causes of cost escalation in my job. My problem is a bit in line with the Titanic prediction excercise.
Now to the problem:
I have a data set containing categorized cost overruns in the transport portfolio (think transporting huge vessels) of my company and relevant variables that could explain why these overruns happen (POD/POL/Destination/type/size etc). The problem is that rather than being Cost overrun=Yes/No, it is a numerical value that represents the size/severity of the overrun, and I cannot comprehend how to create a prediction model that considers this. In addition, I would like to get an output that explains why the model predicts what it does so that I can make sure to eliminate these mistakes.
Thanks to anyone taking their time to help me!

Edit: I only have data for about 65 projects right now. The purpose is to build it and keep feeding it information as projects finish. Cannot go further back in time. This means that AutoModel does not work.

Kind regards
Aksel

Best Answer

  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,231 RM Data Scientist
    Solution Accepted
    Hi,
    you can do a few things:
    • You can build a regression problem and predict the amount of overrun.
    • you can do a classification problem and then define a own performance metric as average \sum OverRunCostsCaptured .
    • You can use the costs as a weight in your analysis
    possibly even more things.

    Cheers,
    Martin
    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
Sign In or Register to comment.