Need some insights on some analysis

iissiiiissii Member Posts: 1 Contributor I
edited November 2018 in Help

First of all thanks for the great software. I am doing my final year project (Bachelor's degree) and I have used Rapidminer as the tool to analyse my data.
In my project, I have captured enough data under certain parameters and then used it as input to a Neural network and also a Bayesian network (Naïve Bayes). Then I made some predictions for some new evidence. for example I could predict the value of a certain parameter for any new data that is imported into RapidMiner. Based on my observations, it is really accurate.

I don't want to go into every detail of my project but this was the big image. I showed my work to my supervisor and he believes this is not some great achievement. he wants to make sure the prediction I have made on new data is accurate enough and that is why he wants some analysis on the results. I really don't get him honestly and don't know what he means by analysis. He just once mentioned I need to do Mean Square Error (MSE) analysis and let him know the accuracy of my method.

Honestly I need help with this kind of analysis. My problem is that now that I have done my predictions, then what? How can I prove the predictions are fine and reliable? Can I use some other tools like RapidAnalytics to do such thing? I am not only talking about the MSE analysis but I just need some insight on any kind of analysis that I could perform.

Please shed some light :)



  • Options
    MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn

    it sounds that to verify your model you put in some data by hand and compared the results by hand. Comparing the predicted values with the correct values is of course the right way to go, but there are more sophisticated ways to do it than doing it by hand. Basically you have two choices (which may be combined):

    1. hold-out validation / split validation
    Train your model on, say, 70% of the data, then test on the remaining 30% if the model works well.

    2. Cross validation
    See e.g. http://en.wikipedia.org/wiki/Cross-validation_(statistics)
    The cross validation is a common means of validation and has some advantages over simple hold-out validation.

    In RapidMiner, for the split validation you have the operator Split Validation, for the cross validation use X-Validation.

    To calculate the RMSE, you can use the performance operator.
    Our video tutorials on rapid-i.com website should also contain some info on model validation.

    Best regards,
Sign In or Register to comment.