The RapidMiner community is on read-only mode until further notice. Technical support via cases will continue to work as is. For any urgent licensing related requests from Students/Faculty members, please use the Altair academic forum here.

Cross validation

PapadPapad Member Posts: 68 Guru
Hello,
Is there anybody who can solve me this problem?
in the first picture I have this:

Here I measure the performance on the same data, and the accuracy is 87,44%.

When I have the same procedure but inside cross validation like this:


(inside cross validation)



The accuracy I have here is 82.11%. 
It is about the same procedure but inside a cross validation operator.
Why there is that difference on two cases?
What I have understand is that because in the second case my model is being trained and then it measures the performance in the testing section so it is more accurate. 
So more training doesn't always means greater accuracy?
I hope my question is clear.
Thanks in advance.

Best Answers

  • varunm1varunm1 Member Posts: 1,207 Unicorn
    Solution Accepted
    Hello @Papad

    As Martin informed, in the first case you are training and testing the model on the same data, which is not useful to validate your model. In the second case, you are cross-validating a model, which means you are training on one data and testing on another data which the model never saw, this is the best method to validate your model.

    To understand cross-validation, here is an excellent post from Scott.

    https://community.rapidminer.com/discussion/55112/cross-validation-and-its-outputs-in-rm-studio

    Thanks
    Regards,
    Varun
    https://www.varunmandalapu.com/

    Be Safe. Follow precautions and Maintain Social Distancing

Answers

  • PapadPapad Member Posts: 68 Guru
    What I can't fully understand is that is the cross validation case, we have one set of data, we know the result, so how it is used for unkown data? 
Sign In or Register to comment.