Cross Validation

moises_mjs53moises_mjs53 Member Posts: 5 Contributor I
edited November 2018 in Help

Hello! I made for educational purposes a predictive network in the form of a video lesson from Thomas Ott for stock market price prediction. My question is, after the trained network, when it is cross validate, should I assign in the operator Set Set the Attribute Name fields as prediction (label) and the target as prediction? Or do I put the attribute as a prediction (label) and the target role as a label?

Best Answers

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn
    Solution Accepted

    Oh I'm glad you're building off my old tutorial. I really need to update them one day. :)

     

    For the scoring set, the data you want to predict a label for, you normally don''t include it. So there will be no column for it. When it gets scored, it will automatically create a new column as well as a column for each label class. In the my videos that would've been a column for label, confidence(up), and confidence(down).

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn
    Solution Accepted

    So with Cross Validation, depending on the "k" parameter, it will cut your your data set into "k" groups. it will train on "k-1" and test on 1, then it will another "k-1" group at random and train in that, test on 1. It will do this k times and then train the model on the whole data set. The accuracy value is the average of "k" data sets with 1 std deviation. This gives you an idea of how stable your model is and what you can reasonable expect from unseen data. 

     

    The Sliding Window Validation operator works different, it's like backtest. The Window widths are the time periods you want to use for training and so forth. 

Answers

  • moises_mjs53moises_mjs53 Member Posts: 5 Contributor I

    Thank you Thomas. Then the result in the accuracy report, the percentage that appears is really what the network managed to hit in the untrained period, eg. 1000 values, 700 for training and 300 for testing, the result of accuracy is what the network was able to achieve in these 300 tests? To do a cross validation can I put the Cross Validation operator in place of the Sliding Window Validation, training the network and cross validating at the same time?

  • moises_mjs53moises_mjs53 Member Posts: 5 Contributor I

    Perfect Thomas, so it's worth to train the network (backtest) and then cross validate. Thank you very much for your help, it was very important to me. Success.

Sign In or Register to comment.