"How to access train and test instances in each fold for a N-fold cross validatin"

kashif_khankashif_khan Member Posts: 19 Contributor II
edited June 2019 in Help
Hi Folks,

I am working on a data mining problem in RapidMiner where i have to access instances in each fold for a N-fold cross validation with a classifiers. I can access the instances in "Test" subprocess of Validation operator as it gives me an instance of "ExampleSet" but cannot access the same for "Training" subprocess which yields an instances of "DistributionModel". I am trying to iterate over them in my code. How can i get the instances in test and train split for each fold separately ? How can i cast DistrubutionModel to an ExampleSet ?

I really appreciate your help ...

Answers

  • Marco_BoeckMarco_Boeck Administrator, Moderator, Employee, Member, University Professor Posts: 1,993 RM Engineering
    Hi,

    1) When you open the X-Validation operator in your process in RapidMiner Studio GUI, you see a "Training" subprocess on the left and a "Testing" subprocess on the right side. Notice the ports on the top right side of each subprocess. If you want to access data from them in your code, they need to be connected. So if you want to access the training data, you will have to pipe it to the "thr" port.
    Another option would be to access the input ports on the left instead of the output ports on the right. That way you can access whatever comes into each subprocess.

    2) You cannot cast DistributionModel to an ExampleSet. An ExampleSet is your actual data (think database table) and the DistributionModel is a model which is used to generate predictions based on your actual data. They are completely different things.

    Regards,
    Marco
Sign In or Register to comment.