How can I specify a validation dataset for H2o Deep Learning model in RapidMiner

MlobriMlobri Member Posts: 2 Newbie
Hello,

I'm using the Deep Learning model of the H2o framework available in RapidMiner.
To perform my analyses I don't see how to control the data used for validation step at the end of each epoch.
For example, with Keras you have to specify the training set rate, and that part of your data are used for the validation.
With Pytorch, we provide the training, the validation and the test set.
How can I do with RapidMiner?
I am also wondering how to see the loss curve  in order to evaluate if the model overfit with respect to the number of epochs.

Do you have an idea for any of these questions ?

Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,503 RM Data Scientist
    Hi,
    i think you want to check the Deep Learning extension which allows more complex things. This allows you to specifiy the test set manually.

    Best,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • MlobriMlobri Member Posts: 2 Newbie
    Hello Martin,

    Thanks a lot for your reply.
    Isn't there no way to do that with the operator provided by H2O ?


  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,503 RM Data Scientist
    you can check the expert parameters, it maybe possible. For Example:

    fold_column: Column name with cross-validation fold index assignment per observation. Type: column, Default: no fold column
    Which is then doing x-validation with defined batches for test.

    Best,
    Martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
Sign In or Register to comment.