08-25-2016 02:51 AM
I want to try out those 3 new algorithms that came with 7.2 on my dataset (4500 examples with 25 num. attributes), what are the most important parameters to tune in a grid optimization operator for them? and in what intervals? are there any experiences..?
08-25-2016 08:50 AM
No free hunch
For deel learning, it basically depend on the network design and specific domain knowledge,
how to choose activation function, # epochs, hidden layer sizes, learning rate, parameters for avoid overfitting etc....
Why not download the booklet and take a look at the reference for the supervised models you just mentioned from
you will get more helpful information there
08-28-2016 05:32 PM
Yea I'd like to get some ideas about the best params and their ranges to start tweaking with.
I run some sweeps and was rather disappointed.
I cannot run 10 params so any pointers are welcome.
Also surprisingly I get a better generalization of a smaller set than a bigger one (my total data set is just a few thousands of examples), what gives..?!