Optimize Deep Learning's network structure and parameters
I'm doing regression using "Deep Learning" operator, I have 480 input features (this is a predictive maintenance problem, each feature is a meter reading, and we want to apply regression to predict the next-time-to-fail of an asset). After training the "Deep Learning" operator, the Root Mean Square Error (RMSE) applied on the training datatset is still quite high (from 0.08 to 0.2), although the training dataset is normalized into [-1; 1]. I also tried a lot of network structures, including increasing the number of hidden layers (up to 15), and increasing the number of nodes in each hidden layer (up 1000 nodes/layer). In some cases, doing so even increases the RMSE on the training dataset, which means the model is under-fitting. I used the default values for the other deep learning parameters, including the adaptive learning rate and rectifier activation.
So do you have any piece of advices for this situation, or is there any way to optimize the network structure? (coz I already tried the "Optimize Parameters" operator for "Deep Learning" but could not find the operator's parameters for network structure). Or is there any way to make the deep learning operator fit the training data better?
Thank you very much for your help!