"Question about prediction values and applying weights from results"
To begin with, I am very new to data mining and Rapid Miner. I have been experimenting with the software for several months now and it has certainly proven to be a fantastic and powerful learning and exploration tool. Thank you for your wonderful efforts on this software application!
I do have a couple of questions, though, about how to accurately replicate the prediction values I am receiving. I am performing both classification and regression tests on labelled data. Overall, I am running experiments where I:
1. optimize the number of attributes (either through PCA or Genetic/Evolutionary "optimize selection"). I typically normalize the resulting weights to receive either a 0 or 1 and pass those attributes with a weight of 1 onto the next processing step.
2. I run the same data set with the "selected" attributes through the same learner as the "optimize selection" (typically SVM) in order to obtain the weights/model for the data with the selected attributes.
3. I then apply these weights/model to a new set of unseen data with just the selected attributes and obtain the performance of the weighted model of the data.
When the test is complete, I view the data set which displays the selected attributes along with the label value and the predicted value. I also view the weights of the selected attributes. In an effort to replicate the predicted value, I basically perform matrix multiplication with the transposed weight matrix and the attribute value matrix. However, the values I obtain when I do this are usually nowhere near the predicted value which is displayed. I perform this for both the classification and regression problems. I also realize there are often times biases associated with the various learners which I add to/subtract from the calculated values I obtain. However, these values are still not near the predicted values. It seems like this should be pretty straight forward, but I know I am definitely missing something.
Is there anyone who might be able to explain how to apply the weights obtained from the various learners in order to obtain accurate prediction values, especially for binominal classification?
Thanks in advance for anyone able to assist!