🎉 🎉. RAPIDMINER 9.8 IS OUT!!! 🎉 🎉
RapidMiner 9.8 continues to innovate in data science collaboration, connectivity and governance
"Weka - Random Forest"
I have a training set and a test set, each has 130 attributes. I apply Weka-Random Forest to train the training set with all the attributes. The program selects 8 attributes of the set and generates 100% accuracy for the training set, however its performance is rather poor for the test set ---- only 53.7% accuracy.
Then I try to train the training set with only one attribute each time and then apply each of the 130 classifiers to the test set, and I discover that some of these classifiers are able to produce 80% accuracy for the test set, although their performances are not the best among the 130 classifiers for the training set.
What I want to know is, how can I train an even better classifier for the test set using those attributes that can produce 80% accuracy(of course, I can't apply the test set to train the classifier). Should I just simply choose the good attributes and put them into the Random Forest training, or are there any better ways to implement this?