Optimizing Random Forest
Hello! I'm working on a random forest predictive model that predicts a binary label, in my case whether a customer has paid in advance or not. I have the following attributes:
date, article code, product name, producer, unit price , sales quantity, customer id, county, payment habits.
The process involves data reading, missing value is not in the data set, normalization (Z transform) (unit price, quantity), cross-checking the training data.
Performance is not good: accuracy about 75%, recall weighted 51%, precision weighted 58%.
I'm not sure whether what I am doing is right or wrong.
How can performance be improved? Any suggestions?
Sorry for my bad english
Please watch these videos both may give some ideas. I will also recommend taking the Machine Learning professional certification is completely free and will help you better understand all these topics.
Sampling & Weighting demo | RapidMiner Studio
Optimize demo | RapidMiner Studio