Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
Classifier Accuracy with Grid Search is not similar to accuracy without Grid Search
Hello guys I'm doing Grid Search for tuning Random Forest Parameters when the process ends it gives me a set of best parameters also the accuracy of the best parameters for RF, now my question is when I run the process without Grid Search by setting Random Forest parameters that i got from Grid Search I notice I get a downgrade accuracy??? Can anyone explain the difference because both approaches are the same the only difference is that the first approach is with Grid Search and the second time without Grid Search?
I have includes screenshots of my process
my dataset is Glass Type with 214 samples it contains 1 duplicate row, 6 class Unbalance Data, I run my process as following
send dataset into Optimize Parameters (Grid) operator
inside Optimize Parameters (Grid) operator:
1- remove duplicates
2- Normalize
3- split Data into 80:20
4- use Smote on Training data only
5- Train RF
6- Evaluate Model
I have includes screenshots of my process
my dataset is Glass Type with 214 samples it contains 1 duplicate row, 6 class Unbalance Data, I run my process as following
send dataset into Optimize Parameters (Grid) operator
inside Optimize Parameters (Grid) operator:
1- remove duplicates
2- Normalize
3- split Data into 80:20
4- use Smote on Training data only
5- Train RF
6- Evaluate Model
0
Answers
The stratified sampling create random subsets.
I suggest you to use the split operator once, store the results and then use the new examplesets into your comparison.
Best
Cesar
I did as you said and split the data then store the results into two separate files.
After that, I run Grid Search and get the best parameters and accuracy.
Then I test without grid search but still, I get a downgrade accuracy??
please check my screenshots and tell me if I'm doing something wrong??
I have used smote only once,
I have removed smote too and test again without using split operator still I get downgrade accuracy, I think using performance operator inside grid search and without grid search make slightly different result anyhow thanks
best regards