Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
[SOLVED] discretize by entropy evaluation
Hi,
I would like to use discretize by entropy operator with naive bayes classifier. As far as I understand discretize by entropy depends on class value and I it would not be correct to first discretize all dataset and then perform cross validation. I would like to set up experiment where in every test fold of cross-validation I discretize data by entropy and in test fold the classifier is evaluated on on test set discretize by bin intervals from train set fold. Is this possible. I am not sure If I was clear, simply I wish to classified new data using classifier build on discretized data, how I should apply the same discretization intervals on new data?
Any help, comment would be very appreciated.
Thank you.
Matus
I would like to use discretize by entropy operator with naive bayes classifier. As far as I understand discretize by entropy depends on class value and I it would not be correct to first discretize all dataset and then perform cross validation. I would like to set up experiment where in every test fold of cross-validation I discretize data by entropy and in test fold the classifier is evaluated on on test set discretize by bin intervals from train set fold. Is this possible. I am not sure If I was clear, simply I wish to classified new data using classifier build on discretized data, how I should apply the same discretization intervals on new data?
Any help, comment would be very appreciated.
Thank you.
Matus
0
Answers
in the X-Validation to ensure that the same preprocessing model from train is used with the test set:
So it means the model or the bin created from the discretized process is applied to the new data right?
Not applying a new "discretize by entropy" preprocessing to the new data, I'm sorry if this is confusing, I only want to make sure.
Thank You
Hi,
yes. You should apply the preprocessing model to the new data set.
BR,
Martin
Dortmund, Germany