Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
Determining value for parameter
Please Help Me. I am stuck.
I have a general decision tree and also CHAID and ID-3.
The parameters are
- minimal size for split
- minimal leaf size
- minimal gain
- maximal depth
- confidence
My training data is 400.
Ny features are 6707
My amount of total text is 27910
How can I determine a good value for the parameter without testruns. Testruns would take too much time due to the high enourmous amount of data.
Who has an idea for me?
Thank you!!!
0
Answers
If you are working with text and a lot of attributes and short on time you could give Naive Bayes a try.
Also you can try pruning some of your text vectors and removing correlated attributes.
the data is not as big as you might think. It sounds pretty reasonable to use a parameter optimization on that. You can do this either by grid or with an evolutionary approach.
If this is text mining, i would recommend a SVM. Usually they score better and you only have one parameter to optimize for in the linear case (C).
Cheers,
Martin
Dortmund, Germany