Optimized parameters for SVM Linear

ferandi · March 2012

Hi friends,

Anyone know wath's the best way to optimize C and epsilon parameter on a linear SVM model (libsvm)? I'm classifying about 3000 docs on 11 categories.
Is there a relationship to the number of features, doc ou categories that's give me a starting point? I'm getting an x-validation accurancy of 70,59% and i want to improve it.

Another question, how is calculated de accurancy value of performance vector result? There's a relashionship with precicion and recall of the classes on each x-validation step? It uses de F1 scrore?

Many thanks in advance!

MariusHelf · March 2012

Hi my friend,

the accuracy is defined as (correctly_classified_examples)/(all_examples), i.e. the ratio of correctly classified examples or, in other words, the probability that an unseen example is classified correctly by the model. (The latter is only true if the class ratio of the testing set and in the new examples is equal).

Concerning the SVM: there is no general rule of thumb for good parameters. But using the Parameter Optimization (Grid) you can easily optimize the parameters. For the C value a good starting point are values from something like 10^-4 to 10^4 on a logarithmic scale. I also suggest to try the radial/rbf kernel. In that case also parameter gamma must be optimized. Try the same value range as for C.

Best,
Marius

ferandi · March 2012

Hi Marius, thanks so much for your response!!!

One more question, my classes are very unbalanced, in your opinion this have any influence on my model accurancy?

Thanks a lot!!

MariusHelf · March 2012

Yes, suppose you have 99% positives, and your model always says "yes" - obviously it will never detect any negatives, but it has an accuracy of 99%.

Keep that in mind when you talk about accuracies

For better comparability, you could change the class ratio to 1:1 in your training set.

Best,
Marius

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

Optimized parameters for SVM Linear

Answers