The Altair Community is migrating to a new platform to provide a better experience for you. The RapidMiner Community will merge with the Altair Community at the same time. In preparation for the migration, both communities are on read-only mode from July 15th - July 24th, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here.

Options

# Optimized parameters for SVM Linear

Hi friends,

Anyone know wath's the best way to optimize C and epsilon parameter on a linear SVM model (libsvm)? I'm classifying about 3000 docs on 11 categories.

Is there a relationship to the number of features, doc ou categories that's give me a starting point? I'm getting an x-validation accurancy of 70,59% and i want to improve it.

Another question, how is calculated de accurancy value of performance vector result? There's a relashionship with precicion and recall of the classes on each x-validation step? It uses de F1 scrore?

Many thanks in advance!

Anyone know wath's the best way to optimize C and epsilon parameter on a linear SVM model (libsvm)? I'm classifying about 3000 docs on 11 categories.

Is there a relationship to the number of features, doc ou categories that's give me a starting point? I'm getting an x-validation accurancy of 70,59% and i want to improve it.

Another question, how is calculated de accurancy value of performance vector result? There's a relashionship with precicion and recall of the classes on each x-validation step? It uses de F1 scrore?

Many thanks in advance!

0

## Answers

1,869Unicornthe accuracy is defined as (correctly_classified_examples)/(all_examples), i.e. the ratio of correctly classified examples or, in other words, the probability that an unseen example is classified correctly by the model. (The latter is only true if the class ratio of the testing set and in the new examples is equal).

Concerning the SVM: there is no general rule of thumb for good parameters. But using the Parameter Optimization (Grid) you can easily optimize the parameters. For the C value a good starting point are values from something like 10^-4 to 10^4 on a logarithmic scale. I also suggest to try the radial/rbf kernel. In that case also parameter gamma must be optimized. Try the same value range as for C.

Best,

Marius

9Contributor IIOne more question, my classes are very unbalanced, in your opinion this have any influence on my model accurancy?

Thanks a lot!!

1,869UnicornKeep that in mind when you talk about accuracies

For better comparability, you could change the class ratio to 1:1 in your training set.

Best,

Marius