Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the

**Register**button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.# "Optimizing Parameters for SVM"

Using various optimization operators, in combination with cross validation and performance operators I want to improve the performance of my SVM. I have tried all the available kernels, and try different values for C.

Are there any “rules of thumb” of what ranges the parameter “C” can be? Are there any other parameters you would recommend varying?

Thanks,

Cleo

Are there any “rules of thumb” of what ranges the parameter “C” can be? Are there any other parameters you would recommend varying?

Thanks,

Cleo

Tagged:

0

## Answers

44Mavenhttp://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf

It is best to use a loose grid search on C = 2^-5, 2^-3 .... 2^15 and Gamma = 2^-15, 2^-13 ....2^3, then once a region is deterimined use a tighter grid. Is this correct?

Also they recomend the libSVM, using the RBG Kernal.

Another paper suggests using a hybrid system.

"The nonlinear SVM model is applied right after the

linear SVM to forecast the nonlinear data pattern of residuals from the linear SVM

model."

Could this be accomblished with the "Stacking" operator?

Thanks,

Cleo

1,751RM FounderAbout predicting the residuals with a non-linear model after performing the global one first: This can help - sometimes. Stefan Rüping discussed this "global" vs. "local" model approach in his PhD but in general I did not get the feeling that it has to help in terms of accuracy but more in understandability. The non-linear model is likely to get the basic linear model as well. It's more about risk for overfitting (which should not happen with correct parameters) and that people understand linear models better.

Cheers,

Ingo

44MavenThanks for the response and congratulations on the nomination for the dissertation award. On February 9, 2010 I took the “Financial Data Mining with RapidMiner” course with Ralf Klinkenberg and have unsuccessfully trying to duplicate the results he presented.

The first process that Ralf Klinkenberg demonstrated used the closing price of the S&P 500 as the only input and I have made a very simple 5.0 version based on his 4.6 version.

The problem I think I have is it seems every data point becomes a support vector, which leads me to believe the model memorizes the data instead of learning any patterns. I have tried adding an optimize parameter operator both a grid and evolutionary to adjust the window size, and the kernel type, c and epsilon value.

I then plan to add inputs, several papers suggest moving averages of different lengths and wavelet transformation, and Ralf Klinkenberg suggested using Fourier transformation.

How would you recommend improving this model?

Cheers,

Cleo

Data

http://dl.dropbox.com/u/3978768/daily_sap.csv

1,751RM FounderIn general, I would suggest to optimize the appropriate kernel parameters as well, for example, gamma (or sigma) for a radial basis function kernel function. Those parameters in combination with C are often much more important than all other SVM parameters. Taking the window size into account is also recommended.

And that's the important point: I would also recommend to shift your focus on extracting additional features and consider this to be much more important than the actual learning scheme. Appropriate feature plus a simple linear regression often perform much better than highly optimized SVM or neural nets. On the other hand, using those more complex non-linear learning schemes often add not much more accuracy on a well-optimized feature space.

All mentioned additional features can help, I would also consider additional single features taken from the frequency space and even from the phase space.

Cheers,

Ingo