Options

# is a SVM invariant to Skewness and Kurtosis?

Member Posts: 344 Unicorn
edited November 2018 in Help

I have data that is highly positively skewed, and I want to train a SVM (LibSVM) Classifier on the data with 3 classes... my question is, does skewness and/or curtosis affect performance of a SVM classifier? or is it invariant to those statistic measurements? or should I rather use a log transform for e.g skewed columns?

and which other classifiers are invariant of statistical measures, and which ones does affect them?

Tagged:

• Options
Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,507 RM Data Scientist

Fred,

help me to get the connection. Kurtosis and Skeweness are univariate measures of a distribution. These do not depend on the label class. They have not that much todo with how a SVM works.

The interesting part is to do a scatter plot between label and observerd attribute(s). The non-linearity here is the intersting thing to catch.

~Martin

- Sr. Director Data Solutions, Altair RapidMiner -
Dortmund, Germany
• Options
Member Posts: 344 Unicorn

well what I mean is, in linear Models like Generalized Linear Model, I think the distribution plays a role like skewness etc. therefore its useful at least with my dataset if I apply a log transformation on skew columns... I got 10%+ better performance after doing that.. outliers also play an important role I think,

I just wanted to know if the same also applies to SVM, I tried with my dataset, both normal and log transformed, and somehow I got about 1- 1.5% better performance on my log-transformed dataset... how can that be?

• Options
Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,507 RM Data Scientist

Mh,

it's overall a interesting question for a regression problem. The problem of GLMs comes from the underlying distribution assumption. By minimizing least squares you implicitly assume a normal distribution. Distributions with high skewness/kurtosis violate this assumption and are thus not performing well.

For SVM-Regression i am not 100% sure if there is such an assumption in the Loss measure. I think it uses absolute loss with the Epsilon to ignore errors below this. Maybe @IngoRM or @RalfKlinkenberg can help, they got some more theoretical experience with SVMs.

In any way, your increase in performance is explainable to me from a total different point of view. If you apply a log on all attributes, your kernel function gets different. That makes a obvious difference.

Best,

Martin

- Sr. Director Data Solutions, Altair RapidMiner -
Dortmund, Germany
• Options
Member Posts: 344 Unicorn

ok thanks, but what do you mean by my kernel function gets different? In what matter is it different?

• Options
Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,507 RM Data Scientist

Hey,

if you look for example at the rbf kernel (https://en.wikipedia.org/wiki/Radial_basis_function_kernel ) you would simply replace x and x' with log(x) and log(x'). Thats simply different. Not necessarly good or bad, but different.

~Martin

- Sr. Director Data Solutions, Altair RapidMiner -
Dortmund, Germany