Options

# Need reference for Optimize Parameters (Evolutionary)[SOLVED]

Hi

I made a model with SVM and Optimize Parameters (Evolutionary) and results are good, so i decide publish it but i could not find any reference about Optimize parameters evolutionary. I do not know which paper is used to develop this operator. If anyone has information about this please tell me.

I made a model with SVM and Optimize Parameters (Evolutionary) and results are good, so i decide publish it but i could not find any reference about Optimize parameters evolutionary. I do not know which paper is used to develop this operator. If anyone has information about this please tell me.

0

## Answers

164Maven4Contributor I1,869Unicornactually we don't have more detailed documentation for the guts of that operator. Unfortunately the only answer I can give you right now is to look at the code - that's also what I would have to do...

Best regards,

Marius

4Contributor II am new in rapid miner and i do not know how to trace code. Please tell me how can i trace code during run or find optimization parameter evolutionary source code to get my answers.

With the best regards

537Maven(Although he is currently in Leiden).

http://arnetminer.org/person/thomas-back-1509429.html

Alternatively you can cite his close friend: Gusz Eiben.

http://www.cs.vu.nl/~gusz/ecbook/ecbook.html

Or, if the code is CMA-ES rather than ES, you can cite Nikolaus Hansen.

https://www.lri.fr/~hansen/cmsa-versus-cma.html

Look at the 3 links, you are guaranteed to find a good paper here.

1,869UnicornTo find the code of a certain class, open the OperatorsDoc.xml, search for the operator name, and search for the respective key in Operators.xml, which will then point you to the underlying java class.

Best regards,

Marius

537MavenThis is a SVM implementation using an evolutionary algorithm (ES) to solve the dual optimization problem of a SVM. It turns out that on many datasets this simple implementation is as fast and accurate as the usual SVM implementations. In addition, it is also capable of learning with Kernels which are not positive semi-definite and can also be used for multi-objective learning which makes the selection of C unecessary before learning.

Mierswa, Ingo. Evolutionary Learning with Kernels: A Generic Solution for Large Margin Problems. In Proc. of the Genetic and Evolutionary Computation Conference (GECCO 2006), 2006.

http://dl.acm.org/citation.cfm?id=1144249

Evolutionary learning with kernels: a generic solution for large margin problems

Full Text: PDFPDF

Author: Ingo Mierswa University of Dortmund

Published in:

· Proceeding

GECCO '06 Proceedings of the 8th annual conference on Genetic and evolutionary computation

Pages 1553-1560

ACM New York, NY, USA ©2006

table of contents ISBN:1-59593-186-4 doi>10.1145/1143997.1144249

http://wing2.ddns.comp.nus.edu.sg/downloads/keyphraseCorpus/89/89.pdf

537Mavenhttp://i.snag.gy/MlUz8.jpg

As far as I understand:

a: constrained real values

n: number of support vectors (a)

y: labels (with values -1 and 1)

k(.,.): a kernel function

So how is n chosen?Is this maybe the number of data points?

So now I should be able to understand fully what this formula does.

We have a double loop, so we get all possible combinations of two data points in our data set.

y_i * y_j gets a value of 1 when both data points are of the same class and a value -1 when they are of a different class

a_i * a_j * k(x_i, x_j) also evaluates to a scalar value

Since we maximizing, we want a_i * a_j * k(x_i, x_j) to evaluate to some positive value if they are the same class, and some negative value if they are not the same class.

k(x_i, x_j) maybe is possible to interpret as the notion of similarity between this two data points with very similar is high positive, and very non-similar is high negative

Is pretty clear to me that I don't fully understand what is going on here.

For me optimizing the a's that maximize this formula using ES is trivial, but why this formula optimizing margin is unclear to me.

After the paper mentions "Wolfe dual" I'm lost, but I would like to understand!

1,869Unicornas far as I remember n is the number of examples in the dataset. You are right about the assumptions of the other examples.

Tibshirani's Elements of Statistical Learning contain a good introduction and mathematical derivation of the SVM and the formula you cite: http://www-stat.stanford.edu/~tibs/ElemStatLearn/index.html

Best regards,

Marius

4Contributor II appreciate for your help.

With the best regards

Ali Kavian