Heuristic for choice of parameter C in LibSVM

DannDann Member Posts: 5 Contributor II
edited November 2018 in Help
Hi there,

I was trying to find more info on the heuristic choice of the parameter C used by RapidMiner (i.e. the one used when the parameter is given as 0). However, the only place I could find mentioning this heuristic was this old thread from this forum. Is there any other place where this heuristic is discussed? I could not find the references cited by Ingo on that thread. Hastie and Tibshirani have far too many publications to search them one by one. Perhaps someone could give me any directions for justifying this heuristic value?

Cheers,
Dann

Answers

  • gutompfgutompf Member Posts: 21 Contributor II
    Maybe try this:
    Cherkassky: Practical Selection of SVM Parameters and Noise Estimation for SVM Regression

    I would be happy if you let me know if it worked...
    Milan
  • DannDann Member Posts: 5 Contributor II
    Well, this paper deals explicitly with the problem of SVM regression. I suspect RapidMiner uses this heuristic for classification as well. I am just looking for the source of this information, so I could properly quote it in a scientific publication.

    Cheers,
    Dann
  • DannDann Member Posts: 5 Contributor II
    Hello Haddock,

    Thanks for replying, but what exactly do you mean by posting a link to the guide by Hsu, Chang and Lin? I could have overlooked it, but I didn't find any reference about the heuristic used by RapidMiner on it. Would you care to explicitly quote where they mention the heuristic described by Ingo Mierswa on this thread?

    Regards,
    Dann.
  • haddockhaddock Member Posts: 849 Maven
    According to the RapidMIner  source distribution  libSVM is ...
    Copyright (c) 2000-2005 Chih-Chung Chang and Chih-Jen Lin
    All rights reserved.

    Redistribution and use in source and binary forms, with or without
    modification, are permitted provided that the following conditions
    are met:

    1. Redistributions of source code must retain the above copyright
    notice, this list of conditions and the following disclaimer.

    2. Redistributions in binary form must reproduce the above copyright
    notice, this list of conditions and the following disclaimer in the
    documentation and/or other materials provided with the distribution.

    3. Neither name of copyright holders nor the names of its contributors
    may be used to endorse or promote products derived from this software
    without specific prior written permission.


    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
    ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
    A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR
    CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
    EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
    PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
    PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
    LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
    NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
    SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.



    And the help for the operator points to them, so I figured that their guide to setting parameters for their library might be relevant.




  • DannDann Member Posts: 5 Contributor II
    Hi Haddock,

    I am aware that the LibSVM library is written by Chang and Lin. But actually, the heuristic parameter seems to be set by RapidMiner, not by LibSVM. At least this was the impression I had by reading Ingo's answer. Sorry for not making this clear. Thanks for the reply, by the way.

    Regards,
    Dann
  • haddockhaddock Member Posts: 849 Maven
    Hi,

    Reading that thread myself I found it unclear, but figured that whetever RapidMiner does it must depend on the properties of the underlying functions. If you search this forum you'll see that even the meaning of 'C' in SVMs gets vague, so zero 'C' ....

    That's why I like Association Rules for booleans!
  • DannDann Member Posts: 5 Contributor II
    Hi Haddock,

    Thanks for the observation. In either case, I am starting to thing that this heuristic arises from the Wahba, Lin, and Zhang bounds on the leave-one-out cross-validation error. A paper by Olivier Chappele [1] states that the Wahba, Lin, and Zhang bound is such that T = sum( a_i k'(x_i, x_i ) ), in which k' is assumed to be an augmented kernel function with the parameter C embedded within it so that k'(x_i, x_i) = k(x_i, x_i) + 1/c. All sums are from 1 to n, in which n is the number of samples in the data set.

    Now, if we go on and substitute k' with the aforementioned relation, I guess we can arrive on

    T = sum( a_i k(x_i, x_i) ) + sum(a_i 1/c).

    Since we would like to find the minimum of T, I suppose a straightforward strategy would be to assume that T is dependent on the Lagrange multipliers a_i, derive in respect to a and impose stationarity. This would lead to:

    del T / del a = sum( k(x_i, x_i) ) + sum (1/c) = sum( k(x_i, x_i) ) + n/c = 0

    And with a little algebra this can result in expression for C very similar to the heuristic suggestion as C = n / sum(K(x_i, x_i), albeit with the sign for C inverted. Something else may still be missing.

    Cheers,
    Dann

    [1] http://research.microsoft.com/en-us/um/people/manik/projects/trade-off/papers/ChapelleML02.pdf
Sign In or Register to comment.