Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

"Non-linear regression"

StraussStrauss Member Posts: 10 Contributor II
edited May 2019 in Help
Is there an operator for non-linear regression, e.g. polynomic? I didn't found something in this way.
Tagged:

Answers

  • haddockhaddock Member Posts: 849 Maven
    There is a polynomial kernel available in LibSVMLearner.
  • StraussStrauss Member Posts: 10 Contributor II
    Thank you very much. Is there somewhere a code example how to use it and what parameters have to be choosed?
  • IngoRMIngoRM Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi,

    you can find many examples in the "sample" directory of RapidMiner. There should also be one for a general regression setting. For the polynomial LibSVM, you have to set the type to one of the both "SVR" types, select the kernel type "polynomial" and define an appropriate degree and values for C. Which parameter values are appropriate can be evaluated by using one of the parameter optimization operators (please also refer to the sample dir). Here is a simple setup (model is applied on the training data - never do this in real life ;-):

    <operator name="Root" class="Process" expanded="yes">
        <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
            <parameter key="number_of_attributes" value="1"/>
            <parameter key="target_function" value="one variable non linear"/>
        </operator>
        <operator name="NoiseGenerator" class="NoiseGenerator">
            <list key="noise">
            </list>
        </operator>
        <operator name="LibSVMLearner" class="LibSVMLearner">
            <parameter key="C" value="10000.0"/>
            <list key="class_weights">
            </list>
            <parameter key="degree" value="2"/>
            <parameter key="keep_example_set" value="true"/>
            <parameter key="kernel_type" value="poly"/>
            <parameter key="svm_type" value="epsilon-SVR"/>
        </operator>
        <operator name="ModelApplier" class="ModelApplier">
            <list key="application_parameters">
            </list>
        </operator>
    </operator>
    However, I would usually prefer an RBF kernel or an (additional) feature construction (for example with YAGGA2) instead but if polynomial works for your data this is of course fine.

    Cheers,
    Ingo
  • StraussStrauss Member Posts: 10 Contributor II
    I got really problems in using this  regression type. My approach is to load an example set from a database and produce a prediction model. But I think it tooks too much time (e.g. more than 2 minutes) and the results are not satisfiying.

    My operator tree is the following:

    <operator name="Root" class="Process" expanded="yes">
        <operator name="DatabaseExampleSource" class="DatabaseExampleSource">
            <parameter key="database_system" value="HSQLDB"/>
            <parameter key="database_url" value="jdbc:hsqldb:file:SnapshotDB"/>
            <parameter key="label_attribute" value="SNAPSHOT"/>
            <parameter key="query" value="SELECT SID, SNAPSHOT FROM snapshots"/>
            <parameter key="username" value="sa"/>
        </operator>
        <operator name="LibSVMLearner" class="LibSVMLearner">
            <parameter key="C" value="10000.0"/>
            <list key="class_weights">
            </list>
            <parameter key="kernel_type" value="poly"/>
            <parameter key="svm_type" value="epsilon-SVR"/>
        </operator>
        <operator name="ModelWriter" class="ModelWriter">
            <parameter key="model_file" value="prediction.mod"/>
        </operator>
    </operator>
    I hope someone can help me to solve these problems or can explain how to calculate these model...
  • IngoRMIngoRM Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi,

    if the runtime is too high you could try to reduce the value of "C". If the results are not satisfying, I always would try the RBF kernel with an optimized value for gamma / sigma. This often leads to much better fits. Instead of introducing the non-linearity in the learner, you could also construction additional (polynomial) features before learning and simply apply a linear regression scheme afterwards. This is often faster and leads to understandable models.

    Cheers,
    Ingo
  • StraussStrauss Member Posts: 10 Contributor II

    ... I always would try the RBF kernel with an optimized value for gamma / sigma.
    It would be great if you could explain me how to do this... Which Operator I have to choose for this?
  • IngoRMIngoRM Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi,

    here are the basic settings for a RBF SVM:

    <operator name="Root" class="Process" expanded="yes">
        <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
            <parameter key="attributes_lower_bound" value="-20.0"/>
            <parameter key="attributes_upper_bound" value="15.0"/>
            <parameter key="number_examples" value="300"/>
            <parameter key="number_of_attributes" value="1"/>
            <parameter key="target_function" value="one variable non linear"/>
        </operator>
        <operator name="LibSVMLearner" class="LibSVMLearner">
            <parameter key="C" value="2000.0"/>
            <list key="class_weights">
            </list>
            <parameter key="gamma" value="1.0"/>
            <parameter key="keep_example_set" value="true"/>
            <parameter key="svm_type" value="epsilon-SVR"/>
        </operator>
        <operator name="ModelApplier" class="ModelApplier">
            <list key="application_parameters">
            </list>
        </operator>
    </operator>

    For the parameter optimization, you could have a look into the sample directory (..._Meta.../...ParameterOptimization.xml).

    Cheers,
    Ingo
  • StraussStrauss Member Posts: 10 Contributor II
    Okay, thank you very much. I think I got it now.

    But could it be possible that setting the degree of the function doesn't have influence to the result?
  • IngoRMIngoRM Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi,

    the kernel parameter "degree" is only used for a poynomial kernel, the parameters "sigma" / "gamma" are only used for RBF kernels.

    Cheers,
    Ingo
Sign In or Register to comment.