"How to specify/load/force a linear regression model?"

anthillanthill Member Posts: 1 Contributor I
edited May 2019 in Help
Hi there, great product, really enjoying learning how to use RapidMiner.

For my first project, I'm trying to assess an existing model (made several years ago) with a new dataset, and see how its performance is holding up.

However I can't figure out how to load a linear regression model!  I can export models generated in RapidMiner to .MOD / XML files and re-load them, but how do I manually enter coefficients and intercepts?  I tried kludging my own .XML file by hand, but RapidMiner won't load it.
Tagged:

Answers

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    actually you can't do this right now. The only way to create your own linear regression model would be to use the script operator. Here's an example process that should show you the trick:
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.0">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.0.10" expanded="true" name="Process">
        <process expanded="true" height="314" width="748">
          <operator activated="true" class="generate_data" compatibility="5.0.8" expanded="true" height="60" name="Generate Data" width="90" x="45" y="30"/>
          <operator activated="true" class="multiply" compatibility="5.0.8" expanded="true" height="94" name="Multiply" width="90" x="179" y="30"/>
          <operator activated="true" class="execute_script" compatibility="5.0.8" expanded="true" height="76" name="Execute Script" width="90" x="380" y="30">
            <parameter key="script" value="import com.rapidminer.operator.learner.functions.LinearRegressionModel;&#9;&#9;&#13;&#10;&#13;&#10;ExampleSet exampleSet = input[0];&#10;&#9;&#9;int numberOfAttributes = exampleSet.getAttributes().size();&#10;&#9;&#9;boolean[] attributeSelection = new boolean[numberOfAttributes];&#10;&#9;&#9;Arrays.fill(attributeSelection, true);&#10;&#9;&#9;// data just used for information. If not available could be left zero&#10;&#9;&#9;double[] standardErrors = new double[numberOfAttributes];&#10;&#9;&#9;double[] standardizedCoefficients = new double[numberOfAttributes];&#10;&#9;&#9;double[] tStatistics = new double[numberOfAttributes];&#10;&#9;&#9;double[] pValues = new double[numberOfAttributes];&#10;&#9;&#9;&#10;&#9;&#9;// data for calculating results&#10;&#9;&#9;double[] coefficients = new double[numberOfAttributes + 1];&#10;&#9;&#9;// entering all coefficients you want to set &gt; 0. Last coefficient is bias.&#10;&#9;&#9;coefficients[0] = 1;&#10;&#9;&#9;// bias&#10;&#9;&#9;coefficients[coefficients.length - 1] = 5;&#10;&#9;&#9;&#10;&#9;&#9;// class names might be null if regression task is performed&#10;&#9;&#9;String firstClassName = null;&#10;&#9;&#9;String secondClassName = null;&#10;&#9;&#9;&#10; &#9;&#9;return(new LinearRegressionModel(exampleSet, attributeSelection, coefficients, standardErrors, standardizedCoefficients, tStatistics, pValues, true, firstClassName, secondClassName));&#9;&#9;&#10;"/>
          </operator>
          <operator activated="true" class="apply_model" compatibility="5.0.8" expanded="true" height="76" name="Apply Model" width="90" x="447" y="165">
            <list key="application_parameters"/>
          </operator>
          <connect from_op="Generate Data" from_port="output" to_op="Multiply" to_port="input"/>
          <connect from_op="Multiply" from_port="output 1" to_op="Execute Script" to_port="input 1"/>
          <connect from_op="Multiply" from_port="output 2" to_op="Apply Model" to_port="unlabelled data"/>
          <connect from_op="Execute Script" from_port="output 1" to_op="Apply Model" to_port="model"/>
          <connect from_op="Apply Model" from_port="labelled data" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    Greetings,
      Sebastian
Sign In or Register to comment.