Polynomial regression: Excel vs. Rapid Miner

Jens_FleckensteJens_Fleckenste Member Posts: 7 Contributor I
edited November 2018 in Help

Hi All,

 

I tried to understand the polynomial Regression within Rapid Miner.

I know this from Excel and there it works like this:

For example I have a Polynom
y = 1*x^3+2*x^2+3*x+20


xValues:1,2,3,4,5,6,7,8,9,10
yValues:26,42,74,128,210,326,482,684,938,1250

This is the Training Set.

The test Set with the same x_Values
xValues_New:1,2,3,4,5,6,7,8,9,10

should return the same prediction for Y.

In Excel the Function
=TREND(yValues;xValues^{1.2.3};xValues_New^{1.2.3})
returns the expected values 26, 42, 74, 128,.........

The function =RGP(yValues;xValues^{1.2.3})
returns the coefficents of the Polynom
1,2,3,20

 

Is it possible to rebuild this in RapidMiner with getting the same results?

I setup this process with completly different results. Can someone explain me, what's wrong?

By the way, the same stuff with a linear function or quadratic function works. (Is a function third degree to much for Rapid Miner?)

 


<?xml version="1.0" encoding="UTF-8"?><process version="7.3.001">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="7.3.001" expanded="true" name="Process">
    <parameter key="logverbosity" value="init"/>
    <parameter key="random_seed" value="2001"/>
    <parameter key="send_mail" value="never"/>
    <parameter key="notification_email" value=""/>
    <parameter key="process_duration_for_mail" value="30"/>
    <parameter key="encoding" value="SYSTEM"/>
    <process expanded="true">
      <operator activated="true" class="generate_data" compatibility="7.3.001" expanded="true" height="68" name="Generate Data" width="90" x="45" y="85">
        <parameter key="target_function" value="random"/>
        <parameter key="number_examples" value="10"/>
        <parameter key="number_of_attributes" value="1"/>
        <parameter key="attributes_lower_bound" value="-10.0"/>
        <parameter key="attributes_upper_bound" value="10.0"/>
        <parameter key="gaussian_standard_deviation" value="10.0"/>
        <parameter key="largest_radius" value="10.0"/>
        <parameter key="use_local_random_seed" value="false"/>
        <parameter key="local_random_seed" value="1992"/>
        <parameter key="datamanagement" value="double_array"/>
      </operator>
      <operator activated="true" class="generate_id" compatibility="7.3.001" expanded="true" height="82" name="Generate ID" width="90" x="45" y="187">
        <parameter key="create_nominal_ids" value="false"/>
        <parameter key="offset" value="0"/>
      </operator>
      <operator activated="true" class="generate_attributes" compatibility="7.3.001" expanded="true" height="82" name="Generate Attributes" width="90" x="45" y="289">
        <list key="function_descriptions">
          <parameter key="X" value="id"/>
          <parameter key="Y" value="id^3+2*id^2+3*id+20"/>
        </list>
        <parameter key="keep_all" value="true"/>
      </operator>
      <operator activated="true" class="set_role" compatibility="7.3.001" expanded="true" height="82" name="Set Role" width="90" x="45" y="391">
        <parameter key="attribute_name" value="Y"/>
        <parameter key="target_role" value="label"/>
        <list key="set_additional_roles"/>
      </operator>
      <operator activated="true" class="select_attributes" compatibility="7.3.001" expanded="true" height="82" name="Select Attributes" width="90" x="45" y="493">
        <parameter key="attribute_filter_type" value="subset"/>
        <parameter key="attribute" value=""/>
        <parameter key="attributes" value="X|Y"/>
        <parameter key="use_except_expression" value="false"/>
        <parameter key="value_type" value="attribute_value"/>
        <parameter key="use_value_type_exception" value="false"/>
        <parameter key="except_value_type" value="time"/>
        <parameter key="block_type" value="attribute_block"/>
        <parameter key="use_block_type_exception" value="false"/>
        <parameter key="except_block_type" value="value_matrix_row_start"/>
        <parameter key="invert_selection" value="false"/>
        <parameter key="include_special_attributes" value="false"/>
      </operator>
      <operator activated="true" class="polynomial_regression" compatibility="7.3.001" expanded="true" height="82" name="Polynomial Regression" width="90" x="313" y="391">
        <parameter key="max_iterations" value="500000"/>
        <parameter key="replication_factor" value="3"/>
        <parameter key="max_degree" value="3"/>
        <parameter key="min_coefficient" value="-100.0"/>
        <parameter key="max_coefficient" value="100.0"/>
        <parameter key="use_local_random_seed" value="false"/>
        <parameter key="local_random_seed" value="1992"/>
      </operator>
      <operator activated="true" class="apply_model" compatibility="7.3.001" expanded="true" height="82" name="Apply Model" width="90" x="581" y="391">
        <list key="application_parameters"/>
        <parameter key="create_view" value="false"/>
      </operator>
      <connect from_op="Generate Data" from_port="output" to_op="Generate ID" to_port="example set input"/>
      <connect from_op="Generate ID" from_port="example set output" to_op="Generate Attributes" to_port="example set input"/>
      <connect from_op="Generate Attributes" from_port="example set output" to_op="Set Role" to_port="example set input"/>
      <connect from_op="Set Role" from_port="example set output" to_op="Select Attributes" to_port="example set input"/>
      <connect from_op="Select Attributes" from_port="example set output" to_op="Polynomial Regression" to_port="training set"/>
      <connect from_op="Polynomial Regression" from_port="model" to_op="Apply Model" to_port="model"/>
      <connect from_op="Polynomial Regression" from_port="exampleSet" to_op="Apply Model" to_port="unlabelled data"/>
      <connect from_op="Apply Model" from_port="labelled data" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>

Tagged:

Best Answer

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn
    Solution Accepted

    I think the operator you're looking for the is the Local Polynominal Regression operator. 

Answers

Sign In or Register to comment.