Using mod output to isolate variables

iasoniason Member Posts: 20 Contributor II
edited November 2018 in Help
I am not sure I am using the correct terms here, I will try to be descriptive.

I want to isolate (or control for) selected variables. That is, from a dataset (x,y,z,result) to create a model and then plot (x,result) considering y,z to be fixed. The final plot will be extrapolated to a wider range of x values.
I am at the point of having the model created (I used linear regression for testing purposes). The remaining steps, which I can't find out how to perform, are to create the appropriate dataset and apply the model.
Is there a data generator suitable for that or should I manually create the tables like (1,0,0),(2,0,0),(3,0,0),(4,0,0)... ?
After data is entered, or generated, how do I use the "mod" output to predict the result value?

Finally, am I using a totally wrong approach for the task I am trying to achieve? Is there a better way to visualize that kind of dependancy than this one?

Thank you all in advance.

Answers

  • wesselwessel Member Posts: 537 Maven
    Hey,

    Can you post a few rows of example data?

    Best regards,

    Wessel
  • iasoniason Member Posts: 20 Contributor II
    Sure. Here are a few lines.
    I mostly need to visualize mes=f(res), for a given set (t,r).
    Physical modeling of the problem says I should expect mes=a*res^2+b*res+c, but given the effect t and the fact that the order of magnitude is so different it is quite difficult. The values of a and b are not independent of t and r.
    I thought of getting a number of examples, large enough to have a lot of cases with the same t,r but that seems impossible.

    t;r;res;mes
    264;109,68;0,030;29441,9
    298;95,07;0,198;31200,2
    322;92,27;0,782;41563,1
    476;101,09;0,152;51838,0
    181;109,53;0,454;24379,3
    496;108,89;0,497;67559,6
    246;103,28;0,719;34732,9
    247;101,86;0,946;37258,2
    239;108,7;0,536;33074,8
    33;97,8;0,883;4889,7
    436;104,02;0,420;54985,0
    370;97,12;0,901;52325,9
    155;100,89;0,446;19224,7
    367;94,5;0,914;50789,2
    291;102,38;0,537;37936,9
    147;99,8;0,321;17075,6
    490;104,62;0,254;57837,4
    230;107,42;0,197;27214,5
  • wesselwessel Member Posts: 537 Maven
    Hey,

    I'm sure I'm missing something.
    I generated an attribute res^2 and ran linear regression.
    And then after made a scatter plot.
    I used 0,1-normalization to make it all fit.

    image

    image



    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.1.008">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.1.008" expanded="true" name="Process">
        <process expanded="true" height="409" width="840">
          <operator activated="true" class="retrieve" compatibility="5.1.008" expanded="true" height="60" name="Retrieve" width="90" x="153" y="146">
            <parameter key="repository_entry" value="//RS/A"/>
          </operator>
          <operator activated="true" class="set_role" compatibility="5.1.008" expanded="true" height="76" name="Set Role" width="90" x="179" y="30">
            <parameter key="name" value="mes"/>
            <parameter key="target_role" value="label"/>
            <list key="set_additional_roles"/>
          </operator>
          <operator activated="true" class="generate_attributes" compatibility="5.1.008" expanded="true" height="76" name="Generate Attributes" width="90" x="313" y="30">
            <list key="function_descriptions">
              <parameter key="res^2" value="res^2"/>
            </list>
          </operator>
          <operator activated="true" class="linear_regression" compatibility="5.1.008" expanded="true" height="94" name="Linear Regression" width="90" x="447" y="30">
            <parameter key="feature_selection" value="none"/>
          </operator>
          <operator activated="true" class="apply_model" compatibility="5.1.008" expanded="true" height="76" name="Apply Model" width="90" x="581" y="120">
            <list key="application_parameters"/>
          </operator>
          <operator activated="true" class="normalize" compatibility="5.1.008" expanded="true" height="94" name="Normalize" width="90" x="715" y="30">
            <parameter key="include_special_attributes" value="true"/>
            <parameter key="method" value="range transformation"/>
          </operator>
          <connect from_op="Retrieve" from_port="output" to_op="Set Role" to_port="example set input"/>
          <connect from_op="Set Role" from_port="example set output" to_op="Generate Attributes" to_port="example set input"/>
          <connect from_op="Generate Attributes" from_port="example set output" to_op="Linear Regression" to_port="training set"/>
          <connect from_op="Linear Regression" from_port="model" to_op="Apply Model" to_port="model"/>
          <connect from_op="Linear Regression" from_port="exampleSet" to_op="Apply Model" to_port="unlabelled data"/>
          <connect from_op="Apply Model" from_port="labelled data" to_op="Normalize" to_port="example set input"/>
          <connect from_op="Apply Model" from_port="model" to_port="result 1"/>
          <connect from_op="Normalize" from_port="example set output" to_port="result 2"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="162"/>
          <portSpacing port="sink_result 2" spacing="0"/>
          <portSpacing port="sink_result 3" spacing="0"/>
        </process>
      </operator>
    </process>
  • iasoniason Member Posts: 20 Contributor II
    Thank you again for replying. Your help is very much appreciated.

    The problem is that with mes=f(res)=a*res^2 + b*res + c the values of a,b,c are not independent of t,r.
    Doing that kind of regression would only acounf for c=g(t,r).
    What I want to do is find a and b, given the values of t,r.
    To put it in a proper form, the function is:

    mes(res, t, r) = a(t,r)*res^2 + b(t,r)*res + c(t,r)

    The quest is to find a(t,r), b(t,r) and c(t,r).
  • wesselwessel Member Posts: 537 Maven
    And what is the form of a(t,r), b(t,r) and c(t,r)?

    This does not seem like a problem suitable for Rapid Miner.

    You can use a fuzzy neural network, or a genetic algorithm to solve this problem.
    But you will have to write your own Java code.

    Best regards,

    Wessel
  • iasoniason Member Posts: 20 Contributor II
    Actually the exact form of a(r,t), b(r,t), c(r,t) is not known. But it is not needed.
    A visual representation of mes vs res for 4-5 pairs of (r,t) would be enough.
    Still, is it reasonable to ask for enough data values with the same t and r? Gathering 100 examples for each pair will take around 8 months.
    And then I could train the model 5 times and get the required 5 values for a,b,c.
    I was hoping I could find a workaround and work with randomly collected values but it seems quite difficult.
Sign In or Register to comment.