linear regression optimization

dkpengqiuyangdkpengqiuyang Member Posts: 21 Contributor I
edited December 2018 in Help


I need to predict thermal expand range from tempreture, and I get a test dataset, so I try the linear regression but the result is not good, my setting and data is like below, can you help me to improve the prediction ? thanks.



Best Answer

  • earmijoearmijo Member Posts: 270 Unicorn
    Solution Accepted

    From looking at the scatter plot of the two variables you get a sense that there are other important predictors missing from this equation. There is non-linearity so you could use other methods instead of plain vanilla linear regression. 


    Investigate further the physics of the process. I know absolutely nothing and Wikipidea tells me pressure is another important variable. 


    Screen Shot 2017-06-05 at 9.07.39 AM.png


  • binsetyawanbinsetyawan Member Posts: 46 Guru

    you can use optimization parameter (grid)operator to get the best parameter for your dataset

  • dkpengqiuyangdkpengqiuyang Member Posts: 21 Contributor I

    my friend do the same job with matlab and the result is well fit the test data , but I can not take the same score with rapidminer. I am still confused about this ...

    1.png 11.7K
  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    Can you post your RapidMiner process? Maybe we can help troubleshoot. 

  • dkpengqiuyangdkpengqiuyang Member Posts: 21 Contributor I


    you can see the rm process and operator setting in the attachment above, and the origin data is also there.

    I can get a simular output like matlab, when I change the input attribute from "tempreture" to "tempreture change", which means y=kx+b do not work but y=k(x-x1)+b works well in rm, while y=kx+b works well in matlab.

    I am confused about this.

  • yyhuangyyhuang Administrator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 364 RM Data Scientist

    @dkpengqiuyang did you try to delete collinear feature?



    Process is attached here. 


    <?xml version="1.0" encoding="UTF-8"?><process version="7.6.000">
    <operator activated="true" class="retrieve" compatibility="7.6.000" expanded="true" height="68" name="Retrieve regression" width="90" x="45" y="34">
    <parameter key="repository_entry" value="//RM YY Local Repository/AAA-PROSPECT/data/regression"/>
    <?xml version="1.0" encoding="UTF-8"?><process version="7.6.000">
    <operator activated="true" class="set_role" compatibility="7.6.000" expanded="true" height="82" name="Set Role" width="90" x="179" y="34">
    <parameter key="attribute_name" value="thermal expand"/>
    <parameter key="target_role" value="label"/>
    <list key="set_additional_roles"/>
    <?xml version="1.0" encoding="UTF-8"?><process version="7.6.000">
    <operator activated="true" class="linear_regression" compatibility="7.6.000" expanded="true" height="103" name="Linear Regression" width="90" x="380" y="34">
    <parameter key="feature_selection" value="M5 prime"/>
    <parameter key="alpha" value="0.05"/>
    <parameter key="max_iterations" value="10"/>
    <parameter key="forward_alpha" value="0.05"/>
    <parameter key="backward_alpha" value="0.05"/>
    <parameter key="eliminate_colinear_features" value="false"/>
    <parameter key="min_tolerance" value="0.05"/>
    <parameter key="use_bias" value="true"/>
    <parameter key="ridge" value="1.0E-8"/>
    <?xml version="1.0" encoding="UTF-8"?><process version="7.6.000">
    <operator activated="true" class="apply_model" compatibility="7.6.000" expanded="true" height="82" name="Apply Model" width="90" x="514" y="34">
    <list key="application_parameters"/>
    <parameter key="create_view" value="false"/>


Sign In or Register to comment.