RapidMiner

RapidMiner

XVPrediction and LinearRegression

Contributor II

XVPrediction and LinearRegression

Hello,

I'm having a problem with XVPrediction and the LinearRegression operator. What I want to achieve is to learn a linear regression model and then apply the model to some test data. Initially, I use cross-validation to evaluate my approach and as I am interested in the actual predictions I am using XVPrediction rather than XValidation. While I can see a new attribute "Prediction(...)" in the resulting data set, all values in this column are "unknown".

My process looks like this:

<?xml version="1.0" encoding="MacRoman"?>
<process version="4.3">

  <operator name="Root" class="Process" expanded="yes">
      <operator name="CSVExampleSource" class="CSVExampleSource">
          <parameter key="filename" value="/Users/bfranke/HPCA/rapidminer/ARM.csv"/>
          <parameter key="id_name" value="Filename"/>
          <parameter key="label_name" value="CNT_cycles"/>
          <parameter key="trim_lines" value="true"/>
      </operator>
      <operator name="FeatureNameFilter" class="FeatureNameFilter">
          <parameter key="skip_features_with_name" value="CNT_alloc_OSMs|CNT_retired_OSMs"/>
      </operator>
      <operator name="Numerical2Real" class="Numerical2Real">
      </operator>
      <operator name="XVPrediction" class="XVPrediction" expanded="yes">
          <parameter key="leave_one_out" value="true"/>
          <parameter key="sampling_type" value="linear sampling"/>
          <operator name="LinearRegression" class="LinearRegression">
              <parameter key="feature_selection" value="greedy"/>
              <parameter key="keep_example_set" value="true"/>
          </operator>
          <operator name="ModelApplier" class="ModelApplier">
              <list key="application_parameters">
              </list>
              <parameter key="create_view" value="true"/>
              <parameter key="keep_model" value="true"/>
          </operator>
      </operator>
      <operator name="ResultWriter" class="ResultWriter">
          <parameter key="result_file" value="/Users/bfranke/HPCA/rapidminer/results.res"/>
      </operator>
  </operator>

</process>


I have a data set with 293 examples and 45 attributes, of which one is the textual/nominal ID and another one the numerical label. All other attributes are also numerical, two of them I filter out using the FeatureNameFilter. I also make sure all regular attributes are converted to real values as some of the attributes are identified as int and others as real. This conversion using the Numerical2Real operator makes sure all attributes are represented using the same real type. Up to this point everything seems to be ok (I set a breakpoint and inspected the data). Hence, the problem seems to be related to XVPrediction and LinearRegression.

Checking the older posts to this forum I haven't found any known issues with either XVPrediction or LinearRegression (the same problem shows up with PolynomialRegression and also GPLearner), so I guess there's something wrong with my process. I've already "experimented" with explicit feature selection to select fewer attributes, but this didn't solve the problem. Any ideas?

Thanks!

Cheers,

  Bjoern
7 REPLIES
Elite

Re: XVPrediction and LinearRegression

Hi Bjoern,
seems to me there is a bug in the XVPrediction. I will check that and keep you informed.

Greetings,
  Sebastian
Old World Computing - Establishing the Future

Check out the Jackhammer Extension for RapidMiner! Crunch more data easier and with up to 700% speed up! Available only here

Elite

Re: XVPrediction and LinearRegression

Hi,
I have found bug, but you need to check out the latest developer version from cvs to get a bug free version.

Greetings,
  Sebastian
Old World Computing - Establishing the Future

Check out the Jackhammer Extension for RapidMiner! Crunch more data easier and with up to 700% speed up! Available only here

Regular Contributor

Re: XVPrediction and LinearRegression

Hello

Do you still use the bug tracker? I just ask to avoid spamming Smiley Wink

regards,

Steffen
Elite

Re: XVPrediction and LinearRegression

Hi Steffen,
I give my very best to pay an appropriate amount of attention to any way messages about bugs could occur. But sometimes its simply one way too much to keep all in mind and the emails had been disappeared between some SF spam...
Thanks for reminding  Smiley Happy

Greetings,
  Sebastian
Old World Computing - Establishing the Future

Check out the Jackhammer Extension for RapidMiner! Crunch more data easier and with up to 700% speed up! Available only here

Contributor II

Re: XVPrediction and LinearRegression

Hello Sebastian,

land wrote:

I have found bug, but you need to check out the latest developer version from cvs to get a bug free version.


Many thanks for your prompt help! Now I've got another question relating to how to get access to this latest version. When I check out the code using anonymous CVS I get an older version 4.2 of RapidMiner. The developer CVS access via SSH does not seem to work for me (password not recognised, permission denied). Do I need to become an "official" developer to check out the latest version and how do I do this? Thanks.

Cheers,

  Björn
Elite

Re: XVPrediction and LinearRegression

Hi,
simply switch to the developer branch called Zaniah.

Greetings,
  Sebastian
Old World Computing - Establishing the Future

Check out the Jackhammer Extension for RapidMiner! Crunch more data easier and with up to 700% speed up! Available only here

Contributor II

Re: XVPrediction and LinearRegression

Ok, thanks!

Cheers,

  Bjoern