Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

rloess in Rapiminer

oneponep Member Posts: 20 Maven
edited November 2018 in Help

Hi.

 

I'm trying to make a predictive regression model and are currently looking at my raw data. I can see that my raw data got a lot of noise in the signal, which means that it got alot of variance in a short time period. My goal is to make data windows of 10 mins where I calculate the mean, variance and linear regression coefficient and then use these windows in my model.

 

But if I use the noisy data, the variance will be larger than i actually is. That makes me think that I need to make a smooth fitted line to the noisy data before I use them? I've seen another project doing this by using the rloess technique, but this is not implemented in Rapidminer?

 

Instead I've tryed to use "Moving average", "Exponential smoothing" and "Fit trend" with an "Local Polynomial Regression inside. Moving average and Exponential smoothing are both making a fitted line, but its not possible to weight the outliers like you can do in rloess. Fit trend is taking to long to process because of the amount of data.

 

 

Anyone have an idea for the best approach for this?

Picture of the noisy data attached.

Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,529 RM Data Scientist

    Hi Mathias,

     

    have you thought about using the outlier detection score as a weight in the regression?


    ~Martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • oneponep Member Posts: 20 Maven

    I've tried the "k-NN Global Anomaly Score" but it's taking far too long to process.

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,529 RM Data Scientist

    Try one of the clustering ones,they are at O(N**2).

     

    ~Martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
Sign In or Register to comment.