Options

I need help with predicting an attribute

ramos213ramos213 Member Posts: 8 Newbie
Hallo all,

I have a dataset that contains the yearly gas use 2010- 2018, temperature from 2010 - 2018, the provinces in the country that i want to analyze, and the speed of wind. What i want to do know is to predict what the usage of gas will be in the future. When i observe the data set it is clearly that the usage of gas is getting less by year and that there is correlation between temperature and gas usage. I tried to get an decision tree but it wont work for some reason. 

Can someone help me with predicting the gas usage?

Thanks in advance

Best Answers

  • Options
    lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
    Solution Accepted
    Hi @ramos213,

    Have you tried to connect the in port to the Generate Macro operator input port ?

    If the issue is still present after doing this connection, please share your data and your process in order we can reproduce and understand your issue.

    Regards,

    Lionel
  • Options
    ramos213ramos213 Member Posts: 8 Newbie
    Solution Accepted
    It worked, thanks alot guys!

Answers

  • Options
    varunm1varunm1 Moderator, Member Posts: 1,207 Unicorn
    Hello @ramos213

    This seems like you are trying to forecast future usage of gas. If you don't have any future data without labels then you need to go for forecasting related to time series data. Here is a link that helps you understand how time series analysis works.

    https://rapidminer.com/resource/time-series-analysis/

    Do let us know if this helps. If not, please clearly inform, how your data looks and do you have any unlabelled data to predict and the way you are building models.
    Regards,
    Varun
    https://www.varunmandalapu.com/

    Be Safe. Follow precautions and Maintain Social Distancing

  • Options
    ramos213ramos213 Member Posts: 8 Newbie
    Hey Varunm1,

    I watched the video and it is what i need, but i keep getting an error when i try to forecast. I did everything exactly like the video shows but for some reason the macro doesnt give input to cross validation. 

  • Options
    varunm1varunm1 Moderator, Member Posts: 1,207 Unicorn
    Hello @ramos213

    The input of cross-validation is an "example set". I guess you are using a wrong input to cross-validation operator.
    Regards,
    Varun
    https://www.varunmandalapu.com/

    Be Safe. Follow precautions and Maintain Social Distancing

  • Options
    ramos213ramos213 Member Posts: 8 Newbie
    Hey,

    I forgot to connect it indeed, but know i have another problem.
  • Options
    varunm1varunm1 Moderator, Member Posts: 1,207 Unicorn
    Hello @ramos213

    Can you provide us with the data and your process? You can download your process by going to FILE --> Export Process and then attach here in the thread with data set.
    Regards,
    Varun
    https://www.varunmandalapu.com/

    Be Safe. Follow precautions and Maintain Social Distancing

  • Options
    ramos213ramos213 Member Posts: 8 Newbie
    Oh, this what i have now 
  • Options
    ramos213ramos213 Member Posts: 8 Newbie
    This is my process
  • Options
    ramos213ramos213 Member Posts: 8 Newbie
    The dataset "NoordBrabant" is what i used in the process.
  • Options
    lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
    @ramos213

    You have only 9 examples in your initial example set and 4 examples after windowing !!!
    Thus you can not perform a 10-folds cross -validation (there are no enough examples).
    You can not build a reliable model with so few data : You have to increase significantly the size of your training set....

    Regards,

    Lionel

    PS : if you want absolutly a working process, set the k-folds parameters of CV to k = 4.




  • Options
    ramos213ramos213 Member Posts: 8 Newbie
    edited May 2020
    I first tried with this dataset, the one that i just posted is one of the 3 provinces that i wanted to analyze. So this dataset contains wat i just posted but for some reason i couldnt get it to work in windowing, i think it has to do with the colomn years.
  • Options
    lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
    Yes , in deed the error is due to the fact that you have duplicate years values in your initial dataset.
    You have 27 examples in your initial dataset for your 3 provinces, thus you have 9 examples for each  province after splitting according the provinces : 
    That's you did and you did the right thing but ...I have to insist :  9 examples is no enough to build a relevant and reliable model.
    Try to increase the size of your dataset by finding the variables values before 2010 (for example) .... 

    Thanks you for your understanding,

    Regards,

    Lionel


Sign In or Register to comment.