"Performance assessment"

islem_h · January 2019

Hi everyone,

I am trying to find a good predictor for my target variable: Annual total electricity consumption of a household in kwh
In the dataset I am using, the values of this label variables varies between 48 and 72145 kwh.
What do you think would be a good RMSE or Relative Error in this case.

PS: I tried also with filtering out the examples with annual electricity consumption values smaller than 3000kwh (estimated electricity for the basic needs in a house) but I still got the same RMSE and RE values of about 5000 and 50 percent)

It would be very helpful to get an answer as I couldn't go forward in my work.

Thank you in advance!

lionelderkrikor · January 2019

Hi @islem_h,

"What do you think would be a good RMSE or Relative Error in this case"
Generally, it's very difficult to answer to this question and without your dataset it is (quasi) impossible to answer...
I will give you my personal point of view :
I think that "all is relativ ..." (like Albert Einstein said

)
There is no good or bad model in absolut, but relativ good or relativ bad model(s).
First the preprocessing step (data preparation) is an important step which has a significant influence on the final performance of your model(s).
For example
- have you removed the correlated attributes ,
- have you performed a feature selection ? a feature generation ?,
- how do you manage missing values ? etc...
Then, I think it would be interesting that you train and evaluate some models to have a general idea of the possible performance on your dataset.
Then you can select the best models and improve their performances by optimizing their parameters...

A good starting point is to submit your dataset to the Auto Model tool of RapidMiner. With this tool all the steps described above are performed automatically.
In parallel, you can share your dataset and your process. I think there will be, here in this community, people (including myself) happy to help you...

Regards,

Lionel

islem_h · January 2019

Thanks a lot Lionel for the detailed and quick answer

Indeed the Auto Model helped in guiding me to what could be good values of RMSE. Thanks for that tip!
I also found in RapidMiner another performance measure adequate to my prediction task which is the Normalized Absolute Error. It is unit independent, however I'mm not sure how to interpret its values.

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

"Performance assessment"

Best Answer

Answers