Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
"Performance assessment"
Hi everyone,
I am trying to find a good predictor for my target variable: Annual total electricity consumption of a household in kwh
In the dataset I am using, the values of this label variables varies between 48 and 72145 kwh.
What do you think would be a good RMSE or Relative Error in this case.
PS: I tried also with filtering out the examples with annual electricity consumption values smaller than 3000kwh (estimated electricity for the basic needs in a house) but I still got the same RMSE and RE values of about 5000 and 50 percent)
It would be very helpful to get an answer as I couldn't go forward in my work.
Thank you in advance!
I am trying to find a good predictor for my target variable: Annual total electricity consumption of a household in kwh
In the dataset I am using, the values of this label variables varies between 48 and 72145 kwh.
What do you think would be a good RMSE or Relative Error in this case.
PS: I tried also with filtering out the examples with annual electricity consumption values smaller than 3000kwh (estimated electricity for the basic needs in a house) but I still got the same RMSE and RE values of about 5000 and 50 percent)
It would be very helpful to get an answer as I couldn't go forward in my work.
Thank you in advance!
Tagged:
0
Best Answer
-
lionelderkrikor RapidMiner Certified Analyst, Member Posts: 1,195 UnicornHi @islem_h,
"What do you think would be a good RMSE or Relative Error in this case"
Generally, it's very difficult to answer to this question and without your dataset it is (quasi) impossible to answer...
I will give you my personal point of view :
I think that "all is relativ ..." (like Albert Einstein said )
There is no good or bad model in absolut, but relativ good or relativ bad model(s).
First the preprocessing step (data preparation) is an important step which has a significant influence on the final performance of your model(s).
For example
- have you removed the correlated attributes ,
- have you performed a feature selection ? a feature generation ?,
- how do you manage missing values ? etc...
Then, I think it would be interesting that you train and evaluate some models to have a general idea of the possible performance on your dataset.
Then you can select the best models and improve their performances by optimizing their parameters...
A good starting point is to submit your dataset to the Auto Model tool of RapidMiner. With this tool all the steps described above are performed automatically.
In parallel, you can share your dataset and your process. I think there will be, here in this community, people (including myself) happy to help you...
Regards,
Lionel
8
Answers
Indeed the Auto Model helped in guiding me to what could be good values of RMSE. Thanks for that tip!
I also found in RapidMiner another performance measure adequate to my prediction task which is the Normalized Absolute Error. It is unit independent, however I'mm not sure how to interpret its values.