Need to make a model for economic Viability. Not sureHow to go ahead with variable selection

RustyboltcutterRustyboltcutter Member Posts: 2 Newbie
Hey guys,
So I am trying to build a model for the prediction of economic viability of property using the below variable.

So I created Listed_Year using the Listed data. Basically, any property that was listed till the end of 2018 is an old property and anything that is after is a new one.
So the logic I used To make the economic viability variable is this if rule,

if(listed_year == "New" && overall_satisfaction>3,"Economically viable",if(listed_year == "Old" && overall_satisfaction>3,"Economically viable","not Economically Vaible"))

But when I run this mode I get 100% in accuracy and 1 in Kappa, which obviously means is made it overfit and not work at all.
Would really love some input on how to move ahead and how to actually get this to work.



  • Options
    lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
    Hi @Rustyboltcutter,

    In deed, there is maybe an overfitting phenomenon or maybe one of your attribute is totally correlated to your label attribute.

    RapidMiner can perform a relevant feature selection (and eventually feature generation) automatically. For that, please use 
    the operator called Automatic Feature Engineering

    An other advice, I can give you, is simply  to submit your dataset to the AutoModel of RapidMiner. In this case, it is "all inclusive" : 
    Rapidminer takes care of everything : RM performs first a "preliminary" feature selection based on the "quality" of each 
     features and then RM will perform a feature selection (and eventually a feature generation based on your settings) , the modelling and the estimation of the model(s) performance. At the end of the calculations, RapidMiner presents all the results (the performances of each model).

    Please let me know if you have other questions.

    Hope this helps,



Sign In or Register to comment.