🎉 🎉 RAPIDMINER 9.10 IS OUT!!! 🎉🎉

Download the latest version helping analytics teams accelerate time-to-value for streaming and IIOT use cases.

CLICK HERE TO DOWNLOAD

Model with dummy variables has 100% accuracy?

happy_neidhappy_neid Member Posts: 10 Contributor I
edited June 2019 in Help

I made a classification model using logistic regression. At first, i used data set that has some nominal variables. Since my task says that i should convert nominal to numeric variables, i used  dummy coding in Nominal to Numeric operator, to do that.. Then i saved that file and make a model just like in the pics, but all the time i get accuracy 100%, so something is not ok. Before i did dummy coding, accuracy was 82% with the same model.

Model.PNGProcesscross validation.PNGCross Validation

Tagged:

Answers

  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,055  RM Data Scientist

    Hi,

     

    you most likely overtrained. Try to put nominal to numerical into X-val and use group models to get it over. Keep in mind that X-val is only validating what's inside her.

     

    Every time you extract something from the whole data set and get a transformation from it you technically need to do it INSIDE of x-val. This includes replacing of missing values with averages, normalization but also dummy coding. But it's rare that the effect is that extreme.

     

    Best,

    Martin

    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
Sign In or Register to comment.