RapidMiner 9.7 is Now Available

Lots of amazing new improvements including true version control! Learn more about what's new here.

CLICK HERE TO DOWNLOAD

Should you normalize dummy coded variables in clustering?

CuriousCurious Member Posts: 12 Newbie
edited June 2019 in Help
Can you keep them as dummies and only normalize numeric variables?
Tagged:

Best Answer

Answers

  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,749  RM Founder
    Hi,
    I would say this depends on the normalization.  If you normalize the rest to the range between 0 and 1, you can keep them as is.  Otherwise I would personally normalize all columns the same way (e.g. z-transformation).
    Hope this helps,
    Ingo
    varunm1sgenzerCurious
  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 2,509  RM Data Scientist
    Hi,
    i usually use PCA after dummy coding to get rid of the problem.
    Best,
    Martin 
    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
    varunm1
  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,497   Unicorn
    @mschmitz but doesn't that get rid of your underlying attributes as well and replace them with synthetic PCs?  That's probably not a helpful feature for clustering, or at least it wouldn't be for most of the clustering projects I have worked on.
    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 2,509  RM Data Scientist
    @Telcontar120,
    i later on join the original data back to the clustering results and start to interpret from there.

    BR,
    Martin
    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
    Telcontar120
Sign In or Register to comment.