Continuous and categorical mixed features

YinYin Member Posts: 17 Contributor II
edited July 2022 in Help
What function can i use to perform PCA on a dataset with mixed continuous and mixed features in rapidminer? I applied PCA on continuous features that have been standardized and left the categorical variables with dummy encoding only without PCA. Is there a dimensionality reduction method (e.g.FAMD) that can be used on such dataset? Thanks in advance.

Best Answer

  MartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor
    Solution Accepted
    PCA itself is simply not defined on non-numerical types. Any other solution would not be a PCA.

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany


  MartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor
    PCA is simply not defined on nominal values. You would need to transform it first to numericals (i.e. using Target encoding).

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • YinYin Member Posts: 17 Contributor II
    Hi, my data is transformed, but i am asking if the PCA in rapid miner can handle categorical variables. Typically speaking, it is not good to use PCA on one-hot encoded variables or categorical dummy encoded ones. There should be a specific function implementation that deals with mixed data and i'm asking if this is already integrated here.
  • YinYin Member Posts: 17 Contributor II
    Agreed. I should have said dimentionality reduction.  I will resolve this Q and start a new one.
