Continuous and categorical mixed features

YinYin Member Posts: 17 Contributor II
edited July 2022 in Help
What function can i use to perform PCA on a dataset with mixed continuous and mixed features in rapidminer? I applied PCA on continuous features that have been standardized and left the categorical variables with dummy encoding only without PCA. Is there a dimensionality reduction method (e.g.FAMD) that can be used on such dataset? Thanks in advance.

Best Answer

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,503 RM Data Scientist
    Solution Accepted
    Hi,
    PCA itself is simply not defined on non-numerical types. Any other solution would not be a PCA.

    Best,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany

Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,503 RM Data Scientist
    Hi,
    PCA is simply not defined on nominal values. You would need to transform it first to numericals (i.e. using Target encoding).

    Best,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • YinYin Member Posts: 17 Contributor II
    Hi, my data is transformed, but i am asking if the PCA in rapid miner can handle categorical variables. Typically speaking, it is not good to use PCA on one-hot encoded variables or categorical dummy encoded ones. There should be a specific function implementation that deals with mixed data and i'm asking if this is already integrated here.
  • YinYin Member Posts: 17 Contributor II
    Agreed. I should have said dimentionality reduction.  I will resolve this Q and start a new one.
Sign In or Register to comment.