Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

Continuous and categorical mixed features

YinYin Member Posts: 17 Contributor II
edited July 2022 in Help
What function can i use to perform PCA on a dataset with mixed continuous and mixed features in rapidminer? I applied PCA on continuous features that have been standardized and left the categorical variables with dummy encoding only without PCA. Is there a dimensionality reduction method (e.g.FAMD) that can be used on such dataset? Thanks in advance.

Best Answer

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,531 RM Data Scientist
    Solution Accepted
    Hi,
    PCA itself is simply not defined on non-numerical types. Any other solution would not be a PCA.

    Best,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany

Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,531 RM Data Scientist
    Hi,
    PCA is simply not defined on nominal values. You would need to transform it first to numericals (i.e. using Target encoding).

    Best,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • YinYin Member Posts: 17 Contributor II
    Hi, my data is transformed, but i am asking if the PCA in rapid miner can handle categorical variables. Typically speaking, it is not good to use PCA on one-hot encoded variables or categorical dummy encoded ones. There should be a specific function implementation that deals with mixed data and i'm asking if this is already integrated here.
  • YinYin Member Posts: 17 Contributor II
    Agreed. I should have said dimentionality reduction.  I will resolve this Q and start a new one.
Sign In or Register to comment.