Options

# Clustering Dummy Variables

mario_sark
Member Posts:

**13**Contributor I
Dears,

I am working on to segment a list customers into different cluster based on different variables, but some of these variables are Dummy variables for example below is the list of variables that i will use to apply the clustering technique:

Unpaid : Yes/No (dummy)

Deposit : Continuous (Some Customers has Zero deposits)

Term Deposits: Continuous (some customer has Zero Term Deposits)

Number of returned Checks : discrete (Some Customers Has Zero)

Insurance Product : discrete (some Customer has Zero) - this can be transform into (Yes /No)

Credit Card Spending : Continuous ( Some customers has zero since they don't hold credit Cards)

Number of Product (Loans) : it can be number of Car Loan ,Personal Loan, Housing Loans, ...(some customer has zero)

What is the best algorithm in RapidMiner i can use to cluster these customers into different segments to highlight the less profitable group.

As i know

Hope That you can help with this. !!

Thank you in advance,

I am working on to segment a list customers into different cluster based on different variables, but some of these variables are Dummy variables for example below is the list of variables that i will use to apply the clustering technique:

Unpaid : Yes/No (dummy)

Deposit : Continuous (Some Customers has Zero deposits)

Term Deposits: Continuous (some customer has Zero Term Deposits)

Number of returned Checks : discrete (Some Customers Has Zero)

Insurance Product : discrete (some Customer has Zero) - this can be transform into (Yes /No)

Credit Card Spending : Continuous ( Some customers has zero since they don't hold credit Cards)

Number of Product (Loans) : it can be number of Car Loan ,Personal Loan, Housing Loans, ...(some customer has zero)

What is the best algorithm in RapidMiner i can use to cluster these customers into different segments to highlight the less profitable group.

As i know

**K-means**can hold only**continuous variable**, and**i am afraid to normalize the dummy variables available in the data set**.Hope That you can help with this. !!

Thank you in advance,

**Mario**
Tagged:

0

## Answers

344Unicorn13Contributor IThank you for your reply, the list of customer that i am going to clusters is around 70,000 Customers.

I was wondering if there is any algorithm other than K-means. I

i am looking forward also to read about other possibilities.

Thank you,

Mario1,635UnicornLindon Ventures

Data Science Consulting from Certified RapidMiner Experts