Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

Some Questions around Clustering

GudiGudi Member Posts: 1 Learner I
Dear all,

Since im neither mathematician nor a computer scientist the answer to the following question might be an easy one for you guys here but for me it doesn't make sense at the moment. So I would kindly ask for your support on the following questions:

My goal is to do a clustering with this data a "customer-personality-analysis".
I want to answer the question, whether the "education" (5 different types are available) has an impact on the "amount sweet products". Therefore I want to analyze with the clustering (k-means) the amounts of sweets being purchased and afterwards understand the education behind the amounts.


  • In the below screenshot you can see 5 different cluster. How do I understand their meaning?
  • Will I have to compare 5 different charts as I have 5 different types of education? (The chart looks the same when I chose a different eduction type for "color column", only the color get different)
  • I have more than 2000 rows. Is it necessary to prepare the data so I have n=1000 to get a better and more precise result?
  • What does the "Cluster Model" mean? I would have expected 5 different cluster where the amount of items rise (e.g. cluster 1= 5 items, cluster 2= 349 items, cluster 3= 500 items)


Thank you for your help!

Kind regard
Gudi 

Sign In or Register to comment.