Rapidminer Equivalent for SPSS Two Step Clustering
I was wondering if there is a clustering algorithm similar to SPSS's Two Step Clustering? I've used it during school a few times and it was very good at auto clustering large datasets with mixed datatypes. Aside from having felt the need for it in previous projects in the past, currently I have a dataset that is composed of three different sets -
- survey responses (nominal)
- user transaction/redemption amount $$ and count (real, integer)
- census data
I've tried clustering analysis with K-means, medoids, etc. with Mixed Euclidean Distance but I often have to perform manual variable selection and number of k multiple times to get clusters that look distinct from one another. Two Step Clustering takes care of the variable selection process automatically and a whole lot of other stuff in the background. I just have to clean and prepare my final dataset. Basically I'd like to avoid going back and forth so much and save time by having an algorithm take care of getting the best clusters possible.