"Unsupervised Clusteranalysis with rapidminer"

shaihuludshaihulud Member Posts: 20 Contributor II
edited May 2019 in Help

ive an instance pool with a couple of attributes. I want to classify the instances. Which are the "common" unsupervised clusteranalysis methods that i can use with rapidminer? I cant give the actual outcome for the samples.

On a sidenote: can somebody recommend a readily comprehensible literatur on the topic of unsupervised clusteranalysis?



  • Options
    el_chiefel_chief Member Posts: 63 Contributor II

    FYI, classification and clustering are different things. Classification is predicting a whole number, while clustering is grouping similar observations into a number of groups.

    What is your data like and what are you trying to accomplish?

    RapidMiner has 9 clustering methods, but the common ones are k-means (good for huge data sets), agglomerative, expectation maximization, and DBSCAN.

    The book Multivariate Data Analysis by Joseph Hair is the most understandable.

    Good luck

  • Options
    shaihuludshaihulud Member Posts: 20 Contributor II
    well its like that:

    i do want to read data from csv files. Each line represents an instance with a name and a couple of attributes. The attribute values are mostly strings and they can be arbitrary. I need to find a way to identify some representatives for each "group" of instances i have in the data. I cannot examine alle the instances ecause i have 10000s of them in my files, so i need to narrow it down as best as i can.
    Many of them have similar or equal attribute values so that i can put them into a group (cluster) and just choose one of the instances from each group as a representative. (Again: i do not know the values of the instances or the attributes) My first thought there was to use clusteranalysis, but maybe i am wrong?!

    I would appreciate any help.

  • Options
    IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder

    no clustering sounds exactly like something which could help you here. Please post in the Problems & Support board of this forum if you have any questions about this can actually be achieved.

Sign In or Register to comment.