🎉 🎉 RAPIDMINER 9.10 IS OUT!!! 🎉🎉

Download the latest version helping analytics teams accelerate time-to-value for streaming and IIOT use cases.

CLICK HERE TO DOWNLOAD

"Subspace Clustering on Binary Attributes."

adjo81adjo81 Member Posts: 2 Contributor I
edited June 2019 in Help
Hello All,

I am a beginner level professional in data mining and new to the topic of subspace clustering. I have a sample dataset which contains observations in terms of purchase orders and columns in terms of binary attributes (1/0) related to customization of same type of product.

The objective is to find whether there are any clusters present in this data. One of the approach is to use a PCA to convert binary to numerical scores and use these as input to k-means iterations.

However, I was trying to check if using hierarchical clustering on this data helps. I have used Jaccard dissimilarity metric and then dendrogram to find out the clusters. It seems no clear structure is present in the data, which the dendrogram containing few isolated clusters. This analysis was done in base R.

Later I came to know about subspace clustering. I am currently trying out an iteration in RapidMiner using subspace plugins, to be precise using the CLIQUE algorithm. However, it is being over an hour and no results have been obtained yet. I have set the tau and xi parameters as 0.1 and 2 respectively, which seem to be correct given the nature of dataset.

Would request comments/suggestions on improving the above situation. I am not sure on how the output of CLIQUE looks in RapidMiner, so would also appreciate some leads on this topic as well.

Best Wishes,
Aditya.
Tagged:

Answers

  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,055  RM Data Scientist
    Hi,

    where did you find CLIQUE? I just googled a bit around and found this extension: http://www-ai.cs.uni-dortmund.de/SOFTWARE/SUBSPACE_CLUSTERING/index.html which is also new to me..

    Kind of intersting stuff going on.

    ~Martin
    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • adjo81adjo81 Member Posts: 2 Contributor I
    Hello Martin,

    I found this at the following link: http://dme.rwth-aachen.de/en/OpenSubspace

    It is available as a plugin for Rapid Miner and Weka as well. Some preprocessing in terms of setting independent vars/attributes of a purchase order as binary was required. Then i ran the CLIQUE algorithm, which ran over an hour and I had to stop it abruptly.

    Later on PROCLUS was also run on the same, and I was not able to interpret the results. All records which are assigned to the same cluster in the PROCLUS do not have the same attributes though, which was a surprise.

    Do let me know if you/someone else would be able to help me i nthis case
  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,055  RM Data Scientist
    Sorry - i never heared of this extension before. I will definitly check it out later on.


    ~Martin
    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
Sign In or Register to comment.