Clustering of SOM

GeoffbGeoffb Member Posts: 2 Contributor I
edited November 2018 in Help
I was wondering if it is possible to cluster SOM results which is generally known as two-stage clustering? Some literature on the internet shows considerable promise such as the DS2L-SOM: http://cdn.intechopen.com/pdfs-wm/10455.pdf. SOM is attractive because it offers reduction of the data dimensions, data analysis tools and is robust to null values and skewed data.  Clustering the results of a large SOM is attractive for further identifying groups without prior knowledge of the number of clusters and those groups, which can then be investigated more qualitatively than standard SOM. Any advice on the recommended approach would be helpful.  I was not sure that the SOM within RapidMiner can output the distance and density information required, to be input into a clustering method?
Tagged:

Answers

  • mschmitzmschmitz Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 1,872  RM Data Scientist
    Hi,

    of course you can use the SOM like a PCA for a preprocessing. It is in the end a change of the vector space and a reduction. You can then use the new coordinates to do a clustering on it. A process for this is attached.
    How would you like to use which density information in such a setting?

    Cheers,
    Martin

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="6.4.000">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="6.4.000" expanded="true" name="Process">
        <process expanded="true">
          <operator activated="true" class="retrieve" compatibility="6.4.000" expanded="true" height="60" name="Retrieve Sonar" width="90" x="45" y="120">
            <parameter key="repository_entry" value="//Samples/data/Sonar"/>
          </operator>
          <operator activated="true" class="self_organizing_map" compatibility="6.4.000" expanded="true" height="94" name="SOM" width="90" x="179" y="120"/>
          <operator activated="true" class="k_means" compatibility="6.4.000" expanded="true" height="76" name="Clustering" width="90" x="313" y="120"/>
          <connect from_op="Retrieve Sonar" from_port="output" to_op="SOM" to_port="example set input"/>
          <connect from_op="SOM" from_port="example set output" to_op="Clustering" to_port="example set"/>
          <connect from_op="Clustering" from_port="clustered set" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
Sign In or Register to comment.