The RapidMiner community is on read-only mode until further notice. Technical support via cases will continue to work as is. For any urgent licensing related requests from Students/Faculty members, please use the Altair academic forum here.
Clustering of SOM
I was wondering if it is possible to cluster SOM results which is generally known as two-stage clustering? Some literature on the internet shows considerable promise such as the DS2L-SOM: http://cdn.intechopen.com/pdfs-wm/10455.pdf. SOM is attractive because it offers reduction of the data dimensions, data analysis tools and is robust to null values and skewed data. Clustering the results of a large SOM is attractive for further identifying groups without prior knowledge of the number of clusters and those groups, which can then be investigated more qualitatively than standard SOM. Any advice on the recommended approach would be helpful. I was not sure that the SOM within RapidMiner can output the distance and density information required, to be input into a clustering method?
0
Answers
of course you can use the SOM like a PCA for a preprocessing. It is in the end a change of the vector space and a reduction. You can then use the new coordinates to do a clustering on it. A process for this is attached.
How would you like to use which density information in such a setting?
Cheers,
Martin
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="6.4.000">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="6.4.000" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="retrieve" compatibility="6.4.000" expanded="true" height="60" name="Retrieve Sonar" width="90" x="45" y="120">
<parameter key="repository_entry" value="//Samples/data/Sonar"/>
</operator>
<operator activated="true" class="self_organizing_map" compatibility="6.4.000" expanded="true" height="94" name="SOM" width="90" x="179" y="120"/>
<operator activated="true" class="k_means" compatibility="6.4.000" expanded="true" height="76" name="Clustering" width="90" x="313" y="120"/>
<connect from_op="Retrieve Sonar" from_port="output" to_op="SOM" to_port="example set input"/>
<connect from_op="SOM" from_port="example set output" to_op="Clustering" to_port="example set"/>
<connect from_op="Clustering" from_port="clustered set" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
Dortmund, Germany