Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.
"ERROR using K-means clustering algorithm with text data"
basel_deeb
Member Posts: 2 Contributor I
Hello,
I'm using text data that contains three attribute (NAME, LABEL, DOMAIN), this is a sample of the data:
NAME LABEL DOMAIN
------------------------------------------------------------------
origin from string
destination to string
departure day day date
departure month month date
I want to use k-means clustering operator in order to cluster the data, but unfortunately I got this ERROR before the execution:
" The setup does not seem to contain any obvious error, but you should check the log messages or activate the debug mode in the setting dialog in order to get more information about this problem"
Here it is the Log Messages:
Dec 26, 2012 1:23:44 AM INFO: Process //NewLocalRepository/IOS/EM starts
Dec 26, 2012 1:23:44 AM INFO: Loading initial data.
Dec 26, 2012 1:23:45 AM SEVERE: Process failed: operator cannot be executed. Check the log messages...
Dec 26, 2012 1:23:45 AM SEVERE: Here: Process[1] (Process)
subprocess 'Main Process'
+- Retrieve[1] (Retrieve)
==> +- Clustering[1] (k-Means)
Dec 26, 2012 1:23:45 AM SEVERE: java.lang.NullPointerException
and here it is the XML :
Any advice would be greatly appreciated. Thanks!
I'm using text data that contains three attribute (NAME, LABEL, DOMAIN), this is a sample of the data:
NAME LABEL DOMAIN
------------------------------------------------------------------
origin from string
destination to string
departure day day date
departure month month date
I want to use k-means clustering operator in order to cluster the data, but unfortunately I got this ERROR before the execution:
" The setup does not seem to contain any obvious error, but you should check the log messages or activate the debug mode in the setting dialog in order to get more information about this problem"
Here it is the Log Messages:
Dec 26, 2012 1:23:44 AM INFO: Process //NewLocalRepository/IOS/EM starts
Dec 26, 2012 1:23:44 AM INFO: Loading initial data.
Dec 26, 2012 1:23:45 AM SEVERE: Process failed: operator cannot be executed. Check the log messages...
Dec 26, 2012 1:23:45 AM SEVERE: Here: Process[1] (Process)
subprocess 'Main Process'
+- Retrieve[1] (Retrieve)
==> +- Clustering[1] (k-Means)
Dec 26, 2012 1:23:45 AM SEVERE: java.lang.NullPointerException
and here it is the XML :
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.2.008">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.2.008" expanded="true" name="Process">
<process expanded="true" height="341" width="480">
<operator activated="true" class="retrieve" compatibility="5.2.008" expanded="true" height="60" name="Retrieve" width="90" x="126" y="140">
<parameter key="repository_entry" value="../EXPIO/DDP"/>
</operator>
<operator activated="true" class="k_means" compatibility="5.2.008" expanded="true" height="76" name="Clustering" width="90" x="313" y="120">
<parameter key="k" value="10"/>
<parameter key="measure_types" value="NominalMeasures"/>
<parameter key="nominal_measure" value="DiceSimilarity"/>
</operator>
<connect from_op="Retrieve" from_port="output" to_op="Clustering" to_port="example set"/>
<connect from_op="Clustering" from_port="cluster model" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
Any advice would be greatly appreciated. Thanks!
Tagged:
0
Answers
P.S.: Please use the code-tags in this forum for your processes and data.
Actually I've surprised when i uninstalled RapidMiner then reinstalled it, it's worked
However, I've got a question if you don't mind, after generating the centroids clusters by K-means how can i know them because it is generating them as follow:
Cluster_0
Cluster_1
Cluster_2
Again thanks a lot