Options

"[Solved]Convert Numeric to Nominal after k-means clustering"

nachiketnachiket Member Posts: 6 Contributor II
edited June 2019 in Help
I am a new RapidMiner, I have an excel dataset
I wanted to apply k-means clustering on this dataset and then Bayesian classification on the result of the same
I imported excel(all fields except FID as text) and did Nominal to Numeric to apply kmeans now I want the clusters with original values of data as in input excel (not the numeric data)  to apply Bayes classification on same
How can I do Numeric to Nominal conversion on all of fields ?

Sample Data(1100 rows)
FID Geology                                        Geomorphology                                            Land use_land cover Rainfall       SLOPE Soil                     zone
0 Fissile hornblende biotite gneiss HIGHLY DISSECTED DIFLECTION SLOPE     FOREST                 1200-1400 >60% BROWN CLAY     High
1 Fissile hornblende biotite gneiss HIGHLY DISSECTED DIFLECTION SLOPE     FOREST                    1200-1400 30-60% BROWN CLAY    Moderate
Tagged:

Answers

  • Options
    Marco_BoeckMarco_Boeck Administrator, Moderator, Employee, Member, University Professor Posts: 1,993 RM Engineering
    Hi,

    I'm not sure I understood you correctly, but does the example process below help you? It uses the Multiply operator to create multiple instances of your data and then at the end join the clustered result to your original data.

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="6.1.001-SNAPSHOT">
     <context>
       <input/>
       <output/>
       <macros/>
     </context>
     <operator activated="true" class="process" compatibility="6.1.001-SNAPSHOT" expanded="true" name="Process">
       <process expanded="true">
         <operator activated="true" class="generate_nominal_data" compatibility="6.1.001-SNAPSHOT" expanded="true" height="60" name="Generate Nominal Data" width="90" x="45" y="75"/>
         <operator activated="true" class="generate_id" compatibility="6.1.001-SNAPSHOT" expanded="true" height="76" name="Generate ID" width="90" x="179" y="75"/>
         <operator activated="true" class="multiply" compatibility="6.1.001-SNAPSHOT" expanded="true" height="94" name="Multiply" width="90" x="313" y="75"/>
         <operator activated="true" class="nominal_to_numerical" compatibility="6.1.001-SNAPSHOT" expanded="true" height="94" name="Nominal to Numerical" width="90" x="514" y="30">
           <list key="comparison_groups"/>
         </operator>
         <operator activated="true" class="k_means" compatibility="6.1.001-SNAPSHOT" expanded="true" height="76" name="Clustering" width="90" x="647" y="30"/>
         <operator activated="true" class="select_attributes" compatibility="6.1.001-SNAPSHOT" expanded="true" height="76" name="Select Attributes" width="90" x="781" y="30">
           <parameter key="attribute_filter_type" value="subset"/>
           <parameter key="attributes" value="cluster|id|label"/>
         </operator>
         <operator activated="true" class="join" compatibility="6.1.001-SNAPSHOT" expanded="true" height="76" name="Join" width="90" x="916" y="75">
           <list key="key_attributes"/>
         </operator>
         <connect from_op="Generate Nominal Data" from_port="output" to_op="Generate ID" to_port="example set input"/>
         <connect from_op="Generate ID" from_port="example set output" to_op="Multiply" to_port="input"/>
         <connect from_op="Multiply" from_port="output 1" to_op="Nominal to Numerical" to_port="example set input"/>
         <connect from_op="Multiply" from_port="output 2" to_op="Join" to_port="right"/>
         <connect from_op="Nominal to Numerical" from_port="example set output" to_op="Clustering" to_port="example set"/>
         <connect from_op="Clustering" from_port="clustered set" to_op="Select Attributes" to_port="example set input"/>
         <connect from_op="Select Attributes" from_port="example set output" to_op="Join" to_port="left"/>
         <connect from_op="Join" from_port="join" to_port="result 1"/>
         <portSpacing port="source_input 1" spacing="0"/>
         <portSpacing port="sink_result 1" spacing="0"/>
         <portSpacing port="sink_result 2" spacing="0"/>
       </process>
     </operator>
    </process>

    Regards,
    Marco
  • Options
    nachiketnachiket Member Posts: 6 Contributor II
    Yes thank you very much ,I got the clustered output with original names however as there are 1 extra attributes cluster and id is at the end(like a float number) can you please tell me how I can use Naive Bayes on it  ?

    PS :I am actually trying to integrate Bayes classification with k-means clustering
  • Options
    Fred12Fred12 Member Posts: 344 Unicorn

    I am interested in that topic, too!

    did you solve how to apply NB with clustering? can you show me how?

Sign In or Register to comment.