Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

K Means Clusting - too few examples

thatsbhavikthatsbhavik Member Posts: 2 Learner III
edited November 2018 in Help
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="7.1.000">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="7.1.000" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="retrieve" compatibility="7.1.000" expanded="true" height="68" name="Retrieve" width="90" x="45" y="34">
<parameter key="repository_entry" value="//Local Repository/data/SAA/HW 2a_Regression"/>
</operator>
<operator activated="true" class="filter_examples" compatibility="7.1.000" expanded="true" height="103" name="Filter Examples" width="90" x="112" y="136">
<parameter key="condition_class" value="no_missing_attributes"/>
<list key="filters_list"/>
</operator>
<operator activated="true" class="select_attributes" compatibility="7.1.000" expanded="true" height="82" name="Select Attributes" width="90" x="313" y="187">
<parameter key="attribute_filter_type" value="subset"/>
<parameter key="attributes" value="AA|AB|Airports UASFM|IAP Exists|IFR|O|P|PDARS|Q|R|RWY type|S|T|This Analysis has Traffic Count from 2 sources:&#10;1 - ATADS 2017 (OPSNET)&#10;2 - FAA Airports Data &#10;&#10;Still needs to be combined with Airspace, Track vs IFP, or Distane from ARP (as recomm by Joe)|V|VFR|W|X|Y|Z"/>
<parameter key="invert_selection" value="true"/>
</operator>
<operator activated="true" class="nominal_to_numerical" compatibility="7.1.000" expanded="true" height="103" name="Nominal to Numerical" width="90" x="514" y="238">
<list key="comparison_groups"/>
</operator>
<operator activated="true" class="fast_k_means" compatibility="7.1.000" expanded="true" height="82" name="Clustering" width="90" x="648" y="238">
<parameter key="add_cluster_attribute" value="false"/>
<parameter key="k" value="20"/>
</operator>
<connect from_op="Retrieve" from_port="output" to_op="Filter Examples" to_port="example set input"/>
<connect from_op="Filter Examples" from_port="example set output" to_op="Select Attributes" to_port="example set input"/>
<connect from_op="Select Attributes" from_port="example set output" to_op="Nominal to Numerical" to_port="example set input"/>
<connect from_op="Nominal to Numerical" from_port="example set output" to_op="Clustering" to_port="example set"/>
<connect from_op="Clustering" from_port="cluster model" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>

Hi - I am new to the Rapid Miner community and have a question on how to remove the error in the following k-means clustering process. The error I get is "Example Set contains not enough examples to perform this operation. Needs atleast 5 examples." (I set k=5) even if I increase the examples to 20 or 100.  I want to see clusters - both supervised (identifying k= "x" or unsupervised (Agglomerative). 

Attached is the process file. 

 

Please help!

 

Thanks

Tagged:

Answers

  • Telcontar120Telcontar120 RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn

    It's hard to diagnose this without seeing your data sample.  I do see you have a "Filter Examples" operator before the K-means.  Are you sure that the filter criteria you have in there is not leading to a reduction in number of examples available downstream so it falls below 5?

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • thatsbhavikthatsbhavik Member Posts: 2 Learner III

    Thanks - See attached. The filter was "no missing attributes". This runs well on the second sheet, which has a smaller sample - I need it to run on the first sheet. I am still unable to see the "Cluster Diagram" - how do i do that? 

  • sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    hi @thatsbhavik - welcome to the community. So my first question is why are you running RapidMiner 7.1? Version 8.1 is our current version so any help I can post will be incompatible with your version. Maybe update? :)

     

    Scott

     

Sign In or Register to comment.