<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="7.1.000">
<operator activated="true" class="process" compatibility="7.1.000" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="retrieve" compatibility="7.1.000" expanded="true" height="68" name="Retrieve" width="90" x="45" y="34">
<parameter key="repository_entry" value="//Local Repository/data/SAA/HW 2a_Regression"/>
<operator activated="true" class="filter_examples" compatibility="7.1.000" expanded="true" height="103" name="Filter Examples" width="90" x="112" y="136">
<parameter key="condition_class" value="no_missing_attributes"/>
<list key="filters_list"/>
<operator activated="true" class="select_attributes" compatibility="7.1.000" expanded="true" height="82" name="Select Attributes" width="90" x="313" y="187">
<parameter key="attribute_filter_type" value="subset"/>
<parameter key="attributes" value="AA|AB|Airports UASFM|IAP Exists|IFR|O|P|PDARS|Q|R|RWY type|S|T|This Analysis has Traffic Count from 2 sources:&#10;1 - ATADS 2017 (OPSNET)&#10;2 - FAA Airports Data &#10;&#10;Still needs to be combined with Airspace, Track vs IFP, or Distane from ARP (as recomm by Joe)|V|VFR|W|X|Y|Z"/>
<parameter key="invert_selection" value="true"/>
<operator activated="true" class="nominal_to_numerical" compatibility="7.1.000" expanded="true" height="103" name="Nominal to Numerical" width="90" x="514" y="238">
<list key="comparison_groups"/>
<operator activated="true" class="fast_k_means" compatibility="7.1.000" expanded="true" height="82" name="Clustering" width="90" x="648" y="238">
<parameter key="add_cluster_attribute" value="false"/>
<parameter key="k" value="20"/>
<connect from_op="Retrieve" from_port="output" to_op="Filter Examples" to_port="example set input"/>
<connect from_op="Filter Examples" from_port="example set output" to_op="Select Attributes" to_port="example set input"/>
<connect from_op="Select Attributes" from_port="example set output" to_op="Nominal to Numerical" to_port="example set input"/>
<connect from_op="Nominal to Numerical" from_port="example set output" to_op="Clustering" to_port="example set"/>
<connect from_op="Clustering" from_port="cluster model" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>

Hi - I am new to the Rapid Miner community and have a question on how to remove the error in the following k-means clustering process. The error I get is "Example Set contains not enough examples to perform this operation. Needs atleast 5 examples." (I set k=5) even if I increase the examples to 20 or 100.  I want to see clusters - both supervised (identifying k= "x" or unsupervised (Agglomerative). 

Attached is the process file. 


Please help!





    Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn

    It's hard to diagnose this without seeing your data sample.  I do see you have a "Filter Examples" operator before the K-means.  Are you sure that the filter criteria you have in there is not leading to a reduction in number of examples available downstream so it falls below 5?

    thatsbhavikthatsbhavik Member Posts: 2 Contributor I

    Thanks - See attached. The filter was "no missing attributes". This runs well on the second sheet, which has a smaller sample - I need it to run on the first sheet. I am still unable to see the "Cluster Diagram" - how do i do that? 

    sgenzersgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    hi @thatsbhavik - welcome to the community. So my first question is why are you running RapidMiner 7.1? Version 8.1 is our current version so any help I can post will be incompatible with your version. Maybe update? :)




