"Using a Kmeans model (clm file) created w/ a sample to cluster my population"
Your description for this operator states "This Operator clusters an exampleset given a cluster model. If an exampleSet does not contain id attributes it is probably not the same as the cluster model has been created on. Since cluster models depend on a static nature of the id attributes, the outcome on another exampleset with different values but same ids will be unpredictable.". Does this mean that it will only cluster the records that I used to create the model, and will not do any new records?
The process below finishes correctly but only clustered the records that had been clustered in the sample file. All other records had a blank cluster # in the output file.
Is there a way to use the model I created to cluster new records or do I have to run the kmeans algorithm on the 1 mill record file and not use the clm file created from the sample data?
Thanks in advance.
<operator name="ClusterModelReader" class="ClusterModelReader">
<description text="The cluster model 8051_Lifestyle_Matches_Excel.clm is the exact model I used for the Excel study so use it to cluster the population"/>
<parameter key="cluster_model_file" value="C:\Documents and Settings\krobinson\My Documents\rm_workspace\Clustering\8051_Lifestyle_Matches_Excel.clm"/>
<operator name="ClusterModel2ExampleSet" class="ClusterModel2ExampleSet">
<operator name="OperatorChain" class="OperatorChain" expanded="yes">
<operator name="PSVExampleSetWriter" class="CSVExampleSetWriter">
<parameter key="column_separator" value="|"/>
<parameter key="csv_file" value="C:\Documents and Settings\krobinson\My Documents\rm_workspace\Clustering\8051_Population_Lifestyle.psv"/>