"Optimizing K-Means with Cross Validation"

hgwelechgwelec Member Posts: 31 Maven
edited June 2019 in Help

I tried to find something similar in example setups but didn't find something similar.

I am trying to figure out how to perform optimization of K-Means (finding the optimal number of k) through cross-validation. I tried using an XValidation operator but i cannot get it to work. Here is my setup which i wish to change :

<operator name="Root" class="Process" expanded="yes">
    <operator name="CSVExampleSource" class="CSVExampleSource">
        <parameter key="filename"  value="/data-binary.csv"/>
        <parameter key="label_name"  value="class"/>
    <operator name="CorrelationMatrix" class="CorrelationMatrix">
    <operator name="OperatorChain" class="OperatorChain" expanded="yes">
        <operator name="KMeans" class="KMeans">
            <parameter key="k"  value="12"/>
            <parameter key="max_runs"  value="50"/>
            <parameter key="max_optimization_steps"  value="500"/>
            <parameter key="use_local_random_seed"  value="true"/>
            <parameter key="local_random_seed"  value="8"/>
        <operator name="ClusterModelWriter" class="ClusterModelWriter">
            <parameter key="cluster_model_file"  value="/models/clusterout.clm"/>
        <operator name="ClusterCentroidEvaluator" class="ClusterCentroidEvaluator">
            <parameter key="keep_example_set"  value="true"/>
    <operator name="ClusterModelReader" class="ClusterModelReader">
        <parameter key="cluster_model_file"  value="/models/clusterout.clm"/>

Could someone please help?

PS : I accidentally cross-posted this question to Getting started section but couldn't delete it!


  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    I have answered it there already. Too late :) I will close this thread.

Sign In or Register to comment.