# AUTOMODEL K-MEANS GRAPHIC RESULT PROBLEMS

Member Posts: 7 Contributor I
edited June 2019 in Help
AUTOMODEL K-MEANS GRAPHIC RESULT PROBLEMS
Tagged:

• Member Posts: 7 Contributor I
1. Somethin is "on average larger/smaller " than what?
2. How to explain "on average"？on average of what？
3. How to calculate the percentage numbers? what is the formula?
• 12Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

tagging @IngoRM

Scott

• Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder

Hi,

When it says "X is on average Y% higher" here is what it means:

The average value for feature / attribute X for the examples / cases in the cluster is Y% higher than the average values of feature X for the examples which are not in the cluster.

BTW, those statements are a textual summary of the top 3 contributors as shown in the Heat Map chart.

Hope this helps,

Ingo

• Member Posts: 7 Contributor I
."..Y higher than the features". The features are belongs to the original samples I entered, right? but it seems not work.
2.PNG 42.2K
• Member Posts: 7 Contributor I

i have tried the sample markt data. not working

2.PNG 40.4K
• Member Posts: 7 Contributor I How to understand this result?

• Member Posts: 7 Contributor I

HOW TO CALCULATE THE AVERAGE?

• Member Posts: 7 Contributor I

tagging @IngoRM

• RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 574 Unicorn

Try to put your data into a process like this.  As you can see I added an aggregation step to the automodel results so we can see some additional averages.

`<?xml version="1.0" encoding="UTF-8"?><process version="8.1.001">  <context>    <input/>    <output/>    <macros/>  </context>  <operator activated="true" automodel="EXPORTED" class="process" compatibility="8.1.001" expanded="true" name="Process">    <process expanded="true">      <operator activated="true" automodel="EXPORTED" class="retrieve" compatibility="8.1.001" expanded="true" height="68" name="Retrieve Data" width="90" x="45" y="85">        <parameter key="repository_entry" value="//Samples/data/Polynomial"/>        <description align="center" color="transparent" colored="false" width="126">Load data.</description>      </operator>      <operator activated="true" automodel="EXPORTED" class="subprocess" compatibility="8.1.001" expanded="true" height="82" name="Preprocessing" width="90" x="179" y="85">        <process expanded="true">          <operator activated="true" automodel="EXPORTED" class="select_subprocess" compatibility="8.1.001" expanded="true" height="82" name="Define Target?" width="90" x="45" y="34">            <process expanded="true">              <connect from_port="input 1" to_port="output 1"/>              <portSpacing port="source_input 1" spacing="0"/>              <portSpacing port="source_input 2" spacing="0"/>              <portSpacing port="sink_output 1" spacing="0"/>              <portSpacing port="sink_output 2" spacing="0"/>            </process>            <process expanded="true">              <operator activated="true" automodel="EXPORTED" class="set_role" compatibility="8.1.001" expanded="true" height="82" name="Define Target" width="90" x="45" y="34">                <parameter key="attribute_name" value="Survived"/>                <parameter key="target_role" value="label"/>                <list key="set_additional_roles"/>                <description align="center" color="transparent" colored="false" width="126">Define the target column for the predictive model.</description>              </operator>              <connect from_port="input 1" to_op="Define Target" to_port="example set input"/>              <connect from_op="Define Target" from_port="example set output" to_port="output 1"/>              <portSpacing port="source_input 1" spacing="0"/>              <portSpacing port="source_input 2" spacing="0"/>              <portSpacing port="sink_output 1" spacing="0"/>              <portSpacing port="sink_output 2" spacing="0"/>            </process>            <description align="center" color="transparent" colored="false" width="126">Should define a target column?</description>          </operator>          <operator activated="true" automodel="EXPORTED" class="select_subprocess" compatibility="8.1.001" expanded="true" height="82" name="Should Discretize?" width="90" x="179" y="34">            <process expanded="true">              <connect from_port="input 1" to_port="output 1"/>              <portSpacing port="source_input 1" spacing="0"/>              <portSpacing port="source_input 2" spacing="0"/>              <portSpacing port="sink_output 1" spacing="0"/>              <portSpacing port="sink_output 2" spacing="0"/>            </process>            <process expanded="true">              <operator activated="true" automodel="EXPORTED" class="discretize_by_bins" compatibility="8.1.001" expanded="true" height="103" name="Binning" width="90" x="45" y="34">                <parameter key="attribute_filter_type" value="single"/>                <parameter key="attribute" value="Age"/>                <parameter key="include_special_attributes" value="true"/>                <parameter key="range_name_type" value="short"/>                <description align="center" color="transparent" colored="false" width="126">Discretize by binning (same range per bin).</description>              </operator>              <connect from_port="input 1" to_op="Binning" to_port="example set input"/>              <connect from_op="Binning" from_port="example set output" to_port="output 1"/>              <portSpacing port="source_input 1" spacing="0"/>              <portSpacing port="source_input 2" spacing="0"/>              <portSpacing port="sink_output 1" spacing="0"/>              <portSpacing port="sink_output 2" spacing="0"/>            </process>            <process expanded="true">              <operator activated="true" automodel="EXPORTED" class="discretize_by_frequency" compatibility="8.1.001" expanded="true" height="103" name="Frequency" width="90" x="45" y="34">                <parameter key="attribute_filter_type" value="single"/>                <parameter key="attribute" value="Age"/>                <parameter key="include_special_attributes" value="true"/>                <parameter key="range_name_type" value="short"/>                <description align="center" color="transparent" colored="false" width="126">Discretize by frequency (same count per bin).</description>              </operator>              <connect from_port="input 1" to_op="Frequency" to_port="example set input"/>              <connect from_op="Frequency" from_port="example set output" to_port="output 1"/>              <portSpacing port="source_input 1" spacing="0"/>              <portSpacing port="source_input 2" spacing="0"/>              <portSpacing port="sink_output 1" spacing="0"/>              <portSpacing port="sink_output 2" spacing="0"/>            </process>            <description align="center" color="transparent" colored="false" width="126">Should discretize numerical target column?</description>          </operator>          <operator activated="true" automodel="EXPORTED" class="select_subprocess" compatibility="8.1.001" expanded="true" height="82" name="Map Values?" width="90" x="313" y="34">            <process expanded="true">              <connect from_port="input 1" to_port="output 1"/>              <portSpacing port="source_input 1" spacing="0"/>              <portSpacing port="source_input 2" spacing="0"/>              <portSpacing port="sink_output 1" spacing="0"/>              <portSpacing port="sink_output 2" spacing="0"/>            </process>            <process expanded="true">              <operator activated="true" automodel="EXPORTED" class="map" compatibility="8.1.001" expanded="true" height="82" name="Map Values" width="90" x="45" y="34">                <parameter key="attribute_filter_type" value="single"/>                <parameter key="attribute" value="Survived"/>                <parameter key="include_special_attributes" value="true"/>                <list key="value_mappings"/>                <description align="center" color="transparent" colored="false" width="126">Map some nominal target values to new values.</description>              </operator>              <connect from_port="input 1" to_op="Map Values" to_port="example set input"/>              <connect from_op="Map Values" from_port="example set output" to_port="output 1"/>              <portSpacing port="source_input 1" spacing="0"/>              <portSpacing port="source_input 2" spacing="0"/>              <portSpacing port="sink_output 1" spacing="0"/>              <portSpacing port="sink_output 2" spacing="0"/>            </process>            <description align="center" color="transparent" colored="false" width="126">Should map nominal values?</description>          </operator>          <operator activated="true" automodel="EXPORTED" class="select_subprocess" compatibility="8.1.001" expanded="true" height="82" name="Positive Class?" width="90" x="447" y="34">            <process expanded="true">              <connect from_port="input 1" to_port="output 1"/>              <portSpacing port="source_input 1" spacing="0"/>              <portSpacing port="source_input 2" spacing="0"/>              <portSpacing port="sink_output 1" spacing="0"/>              <portSpacing port="sink_output 2" spacing="0"/>            </process>            <process expanded="true">              <operator activated="true" automodel="EXPORTED" class="nominal_to_binominal" compatibility="8.1.001" expanded="true" height="103" name="Nominal to Binominal" width="90" x="45" y="34">                <parameter key="attribute_filter_type" value="single"/>                <parameter key="attribute" value="Survived"/>                <parameter key="include_special_attributes" value="true"/>                <description align="center" color="transparent" colored="false" width="126">Make sure that target is binary for positive class mapping.</description>              </operator>              <operator activated="true" automodel="EXPORTED" class="remap_binominals" compatibility="8.1.001" expanded="true" height="82" name="Define Positive Class" width="90" x="179" y="34">                <parameter key="attribute_filter_type" value="single"/>                <parameter key="attribute" value="Survived"/>                <parameter key="include_special_attributes" value="true"/>                <parameter key="negative_value" value="No"/>                <parameter key="positive_value" value="Yes"/>                <description align="center" color="transparent" colored="false" width="126">Potentially define which one should be the positive class.</description>              </operator>              <connect from_port="input 1" to_op="Nominal to Binominal" to_port="example set input"/>              <connect from_op="Nominal to Binominal" from_port="example set output" to_op="Define Positive Class" to_port="example set input"/>              <connect from_op="Define Positive Class" from_port="example set output" to_port="output 1"/>              <portSpacing port="source_input 1" spacing="0"/>              <portSpacing port="source_input 2" spacing="0"/>              <portSpacing port="sink_output 1" spacing="0"/>              <portSpacing port="sink_output 2" spacing="0"/>            </process>            <description align="center" color="transparent" colored="false" width="126">Should define positive class?</description>          </operator>          <operator activated="true" automodel="EXPORTED" class="select_subprocess" compatibility="8.1.001" expanded="true" height="82" name="Remove Columns?" width="90" x="581" y="34">            <parameter key="select_which" value="2"/>            <process expanded="true">              <connect from_port="input 1" to_port="output 1"/>              <portSpacing port="source_input 1" spacing="0"/>              <portSpacing port="source_input 2" spacing="0"/>              <portSpacing port="sink_output 1" spacing="0"/>              <portSpacing port="sink_output 2" spacing="0"/>            </process>            <process expanded="true">              <operator activated="true" automodel="EXPORTED" class="select_attributes" compatibility="8.1.001" expanded="true" height="82" name="Remove Columns" width="90" x="45" y="34">                <parameter key="attribute_filter_type" value="regular_expression"/>                <parameter key="regular_expression" value="\Qlabel\E"/>                <parameter key="invert_selection" value="true"/>                <parameter key="include_special_attributes" value="true"/>                <description align="center" color="transparent" colored="false" width="126">Potentially remove columns.</description>              </operator>              <connect from_port="input 1" to_op="Remove Columns" to_port="example set input"/>              <connect from_op="Remove Columns" from_port="example set output" to_port="output 1"/>              <portSpacing port="source_input 1" spacing="0"/>              <portSpacing port="source_input 2" spacing="0"/>              <portSpacing port="sink_output 1" spacing="0"/>              <portSpacing port="sink_output 2" spacing="0"/>            </process>            <description align="center" color="transparent" colored="false" width="126">Should remove columns?</description>          </operator>          <operator activated="true" automodel="EXPORTED" class="subprocess" compatibility="8.1.001" expanded="true" height="82" name="Unify Value Types" width="90" x="715" y="34">            <process expanded="true">              <operator activated="true" automodel="EXPORTED" class="select_attributes" compatibility="8.1.001" expanded="true" height="82" name="Remove Dates" width="90" x="45" y="34">                <parameter key="attribute_filter_type" value="value_type"/>                <parameter key="value_type" value="date_time"/>                <parameter key="invert_selection" value="true"/>                <description align="center" color="transparent" colored="false" width="126">Remove all date columns.</description>              </operator>              <operator activated="true" automodel="EXPORTED" class="nominal_to_text" compatibility="8.1.001" expanded="true" height="82" name="Nominal to Text" width="90" x="179" y="34">                <parameter key="attribute_filter_type" value="value_type"/>                <parameter key="include_special_attributes" value="true"/>                <description align="center" color="transparent" colored="false" width="126">Transform all nominal columns to text so that we make sure that all will have polynominal type after the next transformation.</description>              </operator>              <operator activated="true" automodel="EXPORTED" class="text_to_nominal" compatibility="8.1.001" expanded="true" height="82" name="Text to Nominal" width="90" x="313" y="34">                <parameter key="attribute_filter_type" value="value_type"/>                <parameter key="include_special_attributes" value="true"/>                <description align="center" color="transparent" colored="false" width="126">Transform all text columns into polynominal columns.</description>              </operator>              <operator activated="true" automodel="EXPORTED" class="numerical_to_real" compatibility="8.1.001" expanded="true" height="82" name="Numerical to Real" width="90" x="447" y="34">                <parameter key="attribute_filter_type" value="value_type"/>                <parameter key="use_value_type_exception" value="true"/>                <parameter key="except_value_type" value="integer"/>                <parameter key="include_special_attributes" value="true"/>                <description align="center" color="transparent" colored="false" width="126">Turn all numerical columns (not integers though) into real columns.</description>              </operator>              <connect from_port="in 1" to_op="Remove Dates" to_port="example set input"/>              <connect from_op="Remove Dates" from_port="example set output" to_op="Nominal to Text" to_port="example set input"/>              <connect from_op="Nominal to Text" from_port="example set output" to_op="Text to Nominal" to_port="example set input"/>              <connect from_op="Text to Nominal" from_port="example set output" to_op="Numerical to Real" to_port="example set input"/>              <connect from_op="Numerical to Real" from_port="example set output" to_port="out 1"/>              <portSpacing port="source_in 1" spacing="0"/>              <portSpacing port="source_in 2" spacing="0"/>              <portSpacing port="sink_out 1" spacing="0"/>              <portSpacing port="sink_out 2" spacing="0"/>            </process>            <description align="center" color="transparent" colored="false" width="126">Unify all value types</description>          </operator>          <connect from_port="in 1" to_op="Define Target?" to_port="input 1"/>          <connect from_op="Define Target?" from_port="output 1" to_op="Should Discretize?" to_port="input 1"/>          <connect from_op="Should Discretize?" from_port="output 1" to_op="Map Values?" to_port="input 1"/>          <connect from_op="Map Values?" from_port="output 1" to_op="Positive Class?" to_port="input 1"/>          <connect from_op="Positive Class?" from_port="output 1" to_op="Remove Columns?" to_port="input 1"/>          <connect from_op="Remove Columns?" from_port="output 1" to_op="Unify Value Types" to_port="in 1"/>          <connect from_op="Unify Value Types" from_port="out 1" to_port="out 1"/>          <portSpacing port="source_in 1" spacing="0"/>          <portSpacing port="source_in 2" spacing="0"/>          <portSpacing port="sink_out 1" spacing="0"/>          <portSpacing port="sink_out 2" spacing="0"/>        </process>        <description align="center" color="transparent" colored="false" width="126">All general preprocessing steps happen inside this operator - double click on it to see the details.</description>      </operator>      <operator activated="true" automodel="EXPORTED" class="subprocess" compatibility="8.1.001" expanded="true" height="82" name="Replace Missing Values" width="90" x="313" y="85">        <process expanded="true">          <operator activated="true" automodel="EXPORTED" class="replace_missing_values" compatibility="8.1.001" expanded="true" height="103" name="Replace Nominal Missings" width="90" x="45" y="34">            <parameter key="attribute_filter_type" value="value_type"/>            <parameter key="value_type" value="nominal"/>            <parameter key="default" value="value"/>            <list key="columns"/>            <parameter key="replenishment_value" value="MISSING"/>            <description align="center" color="transparent" colored="false" width="126">Replace nominal missings with the word 'missing'.</description>          </operator>          <operator activated="true" automodel="EXPORTED" class="replace_infinite_values" compatibility="8.1.001" expanded="true" height="103" name="Replace Pos Infinite Values" width="90" x="179" y="34">            <parameter key="attribute_filter_type" value="value_type"/>            <parameter key="include_special_attributes" value="true"/>            <parameter key="default" value="missing"/>            <list key="columns"/>            <description align="center" color="transparent" colored="false" width="126">Replace positive infinity values by missing.</description>          </operator>          <operator activated="true" automodel="EXPORTED" class="replace_infinite_values" compatibility="8.1.001" expanded="true" height="103" name="Replace Neg Infinite Values" width="90" x="313" y="34">            <parameter key="attribute_filter_type" value="value_type"/>            <parameter key="include_special_attributes" value="true"/>            <parameter key="default" value="missing"/>            <list key="columns"/>            <parameter key="replenish_what" value="negative_infinity"/>            <description align="center" color="transparent" colored="false" width="126">Replace negative infinity values by missing.</description>          </operator>          <operator activated="true" automodel="EXPORTED" class="replace_missing_values" compatibility="8.1.001" expanded="true" height="103" name="Replace Numerical Missings" width="90" x="447" y="34">            <parameter key="attribute_filter_type" value="value_type"/>            <parameter key="value_type" value="numeric"/>            <list key="columns"/>            <description align="center" color="transparent" colored="false" width="126">Replace numerical missings with the average of the column.</description>          </operator>          <connect from_port="in 1" to_op="Replace Nominal Missings" to_port="example set input"/>          <connect from_op="Replace Nominal Missings" from_port="example set output" to_op="Replace Pos Infinite Values" to_port="example set input"/>          <connect from_op="Replace Pos Infinite Values" from_port="example set output" to_op="Replace Neg Infinite Values" to_port="example set input"/>          <connect from_op="Replace Neg Infinite Values" from_port="example set output" to_op="Replace Numerical Missings" to_port="example set input"/>          <connect from_op="Replace Numerical Missings" from_port="example set output" to_port="out 1"/>          <portSpacing port="source_in 1" spacing="0"/>          <portSpacing port="source_in 2" spacing="0"/>          <portSpacing port="sink_out 1" spacing="0"/>          <portSpacing port="sink_out 2" spacing="0"/>        </process>        <description align="center" color="transparent" colored="false" width="126">Replace missing values.</description>      </operator>      <operator activated="true" automodel="EXPORTED" class="select_attributes" compatibility="8.1.001" expanded="true" height="82" name="Select Attributes (2)" width="90" x="447" y="85">        <parameter key="attribute_filter_type" value="value_type"/>        <parameter key="value_type" value="nominal"/>        <description align="center" color="transparent" colored="false" width="126">Check if there are any nominal attributes in the data</description>      </operator>      <operator activated="true" automodel="EXPORTED" class="branch" compatibility="8.1.001" expanded="true" height="103" name="Branch (2)" width="90" x="581" y="85">        <parameter key="condition_type" value="min_attributes"/>        <parameter key="condition_value" value="1"/>        <process expanded="true">          <operator activated="true" automodel="EXPORTED" class="concurrency:loop_attributes" compatibility="8.1.001" expanded="true" height="82" name="Loop Attributes" width="90" x="45" y="34">            <parameter key="attribute_filter_type" value="value_type"/>            <parameter key="value_type" value="nominal"/>            <parameter key="reuse_results" value="true"/>            <parameter key="enable_parallel_execution" value="false"/>            <process expanded="true">              <operator activated="true" automodel="EXPORTED" class="aggregate" compatibility="8.1.001" expanded="true" height="82" name="Aggregate" width="90" x="45" y="34">                <list key="aggregation_attributes"/>                <parameter key="group_by_attributes" value="%{loop_attribute}"/>                <description align="center" color="transparent" colored="false" width="126">Create a new data set with one row for each nominal value of the current column (loop).</description>              </operator>              <operator activated="true" automodel="EXPORTED" class="branch" compatibility="8.1.001" expanded="true" height="103" name="Branch" width="90" x="179" y="34">                <parameter key="condition_type" value="min_examples"/>                <parameter key="condition_value" value="10"/>                <process expanded="true">                  <operator activated="true" automodel="EXPORTED" class="select_attributes" compatibility="8.1.001" expanded="true" height="82" name="Select Attributes" width="90" x="45" y="34">                    <parameter key="attribute_filter_type" value="single"/>                    <parameter key="attribute" value="%{loop_attribute}"/>                    <parameter key="invert_selection" value="true"/>                    <description align="center" color="transparent" colored="false" width="126">More than 10 values? Remove current column.</description>                  </operator>                  <connect from_port="input 1" to_op="Select Attributes" to_port="example set input"/>                  <connect from_op="Select Attributes" from_port="example set output" to_port="input 1"/>                  <portSpacing port="source_condition" spacing="0"/>                  <portSpacing port="source_input 1" spacing="0"/>                  <portSpacing port="source_input 2" spacing="0"/>                  <portSpacing port="sink_input 1" spacing="0"/>                  <portSpacing port="sink_input 2" spacing="0"/>                  <portSpacing port="sink_input 3" spacing="0"/>                </process>                <process expanded="true">                  <operator activated="true" automodel="EXPORTED" class="aggregate" compatibility="8.1.001" expanded="true" height="82" name="Aggregate (2)" width="90" x="45" y="34">                    <list key="aggregation_attributes">                      <parameter key="%{loop_attribute}" value="count"/>                    </list>                    <parameter key="group_by_attributes" value="%{loop_attribute}"/>                    <description align="center" color="transparent" colored="false" width="126">Count number of occurences for each value.</description>                  </operator>                  <operator activated="true" automodel="EXPORTED" class="sort" compatibility="8.1.001" expanded="true" height="82" name="Sort" width="90" x="179" y="34">                    <parameter key="attribute_name" value="count(%{loop_attribute})"/>                    <description align="center" color="transparent" colored="false" width="126">Sort counts.</description>                  </operator>                  <operator activated="true" automodel="EXPORTED" class="extract_macro" compatibility="8.1.001" expanded="true" height="68" name="Extract Macro" width="90" x="313" y="34">                    <parameter key="macro" value="least_common"/>                    <parameter key="macro_type" value="data_value"/>                    <parameter key="attribute_name" value="%{loop_attribute}"/>                    <parameter key="example_index" value="1"/>                    <list key="additional_macros"/>                    <description align="center" color="transparent" colored="false" width="126">Remember value with smallest count.</description>                  </operator>                  <operator activated="true" automodel="EXPORTED" class="nominal_to_numerical" compatibility="8.1.001" expanded="true" height="103" name="Nominal to Numerical (2)" width="90" x="447" y="136">                    <parameter key="attribute_filter_type" value="single"/>                    <parameter key="attribute" value="%{loop_attribute}"/>                    <parameter key="use_comparison_groups" value="true"/>                    <list key="comparison_groups">                      <parameter key="%{loop_attribute}" value="%{least_common}"/>                    </list>                    <description align="center" color="transparent" colored="false" width="126">Transform to binary using dummy coding and a comparison group for the least frequent value.</description>                  </operator>                  <connect from_port="input 1" to_op="Aggregate (2)" to_port="example set input"/>                  <connect from_op="Aggregate (2)" from_port="example set output" to_op="Sort" to_port="example set input"/>                  <connect from_op="Aggregate (2)" from_port="original" to_op="Nominal to Numerical (2)" to_port="example set input"/>                  <connect from_op="Sort" from_port="example set output" to_op="Extract Macro" to_port="example set"/>                  <connect from_op="Extract Macro" from_port="example set" to_port="input 2"/>                  <connect from_op="Nominal to Numerical (2)" from_port="example set output" to_port="input 1"/>                  <portSpacing port="source_condition" spacing="0"/>                  <portSpacing port="source_input 1" spacing="0"/>                  <portSpacing port="source_input 2" spacing="0"/>                  <portSpacing port="sink_input 1" spacing="0"/>                  <portSpacing port="sink_input 2" spacing="0"/>                  <portSpacing port="sink_input 3" spacing="0"/>                  <description align="center" color="yellow" colored="false" height="66" resized="false" width="126" x="40" y="210">Less than 10 values? Transform into binary.</description>                </process>                <description align="center" color="transparent" colored="false" width="126">If more than 10, remove column. If less, transform to binary.</description>              </operator>              <connect from_port="input 1" to_op="Aggregate" to_port="example set input"/>              <connect from_op="Aggregate" from_port="example set output" to_op="Branch" to_port="condition"/>              <connect from_op="Aggregate" from_port="original" to_op="Branch" to_port="input 1"/>              <connect from_op="Branch" from_port="input 1" to_port="output 1"/>              <portSpacing port="source_input 1" spacing="0"/>              <portSpacing port="source_input 2" spacing="0"/>              <portSpacing port="sink_output 1" spacing="0"/>              <portSpacing port="sink_output 2" spacing="0"/>            </process>            <description align="center" color="transparent" colored="false" width="126">Remove nominal columns with too many values, transform the others to binary.</description>          </operator>          <connect from_port="input 1" to_op="Loop Attributes" to_port="input 1"/>          <connect from_op="Loop Attributes" from_port="output 1" to_port="input 1"/>          <portSpacing port="source_condition" spacing="0"/>          <portSpacing port="source_input 1" spacing="0"/>          <portSpacing port="source_input 2" spacing="0"/>          <portSpacing port="sink_input 1" spacing="0"/>          <portSpacing port="sink_input 2" spacing="0"/>        </process>        <process expanded="true">          <connect from_port="input 1" to_port="input 1"/>          <portSpacing port="source_condition" spacing="0"/>          <portSpacing port="source_input 1" spacing="0"/>          <portSpacing port="source_input 2" spacing="0"/>          <portSpacing port="sink_input 1" spacing="0"/>          <portSpacing port="sink_input 2" spacing="0"/>        </process>        <description align="center" color="transparent" colored="false" width="126">If there are nominal attributes, handle them inside</description>      </operator>      <operator activated="true" automodel="EXPORTED" class="sample_stratified" compatibility="8.1.001" expanded="true" height="82" name="Sample (Stratified)" width="90" x="715" y="85">        <parameter key="sample_size" value="500000"/>        <description align="center" color="transparent" colored="false" width="126">Sample down to 500,000 examples in case there are more.</description>      </operator>      <operator activated="true" automodel="EXPORTED" class="remove_useless_attributes" compatibility="8.1.001" expanded="true" height="82" name="Remove Useless Attributes" width="90" x="849" y="85">        <description align="center" color="transparent" colored="false" width="126">Remove constant columns, can happen especially for MISSING columns after dummy coding.</description>      </operator>      <operator activated="true" automodel="EXPORTED" class="normalize" compatibility="8.1.001" expanded="true" height="103" name="Normalize" width="90" x="983" y="85">        <description align="center" color="transparent" colored="false" width="126">Standardize all columns.</description>      </operator>      <operator activated="true" automodel="EXPORTED" class="order_attributes" compatibility="8.1.001" expanded="true" height="82" name="Reorder Attributes" width="90" x="1117" y="85">        <parameter key="sort_mode" value="alphabetically"/>        <description align="center" color="transparent" colored="false" width="126">Order columns alphabetically.</description>      </operator>      <operator activated="true" automodel="EXPORTED" class="concurrency:k_means" compatibility="8.1.001" expanded="true" height="82" name="Clustering" width="90" x="1251" y="85"/>      <operator activated="true" automodel="EXPORTED" class="multiply" compatibility="8.1.001" expanded="true" height="103" name="Multiply" width="90" x="1385" y="136">        <description align="center" color="transparent" colored="false" width="126">Create a copy of the data for learning a tree.</description>      </operator>      <operator activated="true" automodel="EXPORTED" class="model_simulator:cluster_model_visualizer" compatibility="8.1.001" expanded="true" height="82" name="Cluster Model Visualizer" width="90" x="1519" y="85">        <description align="center" color="transparent" colored="false" width="126">Creates the cluster model visualizations.</description>      </operator>      <operator activated="true" automodel="EXPORTED" class="sort" compatibility="8.1.001" expanded="true" height="82" name="Sort (2)" width="90" x="1519" y="289">        <parameter key="attribute_name" value="cluster"/>        <description align="center" color="transparent" colored="false" width="126">Sort according to clusters so that decision tree colors match the cluster colors later on.</description>      </operator>      <operator activated="true" automodel="EXPORTED" class="generate_attributes" compatibility="8.1.001" expanded="true" height="82" name="Generate Attributes" width="90" x="1653" y="289">        <list key="function_descriptions">          <parameter key="cluster_label" value="cluster"/>        </list>        <description align="center" color="transparent" colored="false" width="126">Generate a new label attribute from the sorted values to ensure consistent color schemes.</description>      </operator>      <operator activated="true" automodel="EXPORTED" class="set_role" compatibility="8.1.001" expanded="true" height="82" name="Set Role" width="90" x="1787" y="289">        <parameter key="attribute_name" value="cluster_label"/>        <parameter key="target_role" value="label"/>        <list key="set_additional_roles"/>        <description align="center" color="transparent" colored="false" width="126">Turn the newly generated column into the label for the decision tree.</description>      </operator>      <operator activated="true" automodel="EXPORTED" class="concurrency:parallel_decision_tree" compatibility="8.1.001" expanded="true" height="103" name="Decision Tree" width="90" x="1921" y="187">        <parameter key="apply_prepruning" value="false"/>        <description align="center" color="transparent" colored="false" width="126">Learn a model explaining the cluster assignments.</description>      </operator>      <operator activated="true" class="aggregate" compatibility="8.1.001" expanded="true" height="82" name="Aggregate (3)" width="90" x="2055" y="238">        <list key="aggregation_attributes">          <parameter key="a3" value="average"/>          <parameter key="a1" value="average"/>          <parameter key="a4" value="average"/>        </list>        <parameter key="group_by_attributes" value="cluster_label"/>      </operator>      <operator activated="true" class="aggregate" compatibility="8.1.001" expanded="true" height="82" name="Aggregate (4)" width="90" x="2189" y="340">        <list key="aggregation_attributes">          <parameter key="a3" value="average"/>          <parameter key="a1" value="average"/>          <parameter key="a4" value="average"/>        </list>      </operator>      <operator activated="true" class="generate_attributes" compatibility="8.1.001" expanded="true" height="82" name="Generate Attributes (2)" width="90" x="2323" y="289">        <list key="function_descriptions">          <parameter key="cluster_label" value="&quot;Total&quot;"/>        </list>      </operator>      <operator activated="true" class="append" compatibility="8.1.001" expanded="true" height="103" name="Append" width="90" x="2457" y="187"/>      <connect from_op="Retrieve Data" from_port="output" to_op="Preprocessing" to_port="in 1"/>      <connect from_op="Preprocessing" from_port="out 1" to_op="Replace Missing Values" to_port="in 1"/>      <connect from_op="Replace Missing Values" from_port="out 1" to_op="Select Attributes (2)" to_port="example set input"/>      <connect from_op="Select Attributes (2)" from_port="example set output" to_op="Branch (2)" to_port="condition"/>      <connect from_op="Select Attributes (2)" from_port="original" to_op="Branch (2)" to_port="input 1"/>      <connect from_op="Branch (2)" from_port="input 1" to_op="Sample (Stratified)" to_port="example set input"/>      <connect from_op="Sample (Stratified)" from_port="example set output" to_op="Remove Useless Attributes" to_port="example set input"/>      <connect from_op="Remove Useless Attributes" from_port="example set output" to_op="Normalize" to_port="example set input"/>      <connect from_op="Normalize" from_port="example set output" to_op="Reorder Attributes" to_port="example set input"/>      <connect from_op="Reorder Attributes" from_port="example set output" to_op="Clustering" to_port="example set"/>      <connect from_op="Clustering" from_port="cluster model" to_op="Cluster Model Visualizer" to_port="model"/>      <connect from_op="Clustering" from_port="clustered set" to_op="Multiply" to_port="input"/>      <connect from_op="Multiply" from_port="output 1" to_op="Cluster Model Visualizer" to_port="clustered data"/>      <connect from_op="Multiply" from_port="output 2" to_op="Sort (2)" to_port="example set input"/>      <connect from_op="Cluster Model Visualizer" from_port="visualizer output" to_port="result 1"/>      <connect from_op="Sort (2)" from_port="example set output" to_op="Generate Attributes" to_port="example set input"/>      <connect from_op="Generate Attributes" from_port="example set output" to_op="Set Role" to_port="example set input"/>      <connect from_op="Set Role" from_port="example set output" to_op="Decision Tree" to_port="training set"/>      <connect from_op="Set Role" from_port="original" to_port="result 3"/>      <connect from_op="Decision Tree" from_port="model" to_port="result 2"/>      <connect from_op="Decision Tree" from_port="exampleSet" to_op="Aggregate (3)" to_port="example set input"/>      <connect from_op="Aggregate (3)" from_port="example set output" to_op="Append" to_port="example set 1"/>      <connect from_op="Aggregate (3)" from_port="original" to_op="Aggregate (4)" to_port="example set input"/>      <connect from_op="Aggregate (4)" from_port="example set output" to_op="Generate Attributes (2)" to_port="example set input"/>      <connect from_op="Generate Attributes (2)" from_port="example set output" to_op="Append" to_port="example set 2"/>      <connect from_op="Append" from_port="merged set" to_port="result 4"/>      <portSpacing port="source_input 1" spacing="0"/>      <portSpacing port="sink_result 1" spacing="42"/>      <portSpacing port="sink_result 2" spacing="0"/>      <portSpacing port="sink_result 3" spacing="0"/>      <portSpacing port="sink_result 4" spacing="0"/>      <portSpacing port="sink_result 5" spacing="0"/>      <description align="left" color="yellow" colored="false" height="105" resized="true" width="263" x="1062" y="266">Results:&lt;br&gt;1. Cluster Model Visualization&lt;br&gt;2. Decision tree for cluster explanation&lt;br&gt;3. Clustered data</description>    </process>  </operator></process>`
• Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder

Also keep in mind that the features are normalized before the clustering (and hence also before the calculation of the averages).