🦉 🎤   RapidMiner Wisdom 2020 - CALL FOR SPEAKERS   🦉 🎤

We are inviting all community members to submit proposals to speak at Wisdom 2020 in Boston.


Whether it's a cool RapidMiner trick or a use case implementation, we want to see what you have.
Form link is below and deadline for submissions is November 15. See you in Boston!

CLICK HERE TO GO TO ENTRY FORM

Doubt about median

mcarballomcarballo Member Posts: 1 Newbie
edited November 2018 in Help
Hi, I've 2 rows with a column with 133 and 41 as values, I want to know de median and I use Aggregate object. It gave me back 41 as a result but in fact is 87. Could some one help me about this. Why RM give me back the first value?

regards.
Tagged:

Answers

  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,666  RM Founder

    Are you sure that you have selected the median function and not accidentally used minimum?  I just tried this myself with both TP and with a process and it delivered the mid point between the two numbers in case of even rows like in your example.  See process below for an example.

    Hope this helps,
    Ingo

    <?xml version="1.0" encoding="UTF-8"?><process version="9.1.000-BETA2">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="9.1.000-BETA2" expanded="true" name="Process" origin="EXPORTED_TURBOPREP">
        <parameter key="logverbosity" value="init"/>
        <parameter key="random_seed" value="2001"/>
        <parameter key="send_mail" value="never"/>
        <parameter key="notification_email" value=""/>
        <parameter key="process_duration_for_mail" value="30"/>
        <parameter key="encoding" value="SYSTEM"/>
        <process expanded="true">
          <operator activated="true" class="subprocess" compatibility="9.1.000-BETA2" expanded="true" height="82" name="Subprocess (4)" width="90" x="45" y="34">
            <process expanded="true">
              <operator activated="true" class="retrieve" compatibility="9.1.000-BETA2" expanded="true" height="68" name="Retrieve" origin="EXPORTED_TURBOPREP" width="90" x="45" y="34">
                <parameter key="repository_entry" value="//Samples/data/Titanic"/>
                <description align="center" color="transparent" colored="false" width="126">Loading Titanic</description>
              </operator>
              <operator activated="true" class="subprocess" compatibility="9.1.000-BETA2" expanded="true" height="82" name="Subprocess" origin="EXPORTED_TURBOPREP" width="90" x="179" y="34">
                <process expanded="true">
                  <operator activated="true" class="nominal_to_text" compatibility="9.1.000-BETA2" expanded="true" height="82" name="Nominal to Text" origin="EXPORTED_TURBOPREP" width="90" x="45" y="34">
                    <parameter key="attribute_filter_type" value="value_type"/>
                    <parameter key="attribute" value=""/>
                    <parameter key="attributes" value=""/>
                    <parameter key="use_except_expression" value="false"/>
                    <parameter key="value_type" value="nominal"/>
                    <parameter key="use_value_type_exception" value="false"/>
                    <parameter key="except_value_type" value="file_path"/>
                    <parameter key="block_type" value="single_value"/>
                    <parameter key="use_block_type_exception" value="false"/>
                    <parameter key="except_block_type" value="single_value"/>
                    <parameter key="invert_selection" value="false"/>
                    <parameter key="include_special_attributes" value="false"/>
                    <description align="center" color="transparent" colored="false" width="126">Change all categorical columns to text</description>
                  </operator>
                  <operator activated="true" class="text_to_nominal" compatibility="9.1.000-BETA2" expanded="true" height="82" name="Text to Nominal" origin="EXPORTED_TURBOPREP" width="90" x="179" y="34">
                    <parameter key="attribute_filter_type" value="value_type"/>
                    <parameter key="attribute" value=""/>
                    <parameter key="attributes" value=""/>
                    <parameter key="use_except_expression" value="false"/>
                    <parameter key="value_type" value="text"/>
                    <parameter key="use_value_type_exception" value="false"/>
                    <parameter key="except_value_type" value="text"/>
                    <parameter key="block_type" value="value_matrix"/>
                    <parameter key="use_block_type_exception" value="false"/>
                    <parameter key="except_block_type" value="value_matrix_row_start"/>
                    <parameter key="invert_selection" value="false"/>
                    <parameter key="include_special_attributes" value="false"/>
                    <description align="center" color="transparent" colored="false" width="126">Change all text columns to nominal</description>
                  </operator>
                  <operator activated="true" class="numerical_to_real" compatibility="9.1.000-BETA2" expanded="true" height="82" name="Numerical to Real" origin="EXPORTED_TURBOPREP" width="90" x="313" y="34">
                    <parameter key="attribute_filter_type" value="value_type"/>
                    <parameter key="attribute" value=""/>
                    <parameter key="attributes" value=""/>
                    <parameter key="use_except_expression" value="false"/>
                    <parameter key="value_type" value="numeric"/>
                    <parameter key="use_value_type_exception" value="false"/>
                    <parameter key="except_value_type" value="real"/>
                    <parameter key="block_type" value="value_series"/>
                    <parameter key="use_block_type_exception" value="false"/>
                    <parameter key="except_block_type" value="value_series_end"/>
                    <parameter key="invert_selection" value="false"/>
                    <parameter key="include_special_attributes" value="false"/>
                    <description align="center" color="transparent" colored="false" width="126">Change all numerical columns to real</description>
                  </operator>
                  <connect from_port="in 1" to_op="Nominal to Text" to_port="example set input"/>
                  <connect from_op="Nominal to Text" from_port="example set output" to_op="Text to Nominal" to_port="example set input"/>
                  <connect from_op="Text to Nominal" from_port="example set output" to_op="Numerical to Real" to_port="example set input"/>
                  <connect from_op="Numerical to Real" from_port="example set output" to_port="out 1"/>
                  <portSpacing port="source_in 1" spacing="0"/>
                  <portSpacing port="source_in 2" spacing="0"/>
                  <portSpacing port="sink_out 1" spacing="0"/>
                  <portSpacing port="sink_out 2" spacing="0"/>
                </process>
                <description align="center" color="transparent" colored="false" width="126">Unify column types</description>
              </operator>
              <operator activated="true" class="subprocess" compatibility="9.1.000-BETA2" expanded="true" height="82" name="Subprocess (2)" origin="EXPORTED_TURBOPREP" width="90" x="313" y="34">
                <process expanded="true">
                  <operator activated="true" class="aggregate" compatibility="9.1.000-BETA2" expanded="true" height="82" name="Aggregate" origin="EXPORTED_TURBOPREP" width="90" x="45" y="34">
                    <parameter key="use_default_aggregation" value="false"/>
                    <parameter key="attribute_filter_type" value="all"/>
                    <parameter key="attribute" value=""/>
                    <parameter key="attributes" value=""/>
                    <parameter key="use_except_expression" value="false"/>
                    <parameter key="value_type" value="attribute_value"/>
                    <parameter key="use_value_type_exception" value="false"/>
                    <parameter key="except_value_type" value="time"/>
                    <parameter key="block_type" value="attribute_block"/>
                    <parameter key="use_block_type_exception" value="false"/>
                    <parameter key="except_block_type" value="value_matrix_row_start"/>
                    <parameter key="invert_selection" value="false"/>
                    <parameter key="include_special_attributes" value="false"/>
                    <parameter key="default_aggregation_function" value="average"/>
                    <list key="aggregation_attributes">
                      <parameter key="Age" value="maximum"/>
                    </list>
                    <parameter key="group_by_attributes" value="Survived"/>
                    <parameter key="count_all_combinations" value="false"/>
                    <parameter key="only_distinct" value="false"/>
                    <parameter key="ignore_missings" value="true"/>
                  </operator>
                  <connect from_port="in 1" to_op="Aggregate" to_port="example set input"/>
                  <connect from_op="Aggregate" from_port="example set output" to_port="out 1"/>
                  <portSpacing port="source_in 1" spacing="0"/>
                  <portSpacing port="source_in 2" spacing="0"/>
                  <portSpacing port="sink_out 1" spacing="0"/>
                  <portSpacing port="sink_out 2" spacing="0"/>
                </process>
                <description align="center" color="transparent" colored="false" width="126">Aggregate Age grouped by Survived</description>
              </operator>
              <connect from_op="Retrieve" from_port="output" to_op="Subprocess" to_port="in 1"/>
              <connect from_op="Subprocess" from_port="out 1" to_op="Subprocess (2)" to_port="in 1"/>
              <connect from_op="Subprocess (2)" from_port="out 1" to_port="out 1"/>
              <portSpacing port="source_in 1" spacing="0"/>
              <portSpacing port="sink_out 1" spacing="0"/>
              <portSpacing port="sink_out 2" spacing="0"/>
            </process>
            <description align="center" color="transparent" colored="false" width="126">Create data with two rows only...</description>
          </operator>
          <operator activated="true" class="subprocess" compatibility="9.1.000-BETA2" expanded="true" height="82" name="Subprocess (3)" origin="EXPORTED_TURBOPREP" width="90" x="179" y="34">
            <process expanded="true">
              <operator activated="true" class="aggregate" compatibility="9.1.000-BETA2" expanded="true" height="82" name="Aggregate (2)" origin="EXPORTED_TURBOPREP" width="90" x="45" y="34">
                <parameter key="use_default_aggregation" value="false"/>
                <parameter key="attribute_filter_type" value="all"/>
                <parameter key="attribute" value=""/>
                <parameter key="attributes" value=""/>
                <parameter key="use_except_expression" value="false"/>
                <parameter key="value_type" value="attribute_value"/>
                <parameter key="use_value_type_exception" value="false"/>
                <parameter key="except_value_type" value="time"/>
                <parameter key="block_type" value="attribute_block"/>
                <parameter key="use_block_type_exception" value="false"/>
                <parameter key="except_block_type" value="value_matrix_row_start"/>
                <parameter key="invert_selection" value="false"/>
                <parameter key="include_special_attributes" value="false"/>
                <parameter key="default_aggregation_function" value="average"/>
                <list key="aggregation_attributes">
                  <parameter key="maximum(Age)" value="median"/>
                </list>
                <parameter key="group_by_attributes" value=""/>
                <parameter key="count_all_combinations" value="false"/>
                <parameter key="only_distinct" value="false"/>
                <parameter key="ignore_missings" value="true"/>
              </operator>
              <connect from_port="in 1" to_op="Aggregate (2)" to_port="example set input"/>
              <connect from_op="Aggregate (2)" from_port="example set output" to_port="out 1"/>
              <portSpacing port="source_in 1" spacing="0"/>
              <portSpacing port="source_in 2" spacing="0"/>
              <portSpacing port="sink_out 1" spacing="0"/>
              <portSpacing port="sink_out 2" spacing="0"/>
            </process>
            <description align="center" color="transparent" colored="false" width="126">Aggregate (Median)</description>
          </operator>
          <connect from_op="Subprocess (4)" from_port="out 1" to_op="Subprocess (3)" to_port="in 1"/>
          <connect from_op="Subprocess (3)" from_port="out 1" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>


    RapidMiner Wisdom 2020
    February 11th and 12th 2020 in Boston, MA, USA

    sgenzer
Sign In or Register to comment.