Can we stores statistics data in excel?

Himanshu_PantHimanshu_Pant Member Posts: 46 Contributor I
The statistics data is getting stored in rmhdf5table. Can I get it in excel?

Best Answer

  • BalazsBaranyBalazsBarany Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert Posts: 955 Unicorn
    Solution Accepted
    Hi,

    you can save some metadata into an Excel file using the Reporting extension.
    https://marketplace.rapidminer.com/UpdateServer/faces/product_details.xhtml?productId=rmx_reporting

    For the more precise question, check out the Aggregate operator. It has a setting "use default aggregation". This gives you the min, max or average - whichever function you select - for all attributes in one step. Just do this as often as you like. 

    The "Attributes on rows" part can be done with De-Pivot.

    Here's an example process:
    <?xml version="1.0" encoding="UTF-8"?><process version="9.9.002">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="9.9.002" expanded="true" name="Process">
        <parameter key="logverbosity" value="init"/>
        <parameter key="random_seed" value="-1"/>
        <parameter key="send_mail" value="never"/>
        <parameter key="notification_email" value=""/>
        <parameter key="process_duration_for_mail" value="30"/>
        <parameter key="encoding" value="SYSTEM"/>
        <process expanded="true">
          <operator activated="true" class="retrieve" compatibility="9.9.002" expanded="true" height="68" name="Retrieve Iris" width="90" x="45" y="34">
            <parameter key="repository_entry" value="//Samples/data/Iris"/>
          </operator>
          <operator activated="true" class="multiply" compatibility="9.9.002" expanded="true" height="124" name="Multiply" width="90" x="179" y="34"/>
          <operator activated="true" class="aggregate" compatibility="9.9.002" expanded="true" height="82" name="Aggregate max" width="90" x="313" y="238">
            <parameter key="use_default_aggregation" value="true"/>
            <parameter key="attribute_filter_type" value="all"/>
            <parameter key="attribute" value=""/>
            <parameter key="attributes" value=""/>
            <parameter key="use_except_expression" value="false"/>
            <parameter key="value_type" value="attribute_value"/>
            <parameter key="use_value_type_exception" value="false"/>
            <parameter key="except_value_type" value="time"/>
            <parameter key="block_type" value="attribute_block"/>
            <parameter key="use_block_type_exception" value="false"/>
            <parameter key="except_block_type" value="value_matrix_row_start"/>
            <parameter key="invert_selection" value="false"/>
            <parameter key="include_special_attributes" value="false"/>
            <parameter key="default_aggregation_function" value="maximum"/>
            <list key="aggregation_attributes"/>
            <parameter key="group_by_attributes" value=""/>
            <parameter key="count_all_combinations" value="false"/>
            <parameter key="only_distinct" value="false"/>
            <parameter key="ignore_missings" value="true"/>
          </operator>
          <operator activated="true" class="aggregate" compatibility="9.9.002" expanded="true" height="82" name="Aggregate avg" width="90" x="313" y="34">
            <parameter key="use_default_aggregation" value="true"/>
            <parameter key="attribute_filter_type" value="all"/>
            <parameter key="attribute" value=""/>
            <parameter key="attributes" value=""/>
            <parameter key="use_except_expression" value="false"/>
            <parameter key="value_type" value="attribute_value"/>
            <parameter key="use_value_type_exception" value="false"/>
            <parameter key="except_value_type" value="time"/>
            <parameter key="block_type" value="attribute_block"/>
            <parameter key="use_block_type_exception" value="false"/>
            <parameter key="except_block_type" value="value_matrix_row_start"/>
            <parameter key="invert_selection" value="false"/>
            <parameter key="include_special_attributes" value="false"/>
            <parameter key="default_aggregation_function" value="average"/>
            <list key="aggregation_attributes"/>
            <parameter key="group_by_attributes" value=""/>
            <parameter key="count_all_combinations" value="false"/>
            <parameter key="only_distinct" value="false"/>
            <parameter key="ignore_missings" value="true"/>
          </operator>
          <operator activated="true" class="aggregate" compatibility="9.9.002" expanded="true" height="82" name="Aggregate min" width="90" x="313" y="136">
            <parameter key="use_default_aggregation" value="true"/>
            <parameter key="attribute_filter_type" value="all"/>
            <parameter key="attribute" value=""/>
            <parameter key="attributes" value=""/>
            <parameter key="use_except_expression" value="false"/>
            <parameter key="value_type" value="attribute_value"/>
            <parameter key="use_value_type_exception" value="false"/>
            <parameter key="except_value_type" value="time"/>
            <parameter key="block_type" value="attribute_block"/>
            <parameter key="use_block_type_exception" value="false"/>
            <parameter key="except_block_type" value="value_matrix_row_start"/>
            <parameter key="invert_selection" value="false"/>
            <parameter key="include_special_attributes" value="false"/>
            <parameter key="default_aggregation_function" value="minimum"/>
            <list key="aggregation_attributes"/>
            <parameter key="group_by_attributes" value=""/>
            <parameter key="count_all_combinations" value="false"/>
            <parameter key="only_distinct" value="false"/>
            <parameter key="ignore_missings" value="true"/>
          </operator>
          <operator activated="true" class="cartesian_product" compatibility="9.9.002" expanded="true" height="82" name="Cartesian first" width="90" x="514" y="34">
            <parameter key="remove_double_attributes" value="true"/>
          </operator>
          <operator activated="true" class="cartesian_product" compatibility="9.9.002" expanded="true" height="82" name="Cartesian second" width="90" x="648" y="85">
            <parameter key="remove_double_attributes" value="true"/>
          </operator>
          <operator activated="true" class="de_pivot" compatibility="9.9.002" expanded="true" height="82" name="De-Pivot" width="90" x="782" y="85">
            <list key="attribute_name">
              <parameter key="average" value="average.+"/>
              <parameter key="min" value="minimum.+"/>
              <parameter key="max" value="maximum.+"/>
            </list>
            <parameter key="index_attribute" value="idx"/>
            <parameter key="create_nominal_index" value="false"/>
            <parameter key="keep_missings" value="false"/>
          </operator>
          <connect from_op="Retrieve Iris" from_port="output" to_op="Multiply" to_port="input"/>
          <connect from_op="Multiply" from_port="output 1" to_op="Aggregate avg" to_port="example set input"/>
          <connect from_op="Multiply" from_port="output 2" to_op="Aggregate min" to_port="example set input"/>
          <connect from_op="Multiply" from_port="output 3" to_op="Aggregate max" to_port="example set input"/>
          <connect from_op="Aggregate max" from_port="example set output" to_op="Cartesian second" to_port="right"/>
          <connect from_op="Aggregate avg" from_port="example set output" to_op="Cartesian first" to_port="left"/>
          <connect from_op="Aggregate min" from_port="example set output" to_op="Cartesian first" to_port="right"/>
          <connect from_op="Cartesian first" from_port="join" to_op="Cartesian second" to_port="left"/>
          <connect from_op="Cartesian second" from_port="join" to_op="De-Pivot" to_port="example set input"/>
          <connect from_op="De-Pivot" from_port="example set output" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="42"/>
          <portSpacing port="sink_result 2" spacing="84"/>
        </process>
      </operator>
    </process>


    This is probably more flexible than just getting the metadata. You can set breakpoints on the operators to see their intermediate results.


    Regards,

    Balázs

    lionelderkrikor

Answers

  • Himanshu_PantHimanshu_Pant Member Posts: 46 Contributor I
    Actually I want to create a excel or CSV file which have high, low and average value of each attribute. e.g.

    Attribute     High  Low  Average
    Attribute1   100    50    75
Sign In or Register to comment.