Options

Counting occurrences of values

RitikaRitika Member Posts: 11 Newbie
Hi! I want to count the number of times a certain value appears in an attribute and print it out for each value. I tried using the aggregate operator but I don't think it looks for and counts specific values. Also, I would really appreciate it if the answer could be posted with an explanation/screenshot of the actual process in RapidMiner instead of code.

Comments

  • Options
    lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
    Hi @Ritika

    Did you set the "Group by attributes" parameter ?

    Regards,

    Lionel

  • Options
    RitikaRitika Member Posts: 11 Newbie
    Hi Lionel,

    I used "group by attributes" to select the specific attribute where my values are. This only separates the entire attribute though. There are multiple values within this attribute -- I want to count the number of times each value occurs.
  • Options
    lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
    @Ritika,

    Please take a look at the process in attached file and tell me if it answers to your need...
    If yes, adapt it to your process.
    If not, please share your dataset and explain more explicitly what you want to achieve ...

    Regards,

    Lionel 
  • Options
    RitikaRitika Member Posts: 11 Newbie
    I'm unable to open the file in Rapid Miner; I get an error that the file is malformed. Sorry about this. Is there another way you could send it to me?
  • Options
    lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
    @Ritika,

    Please copy the code here, and paste it in your XML panel and click on the green mark. (the process will appear in the main window) : 
    <?xml version="1.0" encoding="UTF-8"?><process version="9.9.002">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="9.9.002" expanded="true" name="Process">
        <parameter key="logverbosity" value="init"/>
        <parameter key="random_seed" value="2001"/>
        <parameter key="send_mail" value="never"/>
        <parameter key="notification_email" value=""/>
        <parameter key="process_duration_for_mail" value="30"/>
        <parameter key="encoding" value="SYSTEM"/>
        <process expanded="true">
          <operator activated="true" class="retrieve" compatibility="9.9.002" expanded="true" height="68" name="Retrieve Golf" width="90" x="179" y="85">
            <parameter key="repository_entry" value="//Samples/data/Golf"/>
          </operator>
          <operator activated="true" class="aggregate" compatibility="9.9.002" expanded="true" height="82" name="Aggregate" width="90" x="380" y="85">
            <parameter key="use_default_aggregation" value="false"/>
            <parameter key="attribute_filter_type" value="all"/>
            <parameter key="attribute" value=""/>
            <parameter key="attributes" value=""/>
            <parameter key="use_except_expression" value="false"/>
            <parameter key="value_type" value="attribute_value"/>
            <parameter key="use_value_type_exception" value="false"/>
            <parameter key="except_value_type" value="time"/>
            <parameter key="block_type" value="attribute_block"/>
            <parameter key="use_block_type_exception" value="false"/>
            <parameter key="except_block_type" value="value_matrix_row_start"/>
            <parameter key="invert_selection" value="false"/>
            <parameter key="include_special_attributes" value="false"/>
            <parameter key="default_aggregation_function" value="average"/>
            <list key="aggregation_attributes">
              <parameter key="Outlook" value="count"/>
            </list>
            <parameter key="group_by_attributes" value="Outlook"/>
            <parameter key="count_all_combinations" value="false"/>
            <parameter key="only_distinct" value="false"/>
            <parameter key="ignore_missings" value="true"/>
          </operator>
          <connect from_op="Retrieve Golf" from_port="output" to_op="Aggregate" to_port="example set input"/>
          <connect from_op="Aggregate" from_port="example set output" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>

    Regards,

    Lionel


  • Options
    RitikaRitika Member Posts: 11 Newbie
    Yes, I was able to adapt to my process! Thank you so much!
  • Options
    lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
    You're welcome ! 

    good luck ! 

    Regards,

    Lionel
Sign In or Register to comment.