Ponderated Sum Different Attributes

emanuelmcruzemanuelmcruz Member Posts: 6 Contributor I
edited November 2018 in Help

Hi Everybody, i'm new to Rapidminer and i'm doing a project with a Medical database.

I have 17 attributes with values of 1 and 0. Some values have a score of 1, 2, 3 and other 6. I want to create a new attribute which contains the sum of the scores depending on the value of the different attributes.

For example i want to sum the score of Attribute_1 to Attribute_17 only when they are 1 and next i want to sum the different scores of the different Attributes in one new Attribute (Sum Score)

 

I know this must be a easy problem, but i can't seem to find the answer, i tried "generate attribute" and followed a "If-Then" logic, but i can't sum the scores of the different attributes, i can only have the last one positive.

Thank you in advance, i hope you can help me.

Tagged:

Best Answers

  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
    Solution Accepted

    Hi again @emanuelmcruz,

     

    If attributes can only have 2 values (0 or 1), you can use the Generate Aggregation operator : 

    <?xml version="1.0" encoding="UTF-8"?><process version="8.2.000">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="8.2.000" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="read_excel" compatibility="8.2.000" expanded="true" height="68" name="Read Excel" width="90" x="45" y="34">
    <parameter key="excel_file" value="C:\Users\Lionel\Documents\Formations_DataScience\Rapidminer\Tests_Rapidminer\Somme_Si\Somme_Si.xlsx"/>
    <parameter key="imported_cell_range" value="A1:G3"/>
    <parameter key="first_row_as_names" value="false"/>
    <list key="annotations">
    <parameter key="0" value="Name"/>
    </list>
    <list key="data_set_meta_data_information">
    <parameter key="0" value="Att1.true.integer.attribute"/>
    <parameter key="1" value="Att2.true.integer.attribute"/>
    <parameter key="2" value="Att3.true.integer.attribute"/>
    <parameter key="3" value="Att4.true.integer.attribute"/>
    <parameter key="4" value="Att5.true.integer.attribute"/>
    <parameter key="5" value="Att6.true.integer.attribute"/>
    <parameter key="6" value="Att7.true.integer.attribute"/>
    </list>
    </operator>
    <operator activated="true" class="generate_aggregation" compatibility="8.2.000" expanded="true" height="82" name="Generate Aggregation" width="90" x="179" y="34">
    <parameter key="attribute_name" value="sum"/>
    </operator>
    <connect from_op="Read Excel" from_port="output" to_op="Generate Aggregation" to_port="example set input"/>
    <connect from_op="Generate Aggregation" from_port="example set output" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>

    I hope it helps,

     

    Regards,

     

    Lionel

     

  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    Solution Accepted

    Counting the positives is the same as computing the sum if your possible values are only zero or one.  So "Generate Aggregation" across your 17 attributes with the function "sum" should do the trick for you.

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts

Answers

  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn

    Hi @emanuelmcruz

     

    Can you

     - share your dataset and

     - based on an extract of your dataset, post an example of what you want to obtain.

     

    Regards,

     

    Lionel

  • emanuelmcruzemanuelmcruz Member Posts: 6 Contributor I

    I want to make a new column in which it represent the number of times the different previous attributes is positive.

    For example in this image: i want to make a column where it counts the number of attributes from the 17 different ones, that are positive.

    It works like a index, meaning is a attribute is positive it has a score, and i want to sum the counts in which this attributes are positive.

    It's a Charlson Comorbidity Index

    cci.JPG 30.4K
  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn

    Hi again @emanuelmcruz,

     

    To be sure to understand, what you want to obtain is like that (example with 7 attributes) : 

     

    Somme_Si.pngmpp

     

    Regards,

     

    Lionel

  • emanuelmcruzemanuelmcruz Member Posts: 6 Contributor I

    Exactly like that,the attributes can only be 1 or 0, but the last column its correct.

     

    How can i do that?

  • emanuelmcruzemanuelmcruz Member Posts: 6 Contributor I

    I use the Generate Aggregation, but i can't seem to do the Parameters right, do i have to select attributes, and then Generate Aggregation, but which parameters i use, so i can count only the positives

  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn

    Hi again @emanuelmcruz,

     

    I have difficulties to understand the content of your dataset : 

    I thought that you have only 0 and 1 on your dataset, so all your values are positive ?

    Finally, to sum up, you have 17 attributes with only 0 and 1 and additionnal attribute(s) with negative values ? that's right ?

     

    Regards,

     

    Lionel

     

  • emanuelmcruzemanuelmcruz Member Posts: 6 Contributor I
    I only have 0 and 1 in all 17 attributes, i want to count the number of times the number 1 appears.
Sign In or Register to comment.