user based discretization

satyendrasatyendra Member Posts: 13 Contributor II
edited November 2018 in Help
hiii all,
          i facing problem in userbased discretization. i am working on medical dataset.in that data set i want to divide an attribute 'pregnant' in three parts in the folloing manner

low(0,1)
medium(2,4,5,6)
high(6<)

now my problem is how to do it.please help me in this......
                                                  with regards,
                                                        satyendra ???

Answers

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi Satyendra,
    if you are really going to get to know which of your patients are medium pregnant, here's the way to go:
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.0">
      <context>
        <input>
          <location/>
        </input>
        <output>
          <location/>
          <location/>
        </output>
        <macros/>
      </context>
      <operator activated="true" class="process" expanded="true" name="Process">
        <process expanded="true" height="-20" width="-50">
          <operator activated="true" class="generate_data" expanded="true" height="60" name="Generate Data" width="90" x="131" y="90"/>
          <operator activated="true" class="rename" expanded="true" height="76" name="Rename" width="90" x="327" y="98">
            <parameter key="old_name" value="att1"/>
            <parameter key="new_name" value="pregnant"/>
          </operator>
          <operator activated="true" class="discretize_by_user_specification" expanded="true" height="94" name="Discretize" width="90" x="530" y="95">
            <parameter key="attribute_filter_type" value="single"/>
            <parameter key="attribute" value="pregnant"/>
            <list key="classes">
              <parameter key="flow" value="1.0"/>
              <parameter key="medium" value="6.0"/>
              <parameter key="high" value="Infinity"/>
            </list>
          </operator>
          <connect from_op="Generate Data" from_port="output" to_op="Rename" to_port="example set input"/>
          <connect from_op="Rename" from_port="example set output" to_op="Discretize" to_port="example set input"/>
          <connect from_op="Discretize" from_port="example set output" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    Anyway an interesting philosophical question: Can one be a little bit pregnant? :)

    Greetings,
      Sebastian
  • satyendrasatyendra Member Posts: 13 Contributor II
    hello sir,
                previously i couldn`t explain my problem properly .............actully i am working in diebetes data set of females ,each of age 21or above.in this data set  i have to predict whether patient is going to develop diebetes or not.this data set containes several attributes like 'plasma glucose','number of times patient gets pregnant','body mass index'.......and so on.now i want to do the user based discretization and then apply the chaid for learning.in user based discretization i want to discretize in the following manner.........
    for plasma glucose

    low(<95)
    medium(95-140)
    high(140<)

    for body mass index

    low (<18.5)
    healthy(18.5-25)
    overweight( 25-30)
    obese(30-35)
    severely obese(35<)


    and similarly for other attributes also..............now my problem is that i couldn`t able to do it......so please help me in this.....
                            with regards
                                        satyendra
Sign In or Register to comment.