Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

[SOLVED] Convert binominal to numeric?

wesselwessel Member Posts: 537 Maven
edited November 2018 in Help
Dear All,

I have several binomial attributes, on which I wish to run linear regression.
So I must convert these binomial attributes with values "true" and "false" to real attributes with values "1" and "0".
How can I do this?

I tried the generate attributes operator but this did not work.
I used the following settings:
attribute name: myNewAtt    
functional expression: if(myAtt == true, 1, 0)

Even though this expression is functionally correct, it always returns 0.

Best regards,

Wessel

Answers

  • wesselwessel Member Posts: 537 Maven
    A process that does work is the following:
    using operators
    1. replace (replace all true values to 1)
    2. replace (replace all false values to 0)
    3. parse numbers

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.1.017">
     <context>
       <input/>
       <output/>
       <macros/>
     </context>
     <operator activated="true" class="process" compatibility="5.1.017" expanded="true" name="Process">
       <process expanded="true" height="642" width="778">
         <operator activated="true" class="replace" compatibility="5.1.017" expanded="true" height="76" name="Replace" width="90" x="59" y="140">
           <parameter key="attribute_filter_type" value="subset"/>
           <parameter key="attributes" value="|cluster_2|cluster_1|cluster_0"/>
           <parameter key="replace_what" value="true"/>
           <parameter key="replace_by" value="1"/>
         </operator>
         <operator activated="true" class="replace" compatibility="5.1.017" expanded="true" height="76" name="Replace (2)" width="90" x="187" y="85">
           <parameter key="attribute_filter_type" value="subset"/>
           <parameter key="attributes" value="|cluster_2|cluster_1|cluster_0"/>
           <parameter key="replace_what" value="false"/>
           <parameter key="replace_by" value="0"/>
         </operator>
         <operator activated="true" class="parse_numbers" compatibility="5.1.017" expanded="true" height="76" name="Parse Numbers" width="90" x="315" y="30">
           <parameter key="attribute_filter_type" value="subset"/>
           <parameter key="attributes" value="|cluster_2|cluster_1|cluster_0"/>
         </operator>
         <connect from_op="Replace" from_port="example set output" to_op="Replace (2)" to_port="example set input"/>
         <connect from_op="Replace (2)" from_port="example set output" to_op="Parse Numbers" to_port="example set input"/>
         <portSpacing port="source_input 1" spacing="0"/>
         <portSpacing port="sink_result 1" spacing="0"/>
       </process>
     </operator>
    </process>
  • earmijoearmijo Member Posts: 271 Unicorn
    Hi Wessel:

    Two additional solutions:

    1) Use Weka's Linear Regression Operator. It will code the binomial attributes for you automatically. This is sooooo convenient.

    2) Use the "Nominal to Numerical" Operator and select Dummy Coding. You have to define then for each binomial variable a "comparison group" which will get coded 0. According to your message, the comparison group will be false.

    Regards,

    \E

    Here's a example that uses the Golf dataset:
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.1.017">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.1.017" expanded="true" name="Process">
        <process expanded="true" height="637" width="950">
          <operator activated="true" class="retrieve" compatibility="5.1.017" expanded="true" height="60" name="Retrieve" width="90" x="45" y="75">
            <parameter key="repository_entry" value="//Samples/data/Golf"/>
          </operator>
          <operator activated="true" class="nominal_to_numerical" compatibility="5.1.017" expanded="true" height="94" name="Nominal to Numerical" width="90" x="182" y="72">
            <parameter key="coding_type" value="dummy coding"/>
            <parameter key="use_comparison_groups" value="true"/>
            <list key="comparison_groups">
              <parameter key="Wind" value="false"/>
              <parameter key="Outlook" value="sunny"/>
            </list>
          </operator>
          <connect from_op="Retrieve" from_port="output" to_op="Nominal to Numerical" to_port="example set input"/>
          <connect from_op="Nominal to Numerical" from_port="example set output" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
Sign In or Register to comment.