RapidMiner

RM Certified Expert
RM Certified Expert

Re: Setting Binominal Label to positive or negative

Hi,
the solution is comparably easy:
Your label simply isn't binominal, so the remapping operator can't do anything about this. I must admit, that it should somehow notify you about this and I have added the proper meta data testing to the operator. This will be available with the next version.

To make your process work, you have to insert a nominal to binominal operator like in the process below:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.0">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" expanded="true" name="Process">
    <process expanded="true" height="463" width="1083">
      <operator activated="true" class="set_macros" expanded="true" height="60" name="Define Makros" width="90" x="45" y="30">
        <list key="macros">
          <parameter key="DataRoot" value="/home/jdoe/data/"/>
        </list>
      </operator>
      <operator activated="true" class="subprocess" expanded="true" height="76" name="Read Zd 15-25" width="90" x="45" y="120">
        <process expanded="true" height="463" width="844">
          <operator activated="true" class="read_csv" expanded="true" height="60" name="Read CSV (7)" width="90" x="112" y="30">
            <description>Liest zwei CSV-Dateien ein und kombiniert die Example-Sets</description>
            <parameter key="file_name" value="C:\Dokumente und Einstellungen\sland\Desktop\Hadron_4000_Zd_15_25.csv"/>
          </operator>
          <operator activated="true" class="read_csv" expanded="true" height="60" name="Read CSV (8)" width="90" x="112" y="210">
            <description>Liest zwei CSV-Dateien ein und kombiniert die Example-Sets</description>
            <parameter key="file_name" value="C:\Dokumente und Einstellungen\sland\Desktop\Gamma_4000_Zd_15_25.csv"/>
          </operator>
          <operator activated="true" class="append" expanded="true" height="94" name="Append (4)" width="90" x="315" y="30"/>
          <connect from_op="Read CSV (7)" from_port="output" to_op="Append (4)" to_port="example set 1"/>
          <connect from_op="Read CSV (8)" from_port="output" to_op="Append (4)" to_port="example set 2"/>
          <connect from_op="Append (4)" from_port="merged set" to_port="out 1"/>
          <portSpacing port="source_in 1" spacing="0"/>
          <portSpacing port="sink_out 1" spacing="0"/>
          <portSpacing port="sink_out 2" spacing="0"/>
        </process>
      </operator>
      <operator activated="true" class="set_role" expanded="true" height="76" name="Mark Id" width="90" x="185" y="120">
        <parameter key="name" value="idx"/>
        <parameter key="target_role" value="id"/>
      </operator>
      <operator activated="true" class="set_role" expanded="true" height="76" name="Mark Label" width="90" x="313" y="120">
        <parameter key="name" value="ExpectedLabel"/>
        <parameter key="target_role" value="label"/>
      </operator>
      <operator activated="true" class="nominal_to_binominal" expanded="true" height="94" name="Nominal to Binominal" width="90" x="447" y="210">
        <parameter key="attribute_filter_type" value="single"/>
        <parameter key="attribute" value="ExpectedLabel"/>
        <parameter key="include_special_attributes" value="true"/>
      </operator>
      <operator activated="true" class="remap_binominals" expanded="true" height="76" name="Remap Binominals" width="90" x="581" y="255">
        <parameter key="attribute_filter_type" value="single"/>
        <parameter key="attribute" value="ExpectedLabel"/>
        <parameter key="include_special_attributes" value="true"/>
        <parameter key="negative_value" value="Gamma"/>
        <parameter key="positive_value" value="Hadron"/>
      </operator>
      <operator activated="true" class="naive_bayes" expanded="true" height="76" name="Naive Bayes (2)" width="90" x="648" y="75"/>
      <operator activated="true" class="apply_model" expanded="true" height="76" name="Apply Model (2)" width="90" x="782" y="75">
        <list key="application_parameters"/>
      </operator>
      <operator activated="true" breakpoints="after" class="performance_binominal_classification" expanded="true" height="76" name="Performance (2)" width="90" x="916" y="120">
        <parameter key="main_criterion" value="accuracy"/>
        <parameter key="precision" value="true"/>
        <parameter key="recall" value="true"/>
      </operator>
      <connect from_op="Read Zd 15-25" from_port="out 1" to_op="Mark Id" to_port="example set input"/>
      <connect from_op="Mark Id" from_port="example set output" to_op="Mark Label" to_port="example set input"/>
      <connect from_op="Mark Label" from_port="example set output" to_op="Nominal to Binominal" to_port="example set input"/>
      <connect from_op="Nominal to Binominal" from_port="example set output" to_op="Remap Binominals" to_port="example set input"/>
      <connect from_op="Remap Binominals" from_port="example set output" to_op="Naive Bayes (2)" to_port="training set"/>
      <connect from_op="Naive Bayes (2)" from_port="model" to_op="Apply Model (2)" to_port="model"/>
      <connect from_op="Naive Bayes (2)" from_port="exampleSet" to_op="Apply Model (2)" to_port="unlabelled data"/>
      <connect from_op="Apply Model (2)" from_port="labelled data" to_op="Performance (2)" to_port="labelled data"/>
      <connect from_op="Performance (2)" from_port="performance" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>


Greetings,
  Sebastian
Old World Computing - Establishing the Future

Professional consulting for your Data Science problems

Contributor II Rasputin
Contributor II

Re: Setting Binominal Label to positive or negative

Hi,
thanks for your reply. With the modified process I got it working.
But the "Nominal to Binominal" Operator has no effect unless I enable the option "transform binominal". So it seems that it believes that the data is already binominal. With this option enabled, the process works. But I have one remark: the "Remap Binominal" operator should output a warning if the example set does not contain the specified attribute or if the attribute does not contain the specified values. Currently it just continues, which is very annoying if you have a typo in one of the fields.
RM Certified Expert
RM Certified Expert

Re: Setting Binominal Label to positive or negative

Hi,
this is exactly what I added to the code immediately Smiley Happy By the way: You must NOT turn this parameter on, unless you want to have your binominal attribute dichotomized. If you just want to change the attribute type, turn it off. If you make a breakpoint just after the operator, you will see that the type changed, but nothing more.

Greetings,
  Sebastian
Old World Computing - Establishing the Future

Professional consulting for your Data Science problems

Contributor II Rasputin
Contributor II

Re: Setting Binominal Label to positive or negative

No, the type does not change if I do not enable "transform binominal". If I do, I get those two new binominal attributes "Label = A" and "Label = B", which is also not exactly what I want.
RM Certified Expert
RM Certified Expert

Re: Setting Binominal Label to positive or negative

Hi,
I have posted a small process to the RapidMiner's community extension. If you install this extension, you could open the process called "Correct Attribute Type to Binominal". Please take a look at it and if this does not work as expected, update your RapidMiner to the last version.

Greetings,
  Sebastian
Old World Computing - Establishing the Future

Professional consulting for your Data Science problems

Highlighted
Contributor II Rasputin
Contributor II

Re: Setting Binominal Label to positive or negative

Thank you, now I got it working. For some reason, I had "create view" enabled in the operator. After disabling, everything worked fine.
Polls
How can RapidMiner increase participation in our new competitions?
Twitter Feed