Issue with Nominal to Binominal

mattwlmattwl Member Posts: 5 Contributor I
edited November 2018 in Help
Hello

I am a new user to Rapid Miner, still trying to find my way around.
In a model I'm building, I keep coming across this error and can't find my way around it for hours now.
The error is when I try converting a 'nominal to binominal'.
The error I'm getting is 'Attribute filter does not match any attributes' when I try convert 'nominal to binomial' on the attribute named 'location3'
However, I have a break point in the prior step in model and it shows that it is successfully outputting the attribute named 'location3'
Any ideas?

Thanks in advance.

Here is my XML:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="6.4.000">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="6.4.000" expanded="true" name="Process">
    <parameter key="resultfile" value="/Users/ml/Documents/Development/Social Data/tourism/validation-location-select.res"/>
    <process expanded="true">
      <operator activated="true" class="read_csv" compatibility="6.4.000" expanded="true" height="60" name="Read CSV" width="90" x="45" y="75">
        <parameter key="csv_file" value="/Users/ml/Documents/Development/Social Data/tourism/training locaton data for melb.csv"/>
        <parameter key="column_separators" value=","/>
        <parameter key="first_row_as_names" value="false"/>
        <list key="annotations">
          <parameter key="0" value="Name"/>
        </list>
        <parameter key="encoding" value="UTF-8"/>
        <list key="data_set_meta_data_information">
          <parameter key="0" value="id.true.integer.attribute"/>
          <parameter key="1" value="twitter_id.true.real.attribute"/>
          <parameter key="2" value="location(2).true.text.attribute"/>
          <parameter key="3" value="locatoin3.true.nominal.attribute"/>
          <parameter key="4" value="tourist.true.nominal.label"/>
        </list>
      </operator>
      <operator activated="true" class="set_role" compatibility="6.4.000" expanded="true" height="76" name="Set Role" width="90" x="179" y="75">
        <parameter key="attribute_name" value="id"/>
        <parameter key="target_role" value="id"/>
        <list key="set_additional_roles"/>
      </operator>
      <operator activated="true" class="text:process_document_from_data" compatibility="6.1.000" expanded="true" height="76" name="Process Documents from Data" width="90" x="313" y="75">
        <parameter key="keep_text" value="true"/>
        <parameter key="prune_below_absolute" value="2"/>
        <parameter key="prune_above_absolute" value="999999"/>
        <list key="specify_weights"/>
        <process expanded="true">
          <operator activated="true" class="text:tokenize" compatibility="6.1.000" expanded="true" height="60" name="Tokenize" width="90" x="112" y="75"/>
          <operator activated="true" class="text:transform_cases" compatibility="6.1.000" expanded="true" height="60" name="Transform Cases" width="90" x="246" y="75"/>
          <operator activated="true" class="text:filter_by_length" compatibility="6.1.000" expanded="true" height="60" name="Filter Tokens (by Length)" width="90" x="447" y="75">
            <parameter key="min_chars" value="2"/>
            <parameter key="max_chars" value="200"/>
          </operator>
          <connect from_port="document" to_op="Tokenize" to_port="document"/>
          <connect from_op="Tokenize" from_port="document" to_op="Transform Cases" to_port="document"/>
          <connect from_op="Transform Cases" from_port="document" to_op="Filter Tokens (by Length)" to_port="document"/>
          <connect from_op="Filter Tokens (by Length)" from_port="document" to_port="document 1"/>
          <portSpacing port="source_document" spacing="0"/>
          <portSpacing port="sink_document 1" spacing="0"/>
          <portSpacing port="sink_document 2" spacing="0"/>
        </process>
      </operator>
      <operator activated="true" class="set_role" compatibility="6.4.000" expanded="true" height="76" name="Set Role (2)" width="90" x="447" y="75">
        <parameter key="attribute_name" value="locatoin3"/>
        <parameter key="target_role" value="label"/>
        <list key="set_additional_roles"/>
      </operator>
      <operator activated="true" class="x_validation" compatibility="6.4.000" expanded="true" height="130" name="Validation" width="90" x="447" y="210">
        <parameter key="number_of_validations" value="5"/>
        <parameter key="sampling_type" value="stratified sampling"/>
        <process expanded="true">
          <operator activated="true" breakpoints="after" class="select_attributes" compatibility="6.4.000" expanded="true" height="76" name="Select Attributes" width="90" x="45" y="75">
            <parameter key="attribute_filter_type" value="single"/>
            <parameter key="attribute" value="locatoin3"/>
            <parameter key="attributes" value="locatoin3|tourist"/>
          </operator>
          <operator activated="true" class="nominal_to_binominal" compatibility="6.4.000" expanded="true" height="94" name="Nominal to Binominal" width="90" x="45" y="210">
            <parameter key="attribute_filter_type" value="single"/>
            <parameter key="attribute" value="locatoin3"/>
            <parameter key="transform_binominal" value="true"/>
          </operator>
          <operator activated="true" class="support_vector_machine_linear" compatibility="6.4.000" expanded="true" height="76" name="SVM (Linear)" width="90" x="179" y="435"/>
          <connect from_port="training" to_op="Select Attributes" to_port="example set input"/>
          <connect from_op="Select Attributes" from_port="example set output" to_op="Nominal to Binominal" to_port="example set input"/>
          <connect from_op="Nominal to Binominal" from_port="example set output" to_op="SVM (Linear)" to_port="training set"/>
          <connect from_op="SVM (Linear)" from_port="model" to_port="model"/>
          <portSpacing port="source_training" spacing="0"/>
          <portSpacing port="sink_model" spacing="0"/>
          <portSpacing port="sink_through 1" spacing="0"/>
        </process>
        <process expanded="true">
          <operator activated="true" class="apply_model" compatibility="6.4.000" expanded="true" height="76" name="Apply Model" width="90" x="45" y="120">
            <list key="application_parameters"/>
          </operator>
          <operator activated="true" class="performance" compatibility="6.4.000" expanded="true" height="76" name="Performance" width="90" x="179" y="210"/>
          <connect from_port="model" to_op="Apply Model" to_port="model"/>
          <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
          <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
          <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
          <portSpacing port="source_model" spacing="0"/>
          <portSpacing port="source_test set" spacing="0"/>
          <portSpacing port="source_through 1" spacing="0"/>
          <portSpacing port="sink_averagable 1" spacing="0"/>
          <portSpacing port="sink_averagable 2" spacing="0"/>
          <portSpacing port="sink_averagable 3" spacing="0"/>
        </process>
      </operator>
      <connect from_op="Read CSV" from_port="output" to_op="Set Role" to_port="example set input"/>
      <connect from_op="Set Role" from_port="example set output" to_op="Process Documents from Data" to_port="example set"/>
      <connect from_op="Process Documents from Data" from_port="example set" to_op="Set Role (2)" to_port="example set input"/>
      <connect from_op="Set Role (2)" from_port="example set output" to_op="Validation" to_port="training"/>
      <connect from_op="Validation" from_port="model" to_port="result 2"/>
      <connect from_op="Validation" from_port="training" to_port="result 1"/>
      <connect from_op="Validation" from_port="averagable 1" to_port="result 3"/>
      <connect from_op="Validation" from_port="averagable 2" to_port="result 4"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
      <portSpacing port="sink_result 3" spacing="0"/>
      <portSpacing port="sink_result 4" spacing="0"/>
      <portSpacing port="sink_result 5" spacing="0"/>
    </process>
  </operator>
</process>

Answers

  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 2,046  RM Data Scientist
    Hi!

    have you checked that the type of location3 is really binominal?

    Cheers,
    Martin
    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • mattwlmattwl Member Posts: 5 Contributor I
    Hi Martin, thanks for your reply

    I'm a little confused by what you mean. Can you explain your question in some detail?

    Regards

    Matt
  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 2,046  RM Data Scientist
    Hi,

    sorry it was my fault. If you want to use Nominal to Binominal and the choosen attribute is not Nominal, the operator fails. Are you sure that your location attribute is nominal and not Text? Just have a look into the statistics tab for that.

    Cheers,
    Martin
    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • mattwlmattwl Member Posts: 5 Contributor I
    Hello,
    I just checked statistics on result. Yes - the attribute I've chosen is nominal, not text.
    Any other ideas?
    Thanks
    Matt
  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 2,046  RM Data Scientist
    Hi again,

    have you checked for the excecution order? In some wired cases, the order might be simply wrong.

    Best,
    Martin
    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • mattwlmattwl Member Posts: 5 Contributor I
    Hello
    Execution order seems to be correct, can you tell from the XML I posted?
    I basically tried replicating the process contained here:
    http://vancouverdata.blogspot.com.au/2010/11/text-analytics-with-rapidminer-part-5.html
    All the functions are in similar order
  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 2,046  RM Data Scientist
    Hi,

    just another quick idea: Is your attribute a special attribute? Try to check the box "include special attributes" in the operator.

    Cheers,
    Martin
    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • mattwlmattwl Member Posts: 5 Contributor I
    Hi Martin,
    Thanks for your help. I ticked 'special attribute' and it seemed the 'fix' my problem somewhat.
    However, now I am getting another problem: now the data (attribute name: 'tourism', role:label) doesn't seem to be outputting from the nominal to binominal function.
    In other words, nominal data is going in for this attribute, but no data is coming out for this attribute.

    Any ideas?

    Thanks again.

    matt

    See XML below:

          <operator activated="true" class="nominal_to_binominal" compatibility="6.4.000" expanded="true" height="94" name="Nominal to Binominal" width="90" x="45" y="210">
                <parameter key="attribute_filter_type" value="single"/>
                <parameter key="attribute" value="tourist"/>
                <parameter key="include_special_attributes" value="true"/>
              </operator>

  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 2,046  RM Data Scientist
    Hi,

    that feels odd. Could you send me the data and the current xml via mail? It is always hard to look on such things without the data itself.
    My mail address is mschmitz at rapidminer.com

    Cheers,
    Martin
    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • mschmitzmschmitz Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 2,046  RM Data Scientist
    Hi All,

    for the record: We fixed the issue via mail. The problem was related to inlcude special attributes and the select attribute operator which accendently filtered out everything.

    Cheers,
    Martin
    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
Sign In or Register to comment.