"[Delayed] Neural net predicts wrong label"

rapidnewberapidnewbe Member Posts: 2 Contributor I
edited June 2019 in Help
Hello everybody,

I try to use the standard neural net operator to solve a classification problem with nominal attributs. To test whether this operator works properly I connected the labeled data to the neural net operator und the same data unlabeled to the apply model operator. Before I use the nominal to numerical modification operator to bring the soucedata in the corresponding form.

Here is an example of the labeled data:

ID Value Label
1 J no
2 G no
3 B no
4 E no
5 E no
6 J no
7 A no
8 D no
9 H yes
10 J no
11 F no
12 H yes
13 A no
14 C no
15 G no
16 D no
17 H yes
18 G no
19 J no


As you can see the attribut "H" is labeled with "yes", all other attributs with "no".

I apply the trained neural net on this data:

ID Value
1 J
2 G
3 B
4 E
5 E
6 J
7 A
8 D
9 H
10 J
11 F
12 H
13 A
14 C
15 G
16 D
17 H
18 G
19 J

The result is, that the attribute "F" is predicted as "yes" and the rest with "no". So I imply that this operater does work, but not in the right way.
Can anybody help me? What am I doing wrong?

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.3.000">
 <context>
   <input/>
   <output/>
   <macros/>
 </context>
 <operator activated="true" class="process" compatibility="5.3.000" expanded="true" name="Process">
   <parameter key="logverbosity" value="init"/>
   <parameter key="random_seed" value="2001"/>
   <parameter key="send_mail" value="never"/>
   <parameter key="notification_email" value=""/>
   <parameter key="process_duration_for_mail" value="30"/>
   <parameter key="encoding" value="SYSTEM"/>
   <process expanded="true" height="404" width="815">
     <operator activated="true" class="read_excel" compatibility="5.3.000" expanded="true" height="60" name="Read Excel" width="90" x="112" y="75">
       <parameter key="excel_file" value="C:\Users\knolljli\Desktop\testdata.xlsx"/>
       <parameter key="sheet_number" value="1"/>
       <parameter key="imported_cell_range" value="A1:C730"/>
       <parameter key="encoding" value="SYSTEM"/>
       <parameter key="first_row_as_names" value="false"/>
       <list key="annotations">
         <parameter key="0" value="Name"/>
       </list>
       <parameter key="date_format" value=""/>
       <parameter key="time_zone" value="SYSTEM"/>
       <parameter key="locale" value="German"/>
       <list key="data_set_meta_data_information">
         <parameter key="0" value="ID.true.integer.id"/>
         <parameter key="1" value="Value.true.polynominal.attribute"/>
         <parameter key="2" value="Label.true.binominal.label"/>
       </list>
       <parameter key="read_not_matching_values_as_missings" value="true"/>
       <parameter key="datamanagement" value="double_array"/>
     </operator>
     <operator activated="true" class="nominal_to_numerical" compatibility="5.3.000" expanded="true" height="94" name="Nominal to Numerical" width="90" x="313" y="75">
       <parameter key="return_preprocessing_model" value="false"/>
       <parameter key="create_view" value="false"/>
       <parameter key="attribute_filter_type" value="all"/>
       <parameter key="attribute" value=""/>
       <parameter key="attributes" value=""/>
       <parameter key="use_except_expression" value="false"/>
       <parameter key="value_type" value="nominal"/>
       <parameter key="use_value_type_exception" value="false"/>
       <parameter key="except_value_type" value="file_path"/>
       <parameter key="block_type" value="single_value"/>
       <parameter key="use_block_type_exception" value="false"/>
       <parameter key="except_block_type" value="single_value"/>
       <parameter key="invert_selection" value="false"/>
       <parameter key="include_special_attributes" value="false"/>
       <parameter key="coding_type" value="dummy coding"/>
       <parameter key="use_comparison_groups" value="false"/>
       <list key="comparison_groups"/>
       <parameter key="unexpected_value_handling" value="all 0 and warning"/>
       <parameter key="use_underscore_in_name" value="false"/>
     </operator>
     <operator activated="true" class="neural_net" compatibility="5.3.000" expanded="true" height="76" name="Neural Net" width="90" x="514" y="75">
       <list key="hidden_layers"/>
       <parameter key="training_cycles" value="500"/>
       <parameter key="learning_rate" value="0.3"/>
       <parameter key="momentum" value="0.2"/>
       <parameter key="decay" value="false"/>
       <parameter key="shuffle" value="true"/>
       <parameter key="normalize" value="true"/>
       <parameter key="error_epsilon" value="1.0E-5"/>
       <parameter key="use_local_random_seed" value="false"/>
       <parameter key="local_random_seed" value="1992"/>
     </operator>
     <operator activated="true" class="read_excel" compatibility="5.3.000" expanded="true" height="60" name="Read Excel (2)" width="90" x="112" y="255">
       <parameter key="excel_file" value="C:\Users\knolljli\Desktop\testdata.xlsx"/>
       <parameter key="sheet_number" value="1"/>
       <parameter key="imported_cell_range" value="A1:B730"/>
       <parameter key="encoding" value="SYSTEM"/>
       <parameter key="first_row_as_names" value="false"/>
       <list key="annotations">
         <parameter key="0" value="Name"/>
       </list>
       <parameter key="date_format" value=""/>
       <parameter key="time_zone" value="SYSTEM"/>
       <parameter key="locale" value="German"/>
       <list key="data_set_meta_data_information">
         <parameter key="0" value="ID.true.integer.id"/>
         <parameter key="1" value="Value.true.polynominal.attribute"/>
       </list>
       <parameter key="read_not_matching_values_as_missings" value="true"/>
       <parameter key="datamanagement" value="double_array"/>
     </operator>
     <operator activated="true" class="nominal_to_numerical" compatibility="5.3.000" expanded="true" height="94" name="Nominal to Numerical (2)" width="90" x="313" y="255">
       <parameter key="return_preprocessing_model" value="false"/>
       <parameter key="create_view" value="false"/>
       <parameter key="attribute_filter_type" value="all"/>
       <parameter key="attribute" value=""/>
       <parameter key="attributes" value=""/>
       <parameter key="use_except_expression" value="false"/>
       <parameter key="value_type" value="nominal"/>
       <parameter key="use_value_type_exception" value="false"/>
       <parameter key="except_value_type" value="file_path"/>
       <parameter key="block_type" value="single_value"/>
       <parameter key="use_block_type_exception" value="false"/>
       <parameter key="except_block_type" value="single_value"/>
       <parameter key="invert_selection" value="false"/>
       <parameter key="include_special_attributes" value="false"/>
       <parameter key="coding_type" value="dummy coding"/>
       <parameter key="use_comparison_groups" value="false"/>
       <list key="comparison_groups"/>
       <parameter key="unexpected_value_handling" value="all 0 and warning"/>
       <parameter key="use_underscore_in_name" value="false"/>
     </operator>
     <operator activated="true" class="apply_model" compatibility="5.3.000" expanded="true" height="76" name="Apply Model" width="90" x="715" y="120">
       <list key="application_parameters"/>
       <parameter key="create_view" value="false"/>
     </operator>
     <connect from_port="input 1" to_op="Read Excel" to_port="file"/>
     <connect from_port="input 2" to_op="Read Excel (2)" to_port="file"/>
     <connect from_op="Read Excel" from_port="output" to_op="Nominal to Numerical" to_port="example set input"/>
     <connect from_op="Nominal to Numerical" from_port="example set output" to_op="Neural Net" to_port="training set"/>
     <connect from_op="Neural Net" from_port="model" to_op="Apply Model" to_port="model"/>
     <connect from_op="Read Excel (2)" from_port="output" to_op="Nominal to Numerical (2)" to_port="example set input"/>
     <connect from_op="Nominal to Numerical (2)" from_port="example set output" to_op="Apply Model" to_port="unlabelled data"/>
     <connect from_op="Apply Model" from_port="labelled data" to_port="result 1"/>
     <portSpacing port="source_input 1" spacing="0"/>
     <portSpacing port="source_input 2" spacing="0"/>
     <portSpacing port="source_input 3" spacing="0"/>
     <portSpacing port="sink_result 1" spacing="0"/>
     <portSpacing port="sink_result 2" spacing="0"/>
   </process>
 </operator>
</process>

Answers

  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    This process and data indeed demonstrate some inconsistent behaviour. I'll create an internal ticket for this.

    Best regards,
    Marius
Sign In or Register to comment.