Options

[SOLVED] Why does Split not work on nominal data read from csv file?

tennenrishintennenrishin Member Posts: 177 Contributor II
edited December 2019 in Help
I must be missing something obvious, but why, in the following process, does one Split work and the other not?
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.2.008">
 <context>
   <input/>
   <output/>
   <macros/>
 </context>
 <operator activated="true" class="process" compatibility="5.2.008" expanded="true" name="Process">
   <process expanded="true" height="654" width="970">
     <operator activated="true" class="generate_data_user_specification" compatibility="5.2.008" expanded="true" height="60" name="Generate Data by User Specification" width="90" x="45" y="30">
       <list key="attribute_values">
         <parameter key="line" value="&quot;a,b&quot;"/>
       </list>
       <list key="set_additional_roles"/>
     </operator>
     <operator activated="true" class="write_csv" compatibility="5.2.008" expanded="true" height="76" name="Write CSV" width="90" x="179" y="120">
       <parameter key="write_attribute_names" value="false"/>
       <parameter key="quote_nominal_values" value="false"/>
       <parameter key="format_date_attributes" value="false"/>
     </operator>
     <operator activated="true" class="read_csv" compatibility="5.2.008" expanded="true" height="60" name="Read CSV" width="90" x="380" y="165">
       <parameter key="use_quotes" value="false"/>
       <parameter key="parse_numbers" value="false"/>
       <parameter key="first_row_as_names" value="false"/>
       <list key="annotations"/>
       <list key="data_set_meta_data_information">
         <parameter key="0" value="line.true.nominal.regular"/>
       </list>
     </operator>
     <operator activated="true" class="split" compatibility="5.2.008" expanded="true" height="76" name="Split (2)" width="90" x="514" y="165">
       <parameter key="attribute_filter_type" value="single"/>
       <parameter key="attribute" value="line"/>
     </operator>
     <operator activated="true" class="split" compatibility="5.2.008" expanded="true" height="76" name="Split" width="90" x="380" y="30">
       <parameter key="attribute_filter_type" value="single"/>
       <parameter key="attribute" value="line"/>
     </operator>
     <connect from_op="Generate Data by User Specification" from_port="output" to_op="Write CSV" to_port="input"/>
     <connect from_op="Write CSV" from_port="through" to_op="Split" to_port="example set input"/>
     <connect from_op="Write CSV" from_port="file" to_op="Read CSV" to_port="file"/>
     <connect from_op="Read CSV" from_port="output" to_op="Split (2)" to_port="example set input"/>
     <connect from_op="Split (2)" from_port="example set output" to_port="result 2"/>
     <connect from_op="Split" from_port="example set output" to_port="result 1"/>
     <portSpacing port="source_input 1" spacing="0"/>
     <portSpacing port="sink_result 1" spacing="0"/>
     <portSpacing port="sink_result 2" spacing="0"/>
     <portSpacing port="sink_result 3" spacing="0"/>
   </process>
 </operator>
</process>

Answers

  • Options
    tennenrishintennenrishin Member Posts: 177 Contributor II
    Comparing the metadata from the two Split operators' ori ports, I see that the CSV Read operator outputs what might be described as a "regular but special" attribute. What is the meaning of this?

    So I guess an easy work around is to "include special attributes" in the Split operator.
  • Options
    MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Hi,

    you woke a demon bred by a half-bug coupled with a GUI inconsistency.
    When you want to configure the Read CSV operator to create a regular attribute, you have to select "attribute" from the role drop-down list. If you type "regular", the operator will create a special attribute with role "regular". Yes, this is horrible. I will create an internal ticket requesting to correct this.

    Best regards,
    Marius
Sign In or Register to comment.