[SOLVED]Problem with WriteCSV

siamak_wantsiamak_want Member Posts: 98 Contributor II
edited November 2018 in Help
Hi Forum,

I have a problem with "Write CSV" operator: I provide its input with a "Format Number" operator and set all "real" data types to integer. The format number operator works fine. But when I Write its result in a file by "Write CSV" operator, All of my "real" data are still in the form of "ex. 1.0", instead of "ex. 1" in the resulting csv file. Any one knows what is wrong here?

any idea would be appreciated.

Answers

  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    If you use Format Numbers, all your "real" attributes should be "nominal" afterwards, so no changes should occur on writing. Integers being written as reals is a known issue, though. However I have no idea how this "ex." prefix comes into your output file. Please provide a minimal example process as described in my signature.

    Best, Marius
  • siamak_wantsiamak_want Member Posts: 98 Contributor II
    Hi nice guy: Marius

    I made a mistake by writing "ex. 0". I mean "for example":) Sorry for my poor English. So please do not consider the "ex" prefix Marius. Here is my process as you have requested:

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.2.006">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="read_csv" compatibility="5.2.006" expanded="true" height="60" name="Read CSV" width="90" x="45" y="75">
        <parameter key="csv_file" value="C:\Users\Qodmanan\Desktop\test.csv"/>
        <parameter key="column_separators" value=","/>
        <parameter key="trim_lines" value="false"/>
        <parameter key="use_quotes" value="true"/>
        <parameter key="quotes_character" value="&quot;"/>
        <parameter key="escape_character_for_quotes" value="\"/>
        <parameter key="skip_comments" value="false"/>
        <parameter key="comment_characters" value="#"/>
        <parameter key="parse_numbers" value="true"/>
        <parameter key="decimal_character" value="."/>
        <parameter key="grouped_digits" value="false"/>
        <parameter key="grouping_character" value=","/>
        <parameter key="date_format" value=""/>
        <parameter key="first_row_as_names" value="false"/>
        <list key="annotations"/>
        <parameter key="time_zone" value="SYSTEM"/>
        <parameter key="locale" value="English (United States)"/>
        <parameter key="encoding" value="windows-1252"/>
        <list key="data_set_meta_data_information">
          <parameter key="0" value="att1.true.integer.attribute"/>
          <parameter key="1" value="att2.true.integer.attribute"/>
          <parameter key="2" value="att3.true.integer.attribute"/>
          <parameter key="3" value="att4.true.polynominal.label"/>
        </list>
        <parameter key="read_not_matching_values_as_missings" value="true"/>
        <parameter key="datamanagement" value="short_array"/>
      </operator>
    </process>
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.2.006">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="write_csv" compatibility="5.2.006" expanded="true" height="76" name="Write CSV" width="90" x="581" y="75">
        <parameter key="csv_file" value="C:\train\TruncatedData\StratifiedSampledData_0.1Ratio\sampled4.csv"/>
        <parameter key="column_separator" value=","/>
        <parameter key="write_attribute_names" value="false"/>
        <parameter key="quote_nominal_values" value="false"/>
        <parameter key="format_date_attributes" value="false"/>
        <parameter key="encoding" value="SYSTEM"/>
      </operator>
    </process>
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.2.006">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="write_file" compatibility="5.2.006" expanded="true" height="60" name="Write File" width="90" x="715" y="120">
        <parameter key="resource_type" value="file"/>
        <parameter key="filename" value="C:\Users\Qodmanan\Desktop\testtttt.csv"/>
        <parameter key="mime_type" value="application/octet-stream"/>
      </operator>
    </process>
    Any help would be appreciated.
  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Hey, I can't paste your code in my RapidMiner. Please follow the instructions in the post linked in my signature and use the xml-View to retrieve the xml of your process.

    Best, Marius
  • siamak_wantsiamak_want Member Posts: 98 Contributor II
    Hi, Sorry for the inconvenience,
    here is the code:

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.2.006">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.2.006" expanded="true" name="Process">
        <process expanded="true" height="424" width="705">
          <operator activated="true" class="retrieve" compatibility="5.2.006" expanded="true" height="60" name="Retrieve" width="90" x="45" y="75">
            <parameter key="repository_entry" value="//Samples/data/Golf"/>
          </operator>
          <operator activated="true" class="sample_stratified" compatibility="5.2.006" expanded="true" height="76" name="Sample (Stratified)" width="90" x="180" y="30">
            <parameter key="sample" value="relative"/>
            <parameter key="use_local_random_seed" value="true"/>
            <parameter key="local_random_seed" value="2222"/>
          </operator>
          <operator activated="true" class="remove_useless_attributes" compatibility="5.2.006" expanded="true" height="76" name="Remove Useless Attributes" width="90" x="315" y="30"/>
          <operator activated="true" class="format_numbers" compatibility="5.2.006" expanded="true" height="76" name="Format Numbers" width="90" x="450" y="30">
            <parameter key="attribute_filter_type" value="value_type"/>
            <parameter key="value_type" value="real"/>
            <parameter key="include_special_attributes" value="true"/>
            <parameter key="format_type" value="integer"/>
          </operator>
          <operator activated="true" class="write_csv" compatibility="5.2.006" expanded="true" height="76" name="Write CSV" width="90" x="585" y="30">
            <parameter key="csv_file" value="C:\train\TruncatedData\StratifiedSampledData_0.1Ratio\sampled4.csv"/>
            <parameter key="column_separator" value=","/>
            <parameter key="write_attribute_names" value="false"/>
            <parameter key="quote_nominal_values" value="false"/>
            <parameter key="format_date_attributes" value="false"/>
          </operator>
          <operator activated="true" class="write_file" compatibility="5.2.006" expanded="true" height="60" name="Write File" width="90" x="581" y="165">
            <parameter key="filename" value="C:\train\TruncatedData\sample5"/>
          </operator>
          <connect from_op="Retrieve" from_port="output" to_op="Sample (Stratified)" to_port="example set input"/>
          <connect from_op="Sample (Stratified)" from_port="example set output" to_op="Remove Useless Attributes" to_port="example set input"/>
          <connect from_op="Remove Useless Attributes" from_port="example set output" to_op="Format Numbers" to_port="example set input"/>
          <connect from_op="Format Numbers" from_port="example set output" to_op="Write CSV" to_port="input"/>
          <connect from_op="Write CSV" from_port="file" to_op="Write File" to_port="file"/>
          <connect from_op="Write File" from_port="file" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>


  • Nils_WoehlerNils_Woehler Member Posts: 463 Maven
    Hi,

    as Marius stated all numerical values are written as real values. Thus you have to insert a Numerical to Polynominal Operator. Take a look here:


    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.2.007">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.2.007" expanded="true" name="Process">
        <process expanded="true" height="424" width="705">
          <operator activated="true" class="retrieve" compatibility="5.2.007" expanded="true" height="60" name="Retrieve" width="90" x="45" y="75">
            <parameter key="repository_entry" value="//Samples/data/Golf"/>
          </operator>
          <operator activated="true" class="sample_stratified" compatibility="5.2.007" expanded="true" height="76" name="Sample (Stratified)" width="90" x="180" y="30">
            <parameter key="sample" value="relative"/>
            <parameter key="use_local_random_seed" value="true"/>
            <parameter key="local_random_seed" value="2222"/>
          </operator>
          <operator activated="true" class="remove_useless_attributes" compatibility="5.2.007" expanded="true" height="76" name="Remove Useless Attributes" width="90" x="315" y="30"/>
          <operator activated="true" class="format_numbers" compatibility="5.2.007" expanded="true" height="76" name="Format Numbers" width="90" x="450" y="30">
            <parameter key="attribute_filter_type" value="value_type"/>
            <parameter key="value_type" value="real"/>
            <parameter key="include_special_attributes" value="true"/>
            <parameter key="format_type" value="integer"/>
          </operator>
          <operator activated="true" class="numerical_to_polynominal" compatibility="5.2.007" expanded="true" height="76" name="Numerical to Polynominal" width="90" x="447" y="165">
            <parameter key="attribute_filter_type" value="value_type"/>
            <parameter key="value_type" value="integer"/>
          </operator>
          <operator activated="true" class="write_csv" compatibility="5.2.007" expanded="true" height="76" name="Write CSV" width="90" x="581" y="30">
            <parameter key="csv_file" value="C:\train\TruncatedData\StratifiedSampledData_0.1Ratio\sampled4.csv"/>
            <parameter key="column_separator" value=","/>
            <parameter key="write_attribute_names" value="false"/>
            <parameter key="quote_nominal_values" value="false"/>
            <parameter key="format_date_attributes" value="false"/>
          </operator>
          <operator activated="true" class="write_file" compatibility="5.2.007" expanded="true" height="60" name="Write File" width="90" x="581" y="165">
            <parameter key="filename" value="C:\Users\nwoehler\Desktop\sample5"/>
          </operator>
          <connect from_op="Retrieve" from_port="output" to_op="Sample (Stratified)" to_port="example set input"/>
          <connect from_op="Sample (Stratified)" from_port="example set output" to_op="Remove Useless Attributes" to_port="example set input"/>
          <connect from_op="Remove Useless Attributes" from_port="example set output" to_op="Format Numbers" to_port="example set input"/>
          <connect from_op="Format Numbers" from_port="example set output" to_op="Numerical to Polynominal" to_port="example set input"/>
          <connect from_op="Numerical to Polynominal" from_port="example set output" to_op="Write CSV" to_port="input"/>
          <connect from_op="Write CSV" from_port="file" to_op="Write File" to_port="file"/>
          <connect from_op="Write File" from_port="file" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    Best,
    Nils
  • siamak_wantsiamak_want Member Posts: 98 Contributor II
    Thanks to both Marius and Nils,
    The problem was nicely resolved with operator "Numerical to Polynomial"

    Again thanks guys

    Best regards,
Sign In or Register to comment.