The RapidMiner community is on read-only mode until further notice. Technical support via cases will continue to work as is. For any urgent licensing related requests from Students/Faculty members, please use the Altair academic forum here.

[SOLVED] Write TSV

4of44of4 Member Posts: 7 Contributor II
edited November 2018 in Help
Hi,
I need to write an "Example set" as a tsv file.
I'm trying to use the "Write CSV" operator.
What kind of value can I insert in "column separator" field?
The value "\t" seems to work only in "Read CSV" .....
Thanks in advance for support

Answers

  • MacPhotoBikerMacPhotoBiker Member Posts: 60 Contributor II
    Are you looking to insert a tab separator? I'm not sure about that, but if you don't want to use any standard like comma or semi-colon, usually the pipe (|) is a commonly used option.
  • 4of44of4 Member Posts: 7 Contributor II
    Thanks for suggestion, but unfortunately it doesn't work.
    Here's the case
    Bye

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.3.005">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.3.005" expanded="true" name="Process">
        <process expanded="true">
          <operator activated="true" class="retrieve" compatibility="5.3.005" expanded="true" height="60" name="Retrieve Iris" width="90" x="112" y="120">
            <parameter key="repository_entry" value="//Samples/data/Iris"/>
          </operator>
          <operator activated="true" class="write_csv" compatibility="5.3.005" expanded="true" height="76" name="Write CSV" width="90" x="514" y="120">
            <parameter key="csv_file" value="C:\a.txt"/>
            <parameter key="column_separator" value="|"/>
            <parameter key="quote_nominal_values" value="false"/>
          </operator>
          <connect from_op="Retrieve Iris" from_port="output" to_op="Write CSV" to_port="input"/>
          <connect from_op="Write CSV" from_port="through" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>

  • MacPhotoBikerMacPhotoBiker Member Posts: 60 Contributor II
    Well, for me it works, below are the first four lines that your process generates:

    a1|a2|a3|a4|id|label
    5.1|3.5|1.4|0.2|id_1|Iris-setosa
    4.9|3.0|1.4|0.2|id_2|Iris-setosa
    4.7|3.2|1.3|0.2|id_3|Iris-setosa

    Which error message are you getting?
  • 4of44of4 Member Posts: 7 Contributor II
    No, msg error
    The point is that I need that the columns are separated by Tab character to complete my data process .... This is a first part of an ETL process and the output file will be processed by another program .... that needs tab separators
    Bye
  • MacPhotoBikerMacPhotoBiker Member Posts: 60 Contributor II
    I see, you definitely need the tab as a separator. I found the following to work for me, but I'm not sure if this is really a solution, or just a work around. Yet, it works for me.

    I manually created a tab separated file, then I read it with the "read CSV" operator, and chose "tab" as separator (again, while READING). Then, I went to the settings of this operator, and just copied whatever was in the field "column separator). It looked empty, but I just double clicked in it, then copied. You may also just double click between the two brackets below, and copy (without the brackets)

    ( )

    Then, paste this as column separator into your "write CSV operator".

    I hope that works, it did the job for me. I opened the generated file in LibreOffice and indicated "tab" as delimiter, and it opened as expected.
  • MacPhotoBikerMacPhotoBiker Member Posts: 60 Contributor II
    I just realized that whatever I pasted between the brackets got lost when posting the message, sorry about that.

    But just follow the procedure as I described, and copy the field separator value from the "read CSV" to the "write CSV operator", this should do the job.

    I know it doesn't look very smooth, but I hope it gets you a step forward...

    Or, here's the code for the operator:
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.3.008">
     <context>
       <input/>
       <output/>
       <macros/>
     </context>
     <operator activated="true" class="process" compatibility="5.3.008" expanded="true" name="Process">
       <process expanded="true">
         <operator activated="true" class="write_csv" compatibility="5.3.008" expanded="true" height="76" name="Write CSV" width="90" x="380" y="75">
           <parameter key="csv_file" value="/home/macphotobiker/Desktop/tsv.tsv"/>
           <parameter key="column_separator" value="&#9;"/>
         </operator>
         <connect from_op="Write CSV" from_port="through" to_port="result 1"/>
         <portSpacing port="source_input 1" spacing="0"/>
         <portSpacing port="sink_result 1" spacing="0"/>
         <portSpacing port="sink_result 2" spacing="0"/>
       </process>
     </operator>
    </process>
  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    The problem is that the Java framework does not allow to enter a tab character into the input field, because when pressing the tab key the cursor moves to the next field.

    To get a tab character into the parameter, you have to copy it from somewhere. You can e.g. press tab in a normal text editor and copy the resulting (seemingly empty) character into RapidMiner.

    Best regards,
    Marius
  • 4of44of4 Member Posts: 7 Contributor II
    Thank you very much, MacPhotoBiker!!!
    Your solution works perfectly for my purpose !!!!!
    Thanks also to Marius
    Bye
  • MacPhotoBikerMacPhotoBiker Member Posts: 60 Contributor II
    Perfect 4of4, glad I could help.

    Good luck with your project.
Sign In or Register to comment.