The Altair Community is migrating to a new platform to provide a better experience for you. The RapidMiner Community will merge with the Altair Community at the same time. In preparation for the migration, both communities are on read-only mode from July 15th - July 24th, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here.
Options

"Reading a CSV file"

rippleripple Member Posts: 1 Contributor I
edited June 2019 in Help
Hi everyone, I'm a new user to rapidminer, i'm working at the analytics division of a bank. One issue that I haven't been able to resolve is, while reading in CSV files, if the delimiter is ~ or ; the reader reads files properly. But how do I read files that have | as delimiter ? Bcoz if I ain't wrong | is taken as XOR operator in rapidminer....unfortunately the data i receive is occasionally around 700mb of text file, so replacing each | by ~ doesn't seem to be a feasible option.
regards
Tagged:

Answers

  • Options
    haddockhaddock Member Posts: 849 Maven
    Greets,

    After Uni. my first job was in Credit Analysis - so you have my sympathies! You can use the csv reader if you add "\|" to the front of the regex string that matches column separators , like this ...
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.0">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" expanded="true" name="Process">
        <process expanded="true" height="381" width="868">
          <operator activated="true" class="read_csv" expanded="true" height="60" name="Read CSV" width="90" x="93" y="75">
            <parameter key="column_separators" value="\||,\s*|;\s*|\s+"/>
          </operator>
          <connect from_op="Read CSV" from_port="output" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    You just need to point to the file.

    Ciao


Sign In or Register to comment.