Options

Limiting number of rows

kavuchkavuch Member Posts: 6 Contributor I
edited November 2018 in Help
I'm working with multiple CSVs, with over 100.000 entries.
RapidMiner uses up to 6GB RAM (of 8GB) and my system becomes very slow.
Is it possible to limit the number of rows to be load? For example I could only load 1.000 rows and play around with them with a fast system. Just like the limit-function in SQL (http://www.w3schools.com/sql/sql_top.asp).

Answers

  • Options
    earmijoearmijo Member Posts: 270 Unicorn
    YOu can use Filter By Range (1 to n, where n is the desired size). Check the following process:
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="6.5.002">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="6.5.002" expanded="true" name="Process">
        <process expanded="true">
          <operator activated="true" class="retrieve" compatibility="6.5.002" expanded="true" height="60" name="Retrieve Sonar" width="90" x="45" y="120">
            <parameter key="repository_entry" value="//Samples/data/Sonar"/>
          </operator>
          <operator activated="true" class="filter_example_range" compatibility="6.5.002" expanded="true" height="76" name="Filter Example Range" width="90" x="313" y="75">
            <parameter key="first_example" value="1"/>
            <parameter key="last_example" value="10"/>
          </operator>
          <connect from_op="Retrieve Sonar" from_port="output" to_op="Filter Example Range" to_port="example set input"/>
          <connect from_op="Filter Example Range" from_port="example set output" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
Sign In or Register to comment.