RapidMiner

RapidMiner

"Read CSV" to example set

Contributor II

"Read CSV" to example set

Hi,

Just beginning RapidMiner experiment & having trouble with "Read CSV" operator.
I can output the data to res  (and see the ExampleSet), but when other operators require an example set in input, no data is available. Is this a limitation of Read CSV or is there a way to make the data available as an example set ?
Regards.
12 REPLIES
Regular Contributor

Re: "Read CSV" to example set

HI, and welcome!

Start Rapidminer and go Help->Tutorial, that will load runnable examples, so you have some idea of what RM can and cannot do. Believe me, it saves time in the long run!

Regular Contributor

Re: "Read CSV" to example set

Hi Monaco,

if your operator provides an example set to the results port of the process, it will do the same for other operators. Did you check the connection from the output port of "Read CSV" to the input port of the following operator? Perhaps you might want to post your process (code from XML tab) here to reveal possible mistakes in process design.

Regards
Matthias
Contributor II

Re: "Read CSV" to example set

Hi Colo,

Many thanks for your quick reply.
Here is the code (nothing fancy). Doesn't work with CSV Reader but works well with Read Excel or Retrieve.
When you are modifying the file that has been stored as a Data Table in the repository, do you know how to automaticaly update this Data Table ?

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.006">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.1.006" expanded="true" name="Process">
    <process expanded="true" height="426" width="673">
      <operator activated="true" class="read_csv" compatibility="5.1.006" expanded="true" height="60" name="Read CSV" width="90" x="45" y="120">
        <parameter key="csv_file" value="D:\Data.csv"/>
        <parameter key="date_format" value="yyyyMMdd"/>
        <list key="annotations">
          <parameter key="0" value="Name"/>
        </list>
        <parameter key="locale" value="French"/>
        <list key="data_set_meta_data_information">
          <parameter key="0" value="Date.true.date.id"/>
          <parameter key="1" value="Data.true.integer.attribute"/>
        </list>
      </operator>
      <operator activated="true" class="series:windowing" compatibility="5.1.002" expanded="true" height="76" name="Windowing" width="90" x="179" y="30">
        <parameter key="horizon" value="1"/>
        <parameter key="window_size" value="1"/>
        <parameter key="create_label" value="true"/>
        <parameter key="label_attribute" value="Data"/>
      </operator>
      <connect from_op="Read CSV" from_port="output" to_op="Windowing" to_port="example set input"/>
      <connect from_op="Windowing" from_port="example set output" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>
Contributor II

Re: "Read CSV" to example set

haddock wrote:

HI, and welcome!

Start Rapidminer and go Help->Tutorial, that will load runnable examples, so you have some idea of what RM can and cannot do. Believe me, it saves time in the long run!




Hi Haddock,

Thank you for your insight. I've studied this tutorial last week and effectively the ressource is amazingly powerful and educative. But I haven't found an answer to my current problem. I've posted the code, but I don't think it will help. You can try for yourself with a very simple csv file, when you drag the mouse cursor over the operator output, it indicates "number of examples=-1".
Regards
RMStaff

Re: "Read CSV" to example set

Aehem, only a quick question: Did you actually have executed the process (i.e. pressed the "Play" icon in the toolbar?). Does it work then?

Cheers,
Ingo
Contributor II

Re: "Read CSV" to example set

Hi Ingo,

When I execute the process, I works fine to display the data (even if number of example set=-1). But when I add a windowing operator, which requires a number of example set superior to the horizon (set to 1), it fails.
Cheers
RMStaff

Re: "Read CSV" to example set

Ok, then try the following:

1. Load the data with "Read CSV", add an operator "Store" and save the data set directly again in your repository.
2. Drag the freshly saved data from your repository (it will be transformed into a new operator named "Retrieve" which will load the data for you from the repository)

Try again with this data set loaded with "Retrieve". Expected behaviour: Everything works like expected. Reason for your confusion: Search in the forum for "Repository" and "meta data". Best solution for you: Book a training at Rapid-I - it definitely will help  Smiley Very Happy
This would probably also the best option if you do not know what I mean with "Repository"  ;D

Cheers,
Ingo

P.S. (for the more experienced readers here...): I never did expect that this - definitely very unique and innovative - feature of RapidMiner called "meta data propagation" would cause so much uncertainty for some users. I am open for all suggestions how we could make the difference more clear between "meta data" and "actual data" and why it is sometimes impossible to provide meta data (like for CSV files...)
Regular Contributor

Re: "Read CSV" to example set

Hi Monaco,

just to be sure... you didn't use the "Window Document" operator after "Read CSV", did you? Which operators did you try?
I hoped you would post your process with this second operator to reveal possible problems Smiley Wink

Regards
Matthias
Contributor II

Re: "Read CSV" to example set

Hey Ingo,

Just read your post at http://rapid-i.com/rapidforum/index.php/topic,2902.msg11559.html#msg11559
Frequent update of my csv files is why I don't use the repository (unless there is a way to easily and automatically update it).
I don't understand why the same data can be output when in xls and can't in csv format. Fortunately I have found alternative ways to properly deal with this issue, but I would have prefered (it's not crucial) to output directly fron Read CSV.
Many thanks for your support.

Best regards.