"Problem with Filter Example Range"

Nick_ColdhandNick_Coldhand Member Posts: 4 Contributor I
edited June 2019 in Help
I'm using Excel sheets with Date as ID, and found this strange behaviour, everytime the example set goes through the "Filter Example Range" operator:
1. If Date is in Excel "Date" format (MM/DD/YYYY), every filtered example is dated 1/0/1900
2. If Date is in Excel "Number" format (# days from 1/1/1900), this doesn't happen and all is well
3. If the sheet is imported in my repository, and the "Retrieve" operator is used to read it, every filtered example is dated 1/0/1900, regardless of the date format when the sheet was imported.
Tagged:

Answers

  • haddockhaddock Member Posts: 849 Maven

    Hi there,

    I deal with time series a lot and suspect that the answer is more probably outside RM. Running the following gets some data from an SQL database, samples some into an .xls file, and reads that file. No probs encountered.
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.0">
      <context>
        <input>
          <location/>
        </input>
        <output>
          <location/>
          <location/>
          <location/>
        </output>
        <macros/>
      </context>
      <operator activated="true" class="process" expanded="true" name="Process">
        <process expanded="true" height="391" width="915">
          <operator activated="true" class="retrieve" expanded="true" height="60" name="Retrieve" width="90" x="67" y="106">
            <parameter key="repository_entry" value="//SQL Data/Updates"/>
          </operator>
          <operator activated="true" class="sample" expanded="true" height="76" name="Sample" width="90" x="246" y="30"/>
          <operator activated="true" class="write_excel" expanded="true" height="60" name="Write Excel" width="90" x="380" y="120">
            <parameter key="excel_file" value="C:\Documents and Settings\Alien\My Documents\test.xls"/>
          </operator>
          <operator activated="true" class="read_excel" expanded="true" height="60" name="Read Excel" width="90" x="447" y="255">
            <parameter key="excel_file" value="C:\Documents and Settings\Alien\My Documents\test.xls"/>
          </operator>
          <connect from_op="Retrieve" from_port="output" to_op="Sample" to_port="example set input"/>
          <connect from_op="Sample" from_port="example set output" to_op="Write Excel" to_port="input"/>
          <connect from_op="Write Excel" from_port="through" to_port="result 1"/>
          <connect from_op="Read Excel" from_port="output" to_port="result 2"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
          <portSpacing port="sink_result 3" spacing="0"/>
        </process>
      </operator>
    </process>
    However....
    With the date 1/0/1900, excel is referencing a cell without data typed in and gives the base date instead
    .
    http://en.allexperts.com/q/Excel-1059/Cell-references.htm
  • Nick_ColdhandNick_Coldhand Member Posts: 4 Contributor I
    Sorry, but I wasn't exhaustive in my original post.

    In the result screen, the example set is always OK.
    But if I try to use "Write Excel" after "Filter Example Range", the written file will show all the examples dated 1/0/1900 (in cases #1 and #3 of my OP). Thus, I'm not sure if the problem is in "Filter Example Range" or in "Write Excel"
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    thank you for this hint. Seems to be a bug in the write excel operator, I will take a look.

    Greetings,
      Sebastian
Sign In or Register to comment.