DatabaseExampleSource returns only 10 rows

Legacy UserLegacy User Member Posts: 0 Newbie
edited November 2018 in Help
Hello,
I have a problem with the DatabaseExampleSource in combination with Oracle and rapidminer version 4.1.

My query that is executed should return over 100.000 data rows. But all i get within rapidminer are the first 10 rows. This matches (maybe) the default fetch size of the ODBC driver? So it seems that the first 10 rows will be fetched and afterwards the operator terminates...

Sometimes I get the following error message:
Jun 24, 2008 8:49:46 AM: [Error] While reading examples from result set: operation not allowed: Unsupported syntax for refreshRow()

Btw: when i switch to version 4.0 everything works fine...

Any suggestions what I'm doing wrong?

greetz
ajay

Answers

  • TobiasMalbrechtTobiasMalbrecht Moderator, Employee, Member Posts: 294 RM Product Management
    Hi,

    I have not yet an idea, what could be the problem here. Does the error show up, when you only try to read the data via the [tt]DatabaseExamplesSource[/tt] operator? Or does your process involve other operators? What settings do you use in the example source operator? Maybe you can post your process XML?

    Regards,
    Tobias
  • Legacy UserLegacy User Member Posts: 0 Newbie
    ok, I tried it with and without further operators, but this does not make any difference.
    still it isn't a real error - it just does not retrieve all data rows :)

    here are some more details:

    rapidminer 4.0

    Process XML:
    <operator name="Root" class="Process">
        <operator name="DatabaseExampleSource" class="DatabaseExampleSource">
            <parameter key="database_system" value="Oracle"/>
            <parameter key="database_url" value="jdbc:oracle:thin:@*******:1521:***"/>
            <parameter key="password" value="***"/>
            <parameter key="query" value="SELECT JH_JOBNUMBER, DEVFROMMEAN FROM (  SELECT JH_JOBNUMBER, (CD_VALUE-AVG(CD_VALUE) OVER(PARTITION BY JH_JOBNUMBER, FEATURE_SIZE, ORIENTATION)) AS DEVFROMMEAN  FROM (  SELECT JH_JOBNUMBER, CD_FEATURENAME,    CASE    WHEN CD_FEATURENAME LIKE &#39;%280%&#39; THEN &#39;280&#39;    WHEN CD_FEATURENAME LIKE &#39;%400%&#39; THEN &#39;400&#39;  END AS FEATURE_SIZE,  CASE    WHEN CD_FEATURENAME LIKE &#39;%X%&#39; THEN &#39;X&#39;    WHEN CD_FEATURENAME LIKE &#39;%Y%&#39; THEN &#39;Y&#39;  END AS ORIENTATION,  CD_VALUE  FROM DIM_GENERAL_JOBHEADER header, DIM_CDFEATURE feature, FACT_CDRAWANDLER fact  WHERE header.DIM_JH_SK = fact.DIM_JH_SK  AND feature.DIM_CDFEATURE_SK = fact.DIM_CDFEATURE_SK  AND CD_FEATURENAME LIKE &#39;%_GCD_%&#39;  ORDER BY JH_JOBNUMBER, FEATURE_SIZE, ORIENTATION, FLOOR((fact.CD_MEASUREMENTPOINTCOORDINATEY+76200)/10500) DESC, (fact.CD_MEASUREMENTPOINTCOORDINATEX+76200)  ) ) -- WHERE DEVFROMMEAN &gt; -30 AND DEVFROMMEAN &lt; 30"/>
            <parameter key="username" value="***"/>
        </operator>
    </operator>
    output at the console:
    P Jun 24, 2008 9:54:32 PM: Initialising process setup
    P Jun 24, 2008 9:54:32 PM: [NOTE] No filename given for result file, using stdout for logging results!
    P Jun 24, 2008 9:54:32 PM: Checking properties...
    P Jun 24, 2008 9:54:32 PM: Properties are ok.
    P Jun 24, 2008 9:54:32 PM: Checking process setup...
    P Jun 24, 2008 9:54:32 PM: Inner operators are ok.
    P Jun 24, 2008 9:54:32 PM: Checking i/o classes...
    P Jun 24, 2008 9:54:32 PM: i/o classes are ok. Process output: ExampleSet.
    P Jun 24, 2008 9:54:32 PM: Process ok.
    P Jun 24, 2008 9:54:32 PM: Process initialised
    P Jun 24, 2008 9:54:32 PM: [NOTE] Process starts
    P Jun 24, 2008 9:54:32 PM: Process:
      Root[0] (Process)
      +- DatabaseExampleSource[0] (DatabaseExampleSource)
    P Jun 24, 2008 9:54:50 PM: [NOTE] Process finished after 18 seconds
    P Jun 24, 2008 9:54:50 PM: Process:
      Root[1] (Process)
      +- DatabaseExampleSource[1] (DatabaseExampleSource)
    P Jun 24, 2008 9:54:50 PM: Produced output:
    IOContainer (1 objects):
    SimpleExampleSet:
    134424 examples,
    2 regular attributes,
    no special attributes
    (created by DatabaseExampleSource)
    P Jun 24, 2008 9:54:50 PM: [NOTE] Process finished successfully
    G Jun 24, 2008 9:54:50 PM: [Error] Plotter: the given data contains missing values. Probably most plotters will not be able to produce proper visualizations. Please replace missing values beforehand if possible.
    G Jun 24, 2008 9:54:50 PM: [Warning] Cannot plot all data points, using only a sample of 1000 rows.
    G Jun 24, 2008 9:54:51 PM: [NOTE] Cannot use plotter 'Scatter 3D': Data table is not allowed to contain missing values. Please replace them beforehand.
    G Jun 24, 2008 9:54:52 PM: [NOTE] Cannot use plotter 'Scatter 3D Color': Data table is not allowed to contain missing values. Please replace them beforehand.
    G Jun 24, 2008 9:54:52 PM: [NOTE] Cannot use plotter 'SOM': Data table is not allowed to contain missing values. Please replace them beforehand.
    G Jun 24, 2008 9:54:52 PM: [NOTE] Cannot use plotter 'Density': Data table is not allowed to contain missing values. Please replace them beforehand.
    G Jun 24, 2008 9:54:52 PM: [NOTE] Cannot use plotter 'Sticks 3D': Data table is not allowed to contain missing values. Please replace them beforehand.
    G Jun 24, 2008 9:54:52 PM: [NOTE] Cannot use plotter 'Box 3D': Data table is not allowed to contain missing values. Please replace them beforehand.
    G Jun 24, 2008 9:54:52 PM: [NOTE] Cannot use plotter 'Surface 3D': Data table must have between 0 and 50 rows, was 1000.
    -----------------------------------

    rapidminer 4.1

    Process XML:
    <operator name="Root" class="Process" expanded="yes">
        <operator name="DatabaseExampleSource" class="DatabaseExampleSource">
            <parameter key="database_system" value="Oracle"/>
            <parameter key="database_url" value="jdbc:oracle:thin:@*******:1521:***"/>
            <parameter key="password" value="***"/>
            <parameter key="query" value="SELECT JH_JOBNUMBER, DEVFROMMEAN FROM (  SELECT JH_JOBNUMBER, (CD_VALUE-AVG(CD_VALUE) OVER(PARTITION BY JH_JOBNUMBER, FEATURE_SIZE, ORIENTATION)) AS DEVFROMMEAN  FROM (  SELECT JH_JOBNUMBER, CD_FEATURENAME,    CASE    WHEN CD_FEATURENAME LIKE &#39;%280%&#39; THEN &#39;280&#39;    WHEN CD_FEATURENAME LIKE &#39;%400%&#39; THEN &#39;400&#39;  END AS FEATURE_SIZE,  CASE    WHEN CD_FEATURENAME LIKE &#39;%X%&#39; THEN &#39;X&#39;    WHEN CD_FEATURENAME LIKE &#39;%Y%&#39; THEN &#39;Y&#39;  END AS ORIENTATION,  CD_VALUE  FROM DIM_GENERAL_JOBHEADER header, DIM_CDFEATURE feature, FACT_CDRAWANDLER fact  WHERE header.DIM_JH_SK = fact.DIM_JH_SK  AND feature.DIM_CDFEATURE_SK = fact.DIM_CDFEATURE_SK  AND CD_FEATURENAME LIKE &#39;%_GCD_%&#39;  ORDER BY JH_JOBNUMBER, FEATURE_SIZE, ORIENTATION, FLOOR((fact.CD_MEASUREMENTPOINTCOORDINATEY+76200)/10500) DESC, (fact.CD_MEASUREMENTPOINTCOORDINATEX+76200)  ) ) -- WHERE DEVFROMMEAN &gt; -30 AND DEVFROMMEAN &lt; 30"/>
            <parameter key="username" value="***"/>
        </operator>
    </operator>
    output at the console:
    P Jun 24, 2008 10:15:23 PM: Initialising process setup
    P Jun 24, 2008 10:15:23 PM: [NOTE] No filename given for result file, using stdout for logging results!
    P Jun 24, 2008 10:15:23 PM: Checking properties...
    P Jun 24, 2008 10:15:23 PM: Properties are ok.
    P Jun 24, 2008 10:15:23 PM: Checking process setup...
    P Jun 24, 2008 10:15:23 PM: Inner operators are ok.
    P Jun 24, 2008 10:15:23 PM: Checking i/o classes...
    P Jun 24, 2008 10:15:23 PM: i/o classes are ok. Process output: ExampleSet.
    P Jun 24, 2008 10:15:23 PM: Process ok.
    P Jun 24, 2008 10:15:23 PM: Process initialised
    P Jun 24, 2008 10:15:23 PM: [NOTE] Process starts
    P Jun 24, 2008 10:15:23 PM: Process:
      Root[0] (Process)
      +- DatabaseExampleSource[0] (DatabaseExampleSource)
    G Jun 24, 2008 10:15:28 PM: [Error] While reading examples from result set: operation not allowed: Unsupported syntax for refreshRow()
    P Jun 24, 2008 10:15:28 PM: [NOTE] Process finished after 5 seconds
    P Jun 24, 2008 10:15:28 PM: Process:
      Root[1] (Process)
      +- DatabaseExampleSource[1] (DatabaseExampleSource)
    P Jun 24, 2008 10:15:28 PM: Produced output:
    IOContainer (1 objects):
    SimpleExampleSet:
    10 examples,
    2 regular attributes,
    no special attributes
    (created by DatabaseExampleSource)
    P Jun 24, 2008 10:15:28 PM: [NOTE] Process finished successfully
    when i execute the process for a second time, the error message disappear..
    P Jun 24, 2008 10:15:59 PM: Initialising process setup
    P Jun 24, 2008 10:15:59 PM: [NOTE] No filename given for result file, using stdout for logging results!
    P Jun 24, 2008 10:15:59 PM: Checking properties...
    P Jun 24, 2008 10:15:59 PM: Properties are ok.
    P Jun 24, 2008 10:15:59 PM: Checking process setup...
    P Jun 24, 2008 10:15:59 PM: Inner operators are ok.
    P Jun 24, 2008 10:15:59 PM: Checking i/o classes...
    P Jun 24, 2008 10:15:59 PM: i/o classes are ok. Process output: ExampleSet.
    P Jun 24, 2008 10:15:59 PM: Process ok.
    P Jun 24, 2008 10:15:59 PM: Process initialised
    P Jun 24, 2008 10:15:59 PM: [NOTE] Process starts
    P Jun 24, 2008 10:15:59 PM: Process:
      Root[1] (Process)
      +- DatabaseExampleSource[1] (DatabaseExampleSource)
    P Jun 24, 2008 10:16:06 PM: [NOTE] Process finished after 6 seconds
    P Jun 24, 2008 10:16:06 PM: Process:
      Root[1] (Process)
      +- DatabaseExampleSource[1] (DatabaseExampleSource)
    P Jun 24, 2008 10:16:06 PM: Produced output:
    IOContainer (1 objects):
    SimpleExampleSet:
    10 examples,
    2 regular attributes,
    no special attributes
    (created by DatabaseExampleSource)
    P Jun 24, 2008 10:16:06 PM: [NOTE] Process finished successfully
    I use the latest java jdk 1.5 with the ojdbc driver for java 1.4/1.5 (ojdbc14.jar).
    Hope this helps a little ?!

    -ajay-
  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hello,

    hmm, it could be that there is a problem with the Oracle JDBC driver and order / grouping statements:

    http://www.orafaq.com/forum/?t=msg&;th=34929/0/

    and

    http://forum.java.sun.com/thread.jspa?threadID=272795&;messageID=1282624

    Maybe these links and the suggested workaround could help? Although I would not yet have an idea why it worked in RM 4.0 then. I try to check what differs between both version with respect to database access.

    Cheers,
    Ingo
Sign In or Register to comment.