"write Database Operator - Error: table already exists"

BArndtBArndt Member Posts: 2 Contributor I
edited June 12 in Help
Hi RM community,
since a few days i try solve a litte problem.

I have a MySQL DB with table x, this one is created in a preprocess to realise parallel processing. If i try to write something into this table with the write Database Operator (prdefined, append, different batch sizes) i always get the message:
Databse error occured: Table 'x' already exists.

Bug, Feature or some stupid mistake.

It is working one time if i dont create the table before. The Operator creates the table and fill the data inside, but to read out the data and recreate the table every subprocess is no solution for me.

- columns have the rigth names and metadata

Thanks in advance,

Benjamin
Tagged:

Answers

  • CharlieFirpoCharlieFirpo Member Posts: 48 Contributor II
    Dear Benjamin,

    Maybe fine logging can give you some extra information about the problem (it's only a tip):
    Tools -> Preferences -> Gui -> rapidminer.gui.log.level : FINE

    Run the process then check out the Log tab.

    Good luck!
    Charlie
  • JEdwardJEdward RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 564   Unicorn
    It sounds like a problem with you chosen parameters on the operator. 
    RapidMiner has a few options with the write database operator.
    1: that it creates a table & writes data to it.  For this you would choose the parameter 'none'.
    2: that it appends data into the table.
    3: that it overwrites an existing table.

    Choose which one you would like to use depending on your purpose. 
    There is also another operator Update Database which takes a key field you provide it and either adds new records or updates existing matches. 

    Try having a look at the below process with the comments on the operators.
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.3.015">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.3.015" expanded="true" name="Process">
        <process expanded="true">
          <operator activated="true" class="generate_data" compatibility="5.3.015" expanded="true" height="60" name="Generate Data" width="90" x="45" y="30"/>
          <operator activated="true" class="generate_id" compatibility="5.3.015" expanded="true" height="76" name="Generate ID" width="90" x="45" y="120"/>
          <operator activated="true" class="multiply" compatibility="5.3.015" expanded="true" height="130" name="Multiply" width="90" x="45" y="210"/>
          <operator activated="true" class="update_database" compatibility="5.3.015" expanded="true" height="60" name="Update Database" width="90" x="246" y="300">
            <description>This is the update database operator. 
    It will update existing records with a matching ID.  Any new records will be appended to the database table. </description>
            <parameter key="table_name" value="MyTable4"/>
            <parameter key="attribute_filter_type" value="single"/>
            <parameter key="attribute" value="id"/>
          </operator>
          <operator activated="true" class="write_database" compatibility="5.3.015" expanded="true" height="60" name="Write Database (3)" width="90" x="246" y="210">
            <description>This operator has Overwrite Mode set to 'append'.  If the table already exists then the records from your process will be written into this database alongside the existing records. </description>
            <parameter key="table_name" value="MyTable3"/>
            <parameter key="overwrite_mode" value="append"/>
          </operator>
          <operator activated="true" class="write_database" compatibility="5.3.015" expanded="true" height="60" name="Write Database" width="90" x="246" y="30">
            <description>This operator has Overwrite Mode set to 'none'.  The table will be created and data written to it. 
    If the table already exists this parameter will throw an error. </description>
            <parameter key="table_name" value="MyTable"/>
          </operator>
          <operator activated="true" class="write_database" compatibility="5.3.015" expanded="true" height="60" name="Write Database (2)" width="90" x="246" y="120">
            <description>This operator has Overwrite Mode set to 'overwrite'.  If the table already exists the existing data will be overwritten. </description>
            <parameter key="table_name" value="MyTable2"/>
            <parameter key="overwrite_mode" value="overwrite"/>
          </operator>
          <connect from_op="Generate Data" from_port="output" to_op="Generate ID" to_port="example set input"/>
          <connect from_op="Generate ID" from_port="example set output" to_op="Multiply" to_port="input"/>
          <connect from_op="Multiply" from_port="output 1" to_op="Write Database" to_port="input"/>
          <connect from_op="Multiply" from_port="output 2" to_op="Write Database (2)" to_port="input"/>
          <connect from_op="Multiply" from_port="output 3" to_op="Write Database (3)" to_port="input"/>
          <connect from_op="Multiply" from_port="output 4" to_op="Update Database" to_port="input"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
        </process>
      </operator>
    </process>
  • BArndtBArndt Member Posts: 2 Contributor I
    Thx Edward,
    i know about the different options and i choose append, because the table already existe. This is the point which looks like a bug. Choosing append and getting the message "table already exists". I also tryed Update Database, in this case i always get the message that there is no colum named 'xy'. But the table is correct...

    I choose another option, convert the data to csv and load the csv into the table, this is working.

    but thx for the help
  • JEdwardJEdward RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 564   Unicorn
    Hi Benjamin,

    Is there a column named 'xy'?  I often get messages when I forget that the column names need to match exactly. 
    For example attribute 'email' goes into database column 'email'
    but an error is thrown for attribute 'email' goes into database column 'Email'

    I often use these operators for reading, writing & updating data into databases via SQL Server, MySQL & PostgreSQL so I'm not sure it's a bug. 
    Are you able to replicate your problem in a simple example and upload the process(es) here? 

    One thing I have learned when doing this is that it's often better to create database tables using the Execute SQL operator as it gives more fine control over indexes, keys & column attributes. 
    Another tip when dealing with very large datasets it can sometimes be better to use the Execute SQL operator with a LOAD INFILE statement as this means that your RapidMiner isn't overloaded by memory out errors.
Sign In or Register to comment.