"streaming database"

knulphknulph Member Posts: 3 Contributor I
edited May 23 in Help
Hi guys,
I have to classify an image and the csv file is about 2GB. So..I decided to use ingres and the stream database function in RM.

Ingres offers a nice GUI to import a cvs file, but when I try to "stream the database" in RM, I get the following error:


Aug 29, 2010 1:17:09 PM SEVERE: Process failed: Database error occurred: ALTER TABLE: Invalid combination of attribute qualifiers specified. 
ADD COLUMN does not support 'with system_maintained', or 'not null' 
without an accompanying 'with default' (with no explicit default 
value).
Aug 29, 2010 1:17:09 PM SEVERE: Here:          Process[1] (Process)
          subprocess 'Main Process'
      ==>  +- Stream Database[1] (Stream Database)


I get the same error even when I use the IRIS set already included in the RM-ingres bundle.

Can anyone tell me what is the error and how I can fix it? I admit I know nothing about database systems.

Thanks a lot :-)
Tagged:

Answers

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,525   Unicorn
    Hi,
    you will need a primary key in the table to identify each example. If there's no primary key, RapidMiner tries to create one, but seems to fail. Unfortunately the SQL "standard" is more a rule of thumb for all the database vendors and actual syntax differs very much...

    Greetings,
      Sebastian
  • knulphknulph Member Posts: 3 Contributor I
    Thanks for your reply. Not sure what that means though :-(

    If I read the RM help about streaming database:

    "if no primary key and index is present, a new column named RM_INDEX is created and automatically used as primary key,
    if a primary key is already present in the specified table, a new table named RM_MAPPED_INDEX is created mapping a new index column RM_INDEX to the original primary key."

    it seems that the primary key is automatically defined within RM. However, I believe there should be something to be set in INGRES because I keep getting the same error (ALTER TABLE).
  • dan_agapedan_agape Member Posts: 106  Guru
    Hi,

    Indeed, the documentation shows that RM_INDEX is created in case you do not have a primary key. However, for some DBMS some problems may still appear regarding this and/or other aspects (data types, etc - for instance I encountered such problems with a successor of Ingres, PostgreSQL, which is the most advanced&powerful free DBMS, and second in popularity after MySQL - which owes its position to its Web applicability; hopefully the RM team would test various aspects of connecting RM with PostgreSQL in more detail at some point).

    But you may wish to switch to one of the DBMS that users have utilised without particular problems with RM - including MySQL and SQL Server (the Express Edition is free and quite capable - 10GB database storage).

    Dan 
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,525   Unicorn
    Hi,
    or you might define the primary key yourself if the creation isn't allowed for RapidMiner. Might be it's just a problem of User Rights?

    Greetings,
      Sebastian
  • knulphknulph Member Posts: 3 Contributor I
    Hi guys, thanks for your help and suggestions.

    First of all, I am running win 7 64 bit, and I noticed some differencea running both RM and INGRES as administrator, instead of the normal set up.
    So..I followed the INGRES user guide to create and populate the database, and defined a primary key. Then, if I run RM with the "recreate index" option checked I get this result:

    Sep 3, 2010 10:06:05 AM SEVERE: Process failed: Database error occurred: ALTER TABLE: Invalid combination of attribute qualifiers specified. 
    ADD COLUMN does not support 'with system_maintained', or 'not null' 
    without an accompanying 'with default' (with no explicit default 
    value).
    Sep 3, 2010 10:06:05 AM SEVERE: Here:          Process[1] (Process)
              subprocess 'Main Process'
          ==>  +- Stream Database[1] (Stream Database)


    If I dont check that optioin, I get a different result (which is the main difference with respect a week ago):

    Sep 3, 2010 10:06:10 AM SEVERE: Process failed: Database error occurred: line 1, Column 'rm_index' not found in any specified table.
    Sep 3, 2010 10:06:10 AM SEVERE: Here:          Process[1] (Process)
              subprocess 'Main Process'
          ==>  +- Stream Database[1] (Stream Database)

    and I notice that RM writes the first column in a new table within INGRES, but just the first one. So..overall i believe I made some minor progress, but still didnt achieve my goal. Dan, i will try to use a different DB system. Any help in the main time would be highly appreciated.

    Sebastian, btw..I am converting my training samples from a tif image to a csv file. It would be a big improvement for RM to be able to read images (there is a huge remote sensing community that use classification methods for various projects).

    Thanks a lot for your help!
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,525   Unicorn
    Hi,
    what happens if you just use the ingres tools to add a primary key column to your table?

    Beside this we are aware of the image processing needs and we would really like to add such an extension. But as long as nobody helps us by either contributing code or paying us, so that we can hire an additional programmer, there's simply no way to achieve this in the near future. But we are releasing an R extension during the RCOMM. Is there an image processing library for them? Then you could simply use it and transfer the data to RapidMiner.

    Greetings,
      Sebastian Land
Sign In or Register to comment.