RapidMiner

Cannot store data in repository at entry.... Violation of primary key constraint

SOLVED
Contributor II

Cannot store data in repository at entry.... Violation of primary key constraint

When I try to store a dataset on a remote server, I get the following error message. It seems that it is trying to add the attribute names into a metadata table that already contains them. How do I clear these metadata tables out?

 

rm_server_error.png

5 REPLIES
Community Manager

Re: Cannot store data in repository at entry.... Violation of primary key constraint

You could try a Materialize Data right before you write it to the database BUT i'm not sure that's going to work 100%.

Regards,
Thomas
LinkedIn: Thomas Ott
Blog: Neural Market Trends
Moderator

Re: Cannot store data in repository at entry.... Violation of primary key constraint

Hello,

 

Is it possible to attach the more verbose error from inside the log view for that process?  You should be able to access it from clicking the log icon next to the completion times in your process scheduler.  

 

 

image.png

 

 

 

This fuller log segment will give us an idea of the specific error

Contributor II

Re: Cannot store data in repository at entry.... Violation of primary key constraint

Sure, here is the full error from the log file.

 

SEVERE: Process failed: com.rapidminer.operator.UserError: Cannot store data in repository at entry '../data/model_table_test'. Reason: Cannot store example set in database as es_31: com.microsoft.sqlserver.jdbc.SQLServerException: Violation of PRIMARY KEY constraint 'PK__es_31_me__1842C92D2CCBCF70'. Cannot insert duplicate key in object 'dbo.es_31_meta'. The duplicate key value is (STATUS)..
Jul 19, 2017 12:56:02 PM <unknown> <unknown>
SEVERE: Here:           Process[1] (Process)
           subprocess 'Main Process'
             +- Retrieve sample_table[1] (Retrieve)
             +- prep table[1] (Subprocess)
           subprocess 'Nested Process'
             |     +- Set Role[1] (Set Role)
             |     +- Generate ID[1] (Generate ID)
             |     +- Nominal to Text[1] (Nominal to Text)
             +- Multiply[1] (Multiply)
             +- Filter disk vars[1] (Select Attributes)
             +- Select id & desc[1] (Select Attributes)
             +- Token and Filter[1] (Process Documents from Data)
           subprocess 'Vector Creation'
             |     +- Filter Tokens (by Content)[0] (Filter Tokens (by Content))
             |     +- Transform Cases[2000] (Transform Cases)
             |     +- Remove xslt[2000] (Execute Script)
             |     +- Tokenize[2000] (Tokenize)
             |     +- Filter Tokens (by Length)[2000] (Filter Tokens (by Length))
             |     +- Filter junk out[2000] (Execute Script)
             +- Execute GenAttr[1] (Execute Process)
             +- Join[1] (Join)
             +- Generate Attributes[1] (Generate Attributes)
             +- Select Attributes[1] (Select Attributes)
       ==>   +- store model_table[1] (Store)

 

Highlighted
RMStaff

Re: Cannot store data in repository at entry.... Violation of primary key constraint

Hi ccricha,

 

the Primary key in the Meta data tables contains the Attribute names which need to be unique in RapidMiner (and they definitely are in your RapidMiner object you want to store).

Nevertheless I suppose there are characters in these names which may not be correctly interpreted by your database.

Just for testing purposes, could you please try and use the Operator Rename by Generic Names to rename all attributes.

The resulting ExampleSet should look like 'att_1', 'att_2',....

 

Best,

Edin

Contributor II

Re: Cannot store data in repository at entry.... Violation of primary key constraint

It turns out that there were two attributes that contained the same name and it makes sense that these were causing a violation of the primary key constraint in the database. I have removed some unnecessary columns and it now works.