🦉🦉   WOOT WOOT!   RAPIDMINER WISDOM 2020 EARLY BIRD REGISTRATION ENDS FRIDAY DEC 13!   REGISTER NOW!   ðŸ¦‰ðŸ¦‰
ALL FEATURE REQUESTS HERE ARE MONITORED BY OUR PRODUCT TEAM.

VOTING MATTERS!

IDEAS WITH HIGH NUMBERS OF VOTES (USUALLY ≥ 10) ARE PRIORITIZED IN OUR ROADMAP.

NOTE: IF YOU WISH TO SUGGEST A NEW FEATURE, PLEASE POST A NEW QUESTION AND TAG AS "FEATURE REQUEST". THANK YOU.

Update Database is unexpectedly slow

BalazsBaranyBalazsBarany Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert Posts: 339   Unicorn
edited March 28 in Product Ideas
I had a situation today where a big table (nicely set up with indexes) has to be updated with Update Database, with the integer ID field as the key (attribute filter type: single, attribute: id). The ID is displayed by RapidMiner as an integer, too.

It was much slower than it should have been: when testing an update, it finished in milliseconds, but in RapidMiner it took seconds instead.

Having already seen this kind of unexpected slowness with other software, I checked the access statistics in the database. The table showed many (and growing) full table scans, but a constant number of index scans. So the index of the primary key field was not being used.

Creating a second index on the id field with the double precision (floating point number) type helped, the throughput of Update Database improved thousand fold, the index was being used according to the table statistics, and the number of full table scans stayed constant. This is of course not a nice workaround, but for me it helped.

Update Database uses "prepared statements". In these, all columns get an assigned data type. Depending on the compatibility between types in the database, an index can or can't be used if the data type doesn't match. Here, the integer index was not used with the floating point data type.

Hopefully this will be fixed in one of the next releases. Until then, you can work around the problem by creating a second index with the appropriate data type in your database to speed up Update Database.

This is on PostgreSQL 10, your database might work differently. 
mstogicaBalazsBarany
1
1 votes

Open for Voting · Last Updated

31 May 2019 - redesignated as feature request - moved to Product Ideas PROD-703

Sign In or Register to comment.