🎉 🎉 RAPIDMINER 9.10 IS OUT!!! 🎉🎉
Download the latest version helping analytics teams accelerate time-to-value for streaming and IIOT use cases.
number of rows in write database operator
Perhaps this has been considered, but just in case:
Having tried to write large scored example sets in a database using the write database operator, took quite long in some application I have. For instance writing 5 million rows in a MySQL database took about one hour. In addition to tuning the database itself, some optimisation on RM's side may be possible and very welcome. For instance writing an example set in a database table may be speeded up if rows/examples are written in groups (i.e. each group of examples is written via the same INSERT SQL command). If there is one row per INSERT only, time for connection, time for sending the query to the database server, and time for parsing the query are unnecessarily spent. If several rows are grouped per one INSERT command, comparable times are spent per group of rows instead of per individual row.
So it would be useful to add a parameter "number of rows" to the write database operator showing how many rows are to be grouped per INSERT command. If some DBMSs do not support several rows per INSERT, then at least this number can consist of how many individual INSERT commands (with one row per command) are to be grouped in the same SQL transaction, so anyway an optimisation would result.