-
Improvement
-
Resolution: Fixed
-
Major
-
2.1, 2.2, 2.3, 2.6
-
MOODLE_21_STABLE, MOODLE_22_STABLE, MOODLE_23_STABLE, MOODLE_26_STABLE
-
MOODLE_27_STABLE
-
w04_
MDL-34055_m27_insertrecords -
-
8
-
BACKEND Sprint 9
When inserting a very large number of records, insert_record() can be quite slow (a few hundred ms per insert).
I've done some testing and found that it's possible to get ~100x performance boost by bundling multiple inserts into one query using the syntax:
INSERT INTO table (field1, field2, field3) VALUES (a1, a2, a3),(b1, b2, b3),(c1, c2, c3),...
I've written a generic function to do bulk inserts:
Insert multiple records into a table using batch insert syntax for speed}}
@param string $table The database table to be inserted into
@param iterator $iterator An iterable object (such as moodle_recordset or array)
containing objects to be inserted into $table
@return true
@throws dml_exception if errorpublic function insert_records_via_batch($table, $iterator) {}
The function builds the query by adding records from the iterator until it reaches either the maximum query length or maximum record count (both predefined constants), then it does an insert and starts building a fresh query.
By accepting an iterator you can pass in a whole array, or a recordset (to avoid storing all the data to be inserted in memory).
Although it works in simple cases I'm finding that there are quite a few subtleties in the depths of dmllib and to make it work with all field types and databases I think some core dml changes might be needed.
I'm wondering if there would be any interest in getting this into core?