-
Improvement
-
Resolution: Unresolved
-
Minor
-
None
-
3.11.12, 4.1
-
MOODLE_311_STABLE, MOODLE_401_STABLE
We are running a Moodle instance that currently hosts around 300k users. Our course sizes range between a few hundred and over 20k users. Owners of courses with more than a few thousand users started complaining a while ago about timeouts when trying to export completion reports in CSV or Excel format.
We initially thought that there might be a missing index in the DB or something similar, but couldn't really find a query that took significantly longer than others. After adding a debug statement to log the SQL queries we were shocked to find out that over 250k queries had been generated in the process of one request (https://<MOODLE-HOST>/report/completion/index.php?course=<COURSE-ID>&format=csv). Over 99% look like this: SELECT FROM mdl_course_completion_crit_compl WHERE course = ? AND userid = ? AND criteriaid = ?. It seems like the queries are generated in this nested loop, which, for each user, goes through each completion criterium. This doesn't seem to be a problem for small courses with few criteria, but can grow quickly the bigger the course is (O(n*m)).
Unfortunately, I don't have too much knowledge about the code base. But it would probably be already an improvement to only retrieve the criteria_completion for each user once. Or, better, just have a query like SELECT FROM mdl_course_completion_crit_compl WHERE course = ?.
How to reproduce
- Create a course that has a few completion criteria (in our case we have 12) and a few thousand users (in our case around 21k).
- In report/completion/index.php add $DB->set_debug(true); at the beginning of the file to enable SQL debugging
- Visit /report/completion/index.php?course=<COURSE-ID>&format=csv
- Count the number of occurrences of SELECT * FROM mdl_course_completion_crit_compl WHERE course = ? AND userid = ? AND criteriaid = ?. It should be around 250k
What I expected
The report should start to download within a few seconds
What actually happens
It took around 10 minutes until the download started, which is way above the idle timeout of our load balancer (1 minute).