Details
-
Bug
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
3.10.9
-
None
-
MOODLE_310_STABLE
Description
A client with a large number of users (~760k students) in a single course is observing a significant performance issue with queries of the form:
SELECT COUNT(eu.id) FROM (SELECT DISTINCT eu1_u.id |
FROM mdl_user eu1_u |
JOIN mdl_user_enrolments ej1_ue ON ej1_ue.userid = eu1_u.id |
JOIN mdl_enrol ej1_e ON (ej1_e.id = ej1_ue.enrolid AND ej1_e.courseid = '11149') |
JOIN (SELECT DISTINCT userid |
FROM mdl_role_assignments |
WHERE contextid IN (1,3,10094,10102,11955,1333860) |
AND roleid IN (5) |
) ra ON ra.userid = eu1_u.id |
WHERE 1 = 1 AND ej1_ue.status = '0' AND ej1_e.status = '0' AND ej1_ue.timestart < 1638389800 AND (ej1_ue.timeend = 0 OR ej1_ue.timeend > 1638389800) AND eu1_u.deleted = 0 AND eu1_u.id <> '1' AND eu1_u.deleted = 0) eu JOIN mdl_user u ON u.id = eu.id WHERE u.firstname LIKE 'E%' ESCAPE '\\' AND u.lastname LIKE 'W%' ESCAPE '\\' |
- There are a number of variations of this query taking a long time to complete; the variations mostly appear contained to different values for `WHERE u.firstname LIKE ... AND u.lastname LIKE...`
- These kinds of queries take a long time to complete, with logged times between 15 minutes and 3+ hours.
- Due to the long-running nature of this kind of query, the client is experiencing performance issues on their DB server(s) - these queries generally hit reader nodes, as expected.
- roleid IN (5) refers to the Student role
- contextid IN (1,3,10094,10102,11955,1333860) can be traced back to a single course (1333860 is the course's context; 11955, 10102, 10094, 1 are the course's containing course categories)
- It appears these kind of queries are triggered by visiting the Course Completion and Activity Completion reports, with the queries themselves almost certainly originating from lib/completionlib.php::get_num_tracked_users()
Notes from a DBA working for the client:
- looking at explain plan, mdl_role_assignments subquery causing most of the load, the DISTINCT clause causing unnecessary deduplication, user list is dereplicated again later in top select, can get rid of find JOIN to mdl_user, not needed for counts
- can't create new index, index already exist for role, contextid for mdl_role_assignments table
- the mdl_role_assignments subquery is forced to range scan table, which will net [almost everyone in the system] from the given context values.
More context from the server logs
# Query_time: 849.897175 Lock_time: 0.000195 Rows_sent: 1 Rows_examined: 4273909