A client with a large number of users (~760k students) in a single course is observing a significant performance issue with queries of the form:
- There are a number of variations of this query taking a long time to complete; the variations mostly appear contained to different values for `WHERE u.firstname LIKE ... AND u.lastname LIKE...`
- These kinds of queries take a long time to complete, with logged times between 15 minutes and 3+ hours.
- Due to the long-running nature of this kind of query, the client is experiencing performance issues on their DB server(s) - these queries generally hit reader nodes, as expected.
- roleid IN (5) refers to the Student role
- contextid IN (1,3,10094,10102,11955,1333860) can be traced back to a single course (1333860 is the course's context; 11955, 10102, 10094, 1 are the course's containing course categories)
- It appears these kind of queries are triggered by visiting the Course Completion and Activity Completion reports, with the queries themselves almost certainly originating from lib/completionlib.php::get_num_tracked_users()
- looking at explain plan, mdl_role_assignments subquery causing most of the load, the DISTINCT clause causing unnecessary deduplication, user list is dereplicated again later in top select, can get rid of find JOIN to mdl_user, not needed for counts
- can't create new index, index already exist for role, contextid for mdl_role_assignments table
- the mdl_role_assignments subquery is forced to range scan table, which will net [almost everyone in the system] from the given context values.
# Query_time: 849.897175 Lock_time: 0.000195 Rows_sent: 1 Rows_examined: 4273909