Moodle

User/index.php is really slow when on participants page

Details

  • Type: Sub-task Sub-task
  • Status: Closed Closed
  • Priority: Major Major
  • Resolution: Fixed
  • Affects Version/s: 1.9
  • Fix Version/s: 1.9
  • Component/s: Roles / Access
  • Labels:
    None
  • Affected Branches:
    MOODLE_19_STABLE
  • Fixed Branches:
    MOODLE_19_STABLE

Description

The attached patch seems to speed up the display one page from 317,000 users on moodle.org from around 30 seconds to about 4, with no ill effects.

Can I have some review of this before I check it in? It's running on moodle.org already by the way.

Activity

Hide
Martin Dougiamas added a comment -

Just committed a simpler fix based on discussion in Moodle HQ chat - remove one DISTINCT! Good one, ML!

Show
Martin Dougiamas added a comment - Just committed a simpler fix based on discussion in Moodle HQ chat - remove one DISTINCT! Good one, ML!
Hide
Martín Langhoff added a comment -

Notss from collaborative debugging and benchmarking on MHQ:

  • the DISTINCT over a very wide set of rows is incredibly expensive if the dataset is large
  • it gets specially bad when we add the outer join for contexts (30s on moodle.org)
  • some cases don't need the DISTINCT, MD fixing one of them

So the fix where we need DISTINCT is to push it into a subselect. Do all the "selection of the user id (based on enrolments), with a distinct to avoid dups" in a subselect, and the outer select picks the user fields, lastaccess, and context stuff.

Show
Martín Langhoff added a comment - Notss from collaborative debugging and benchmarking on MHQ:
  • the DISTINCT over a very wide set of rows is incredibly expensive if the dataset is large
  • it gets specially bad when we add the outer join for contexts (30s on moodle.org)
  • some cases don't need the DISTINCT, MD fixing one of them
So the fix where we need DISTINCT is to push it into a subselect. Do all the "selection of the user id (based on enrolments), with a distinct to avoid dups" in a subselect, and the outer select picks the user fields, lastaccess, and context stuff.
Hide
Martín Langhoff added a comment -

Martin - the query that was performing badly (1.18s?) according to skype was this:

SELECT DISTINCT u.id, u.username, u.firstname, u.lastname, u.email, u.city, u.country, u.picture, u.lang, u.timezone, u.emailstop, u.maildisplay, u.imagealt, COALESCE( ul.timeaccess, 0 ) AS lastaccess, r.hidden, ctx.id AS ctxid, ctx.path AS ctxpath, ctx.depth AS ctxdepth, ctx.contextlevel AS ctxlevel
FROM user u
LEFT OUTER JOIN context ctx
ON ( u.id = ctx.instanceid
AND ctx.contextlevel =30 )
JOIN role_assignments r
ON u.id = r.userid
LEFT OUTER JOIN user_lastaccess ul
ON ( r.userid = ul.userid
AND ul.courseid =5 )
WHERE (
r.contextid =53
OR r.contextid
IN ( 54, 50 )
)
AND u.deleted =0
AND (
ul.courseid =5
OR ul.courseid IS NULL
)
AND u.username != 'guest'
AND r.roleid NOT
IN ( 1, 2 )
ORDER BY lastaccess DESC
LIMIT 20

According to your notes, removing DISTINCT made it drop to ~0.58s. And then removing the left outer join to about 0.40s.

I've rewritten it below to use a subselect where we apply a much smarter GROUP BY which fixes a bug we had before (hidden/unhidden role-assignments would mean that a user was listed twice) – can you tell me if it compares well?

SELECT u.id, u.username, u.firstname, u.lastname, u.email, u.city, u.country, u.picture, u.lang, u.timezone, u.emailstop, u.maildisplay, u.imagealt, COALESCE( ul.timeaccess, 0 ) AS lastaccess, eu.hidden, ctx.id AS ctxid, ctx.path AS ctxpath, ctx.depth AS ctxdepth, ctx.contextlevel AS ctxlevel
FROM user u
JOIN (SELECT iu.id, MIN(r.hidden) AS hidden
FROM user iu
JOIN role_assignments r
ON iu.id = r.userid
WHERE ( r.contextid =53
OR r.contextid IN ( 54, 50 ) )
AND r.roleid NOT
IN ( 1, 2 )
GROUP BY iu.id
) eu
ON u.id=eu.id
LEFT OUTER JOIN context ctx
ON ( u.id = ctx.instanceid
AND ctx.contextlevel =30 )
LEFT OUTER JOIN user_lastaccess ul
ON ( u.id = ul.userid
AND ul.courseid =5 )
WHERE
u.deleted =0
AND u.username != 'guest'
ORDER BY lastaccess DESC
LIMIT 20

Show
Martín Langhoff added a comment - Martin - the query that was performing badly (1.18s?) according to skype was this: SELECT DISTINCT u.id, u.username, u.firstname, u.lastname, u.email, u.city, u.country, u.picture, u.lang, u.timezone, u.emailstop, u.maildisplay, u.imagealt, COALESCE( ul.timeaccess, 0 ) AS lastaccess, r.hidden, ctx.id AS ctxid, ctx.path AS ctxpath, ctx.depth AS ctxdepth, ctx.contextlevel AS ctxlevel FROM user u LEFT OUTER JOIN context ctx ON ( u.id = ctx.instanceid AND ctx.contextlevel =30 ) JOIN role_assignments r ON u.id = r.userid LEFT OUTER JOIN user_lastaccess ul ON ( r.userid = ul.userid AND ul.courseid =5 ) WHERE ( r.contextid =53 OR r.contextid IN ( 54, 50 ) ) AND u.deleted =0 AND ( ul.courseid =5 OR ul.courseid IS NULL ) AND u.username != 'guest' AND r.roleid NOT IN ( 1, 2 ) ORDER BY lastaccess DESC LIMIT 20 According to your notes, removing DISTINCT made it drop to ~0.58s. And then removing the left outer join to about 0.40s. I've rewritten it below to use a subselect where we apply a much smarter GROUP BY which fixes a bug we had before (hidden/unhidden role-assignments would mean that a user was listed twice) – can you tell me if it compares well? SELECT u.id, u.username, u.firstname, u.lastname, u.email, u.city, u.country, u.picture, u.lang, u.timezone, u.emailstop, u.maildisplay, u.imagealt, COALESCE( ul.timeaccess, 0 ) AS lastaccess, eu.hidden, ctx.id AS ctxid, ctx.path AS ctxpath, ctx.depth AS ctxdepth, ctx.contextlevel AS ctxlevel FROM user u JOIN (SELECT iu.id, MIN(r.hidden) AS hidden FROM user iu JOIN role_assignments r ON iu.id = r.userid WHERE ( r.contextid =53 OR r.contextid IN ( 54, 50 ) ) AND r.roleid NOT IN ( 1, 2 ) GROUP BY iu.id ) eu ON u.id=eu.id LEFT OUTER JOIN context ctx ON ( u.id = ctx.instanceid AND ctx.contextlevel =30 ) LEFT OUTER JOIN user_lastaccess ul ON ( u.id = ul.userid AND ul.courseid =5 ) WHERE u.deleted =0 AND u.username != 'guest' ORDER BY lastaccess DESC LIMIT 20

People

Vote (0)
Watch (4)

Dates

  • Created:
    Updated:
    Resolved: