Moodle
  1. Moodle
  2. MDL-34776

Some enrolment and role related queries are slow

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 2.3.1, 2.4
    • Fix Version/s: 2.3.2
    • Component/s: Enrolments
    • Labels:
    • Testing Instructions:
      Hide

      1/ try assigning roles for enrolled users bellow the course level ("locally assigned roles" in module settings block)
      2/ try "normal manual" user enrolling
      3/ try "normal" assigning of roles to not-enrolled users ("Other users" in course settings block)
      4/ disable JS and repeat user ernolment via the old two pane widget
      5/ disable JS and repeat other user role assignment via the old two pane widget
      6/ try to add/remove enrolled users from a group

      Show
      1/ try assigning roles for enrolled users bellow the course level ("locally assigned roles" in module settings block) 2/ try "normal manual" user enrolling 3/ try "normal" assigning of roles to not-enrolled users ("Other users" in course settings block) 4/ disable JS and repeat user ernolment via the old two pane widget 5/ disable JS and repeat other user role assignment via the old two pane widget 6/ try to add/remove enrolled users from a group
    • Affected Branches:
      MOODLE_23_STABLE, MOODLE_24_STABLE
    • Fixed Branches:
      MOODLE_23_STABLE
    • Pull from Repository:
    • Pull Master Branch:
      w34_MDL-34776_m24_notinsql
    • Rank:
      43264

      Description

      I have a moodle with 600,000 users enrolled and this query is never completing when simply trying to enrol one user to a course using postgres.

      SELECT COUNT(1) FROM mdl_user u
      WHERE id <> $1 AND u.deleted = 0 AND u.confirmed = 1
      AND u.id NOT IN (SELECT ue.userid
      FROM mdl_user_enrolments ue
      JOIN mdl_enrol e ON (e.id = ue.enrolid AND e.id = $2))
      

      I'm wondering if we can avoid the subselect somehow, something like:

      SELECT COUNT(1) FROM mdl_user u 
      LEFT OUTER join mdl_user_enrolments ue on u.id = ue.userid and ue.enrolid = $2 
      WHERE ue IS NULL AND u.id <> $1 AND u.deleted = 0 AND u.confirmed = 1;
      

        Activity

        Hide
        Dan Poltawski added a comment - - edited

        Old query - cost *11058679686.88*!!

        integration=# explain SELECT COUNT(1) FROM mdl_user u
        WHERE id <> 1 AND u.deleted = 0 AND u.confirmed = 1
        AND u.id NOT IN (SELECT ue.userid
        FROM mdl_user_enrolments ue
        JOIN mdl_enrol e ON (e.id = ue.enrolid AND e.id = 1))
        ;
                                                         QUERY PLAN                                                 
        ------------------------------------------------------------------------------------------------------------
         Aggregate  (cost=11058679686.88..11058679686.89 rows=1 width=0)
           ->  Seq Scan on mdl_user u  (cost=0.00..11058678821.84 rows=346016 width=0)
                 Filter: ((id <> 1) AND (deleted = 0) AND (confirmed = 1) AND (NOT (SubPlan 1)))
                 SubPlan 1
                   ->  Materialize  (cost=0.00..30232.83 rows=690784 width=8)
                         ->  Nested Loop  (cost=0.00..24079.91 rows=690784 width=8)
                               ->  Index Scan using mdl_enro_id_pk on mdl_enrol e  (cost=0.00..8.27 rows=1 width=8)
                                     Index Cond: (id = 1)
                               ->  Seq Scan on mdl_user_enrolments ue  (cost=0.00..17163.80 rows=690784 width=16)
                                     Filter: (enrolid = 1)
        (10 rows)
        

        New query cost *129018.94*:

        integration=# explain ANALYZE SELECT COUNT(1) FROM mdl_user u 
        LEFT OUTER join mdl_user_enrolments ue on u.id = ue.userid and ue.enrolid = 1 
        WHERE ue IS NULL AND u.id <> 1 AND u.deleted = 0 AND u.confirmed = 1;
                                                                             QUERY PLAN                                                                     
        ----------------------------------------------------------------------------------------------------------------------------------------------------
         Aggregate  (cost=129018.94..129018.95 rows=1 width=0) (actual time=3010.499..3010.499 rows=1 loops=1)
           ->  Hash Left Join  (cost=36592.60..129010.29 rows=3460 width=0) (actual time=2163.198..3010.492 rows=2 loops=1)
                 Hash Cond: (u.id = ue.userid)
                 Filter: (ue.* IS NULL)
                 ->  Seq Scan on mdl_user u  (cost=0.00..63252.58 rows=692032 width=8) (actual time=0.008..590.134 rows=690786 loops=1)
                       Filter: ((id <> 1) AND (deleted = 0) AND (confirmed = 1))
                 ->  Hash  (cost=17163.80..17163.80 rows=690784 width=104) (actual time=974.618..974.618 rows=690784 loops=1)
                       Buckets: 1024  Batches: 128  Memory Usage: 729kB
                       ->  Seq Scan on mdl_user_enrolments ue  (cost=0.00..17163.80 rows=690784 width=104) (actual time=0.016..351.870 rows=690784 loops=1)
                             Filter: (enrolid = 1)
         Total runtime: 3010.555 ms
        
        Show
        Dan Poltawski added a comment - - edited Old query - cost * 11058679686.88 *!! integration=# explain SELECT COUNT(1) FROM mdl_user u WHERE id <> 1 AND u.deleted = 0 AND u.confirmed = 1 AND u.id NOT IN (SELECT ue.userid FROM mdl_user_enrolments ue JOIN mdl_enrol e ON (e.id = ue.enrolid AND e.id = 1)) ; QUERY PLAN ------------------------------------------------------------------------------------------------------------ Aggregate (cost=11058679686.88..11058679686.89 rows=1 width=0) -> Seq Scan on mdl_user u (cost=0.00..11058678821.84 rows=346016 width=0) Filter: ((id <> 1) AND (deleted = 0) AND (confirmed = 1) AND (NOT (SubPlan 1))) SubPlan 1 -> Materialize (cost=0.00..30232.83 rows=690784 width=8) -> Nested Loop (cost=0.00..24079.91 rows=690784 width=8) -> Index Scan using mdl_enro_id_pk on mdl_enrol e (cost=0.00..8.27 rows=1 width=8) Index Cond: (id = 1) -> Seq Scan on mdl_user_enrolments ue (cost=0.00..17163.80 rows=690784 width=16) Filter: (enrolid = 1) (10 rows) New query cost * 129018.94 *: integration=# explain ANALYZE SELECT COUNT(1) FROM mdl_user u LEFT OUTER join mdl_user_enrolments ue on u.id = ue.userid and ue.enrolid = 1 WHERE ue IS NULL AND u.id <> 1 AND u.deleted = 0 AND u.confirmed = 1; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=129018.94..129018.95 rows=1 width=0) (actual time=3010.499..3010.499 rows=1 loops=1) -> Hash Left Join (cost=36592.60..129010.29 rows=3460 width=0) (actual time=2163.198..3010.492 rows=2 loops=1) Hash Cond: (u.id = ue.userid) Filter: (ue.* IS NULL) -> Seq Scan on mdl_user u (cost=0.00..63252.58 rows=692032 width=8) (actual time=0.008..590.134 rows=690786 loops=1) Filter: ((id <> 1) AND (deleted = 0) AND (confirmed = 1)) -> Hash (cost=17163.80..17163.80 rows=690784 width=104) (actual time=974.618..974.618 rows=690784 loops=1) Buckets: 1024 Batches: 128 Memory Usage: 729kB -> Seq Scan on mdl_user_enrolments ue (cost=0.00..17163.80 rows=690784 width=104) (actual time=0.016..351.870 rows=690784 loops=1) Filter: (enrolid = 1) Total runtime: 3010.555 ms
        Hide
        Dan Poltawski added a comment -

        The current query I gave up on timing after 20mins. As you can see the new query takes 3000ms.

        NOTE: I haven't tested the correctness of the query - just proving the concept.

        Show
        Dan Poltawski added a comment - The current query I gave up on timing after 20mins. As you can see the new query takes 3000ms. NOTE: I haven't tested the correctness of the query - just proving the concept.
        Hide
        Petr Škoda added a comment -

        I have found a few places that use "NOT IN" where "LEFT JOIN" should be faster, thanks for the report.

        Show
        Petr Škoda added a comment - I have found a few places that use "NOT IN" where "LEFT JOIN" should be faster, thanks for the report.
        Hide
        Aparup Banerjee added a comment -

        i was wondering about indexes needed here (if any).

        Show
        Aparup Banerjee added a comment - i was wondering about indexes needed here (if any).
        Hide
        Dan Poltawski added a comment -

        Well, we hav one on enrolid, userid which seems sensible. I'm a bit rusty on explain output and the query I was looking at is not one you do every page load so I don't think we necessarily should overoptimisme it.

        If it completeles in a few seconds rather than > 20mins then thats a win.

        Show
        Dan Poltawski added a comment - Well, we hav one on enrolid, userid which seems sensible. I'm a bit rusty on explain output and the query I was looking at is not one you do every page load so I don't think we necessarily should overoptimisme it. If it completeles in a few seconds rather than > 20mins then thats a win.
        Hide
        Eloy Lafuente (stronk7) added a comment -

        The integration of this issue has been delayed to next week because the integration period is over (Monday, Tuesday) and testing must happen on Wednesday.

        This change to a more rigid timeframe on each integration/testing cycle aims to produce a better and clear separation and organization of tasks for everybody.

        This is a bulk-automated message, so if you want to blame somebody/thing/where, don't do it here (use git instead) :-D :-P

        Apologizes for the inconvenient, this will be integrated next week. Thanks for your collaboration & ciao

        Show
        Eloy Lafuente (stronk7) added a comment - The integration of this issue has been delayed to next week because the integration period is over (Monday, Tuesday) and testing must happen on Wednesday. This change to a more rigid timeframe on each integration/testing cycle aims to produce a better and clear separation and organization of tasks for everybody. This is a bulk-automated message, so if you want to blame somebody/thing/where, don't do it here (use git instead) :-D :-P Apologizes for the inconvenient, this will be integrated next week. Thanks for your collaboration & ciao
        Hide
        Eloy Lafuente (stronk7) added a comment -

        Integrated, thanks! (23 & master)

        Show
        Eloy Lafuente (stronk7) added a comment - Integrated, thanks! (23 & master)
        Hide
        Andrew Davis added a comment -

        It seems to be working nicely. Passing.

        Show
        Andrew Davis added a comment - It seems to be working nicely. Passing.
        Hide
        Eloy Lafuente (stronk7) added a comment -

        YEAR!*

        CAF*, TOT!*

        • Your effort amazingly resulted. (unbelievable :-P)
        • Closing as fixed.
        • Tons of thanks.
        Show
        Eloy Lafuente (stronk7) added a comment - YEAR!* CAF*, TOT!* Your effort amazingly resulted. (unbelievable :-P) Closing as fixed. Tons of thanks.

          People

          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: