Moodle
  1. Moodle
  2. MDL-34156

Inefficient query in meta enrol plugin

    Details

    • Database:
      MySQL
    • Testing Instructions:
      Hide

      1/ set up some meta course links
      2/ execute meta sync from CLI with verbose flag, verify results
      3/ modify some users in the linked courses
      4/ execute meta sync from CLI with verbose flag, verify results

      Show
      1/ set up some meta course links 2/ execute meta sync from CLI with verbose flag, verify results 3/ modify some users in the linked courses 4/ execute meta sync from CLI with verbose flag, verify results
    • Affected Branches:
      MOODLE_23_STABLE
    • Fixed Branches:
      MOODLE_23_STABLE, MOODLE_24_STABLE
    • Pull Master Branch:
      MDL-34156-master
    • Rank:
      42490

      Description

      We have been noticing an extremely long-running SQL query in our MySQL query logs from the enrol_meta_sync function in enrol/meta/locallib.php. (Notice the rows examined is just under 2 billion! Our user_enrolments table has about 187,000-odd rows)

      # Query_time: 1269.173287 Lock_time: 0.000215 Rows_sent: 0 Rows_examined: 1899410877
      SET timestamp=1341238953;
      SELECT ue.*
      FROM mdl_user_enrolments ue
      JOIN mdl_enrol e ON (e.id = ue.enrolid AND e.enrol = 'meta' )
      LEFT JOIN (SELECT xpue.userid, xpe.courseid
      FROM mdl_user_enrolments xpue
      JOIN mdl_enrol xpe ON (xpe.id = xpue.enrolid AND xpe.enrol <> 'meta' AND xpe.enrol IN ('manual','guest','self','cohort','database','meta','category','strathds'))
      ) pue ON (pue.courseid = e.customint1 AND pue.userid = ue.userid)
      WHERE pue.userid IS NULL;

      As far as I can see the subselect is unnecessary and the query is equivalent to this one, which runs in a couple of seconds:

      SELECT ue.*
      FROM mdl_user_enrolments ue
      JOIN mdl_enrol e ON (e.id = ue.enrolid)
      LEFT JOIN (mdl_user_enrolments xpue
      JOIN mdl_enrol xpe ON (xpe.id = xpue.enrolid AND xpe.enrol <> 'meta'
      AND xpe.enrol IN ('manual','guest','self','cohort','database','meta','category','strathds'))
      ) ON (xpe.courseid = e.customint1 AND xpue.userid = ue.userid)
      WHERE e.enrol = 'meta'
      AND xpue.userid IS NULL;

      This is on MySQL 5.5.22

        Issue Links

          Activity

          Hide
          Michael Aherne added a comment -

          I've attached a pull request for this change, if it's useful.

          Show
          Michael Aherne added a comment - I've attached a pull request for this change, if it's useful.
          Hide
          Petr Škoda added a comment -

          Hi, thanks for the patch, I agree that it should be possible to remove the subselect, but I do not like that you moved the conditions to WHERE because I intentionally put it there to make it more readable for me (I know Eloy and others may disagree, but I have to maintain it). I thought that the placement of conditions should not affect perf, right?

          Show
          Petr Škoda added a comment - Hi, thanks for the patch, I agree that it should be possible to remove the subselect, but I do not like that you moved the conditions to WHERE because I intentionally put it there to make it more readable for me (I know Eloy and others may disagree, but I have to maintain it). I thought that the placement of conditions should not affect perf, right?
          Hide
          Michael Aherne added a comment -

          Hi Petr. I just changed the condition to the WHERE clause because I vaguely remember the MySQL documentation recommending this, but I don't know if it was for performance reasons or not, and I can't find the page now! If I find it I'll post it.

          Show
          Michael Aherne added a comment - Hi Petr. I just changed the condition to the WHERE clause because I vaguely remember the MySQL documentation recommending this, but I don't know if it was for performance reasons or not, and I can't find the page now! If I find it I'll post it.
          Hide
          Michael Aherne added a comment -

          I should also say that I don't have a Postgres environment to test this on, so I've no idea if it'll cause problems on that. I've only tested it on MySQL.

          Show
          Michael Aherne added a comment - I should also say that I don't have a Postgres environment to test this on, so I've no idea if it'll cause problems on that. I've only tested it on MySQL.
          Hide
          Petr Škoda added a comment -

          My reasoning is that if you have more than two joins you need to mentally separate them into those joins every time you read it which is bad for my brain, so if there is not any major performance reason we should keep it as is imo. Thanks!!

          Show
          Petr Škoda added a comment - My reasoning is that if you have more than two joins you need to mentally separate them into those joins every time you read it which is bad for my brain, so if there is not any major performance reason we should keep it as is imo. Thanks!!
          Hide
          Petr Škoda added a comment -

          Oh and it also gets inconsistent for left joins or when nesting them using () which is exactly this case...

          Show
          Petr Škoda added a comment - Oh and it also gets inconsistent for left joins or when nesting them using () which is exactly this case...
          Hide
          Michael Aherne added a comment -

          I must admit I don't quite understand the implications of having these kinds of conditions in the JOIN as opposed to the WHERE clause! Anyway, I've run EXPLAIN on the query with the "e.enrol = 'meta'" in both places and it gives exactly the same plan, so it certainly doesn't appear to make any difference in this case.

          Show
          Michael Aherne added a comment - I must admit I don't quite understand the implications of having these kinds of conditions in the JOIN as opposed to the WHERE clause! Anyway, I've run EXPLAIN on the query with the "e.enrol = 'meta'" in both places and it gives exactly the same plan, so it certainly doesn't appear to make any difference in this case.
          Hide
          Petr Škoda added a comment - - edited

          the problem is in my head, imagine following:

          select * from a join b on a.y=b.z join c on b.r=an join d on d.e=bf where b.o='xx' and c.o=a.t and b.p=1 and c.i=2

          why should I keep decoding this over and over again in my head? That is why I created my own SQL style over the time that helps me to understand it immediately (and applied it to most places during the big DML conversion in 2.0) and one of the rules is to write relevant conditions to joins, so please do not unnecessarily refactor my SQL code when submitting patches for enrol plugins.

          Show
          Petr Škoda added a comment - - edited the problem is in my head, imagine following: select * from a join b on a.y=b.z join c on b.r=an join d on d.e=bf where b.o='xx' and c.o=a.t and b.p=1 and c.i=2 why should I keep decoding this over and over again in my head? That is why I created my own SQL style over the time that helps me to understand it immediately (and applied it to most places during the big DML conversion in 2.0) and one of the rules is to write relevant conditions to joins, so please do not unnecessarily refactor my SQL code when submitting patches for enrol plugins.
          Hide
          Michael Aherne added a comment -

          That certainly makes sense to me! I'll rewrite the patch to put the condition back in the JOIN.

          Show
          Michael Aherne added a comment - That certainly makes sense to me! I'll rewrite the patch to put the condition back in the JOIN.
          Hide
          Petr Škoda added a comment -

          yay! very big thanks!

          Show
          Petr Škoda added a comment - yay! very big thanks!
          Hide
          Michael Aherne added a comment -

          Patch updated. Thanks for the feedback!

          Show
          Michael Aherne added a comment - Patch updated. Thanks for the feedback!
          Hide
          Petr Škoda added a comment -

          submitting for integration, I think this is safe for stable provided we test it on all supported databases, thanks

          Show
          Petr Škoda added a comment - submitting for integration, I think this is safe for stable provided we test it on all supported databases, thanks
          Hide
          Michael Aherne added a comment - - edited

          I found the bit in the documentation I was thinking about. It's in http://dev.mysql.com/doc/refman/5.5/en/join.html:

          Generally, you should use the ON clause for conditions that specify how to join tables, and the WHERE clause to restrict which rows you want in the result set.

          It doesn't go into any more detail than that, so it's probably not relevant.

          Show
          Michael Aherne added a comment - - edited I found the bit in the documentation I was thinking about. It's in http://dev.mysql.com/doc/refman/5.5/en/join.html: Generally, you should use the ON clause for conditions that specify how to join tables, and the WHERE clause to restrict which rows you want in the result set. It doesn't go into any more detail than that, so it's probably not relevant.
          Hide
          Dan Poltawski added a comment -

          Hi,

          My brain works the same as Petr - I think the conditions near the joins helps try and break it down.

          I have integrated this now, although I have to say, the table aliases have me bamboozled!

          xpue = ? (X=join? p=? u=user e=enrollments)
          ??

          Show
          Dan Poltawski added a comment - Hi, My brain works the same as Petr - I think the conditions near the joins helps try and break it down. I have integrated this now, although I have to say, the table aliases have me bamboozled! xpue = ? (X=join? p=? u=user e=enrollments) ??
          Hide
          Petr Škoda added a comment - - edited

          hehe, I use the 'x' for "extra" aliases necessary for subquery results.

          Show
          Petr Škoda added a comment - - edited hehe, I use the 'x' for "extra" aliases necessary for subquery results.
          Hide
          David Monllaó added a comment -

          Test passed, results verified both with mysql 5.5 and postgres 9.1

          Show
          David Monllaó added a comment - Test passed, results verified both with mysql 5.5 and postgres 9.1
          Hide
          Dan Poltawski added a comment -

          Congratulations!

          You've made it into the weekly release!

          Thanks for your contribution - here are some random drummers to keep you inspired for the next week!
          http://www.youtube.com/watch?v=_QhpHUmVCmY

          Show
          Dan Poltawski added a comment - Congratulations! You've made it into the weekly release! Thanks for your contribution - here are some random drummers to keep you inspired for the next week! http://www.youtube.com/watch?v=_QhpHUmVCmY

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: