Moodle
  1. Moodle
  2. MDL-34156

Inefficient query in meta enrol plugin

    Details

    • Database:
      MySQL
    • Testing Instructions:
      Hide

      1/ set up some meta course links
      2/ execute meta sync from CLI with verbose flag, verify results
      3/ modify some users in the linked courses
      4/ execute meta sync from CLI with verbose flag, verify results

      Show
      1/ set up some meta course links 2/ execute meta sync from CLI with verbose flag, verify results 3/ modify some users in the linked courses 4/ execute meta sync from CLI with verbose flag, verify results
    • Affected Branches:
      MOODLE_23_STABLE
    • Fixed Branches:
      MOODLE_23_STABLE, MOODLE_24_STABLE
    • Pull Master Branch:
      MDL-34156-master

      Description

      We have been noticing an extremely long-running SQL query in our MySQL query logs from the enrol_meta_sync function in enrol/meta/locallib.php. (Notice the rows examined is just under 2 billion! Our user_enrolments table has about 187,000-odd rows)

      # Query_time: 1269.173287 Lock_time: 0.000215 Rows_sent: 0 Rows_examined: 1899410877
      SET timestamp=1341238953;
      SELECT ue.*
      FROM mdl_user_enrolments ue
      JOIN mdl_enrol e ON (e.id = ue.enrolid AND e.enrol = 'meta' )
      LEFT JOIN (SELECT xpue.userid, xpe.courseid
      FROM mdl_user_enrolments xpue
      JOIN mdl_enrol xpe ON (xpe.id = xpue.enrolid AND xpe.enrol <> 'meta' AND xpe.enrol IN ('manual','guest','self','cohort','database','meta','category','strathds'))
      ) pue ON (pue.courseid = e.customint1 AND pue.userid = ue.userid)
      WHERE pue.userid IS NULL;

      As far as I can see the subselect is unnecessary and the query is equivalent to this one, which runs in a couple of seconds:

      SELECT ue.*
      FROM mdl_user_enrolments ue
      JOIN mdl_enrol e ON (e.id = ue.enrolid)
      LEFT JOIN (mdl_user_enrolments xpue
      JOIN mdl_enrol xpe ON (xpe.id = xpue.enrolid AND xpe.enrol <> 'meta'
      AND xpe.enrol IN ('manual','guest','self','cohort','database','meta','category','strathds'))
      ) ON (xpe.courseid = e.customint1 AND xpue.userid = ue.userid)
      WHERE e.enrol = 'meta'
      AND xpue.userid IS NULL;

      This is on MySQL 5.5.22

        Gliffy Diagrams

          Issue Links

            Activity

            Hide
            Michael Aherne added a comment -

            I've attached a pull request for this change, if it's useful.

            Show
            Michael Aherne added a comment - I've attached a pull request for this change, if it's useful.
            Hide
            Petr Skoda added a comment -

            Hi, thanks for the patch, I agree that it should be possible to remove the subselect, but I do not like that you moved the conditions to WHERE because I intentionally put it there to make it more readable for me (I know Eloy and others may disagree, but I have to maintain it). I thought that the placement of conditions should not affect perf, right?

            Show
            Petr Skoda added a comment - Hi, thanks for the patch, I agree that it should be possible to remove the subselect, but I do not like that you moved the conditions to WHERE because I intentionally put it there to make it more readable for me (I know Eloy and others may disagree, but I have to maintain it). I thought that the placement of conditions should not affect perf, right?
            Hide
            Michael Aherne added a comment -

            Hi Petr. I just changed the condition to the WHERE clause because I vaguely remember the MySQL documentation recommending this, but I don't know if it was for performance reasons or not, and I can't find the page now! If I find it I'll post it.

            Show
            Michael Aherne added a comment - Hi Petr. I just changed the condition to the WHERE clause because I vaguely remember the MySQL documentation recommending this, but I don't know if it was for performance reasons or not, and I can't find the page now! If I find it I'll post it.
            Hide
            Michael Aherne added a comment -

            I should also say that I don't have a Postgres environment to test this on, so I've no idea if it'll cause problems on that. I've only tested it on MySQL.

            Show
            Michael Aherne added a comment - I should also say that I don't have a Postgres environment to test this on, so I've no idea if it'll cause problems on that. I've only tested it on MySQL.
            Hide
            Petr Skoda added a comment -

            My reasoning is that if you have more than two joins you need to mentally separate them into those joins every time you read it which is bad for my brain, so if there is not any major performance reason we should keep it as is imo. Thanks!!

            Show
            Petr Skoda added a comment - My reasoning is that if you have more than two joins you need to mentally separate them into those joins every time you read it which is bad for my brain, so if there is not any major performance reason we should keep it as is imo. Thanks!!
            Hide
            Petr Skoda added a comment -

            Oh and it also gets inconsistent for left joins or when nesting them using () which is exactly this case...

            Show
            Petr Skoda added a comment - Oh and it also gets inconsistent for left joins or when nesting them using () which is exactly this case...
            Hide
            Michael Aherne added a comment -

            I must admit I don't quite understand the implications of having these kinds of conditions in the JOIN as opposed to the WHERE clause! Anyway, I've run EXPLAIN on the query with the "e.enrol = 'meta'" in both places and it gives exactly the same plan, so it certainly doesn't appear to make any difference in this case.

            Show
            Michael Aherne added a comment - I must admit I don't quite understand the implications of having these kinds of conditions in the JOIN as opposed to the WHERE clause! Anyway, I've run EXPLAIN on the query with the "e.enrol = 'meta'" in both places and it gives exactly the same plan, so it certainly doesn't appear to make any difference in this case.
            Hide
            Petr Skoda added a comment - - edited

            the problem is in my head, imagine following:

            select * from a join b on a.y=b.z join c on b.r=an join d on d.e=bf where b.o='xx' and c.o=a.t and b.p=1 and c.i=2

            why should I keep decoding this over and over again in my head? That is why I created my own SQL style over the time that helps me to understand it immediately (and applied it to most places during the big DML conversion in 2.0) and one of the rules is to write relevant conditions to joins, so please do not unnecessarily refactor my SQL code when submitting patches for enrol plugins.

            Show
            Petr Skoda added a comment - - edited the problem is in my head, imagine following: select * from a join b on a.y=b.z join c on b.r=an join d on d.e=bf where b.o='xx' and c.o=a.t and b.p=1 and c.i=2 why should I keep decoding this over and over again in my head? That is why I created my own SQL style over the time that helps me to understand it immediately (and applied it to most places during the big DML conversion in 2.0) and one of the rules is to write relevant conditions to joins, so please do not unnecessarily refactor my SQL code when submitting patches for enrol plugins.
            Hide
            Michael Aherne added a comment -

            That certainly makes sense to me! I'll rewrite the patch to put the condition back in the JOIN.

            Show
            Michael Aherne added a comment - That certainly makes sense to me! I'll rewrite the patch to put the condition back in the JOIN.
            Hide
            Petr Skoda added a comment -

            yay! very big thanks!

            Show
            Petr Skoda added a comment - yay! very big thanks!
            Hide
            Michael Aherne added a comment -

            Patch updated. Thanks for the feedback!

            Show
            Michael Aherne added a comment - Patch updated. Thanks for the feedback!
            Hide
            Petr Skoda added a comment -

            submitting for integration, I think this is safe for stable provided we test it on all supported databases, thanks

            Show
            Petr Skoda added a comment - submitting for integration, I think this is safe for stable provided we test it on all supported databases, thanks
            Hide
            Michael Aherne added a comment - - edited

            I found the bit in the documentation I was thinking about. It's in http://dev.mysql.com/doc/refman/5.5/en/join.html:

            Generally, you should use the ON clause for conditions that specify how to join tables, and the WHERE clause to restrict which rows you want in the result set.

            It doesn't go into any more detail than that, so it's probably not relevant.

            Show
            Michael Aherne added a comment - - edited I found the bit in the documentation I was thinking about. It's in http://dev.mysql.com/doc/refman/5.5/en/join.html: Generally, you should use the ON clause for conditions that specify how to join tables, and the WHERE clause to restrict which rows you want in the result set. It doesn't go into any more detail than that, so it's probably not relevant.
            Hide
            Dan Poltawski added a comment -

            Hi,

            My brain works the same as Petr - I think the conditions near the joins helps try and break it down.

            I have integrated this now, although I have to say, the table aliases have me bamboozled!

            xpue = ? (X=join? p=? u=user e=enrollments)
            ??

            Show
            Dan Poltawski added a comment - Hi, My brain works the same as Petr - I think the conditions near the joins helps try and break it down. I have integrated this now, although I have to say, the table aliases have me bamboozled! xpue = ? (X=join? p=? u=user e=enrollments) ??
            Hide
            Petr Skoda added a comment - - edited

            hehe, I use the 'x' for "extra" aliases necessary for subquery results.

            Show
            Petr Skoda added a comment - - edited hehe, I use the 'x' for "extra" aliases necessary for subquery results.
            Hide
            David Monllaó added a comment -

            Test passed, results verified both with mysql 5.5 and postgres 9.1

            Show
            David Monllaó added a comment - Test passed, results verified both with mysql 5.5 and postgres 9.1
            Hide
            Dan Poltawski added a comment -

            Congratulations!

            You've made it into the weekly release!

            Thanks for your contribution - here are some random drummers to keep you inspired for the next week!
            http://www.youtube.com/watch?v=_QhpHUmVCmY

            Show
            Dan Poltawski added a comment - Congratulations! You've made it into the weekly release! Thanks for your contribution - here are some random drummers to keep you inspired for the next week! http://www.youtube.com/watch?v=_QhpHUmVCmY

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: