Moodle
  1. Moodle
  2. MDL-20946

Grade book averages are wrong if a student has multiple roles or has been deleted

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 1.9.6, 1.9.11
    • Fix Version/s: 1.9.12, 2.0.3, 2.1
    • Component/s: Gradebook
    • Labels:
    • Database:
      Any
    • Affected Branches:
      MOODLE_19_STABLE
    • Fixed Branches:
      MOODLE_19_STABLE, MOODLE_20_STABLE, MOODLE_21_STABLE
    • Rank:
      2474

      Description

      Moodle version: 1.9.6 (Build: 20091021)
      Grades selected for column averages set to 'Non-empty grades'
      Graded roles as enrolled student and manually assigned student.

      If a student has multiple roles, the averages are counted multiple times.

      John & Jeff got 50% for the quiz and James didn't get anything.
      John is an enrolled student and manually assigned student.
      Jeff & James is are enrolled students.

      Average was 50 * 2 + 50 + 0 = 150 / 1 = 150.
      Average should be 50 + 50 + 0 = 100 / 2 = 50.

      Reason for difference is the scores were added multiple times to the total and non-emply grades were less because the count of students with graders was greater because of the multiple roles. ie: 14 students - 9 grades = 5 non empty grades when it should be 6.

      Fix attached.

        Issue Links

          Activity

          Hide
          Tim Hunt added a comment -

          First note that MDL-15277 may well be a duplicate of this bug.

          Next note that MDL-25879 is somewhat related.

          I got here be investigating a related but different problem. Some queries in the gradebook correctly check AND u.deleted = 0, and others, including the two involved here, don't. Because of this, we were seeing averages that looked like -6.7 (-3) (that is, an average of minus three users) because it was subtracting a number of ungraded users that included deleted users from a total number of users that excluded deleted users.

          I have now turned Tim Lock's patch, and a patch by me to take into account the deleted users issue, into a branch here: https://github.com/timhunt/moodle/compare/MOODLE_19_STABLE...MDL-20946_19

          However, grepping the gradebook code for "->gradebookroles" I find a number of other database queries which seem to suffer from similar problems, so I think we need to do a more careful review and try to fix all of them before we push this into 1.9. We also need to prepare a patch/branch for 2.0.

          Show
          Tim Hunt added a comment - First note that MDL-15277 may well be a duplicate of this bug. Next note that MDL-25879 is somewhat related. I got here be investigating a related but different problem. Some queries in the gradebook correctly check AND u.deleted = 0, and others, including the two involved here, don't. Because of this, we were seeing averages that looked like -6.7 (-3) (that is, an average of minus three users) because it was subtracting a number of ungraded users that included deleted users from a total number of users that excluded deleted users. I have now turned Tim Lock's patch, and a patch by me to take into account the deleted users issue, into a branch here: https://github.com/timhunt/moodle/compare/MOODLE_19_STABLE...MDL-20946_19 However, grepping the gradebook code for "->gradebookroles" I find a number of other database queries which seem to suffer from similar problems, so I think we need to do a more careful review and try to fix all of them before we push this into 1.9. We also need to prepare a patch/branch for 2.0.
          Hide
          Tim Hunt added a comment -

          By the way, I am getting hassled about this in relation to the OU VLE, so I would like to get this fixed ASAP. I know you have a public holiday in Australia on Monday, but hopefully we can work on this in the remainder of next week.

          Show
          Tim Hunt added a comment - By the way, I am getting hassled about this in relation to the OU VLE, so I would like to get this fixed ASAP. I know you have a public holiday in Australia on Monday, but hopefully we can work on this in the remainder of next week.
          Hide
          Tim Hunt added a comment -

          OK, so reviewing all the mentions of ->gradebookroles in the code, as a plausible way to find all the DB queries that might be affected.

          1. grade/import/lib.php line 175: Seems to be OK. Hmm... unless we should change
          "AND ra.id IS NULL" to "AND (ra.id IS NULL OR u.deleted = 1)"?

          2. grade/lib.php class graded_users_iterator::init(): Two queries here, both of which will duplicate users if they have multiple role assignments. Yes. For example, this is used by all the export code. If you have a user with two roles, then they are included twice in the export file.

          (By the way, it is crazy the way this code is doing the $ofields stuff. Surely that is not really necessary!)

          3. grade/report/lib.php grade_report::grade_report constructor: Just copies $this->gradebookroles = $CFG->gradebookroles; No problem.

          (Note that there is a nasty use of Error here, that gives the wrong place to go for the setting admin->appearance->graderoles)

          4. grade/report/lib.php grade_report::get_numusers method: No problem with the multiple roles thing, because it is using COUNT(DISTINCT ...)) in includes u.delete = 0.

          5. grade/report/grader/lib.php grade_report_grader::load_users method: Oh look at the comment at the top "// the MAX() magic is required in order to please PG". No, Postgres was pointing out this bug, and you chose to ignore it!

          Anyway the queries here are wrong, but this is probably masked by the fact that get_records_sql eliminates duplicate rows when it converts the recordset to an array. Well, one of the queries does a SELECT DISTINCT, but those tend to give bad performance. Mind you, I am not sure that SELECT ... WHERE EXISTS is particularly efficient either.

          (I also wonder why this method cannot use the graded_users_iterator class? Anyway, I will just fix the existing code.)

          There is horrible duplicated code here. Yes, I will refactor.

          6. grade/report/grader/lib.php grade_report_grader::get_avghtml method: This is where we started, and my previous commits should have fixed this. I just reviewed the code again and tweaked the coding style a bit.

          7. mentions in mod/assignment/lib.php are not relevant to us (and appear to be fine on a quick look.)

          Note that there are mentions of gradebookroles in other places in Moodle 2.0. For example, in 2.0 there are separate queries in the user report that are not present in 1.9. For now, let's get 1.9 right, before porting the changes to 2.0.

          Note, I just did a quick test, which I think covered all the code I changed, and I think it worked for me.

          OK, all these changes pushed to https://github.com/timhunt/moodle/compare/MOODLE_19_STABLE...MDL-20946_19. I am looking for review and testing please.

          Show
          Tim Hunt added a comment - OK, so reviewing all the mentions of ->gradebookroles in the code, as a plausible way to find all the DB queries that might be affected. 1. grade/import/lib.php line 175: Seems to be OK. Hmm... unless we should change "AND ra.id IS NULL" to "AND (ra.id IS NULL OR u.deleted = 1)"? 2. grade/lib.php class graded_users_iterator::init(): Two queries here, both of which will duplicate users if they have multiple role assignments. Yes. For example, this is used by all the export code. If you have a user with two roles, then they are included twice in the export file. (By the way, it is crazy the way this code is doing the $ofields stuff. Surely that is not really necessary!) 3. grade/report/lib.php grade_report::grade_report constructor: Just copies $this->gradebookroles = $CFG->gradebookroles; No problem. (Note that there is a nasty use of Error here, that gives the wrong place to go for the setting admin->appearance->graderoles) 4. grade/report/lib.php grade_report::get_numusers method: No problem with the multiple roles thing, because it is using COUNT(DISTINCT ...)) in includes u.delete = 0. 5. grade/report/grader/lib.php grade_report_grader::load_users method: Oh look at the comment at the top "// the MAX() magic is required in order to please PG". No, Postgres was pointing out this bug, and you chose to ignore it! Anyway the queries here are wrong, but this is probably masked by the fact that get_records_sql eliminates duplicate rows when it converts the recordset to an array. Well, one of the queries does a SELECT DISTINCT, but those tend to give bad performance. Mind you, I am not sure that SELECT ... WHERE EXISTS is particularly efficient either. (I also wonder why this method cannot use the graded_users_iterator class? Anyway, I will just fix the existing code.) There is horrible duplicated code here. Yes, I will refactor. 6. grade/report/grader/lib.php grade_report_grader::get_avghtml method: This is where we started, and my previous commits should have fixed this. I just reviewed the code again and tweaked the coding style a bit. 7. mentions in mod/assignment/lib.php are not relevant to us (and appear to be fine on a quick look.) Note that there are mentions of gradebookroles in other places in Moodle 2.0. For example, in 2.0 there are separate queries in the user report that are not present in 1.9. For now, let's get 1.9 right, before porting the changes to 2.0. Note, I just did a quick test, which I think covered all the code I changed, and I think it worked for me. OK, all these changes pushed to https://github.com/timhunt/moodle/compare/MOODLE_19_STABLE...MDL-20946_19 . I am looking for review and testing please.
          Hide
          Tim Hunt added a comment -

          Just adding my colleague Colin as a watcher.

          Show
          Tim Hunt added a comment - Just adding my colleague Colin as a watcher.
          Hide
          Tim Hunt added a comment -

          Sadly, it occurred to me to search all the gradebook code for 'SELECT' and there are some more queries with problems.

          8. In both grade/report/overview/lib.php and grade/report/user/lib.php inside the if ($this->showrank) { bit, it should ignore grades belonging to un-enrolled or deleted users when computing the rank, but it includes them. The OU does not use ranks, so I am not inclined to fix this, Perhpas we could file a separate MDL bug about that.

          9. Drat! I missed the second query in graded_users_iterator::init(). I'll do another commit. Done.

          But otherwise, I think we have now found all the problems.

          Show
          Tim Hunt added a comment - Sadly, it occurred to me to search all the gradebook code for 'SELECT' and there are some more queries with problems. 8. In both grade/report/overview/lib.php and grade/report/user/lib.php inside the if ($this->showrank) { bit, it should ignore grades belonging to un-enrolled or deleted users when computing the rank, but it includes them. The OU does not use ranks, so I am not inclined to fix this, Perhpas we could file a separate MDL bug about that. 9. Drat! I missed the second query in graded_users_iterator::init(). I'll do another commit. Done. But otherwise, I think we have now found all the problems.
          Hide
          Tim Hunt added a comment -

          One of the grader report queries was broken in groups mode. Thanks to Colin for finding that.

          Acutally, I can feel a bit smug, because it was Tim Lock's original patch that broke that That is the only problem found so far, and I have just done another commit to fix it.

          So, the state of play is that https://github.com/timhunt/moodle/compare/MOODLE_19_STABLE...MDL-20946_19 is, to the best of my knowledge, the correct fix for 1.9. However it needs more testing.

          Show
          Tim Hunt added a comment - One of the grader report queries was broken in groups mode. Thanks to Colin for finding that. Acutally, I can feel a bit smug, because it was Tim Lock's original patch that broke that That is the only problem found so far, and I have just done another commit to fix it. So, the state of play is that https://github.com/timhunt/moodle/compare/MOODLE_19_STABLE...MDL-20946_19 is, to the best of my knowledge, the correct fix for 1.9. However it needs more testing.
          Hide
          Andrew Davis added a comment -

          Prior to working through the 1.9 patch Ive had a go at reproducing this in 2.0 and have done so. I made student and non-editing teacher gradeable roles then made a user that was already a student also a non-editing teacher. Averages immediately become whacky.

          Ill set this same situation up in 1.9, apply the patch and see how we go

          Show
          Andrew Davis added a comment - Prior to working through the 1.9 patch Ive had a go at reproducing this in 2.0 and have done so. I made student and non-editing teacher gradeable roles then made a user that was already a student also a non-editing teacher. Averages immediately become whacky. Ill set this same situation up in 1.9, apply the patch and see how we go
          Hide
          Andrew Davis added a comment -

          ok. Looking at https://github.com/timhunt/moodle/compare/MOODLE_19_STABLE...MDL-20946_19 and I have that branch on my machine.

          Youve added queries that look like this:

          SELECT u.id, u.firstname, u.lastname, u.imagealt, u.picture, u.idnumber
          FROM mdl_user u
          WHERE EXISTS (
          SELECT 1
          FROM mdl_role_assignments ra
          WHERE u.id = ra.userid
          AND ra.roleid IN (4,5)
          AND ra.contextid  IN (14,3,1)
          )
          AND u.deleted = 0
          

          The following also seems to resolve the issue of duplicate rows. Using a distinct instead of the exists.

          SELECT distinct u.id, u.firstname, u.lastname, u.imagealt, u.picture, u.idnumber
          FROM mdl_user u
          LEFT JOIN mdl_role_assignments ra ON u.id = ra.userid
          WHERE ra.roleid IN (4,5)
          AND ra.contextid  IN (14,3,1)
          AND u.deleted = 0
          

          Is there are reason Im not seeing for not doing it using an exists clause? The way youve done it certainly seems to work just fine but seems a little more complex so Im curious if Im not seeing something or if its just personal preference.

          re the change in grade/import/lib.php, be wary of white space changes. They can make the integrators unhappy.

          Otherwise I think it looks good

          Show
          Andrew Davis added a comment - ok. Looking at https://github.com/timhunt/moodle/compare/MOODLE_19_STABLE...MDL-20946_19 and I have that branch on my machine. Youve added queries that look like this: SELECT u.id, u.firstname, u.lastname, u.imagealt, u.picture, u.idnumber FROM mdl_user u WHERE EXISTS ( SELECT 1 FROM mdl_role_assignments ra WHERE u.id = ra.userid AND ra.roleid IN (4,5) AND ra.contextid IN (14,3,1) ) AND u.deleted = 0 The following also seems to resolve the issue of duplicate rows. Using a distinct instead of the exists. SELECT distinct u.id, u.firstname, u.lastname, u.imagealt, u.picture, u.idnumber FROM mdl_user u LEFT JOIN mdl_role_assignments ra ON u.id = ra.userid WHERE ra.roleid IN (4,5) AND ra.contextid IN (14,3,1) AND u.deleted = 0 Is there are reason Im not seeing for not doing it using an exists clause? The way youve done it certainly seems to work just fine but seems a little more complex so Im curious if Im not seeing something or if its just personal preference. re the change in grade/import/lib.php, be wary of white space changes. They can make the integrators unhappy. Otherwise I think it looks good
          Hide
          Tim Hunt added a comment -

          Thanks for looking at this.

          In the past there have been nasty performance problems with SELECT DISTINCT queries, particularly where there are a lot of columns. This about what the DB has to do to implement it. Of course, we really ought to measure to confirm that the WHERE EXISTS (SELECT) version works better. I will try to do some basic timings of the queries on our live DB to see what I get.

          I will get my colleagues to test the functionality on our acceptance test server. We want this bug fix ASAP.

          I will re-base the branch onto the latest MOODLE_19_STABLE, including taking out the white-space only changes to grade/import/lib.php.

          Will you then be able to do the 2.0 changes?

          Show
          Tim Hunt added a comment - Thanks for looking at this. In the past there have been nasty performance problems with SELECT DISTINCT queries, particularly where there are a lot of columns. This about what the DB has to do to implement it. Of course, we really ought to measure to confirm that the WHERE EXISTS (SELECT) version works better. I will try to do some basic timings of the queries on our live DB to see what I get. I will get my colleagues to test the functionality on our acceptance test server. We want this bug fix ASAP. I will re-base the branch onto the latest MOODLE_19_STABLE, including taking out the white-space only changes to grade/import/lib.php. Will you then be able to do the 2.0 changes?
          Hide
          Tim Hunt added a comment -

          However, in this case I was wrong:

          My query: 11,000 ms

          Your query: 50 ms

          Best query: 25 ms

          SELECT u.id, u.firstname, u.lastname, u.imagealt, u.picture, u.idnumber
          FROM mdl_user u
          WHERE u.deleted = 0 AND u.id IN (
          SELECT DISTINCT ra.userid FROM mdl_role_assignments ra
          WHERE ra.roleid IN (5, 162)
          AND ra.contextid  IN (1161021, 830650, 32, 1))
          

          That idiom seems to work better on other queries too, so I adoped it everywhere.

          See the last commit on https://github.com/timhunt/moodle/compare/MOODLE_19_STABLE...MDL-20946_19 - note I have rebased this branch.

          I also did a version with cleaned up history: https://github.com/timhunt/moodle/compare/MOODLE_19_STABLE...MDL-20946_19_simplehistory. I don't think there is any benefit in the full history, so I am proposing to use this version as the basis for a PULL request, once we have done a bit more testing here at the OU.

          Show
          Tim Hunt added a comment - However, in this case I was wrong: My query: 11,000 ms Your query: 50 ms Best query: 25 ms SELECT u.id, u.firstname, u.lastname, u.imagealt, u.picture, u.idnumber FROM mdl_user u WHERE u.deleted = 0 AND u.id IN ( SELECT DISTINCT ra.userid FROM mdl_role_assignments ra WHERE ra.roleid IN (5, 162) AND ra.contextid IN (1161021, 830650, 32, 1)) That idiom seems to work better on other queries too, so I adoped it everywhere. See the last commit on https://github.com/timhunt/moodle/compare/MOODLE_19_STABLE...MDL-20946_19 - note I have rebased this branch. I also did a version with cleaned up history: https://github.com/timhunt/moodle/compare/MOODLE_19_STABLE...MDL-20946_19_simplehistory . I don't think there is any benefit in the full history, so I am proposing to use this version as the basis for a PULL request, once we have done a bit more testing here at the OU.
          Hide
          Tim Hunt added a comment -

          Grrr! I managed to screw up a few things. https://github.com/timhunt/moodle/compare/MOODLE_19_STABLE...MDL-20946_19_simplehistory now rewritten and fixed. I now think this is working on my development server. Time to put it on the OU's acceptance test server.

          Show
          Tim Hunt added a comment - Grrr! I managed to screw up a few things. https://github.com/timhunt/moodle/compare/MOODLE_19_STABLE...MDL-20946_19_simplehistory now rewritten and fixed. I now think this is working on my development server. Time to put it on the OU's acceptance test server.
          Hide
          Andrew Davis added a comment -

          See how you go with your testing but it looks fine to me. I'm doing the 2.0 version now.

          Show
          Andrew Davis added a comment - See how you go with your testing but it looks fine to me. I'm doing the 2.0 version now.
          Hide
          Andrew Davis added a comment -

          I think the 2.0 version is good to go.

          repo: git://github.com/andyjdavis/moodle.git
          branch: MDL-20946_roles_averages
          diff: https://github.com/andyjdavis/moodle/compare/master...MDL-20946_roles_averages

          Show
          Andrew Davis added a comment - I think the 2.0 version is good to go. repo: git://github.com/andyjdavis/moodle.git branch: MDL-20946 _roles_averages diff: https://github.com/andyjdavis/moodle/compare/master...MDL-20946_roles_averages
          Hide
          Andrew Davis added a comment -

          Will of course reopen if any problems are detected during the ongoing OU testing.

          2.0 PULL-486
          1.9 PULL-487

          Show
          Andrew Davis added a comment - Will of course reopen if any problems are detected during the ongoing OU testing. 2.0 PULL-486 1.9 PULL-487
          Hide
          Tim Hunt added a comment -

          Some review comments on the 2.0 patch:

          • You've got a FROM {$CFG->prefix}role_assignments ra (second query)
          • Don't we need the $enrolledsql bit in the queries in grade/lib.php ?
          Show
          Tim Hunt added a comment - Some review comments on the 2.0 patch: You've got a FROM {$CFG->prefix}role_assignments ra (second query) Don't we need the $enrolledsql bit in the queries in grade/lib.php ?
          Hide
          Andrew Davis added a comment -

          well spotted. fixed those.

          Show
          Andrew Davis added a comment - well spotted. fixed those.
          Hide
          Eloy Lafuente (stronk7) added a comment -

          Reopening this as far as a bit more of work is needed with the queries just to check we are in the safe side.

          Thanks for the effort! Ciao

          Show
          Eloy Lafuente (stronk7) added a comment - Reopening this as far as a bit more of work is needed with the queries just to check we are in the safe side. Thanks for the effort! Ciao
          Hide
          Tim Hunt added a comment -

          Grrr! Why fail PULL-487 out-of-hand because it is the sister of PULL-486, which affects 2.0 where the code if different because of the enrol stuff?

          And, if you actually bother to read what was written above, then you will see that I did test all the 1.9 queries on the OU's database. (700,000+ users, 100,000+ role assignments, 100,000+ contexts. 3,000+ courses - I think it is actually much more than that but I don't have the exact number right now.)

          But if you want to do more testing with a bigger database, be my guest

          Show
          Tim Hunt added a comment - Grrr! Why fail PULL-487 out-of-hand because it is the sister of PULL-486, which affects 2.0 where the code if different because of the enrol stuff? And, if you actually bother to read what was written above, then you will see that I did test all the 1.9 queries on the OU's database. (700,000+ users, 100,000+ role assignments, 100,000+ contexts. 3,000+ courses - I think it is actually much more than that but I don't have the exact number right now.) But if you want to do more testing with a bigger database, be my guest
          Hide
          Eloy Lafuente (stronk7) added a comment -

          Tim,

          I read this 100% (in fact I did that more than once), so I knew perfectly about your results above, so please, don't make false assumptions.

          And they where failed because both are using the same (IN + subquery having DISTINCT) thing. I'm sure you tested it under PostgreSQL... but what happens with MySQL (yes, that tiny database that nobody uses).

          So I think it's fair enough to dedicate some time for testing and comparing for one more week. The rest of the patch seemed perfect, and that was stated in the PULL request, so I really cannot see any problem with the decision.

          Ciao

          Show
          Eloy Lafuente (stronk7) added a comment - Tim, I read this 100% (in fact I did that more than once), so I knew perfectly about your results above, so please, don't make false assumptions. And they where failed because both are using the same (IN + subquery having DISTINCT) thing. I'm sure you tested it under PostgreSQL... but what happens with MySQL (yes, that tiny database that nobody uses). So I think it's fair enough to dedicate some time for testing and comparing for one more week. The rest of the patch seemed perfect, and that was stated in the PULL request, so I really cannot see any problem with the decision. Ciao
          Hide
          Andrew Davis added a comment -

          Ok, in this issue we've created queries like this...

          SELECT u.* $ofields
          FROM {$CFG->prefix}user u
          INNER JOIN {$CFG->prefix}role_assignments ra ON u.id = ra.userid
          $groupsql
          WHERE ra.roleid $gradebookroles
          AND ra.contextid $relatedcontexts
          WHERE u.id IN (
              SELECT DISTINCT ra.userid
              FROM {$CFG->prefix}role_assignments ra
              WHERE ra.roleid $gradebookroles
              AND ra.contextid $relatedcontexts
              )
          AND u.deleted = 0
          $groupwheresql
          ORDER BY $order
          

          From what I can gather from the comments here and in the 2 pull requests Eloy and Petr are suggesting we try something like this...

          SELECT u.* $ofields
          FROM {$CFG->prefix}user u
          INNER JOIN {$CFG->prefix}role_assignments ra ON u.id = ra.userid
          $groupsql
          WHERE ra.roleid $gradebookroles
          AND ra.contextid $relatedcontexts
          JOIN (
              SELECT DISTINCT ra.userid
              FROM {$CFG->prefix}role_assignments ra
              WHERE ra.roleid $gradebookroles
              AND ra.contextid $relatedcontexts
              ) uinner ON u.id = uinner.userid
          WHERE u.deleted = 0
          $groupwheresql
          ORDER BY $order
          

          Which is actually a variation we didnt try. I think... Anyhow, Tim, would you be able to test that variation against your dataset?

          Show
          Andrew Davis added a comment - Ok, in this issue we've created queries like this... SELECT u.* $ofields FROM {$CFG->prefix}user u INNER JOIN {$CFG->prefix}role_assignments ra ON u.id = ra.userid $groupsql WHERE ra.roleid $gradebookroles AND ra.contextid $relatedcontexts WHERE u.id IN ( SELECT DISTINCT ra.userid FROM {$CFG->prefix}role_assignments ra WHERE ra.roleid $gradebookroles AND ra.contextid $relatedcontexts ) AND u.deleted = 0 $groupwheresql ORDER BY $order From what I can gather from the comments here and in the 2 pull requests Eloy and Petr are suggesting we try something like this... SELECT u.* $ofields FROM {$CFG->prefix}user u INNER JOIN {$CFG->prefix}role_assignments ra ON u.id = ra.userid $groupsql WHERE ra.roleid $gradebookroles AND ra.contextid $relatedcontexts JOIN ( SELECT DISTINCT ra.userid FROM {$CFG->prefix}role_assignments ra WHERE ra.roleid $gradebookroles AND ra.contextid $relatedcontexts ) uinner ON u.id = uinner.userid WHERE u.deleted = 0 $groupwheresql ORDER BY $order Which is actually a variation we didnt try. I think... Anyhow, Tim, would you be able to test that variation against your dataset?
          Hide
          Tim Hunt added a comment -

          Andrew, I don't understand your comment. I don't think any of our queries have both a INNER JOIN {$CFG->prefix}role_assignments ... and a WHERE u.id IN (... Is that just a copy/paste error?

          I thought Eloy was just suggesting we try removing the DISTINCT from WHERE u.id IN (SELECT DISTINCT ...

          The switch from IN to JOINing a view is an idea that might be worth trying. My guess (and experience) would be that in cases where we have another way to limit the list of users (either $groupsql, or a join no grade_grades, the JOIN will perform worse, because the DB will probably evaluate the inner query for the whole role_assignments table, whereas we only need to think about a small subset of users. In the particular query you show, with no other restriction on userid, the JOIN may be slightly faster, or about the same speed, as the query I wrote.

          Anyway, the only way to know is to test on a large example database on both Postgres and MySQL (and the other DBs if you can be bothered.)

          Show
          Tim Hunt added a comment - Andrew, I don't understand your comment. I don't think any of our queries have both a INNER JOIN {$CFG->prefix}role_assignments ... and a WHERE u.id IN (... Is that just a copy/paste error? I thought Eloy was just suggesting we try removing the DISTINCT from WHERE u.id IN (SELECT DISTINCT ... The switch from IN to JOINing a view is an idea that might be worth trying. My guess (and experience) would be that in cases where we have another way to limit the list of users (either $groupsql, or a join no grade_grades, the JOIN will perform worse, because the DB will probably evaluate the inner query for the whole role_assignments table, whereas we only need to think about a small subset of users. In the particular query you show, with no other restriction on userid, the JOIN may be slightly faster, or about the same speed, as the query I wrote. Anyway, the only way to know is to test on a large example database on both Postgres and MySQL (and the other DBs if you can be bothered.)
          Hide
          Petr Škoda added a comment -

          http://dev.mysql.com/doc/refman/5.1/en/subquery-restrictions.html describes some mysql problems with IN (SELECT ...), we have already tested the JOIN (SELECT DISTINCT ...) approach in enrol related queries. I think we were bitten by this a few times before and after a few nights in stats hacking I am not using it any more - for uncorrelated subqueries the JOIN is better, EXISTS works fine for the rest. Please note 1.9.x supports older versions of mysql which may affect the results too.

          Show
          Petr Škoda added a comment - http://dev.mysql.com/doc/refman/5.1/en/subquery-restrictions.html describes some mysql problems with IN (SELECT ...), we have already tested the JOIN (SELECT DISTINCT ...) approach in enrol related queries. I think we were bitten by this a few times before and after a few nights in stats hacking I am not using it any more - for uncorrelated subqueries the JOIN is better, EXISTS works fine for the rest. Please note 1.9.x supports older versions of mysql which may affect the results too.
          Hide
          Andrew Davis added a comment -

          Tim, apparently I cant reliably copy and paste. Sorry for the confusion.

          So I believe we have 2 alternatives that need to be compared.

          2) WHERE u.id IN (SELECT ra.userid... with no distinct in the inner query

          3) JOIN (SELECT DISTINCT ra.userid...
          ) rainner ON u.id = rainner.userid

          Tim's testing above found that EXISTS performed terribly.

          Show
          Andrew Davis added a comment - Tim, apparently I cant reliably copy and paste. Sorry for the confusion. So I believe we have 2 alternatives that need to be compared. 2) WHERE u.id IN (SELECT ra.userid... with no distinct in the inner query 3) JOIN (SELECT DISTINCT ra.userid... ) rainner ON u.id = rainner.userid Tim's testing above found that EXISTS performed terribly.
          Hide
          Andrew Davis added a comment - - edited

          I'm using mysql and Moodle 2.0. I used /admin/generate.php to create a whole bunch of users, course, role assignments etc. I'm just using Moodle's own profiling to get some numbers. The numbers Tim gets will be more valid as its based on real data.

          First, I removed the distinct in the inner query. Execution time to view the grade report dropped 0.3%. Memory usage went up 0.1%. No effect essentially.

          Now going to try using a JOIN instead....

          Switching from IN to JOIN seems to make the grader report execution time 5% slower. Kind of weird but I suspect thats within the "random variation" range so it probably doesn't indicate that switching would be a bad idea.

          Show
          Andrew Davis added a comment - - edited I'm using mysql and Moodle 2.0. I used /admin/generate.php to create a whole bunch of users, course, role assignments etc. I'm just using Moodle's own profiling to get some numbers. The numbers Tim gets will be more valid as its based on real data. First, I removed the distinct in the inner query. Execution time to view the grade report dropped 0.3%. Memory usage went up 0.1%. No effect essentially. Now going to try using a JOIN instead.... Switching from IN to JOIN seems to make the grader report execution time 5% slower. Kind of weird but I suspect thats within the "random variation" range so it probably doesn't indicate that switching would be a bad idea.
          Hide
          Andrew Davis added a comment - - edited

          Ive pushed the 2.0 version using JOIN to github so I can access from home. https://github.com/andyjdavis/moodle/compare/master...MDL-20946_roles_averages

          Show
          Andrew Davis added a comment - - edited Ive pushed the 2.0 version using JOIN to github so I can access from home. https://github.com/andyjdavis/moodle/compare/master...MDL-20946_roles_averages
          Hide
          Tim Hunt added a comment -

          What you should do is, having created the huge test database, then run the queries directly from the command-line (or phpMyAdmin, and the equivalent for other DBs). That lets you time just the query with minimal noise.

          You should also use Explain to see how the database is processing the query.

          Show
          Tim Hunt added a comment - What you should do is, having created the huge test database, then run the queries directly from the command-line (or phpMyAdmin, and the equivalent for other DBs). That lets you time just the query with minimal noise. You should also use Explain to see how the database is processing the query.
          Hide
          Andrew Davis added a comment - - edited

          Running both the JOIN and the IN versions of the queries directly in mysql query browser I get the following...

          Using JOIN: average time 0.0480 seconds
          Using IN: average time 0.0469 seconds

          Thats running each one half a dozen times and averaging the results. Again, switching from IN to JOIN actually makes the query marginally slower. The difference is so small I dont think it really matters either way. If Petr says there are potential problems with using the IN version then Im happy to go with the JOIN version.

          Tim, are you happy to repeat this testing in the 1.9 version and update the branch in your repository? (depending on whether or not your testing differs wildly from mine).

          Show
          Andrew Davis added a comment - - edited Running both the JOIN and the IN versions of the queries directly in mysql query browser I get the following... Using JOIN: average time 0.0480 seconds Using IN: average time 0.0469 seconds Thats running each one half a dozen times and averaging the results. Again, switching from IN to JOIN actually makes the query marginally slower. The difference is so small I dont think it really matters either way. If Petr says there are potential problems with using the IN version then Im happy to go with the JOIN version. Tim, are you happy to repeat this testing in the 1.9 version and update the branch in your repository? (depending on whether or not your testing differs wildly from mine).
          Hide
          Tim Hunt added a comment -

          No, I am not happy to re-test. I have got lots of other things to try to get done by the end of this week.

          I have had previous experiences where a JOIN query like this was a performance disaster, and an IN query was better. There are potential problems with either way of writing the query, depending on the whims of the databases' query optimisers. The only reliable option is to test the performance, which we have now done.

          We have some code (thin IN version) that performs well on our two main database, and has been tested to make sure it is functionally correct. It think we should integrate that.

          If anyone else wants to do more work on this, then be my guest, but if we agonise this much about every change we make, then Moodle development will grind to a halt.

          Show
          Tim Hunt added a comment - No, I am not happy to re-test. I have got lots of other things to try to get done by the end of this week. I have had previous experiences where a JOIN query like this was a performance disaster, and an IN query was better. There are potential problems with either way of writing the query, depending on the whims of the databases' query optimisers. The only reliable option is to test the performance, which we have now done. We have some code (thin IN version) that performs well on our two main database, and has been tested to make sure it is functionally correct. It think we should integrate that. If anyone else wants to do more work on this, then be my guest, but if we agonise this much about every change we make, then Moodle development will grind to a halt.
          Hide
          Tim Hunt added a comment -

          Sorry, that was a bit of a rant. The key point here is that we are not worrying about one query being 5% or 10% slower than another - that will always tend to depend on the exact statistics of the data in the various tables, and the way the query optimiser works in a particular version of a particular database.

          The thing you do have to worry about is if a particular database query triggers a really pathological case in the query optimiser of some database. When that happens, the query can take 1000 times longer than necessary (e.g. minutes not miliseconds). That is what you have to worry about and avoid.

          That is clearly not happening with either of the proposed queries, so we don't have to worry.

          Show
          Tim Hunt added a comment - Sorry, that was a bit of a rant. The key point here is that we are not worrying about one query being 5% or 10% slower than another - that will always tend to depend on the exact statistics of the data in the various tables, and the way the query optimiser works in a particular version of a particular database. The thing you do have to worry about is if a particular database query triggers a really pathological case in the query optimiser of some database. When that happens, the query can take 1000 times longer than necessary (e.g. minutes not miliseconds). That is what you have to worry about and avoid. That is clearly not happening with either of the proposed queries, so we don't have to worry.
          Hide
          Andrew Davis added a comment - - edited

          re the rant, don't worry about it. Our current process is prone to being rage inducing

          Unfortunately, with you saying to use IN and Petr saying to use JOIN I don't feel that I am able to make this decision. Ultimately you and I can make whatever decision we like but if Petr doesn't agree then the fix will simply be rejected. I'll try and get Petr back to comment on this issue before I bother creating pull requests. I'm trying to avoid another MDL-25718 (5 rejected pull requests so far).

          Show
          Andrew Davis added a comment - - edited re the rant, don't worry about it. Our current process is prone to being rage inducing Unfortunately, with you saying to use IN and Petr saying to use JOIN I don't feel that I am able to make this decision. Ultimately you and I can make whatever decision we like but if Petr doesn't agree then the fix will simply be rejected. I'll try and get Petr back to comment on this issue before I bother creating pull requests. I'm trying to avoid another MDL-25718 (5 rejected pull requests so far).
          Hide
          Petr Škoda added a comment -

          My experience was that IN can be a performance disaster for mysql, until somebody actually proves it is good or wrong it should not hit any STABLE branch. At the moment it works only if students have one role, but that is probably majority of installs, we can introduce any major database change without testing because it may create serious performance problems for existing sites that use mysql, sorry.

          To Tim: we are already using the JOIN method for enrolment queries and it seems to work fine so far, we did not test IN yet (if yes please tell me the file name and line number).

          Andrew: your testing of IN x JOIN in 2.0 is not correct because the outer query is already limited by the enrol sql, the results in 1.9 may be different because the outer sql returns more data. You would have to change the enrol SQL to use IN too to get the results you wanted. Ideally the test should be done for a really large outer query (all users) and small number of inner query (enrolled users). You might want to create a new query just for that.

          Show
          Petr Škoda added a comment - My experience was that IN can be a performance disaster for mysql, until somebody actually proves it is good or wrong it should not hit any STABLE branch. At the moment it works only if students have one role, but that is probably majority of installs, we can introduce any major database change without testing because it may create serious performance problems for existing sites that use mysql, sorry. To Tim: we are already using the JOIN method for enrolment queries and it seems to work fine so far, we did not test IN yet (if yes please tell me the file name and line number). Andrew: your testing of IN x JOIN in 2.0 is not correct because the outer query is already limited by the enrol sql, the results in 1.9 may be different because the outer sql returns more data. You would have to change the enrol SQL to use IN too to get the results you wanted. Ideally the test should be done for a really large outer query (all users) and small number of inner query (enrolled users). You might want to create a new query just for that.
          Hide
          Petr Škoda added a comment -

          In any case the mysql manual clearly says that "IN" has performance problems, that is enough for reject if you do not prove the docs do not apply to this case:

          A typical case for poor IN subquery performance is when the subquery returns a small number of rows but the outer query returns a large number of rows to be compared to the subquery result.

          The problem is that, for a statement that uses an IN subquery, the optimizer rewrites it as a correlated subquery. Consider the following statement that uses an uncorrelated subquery: ...

          Show
          Petr Škoda added a comment - In any case the mysql manual clearly says that "IN" has performance problems, that is enough for reject if you do not prove the docs do not apply to this case: A typical case for poor IN subquery performance is when the subquery returns a small number of rows but the outer query returns a large number of rows to be compared to the subquery result. The problem is that, for a statement that uses an IN subquery, the optimizer rewrites it as a correlated subquery. Consider the following statement that uses an uncorrelated subquery: ...
          Hide
          Tim Hunt added a comment -

          So, following a discussion with Petr in dev chat:

          1. The MySQL docs are http://dev.mysql.com/doc/refman/5.1/en/subquery-restrictions.html (still applies to 5.5 too.)

          2. The correct solution is to rewrite the queries to be like the second query in this comment: http://tracker.moodle.org/browse/MDL-20946?focusedCommentId=107430&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-107430

          3. Sorry, I don't have time to do any more work on this. We have already deployed my fix to our live servers, because we needed it, and it works fine on Postgres. I guess we will go on using that until we move to 1.9.12.

          Chat log: http://moodle.org/mod/cvsadmin/view.php?conversationid=7100#c264809

          Show
          Tim Hunt added a comment - So, following a discussion with Petr in dev chat: 1. The MySQL docs are http://dev.mysql.com/doc/refman/5.1/en/subquery-restrictions.html (still applies to 5.5 too.) 2. The correct solution is to rewrite the queries to be like the second query in this comment: http://tracker.moodle.org/browse/MDL-20946?focusedCommentId=107430&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-107430 3. Sorry, I don't have time to do any more work on this. We have already deployed my fix to our live servers, because we needed it, and it works fine on Postgres. I guess we will go on using that until we move to 1.9.12. Chat log: http://moodle.org/mod/cvsadmin/view.php?conversationid=7100#c264809
          Hide
          Andrew Davis added a comment -

          PULL-559 for 2.0 stable

          Show
          Andrew Davis added a comment - PULL-559 for 2.0 stable
          Hide
          Andrew Davis added a comment -

          PULL-560 for 1.9

          Show
          Andrew Davis added a comment - PULL-560 for 1.9
          Hide
          Andrew Davis added a comment -

          PULL-561 for 2.1 ie master

          Show
          Andrew Davis added a comment - PULL-561 for 2.1 ie master
          Hide
          Andrew Davis added a comment -

          ok. I think thats all done. phew.

          Show
          Andrew Davis added a comment - ok. I think thats all done. phew.
          Hide
          Tim Hunt added a comment -

          Thanks Andrew. I did a quick review of the 1.9 code, and it looks OK to me. Fingers crossed for integration next week.

          Show
          Tim Hunt added a comment - Thanks Andrew. I did a quick review of the 1.9 code, and it looks OK to me. Fingers crossed for integration next week.
          Hide
          David Mudrak added a comment -

          Tested on 1.9, 2.0 and 2.1dev. Averages are counted correctly now. Thanks Tim and Andrew for you work on this!

          Show
          David Mudrak added a comment - Tested on 1.9, 2.0 and 2.1dev. Averages are counted correctly now. Thanks Tim and Andrew for you work on this!
          Hide
          Helen Foster added a comment -

          This issue is fixed in this week's 1.9.11+ and 2.0.2+. Thanks everyone

          Show
          Helen Foster added a comment - This issue is fixed in this week's 1.9.11+ and 2.0.2+. Thanks everyone

            People

            • Votes:
              5 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: