Moodle
  1. Moodle
  2. MDL-38173

Adding module on courses with completion enabled creates course_module section-id corruption

    Details

    • Database:
      PostgreSQL
    • Testing Instructions:
      Hide

      Part 1 on 2.3 only - check adding a module works correctly

      1. On a clean install of moodle
      2. Turn on debugging
      3. Create a new course (will create 1 forum activity)
      4. Add a file to the second section of the course
      5. Verify in the database that the section column of the course_modules table is not the same for both modules.

      Part 2 - check upgrade works for 2.3, 2.4 and 2.5

      1. Install last weeks stable release of 2.3 (2012062504.10)
      2. Perform the steps listed in part 1 and but verify that the section column for both modules is the same (this is the bug)
      3. Upgrade to the integration version of 2.3
      4. Verify the upgrade completes and after the upgrade, that the section column of the course_modules table is not the same for both modules
      1. Install last weeks stable release of 2.3 (2012062504.10)
      2. Perform the steps listed in part 1 and but verify that the section column for both modules is the same (this is the bug)
      3. Upgrade to the integration version of 2.4
      4. Verify the upgrade completes and after the upgrade, that the section column of the course_modules table is not the same for both modules
      1. Install last weeks stable release of 2.3 (2012062504.10)
      2. Perform the steps listed in part 1 and but verify that the section column for both modules is the same (this is the bug)
      3. Upgrade to the integration version of master
      4. Verify the upgrade completes and after the upgrade, that the section column of the course_modules table is not the same for both modules
      Show
      Part 1 on 2.3 only - check adding a module works correctly On a clean install of moodle Turn on debugging Create a new course (will create 1 forum activity) Add a file to the second section of the course Verify in the database that the section column of the course_modules table is not the same for both modules. Part 2 - check upgrade works for 2.3, 2.4 and 2.5 Install last weeks stable release of 2.3 (2012062504.10) Perform the steps listed in part 1 and but verify that the section column for both modules is the same (this is the bug) Upgrade to the integration version of 2.3 Verify the upgrade completes and after the upgrade, that the section column of the course_modules table is not the same for both modules Install last weeks stable release of 2.3 (2012062504.10) Perform the steps listed in part 1 and but verify that the section column for both modules is the same (this is the bug) Upgrade to the integration version of 2.4 Verify the upgrade completes and after the upgrade, that the section column of the course_modules table is not the same for both modules Install last weeks stable release of 2.3 (2012062504.10) Perform the steps listed in part 1 and but verify that the section column for both modules is the same (this is the bug) Upgrade to the integration version of master Verify the upgrade completes and after the upgrade, that the section column of the course_modules table is not the same for both modules
    • Difficulty:
      Difficult
    • Affected Branches:
      MOODLE_23_STABLE, MOODLE_25_STABLE
    • Fixed Branches:
      MOODLE_23_STABLE, MOODLE_24_STABLE
    • Pull from Repository:
    • Pull 2.4 Branch:
    • Pull Master Branch:
      MDL-38173-master
    • Rank:
      48004

      Description

      After MDL-37939 was applied, we began seeing corruption of modules moving to other courses. I added a number of comments related to that in the bug, but didn't understand the issue well enough to confirm it was caused by that fix. I have now confirmed that.

      When course completion enabled, you get two database updates on mdl_course_modules;

      UPDATE mdl_course_modules SET section = $2 WHERE id = $1
      parameters: $1 = '519934', $2 = '10931'
      UPDATE mdl_course_modules SET section = $2 WHERE id = $1
      parameters: $1 = '930004', $2 = '10931'

      The first one has the wrong value for id, updating a course_module that is wrong. That id is the instance of the module, not it's course_module.id.

        Issue Links

          Activity

          Hide
          Russell Smith added a comment -

          The attached latrobe-database-fix.txt is what we've been applying to repair the corruption until we would work out the cause stop it happening. After each run we've had to apply the upgrade sequence code from MDL-37939.

          Show
          Russell Smith added a comment - The attached latrobe-database-fix.txt is what we've been applying to repair the corruption until we would work out the cause stop it happening. After each run we've had to apply the upgrade sequence code from MDL-37939 .
          Hide
          Russell Smith added a comment -

          I have been many long days to finally pin this down (as my 2am report indicates

          To test this you need to;

          1. Create a course with activity completion turned on.
          2. Add a new section to the course
          3. Run the analysis select query from latrobe-database-fix.txt and confirm no results
          4. Add a new file activity
          5. Run the analysis select query from latrobe-database-fix.txt and confirm a section with invalid section id
          6. Apply the patch
          7. Add another new file activity
          8. Re-run the analysis select query from latrobe-database-fix.txt a final time to confirm no new corruption occurs.

          Again as with all my reports, I only have PostgreSQL to test on, so fixes and tests were developed against that.

          Show
          Russell Smith added a comment - I have been many long days to finally pin this down (as my 2am report indicates To test this you need to; 1. Create a course with activity completion turned on. 2. Add a new section to the course 3. Run the analysis select query from latrobe-database-fix.txt and confirm no results 4. Add a new file activity 5. Run the analysis select query from latrobe-database-fix.txt and confirm a section with invalid section id 6. Apply the patch 7. Add another new file activity 8. Re-run the analysis select query from latrobe-database-fix.txt a final time to confirm no new corruption occurs. Again as with all my reports, I only have PostgreSQL to test on, so fixes and tests were developed against that.
          Hide
          Dan Poltawski added a comment -

          Thanks for the details report Russell.

          Show
          Dan Poltawski added a comment - Thanks for the details report Russell.
          Hide
          Dan Poltawski added a comment -

          Raised to Blocker as this is causing data corruption.

          Show
          Dan Poltawski added a comment - Raised to Blocker as this is causing data corruption.
          Hide
          Dan Poltawski added a comment -

          Andrew, i've assigned this to you as you're most familiar with it. Please reassign to me if you don't have time for it. Thanks

          Show
          Dan Poltawski added a comment - Andrew, i've assigned this to you as you're most familiar with it. Please reassign to me if you don't have time for it. Thanks
          Hide
          Andrew Nicols added a comment -

          Thanks for finding the cause of this issue Russell - stupid mistake to have made. Frustratingly our unit tests won't have picked this up as cmid == id when you only have a few of the same type.

          I'll try and improve the unit tests and incorporate this.

          Dan Poltawski Do you think that we can get away with incrementing the version numbers in the upgrade script in 2.3 to prevent it running twice for those users going from 2.3.4 to 2.3.5?

          Show
          Andrew Nicols added a comment - Thanks for finding the cause of this issue Russell - stupid mistake to have made. Frustratingly our unit tests won't have picked this up as cmid == id when you only have a few of the same type. I'll try and improve the unit tests and incorporate this. Dan Poltawski Do you think that we can get away with incrementing the version numbers in the upgrade script in 2.3 to prevent it running twice for those users going from 2.3.4 to 2.3.5?
          Hide
          Dan Poltawski added a comment -

          Andrew: not sure exactly what you mean, but guessing you mean updating existing upgrade steps. I think thats OK.

          Show
          Dan Poltawski added a comment - Andrew: not sure exactly what you mean, but guessing you mean updating existing upgrade steps. I think thats OK.
          Hide
          Damyon Wiese added a comment -

          Confirmed this is only an issue with the 23 branch and that it is only adding modules that gives this error - not moving modules.

          Show
          Damyon Wiese added a comment - Confirmed this is only an issue with the 23 branch and that it is only adding modules that gives this error - not moving modules.
          Hide
          Damyon Wiese added a comment -

          This triggers a developer warning when adding a module with the chooser - but because the section is also set correctly after this function returns, does not corrupt any data in this case.

          More info:

          This bug is only an issue if $mod->id != $mod->coursemodule in the call to add_mod_to_section.

          Audit of all cases add_mod_to_section:
          admin/tool/generator/locallib.php: - Generator is for devs only - not an issue
          course/dnduploadlib.php: - Triggers warning but then correctly updates course_module with the correct section - not an issue
          course/modedit.php: - Triggers warning but then correctly updates course_module with the correct section - not an issue
          course/lib.php: $mod->id == $mod->coursemodule - not an issue
          mod/forum/lib.php - Triggers warning but then correctly updates course_module with the correct section - not an issue
          mod/forum/lib.php - There is an issue in forum_convert_to_roles - but this is not used anywhere.
          mod/glossay/import.php - Triggers warning but then correctly updates course_module with the correct section - not an issue

          I couldn't get the query described in this bug to find any broken course modules with or without this patch.

          It is worth fixing this to get rid of the warning.

          Show
          Damyon Wiese added a comment - This triggers a developer warning when adding a module with the chooser - but because the section is also set correctly after this function returns, does not corrupt any data in this case. More info: This bug is only an issue if $mod->id != $mod->coursemodule in the call to add_mod_to_section. Audit of all cases add_mod_to_section: admin/tool/generator/locallib.php: - Generator is for devs only - not an issue course/dnduploadlib.php: - Triggers warning but then correctly updates course_module with the correct section - not an issue course/modedit.php: - Triggers warning but then correctly updates course_module with the correct section - not an issue course/lib.php: $mod->id == $mod->coursemodule - not an issue mod/forum/lib.php - Triggers warning but then correctly updates course_module with the correct section - not an issue mod/forum/lib.php - There is an issue in forum_convert_to_roles - but this is not used anywhere. mod/glossay/import.php - Triggers warning but then correctly updates course_module with the correct section - not an issue I couldn't get the query described in this bug to find any broken course modules with or without this patch. It is worth fixing this to get rid of the warning.
          Hide
          Damyon Wiese added a comment -

          The patch provided by Russell looks good to me and only needs applying in 23.

          Show
          Damyon Wiese added a comment - The patch provided by Russell looks good to me and only needs applying in 23.
          Hide
          Russell Smith added a comment -

          The only case where $mod->id != $mod->coursemodule is when course completion is on.

          Have you tested a course with course completion on? course/lib.php is an issue, but only with course completion. Hence why unit tests didn't pick this up in the first place.

          Show
          Russell Smith added a comment - The only case where $mod->id != $mod->coursemodule is when course completion is on. Have you tested a course with course completion on? course/lib.php is an issue, but only with course completion. Hence why unit tests didn't pick this up in the first place.
          Hide
          Damyon Wiese added a comment -

          Hi Russell,

          I tried this with completion on and still could not reproduce any data corruption - and I can't see in the code why it would make any difference? Can you provide any more info ?

          Thanks, Damyon

          Show
          Damyon Wiese added a comment - Hi Russell, I tried this with completion on and still could not reproduce any data corruption - and I can't see in the code why it would make any difference? Can you provide any more info ? Thanks, Damyon
          Hide
          Damyon Wiese added a comment -

          The only occurrence of this in course/lib.php is in moveto_module which looks like this:

          (starting from line 3184)
              $mod->coursemodule = $mod->id;                                                                                                  
                                                                                                                                              
              if (! add_mod_to_section($mod, $beforemod)) {                                                                                   
                  return false;                                                                                                               
              }                                                         
          
          
          Show
          Damyon Wiese added a comment - The only occurrence of this in course/lib.php is in moveto_module which looks like this: (starting from line 3184) $mod->coursemodule = $mod->id; if (! add_mod_to_section($mod, $beforemod)) { return false ; }
          Hide
          Russell Smith added a comment -

          Sorry, I misled you with course/lib.php I should have checked.

          course/modedit.php:477 $sectionid = add_mod_to_section($fromform);

          Let me try and explain what's going on;

          course/modedit.php:447 $returnfromfunc = $addinstancefunction($fromform, $mform);

          In my example, that calls resource_add_instance();

          Which does
          mod/resource/lib.php: resource_add_instance(..)

          $data->id = $DB->insert_record('resource', $data);

          Which is instance id.

          My earlier instructions may be not 100% then. That's how the invalid ID gets in. as $data->id is set to the instance and comes from $fromform. So it gets overwridden.

          I'll try to write these test instructions up a little better next time.

          Show
          Russell Smith added a comment - Sorry, I misled you with course/lib.php I should have checked. course/modedit.php:477 $sectionid = add_mod_to_section($fromform); Let me try and explain what's going on; course/modedit.php:447 $returnfromfunc = $addinstancefunction($fromform, $mform); In my example, that calls resource_add_instance(); Which does mod/resource/lib.php: resource_add_instance(..) $data->id = $DB->insert_record('resource', $data); Which is instance id. My earlier instructions may be not 100% then. That's how the invalid ID gets in. as $data->id is set to the instance and comes from $fromform. So it gets overwridden. I'll try to write these test instructions up a little better next time.
          Hide
          Damyon Wiese added a comment -

          Thanks for the extra info Russell - it helped!

          Show
          Damyon Wiese added a comment - Thanks for the extra info Russell - it helped!
          Hide
          Damyon Wiese added a comment -

          Added an upgrade step (it that also improves the performance of the previously added upgrade step). That code will need a good review though so we will have to target this change for next week.

          Show
          Damyon Wiese added a comment - Added an upgrade step (it that also improves the performance of the previously added upgrade step). That code will need a good review though so we will have to target this change for next week.
          Hide
          Damyon Wiese added a comment -

          The upgrade code in particular could do with some more eyes.

          Show
          Damyon Wiese added a comment - The upgrade code in particular could do with some more eyes.
          Hide
          Russell Smith added a comment -

          I'll attempt to provide quality review tomorrow. My initial question is how can you be certain that all orphaned modules will still appear in the sequence. Is it impossible for them to be removed?

          Show
          Russell Smith added a comment - I'll attempt to provide quality review tomorrow. My initial question is how can you be certain that all orphaned modules will still appear in the sequence. Is it impossible for them to be removed?
          Hide
          Kris Stokking added a comment -

          Damyon - the queries in db/upgrade.php do not take advantage of any indices, causing extremely high execution times. You'll want to join on the course field between the course_sections and course_modules table.

          Show
          Kris Stokking added a comment - Damyon - the queries in db/upgrade.php do not take advantage of any indices, causing extremely high execution times. You'll want to join on the course field between the course_sections and course_modules table.
          Hide
          Damyon Wiese added a comment -

          Hi Kris,

          You cannot join on the course field because the course may be different (because of the bug that is being fixed).

          • Damyon
          Show
          Damyon Wiese added a comment - Hi Kris, You cannot join on the course field because the course may be different (because of the bug that is being fixed). Damyon
          Hide
          Dan Poltawski added a comment -

          Hmm... I'm finding this hard for a friday afternoon..

          My initial question is how can you be certain that all orphaned modules will still appear in the sequence. Is it impossible for them to be removed?

          This seems like an excellent point. Trying to think about this (probably not correctly) don't we need it to be something like:

          LEFT OUTER JOIN {course_sections} ON 
             m.section = s.id
          WHERE ($sequenceconcat  LIKE  $moduleconcat AND s.id IS NULL) OR
          ($sequenceconcat  NOT LIKE  $moduleconcat AND s.id IS NOT NULL)
          

          Also, couldn't we delete the existing upgrade step and just introduce the new one.

          Show
          Dan Poltawski added a comment - Hmm... I'm finding this hard for a friday afternoon.. My initial question is how can you be certain that all orphaned modules will still appear in the sequence. Is it impossible for them to be removed? This seems like an excellent point. Trying to think about this (probably not correctly) don't we need it to be something like: LEFT OUTER JOIN {course_sections} ON m.section = s.id WHERE ($sequenceconcat LIKE $moduleconcat AND s.id IS NULL) OR ($sequenceconcat NOT LIKE $moduleconcat AND s.id IS NOT NULL) Also, couldn't we delete the existing upgrade step and just introduce the new one.
          Hide
          Dan Poltawski added a comment -

          BTW the more I think about it, the more I think we can't fix the performance problem being talked about in this upgrade to fix the data properly.

          Show
          Dan Poltawski added a comment - BTW the more I think about it, the more I think we can't fix the performance problem being talked about in this upgrade to fix the data properly.
          Hide
          Damyon Wiese added a comment -

          The more I think about it the more it hurts!

          Show
          Damyon Wiese added a comment - The more I think about it the more it hurts!
          Hide
          Damyon Wiese added a comment -

          Yes you are right correct Russell, the case where the module does not appear in any sequence is not handled by the first upgrade step - we should remove the performance improvement from that step.

          The second step looks correct.

          Show
          Damyon Wiese added a comment - Yes you are right correct Russell, the case where the module does not appear in any sequence is not handled by the first upgrade step - we should remove the performance improvement from that step. The second step looks correct.
          Hide
          Damyon Wiese added a comment -

          Pushed new branch.

          Show
          Damyon Wiese added a comment - Pushed new branch.
          Hide
          Damyon Wiese added a comment -

          Re-removing the second step - no I don't think we can combine them.

          The problem is that in the first case, the correct data was in the "course_modules::section" column, and the second case the correct data (which would have slipped in since the last upgrade was run) is in the "course_sections::sequence" column.

          Show
          Damyon Wiese added a comment - Re-removing the second step - no I don't think we can combine them. The problem is that in the first case, the correct data was in the "course_modules::section" column, and the second case the correct data (which would have slipped in since the last upgrade was run) is in the "course_sections::sequence" column.
          Hide
          Dan Poltawski added a comment -

          Well, i've stared at the code for quite a while and I can't think of any more problems. It might be worth putting a comment in to explain the concact like part because its not entirely obvious, but other than that it looks ok to me.

          Show
          Dan Poltawski added a comment - Well, i've stared at the code for quite a while and I can't think of any more problems. It might be worth putting a comment in to explain the concact like part because its not entirely obvious, but other than that it looks ok to me.
          Hide
          Russell Smith added a comment -

          None of the corruption examples we saw involved a different course id. If you are willing to bet that the sequences are a safe join key and filter, then course is just as safe, if not safer. It's only the section number that gets moved. If you can put that back to the right number, you can safely join on any other key.

          In the SQL we used to fix the issue, we didn't uses sequences, we dumped it all in section 0. Not necessarily the best outcome, but it only affected old subjects and we needed to get this resolved quickly to determine what was going on.

          I went to reasonable lengths to explain the SQL I attached and even though you can't use that as a direct base, I think there is value in reading it and my comments.

          I'm also not convinced of the safety of using the sequences. But that could very well be my inexperience.

          Otherwise, I've read through the code again and understand what's going on. The flow is correct, but as Dan said. Comments always help here. Especially the why are you doing it. And for some things it took me a minute of starring at it to figure out what each part was achieving.

          I vote it looks okay too.

          Show
          Russell Smith added a comment - None of the corruption examples we saw involved a different course id. If you are willing to bet that the sequences are a safe join key and filter, then course is just as safe, if not safer. It's only the section number that gets moved. If you can put that back to the right number, you can safely join on any other key. In the SQL we used to fix the issue, we didn't uses sequences, we dumped it all in section 0. Not necessarily the best outcome, but it only affected old subjects and we needed to get this resolved quickly to determine what was going on. I went to reasonable lengths to explain the SQL I attached and even though you can't use that as a direct base, I think there is value in reading it and my comments. I'm also not convinced of the safety of using the sequences. But that could very well be my inexperience. Otherwise, I've read through the code again and understand what's going on. The flow is correct, but as Dan said. Comments always help here. Especially the why are you doing it. And for some things it took me a minute of starring at it to figure out what each part was achieving. I vote it looks okay too.
          Hide
          François Gannaz added a comment -

          Just a reminder: this bug is not restricted to course completion. It seems this was not clear, e.g. "course/lib.php is an issue, but only with course completion. Hence why unit tests didn't pick this up in the first place."

          • Install a vanilla Moodle 2.3.4+
          • Enable debug mode
          • Create a course with default settings
          • Go to the course page: "Notice: Undefined property: stdClass::$id in ..."

          In "mod/forum/lib.php" l.3021, there is a call to `add_mod_to_section()` of "course/lib.php", and that fails to insert a record into "course_modules" (no field `$mod->id`).

          Show
          François Gannaz added a comment - Just a reminder: this bug is not restricted to course completion. It seems this was not clear, e.g. "course/lib.php is an issue, but only with course completion. Hence why unit tests didn't pick this up in the first place." Install a vanilla Moodle 2.3.4+ Enable debug mode Create a course with default settings Go to the course page: "Notice: Undefined property: stdClass::$id in ..." In "mod/forum/lib.php" l.3021, there is a call to `add_mod_to_section()` of "course/lib.php", and that fails to insert a record into "course_modules" (no field `$mod->id`).
          Hide
          Kris Stokking added a comment -

          You cannot join on the course field because the course may be different (because of the bug that is being fixed).

          I'm not sure that's true. If a course module was affected in a separate course, it still wouldn't match up to the course section in its own course. But even if it were true you could identify those even more easily because the course field on the module wouldn't match the related section.

          Regardless, the query is essentially a cross join because the predicate "m.section != s.id" filters out very little. And we should, at all costs, avoid doing a full scan of a cross join on two tables where the growth isn't capped. In this case we should expect the total number of records in those tables to be quite high. To put those numbers in perspective, in one of our largest sites the cartesian product of those two tables is well over 253 billion rows. Using really rough estimation, that would translate to ~22 hours to complete. And here's the real kicker - the result set would be empty because MDL-37939 hasn't even been applied yet.

          I drive this last point to bring more overall awareness to the impact upgrade scripts can have on large sites, not to push anyone's face in mud.

          Also, +1 for additional comments. If I came across this fix outside of the context of this ticket, I wouldn't really know what it's doing. That's important if an administrator needs to somehow back out of the upgrade script (e.g. because they didn't apply the patch that originally caused the issue), or needs to apply the fix outside of the regular upgrade cycle.

          Show
          Kris Stokking added a comment - You cannot join on the course field because the course may be different (because of the bug that is being fixed). I'm not sure that's true. If a course module was affected in a separate course, it still wouldn't match up to the course section in its own course. But even if it were true you could identify those even more easily because the course field on the module wouldn't match the related section. Regardless, the query is essentially a cross join because the predicate "m.section != s.id" filters out very little. And we should, at all costs, avoid doing a full scan of a cross join on two tables where the growth isn't capped. In this case we should expect the total number of records in those tables to be quite high. To put those numbers in perspective, in one of our largest sites the cartesian product of those two tables is well over 253 billion rows. Using really rough estimation, that would translate to ~22 hours to complete. And here's the real kicker - the result set would be empty because MDL-37939 hasn't even been applied yet. I drive this last point to bring more overall awareness to the impact upgrade scripts can have on large sites, not to push anyone's face in mud. Also, +1 for additional comments. If I came across this fix outside of the context of this ticket, I wouldn't really know what it's doing. That's important if an administrator needs to somehow back out of the upgrade script (e.g. because they didn't apply the patch that originally caused the issue), or needs to apply the fix outside of the regular upgrade cycle.
          Hide
          Russell Smith added a comment -

          François, are you saying that the course_module is missing in some cases? Does that effect the sequence is a safe join key premise? Maybe this is how I have section 0 items in my database.

          Kris I completely agree with you about performance. To support the course join suggestion, my testing showed course was never corrupted in the data and when it constrains the join so much I am surprised it was excluded. Especially since there is less evidence for sequence begin reliable. I'm not saying it's not, I just have seen anything to support it.

          Further to this is there any way to only run this if you been on the broken version? Or must it run for all upgrades?

          Show
          Russell Smith added a comment - François, are you saying that the course_module is missing in some cases? Does that effect the sequence is a safe join key premise? Maybe this is how I have section 0 items in my database. Kris I completely agree with you about performance. To support the course join suggestion, my testing showed course was never corrupted in the data and when it constrains the join so much I am surprised it was excluded. Especially since there is less evidence for sequence begin reliable. I'm not saying it's not, I just have seen anything to support it. Further to this is there any way to only run this if you been on the broken version? Or must it run for all upgrades?
          Hide
          François Gannaz added a comment -

          Russel, I don't know the full consequences of the bug. We applied the update on an instance running in debug mode and immediately saw the notice, so I have little data. The code line that fails is:

          course/lib.php l.2851 add_mod_to_section()
          $DB->set_field("course_modules", "section", $section->id, array("id" => $mod->id));
          

          I've just had a look, and "mod/forum/lib.php" has the following line just after the call to `add_mod_to_section()`:

          mod/forum/lib.php l.3025
          $DB->set_field("course_modules", "section", $sectionid, array("id" => $mod->coursemodule));
          

          Thanks to this redundancy, the "course_module.section" is not kept at 0. Other modules (or the code in course/) may not behave like this, I haven't checked.

          Show
          François Gannaz added a comment - Russel, I don't know the full consequences of the bug. We applied the update on an instance running in debug mode and immediately saw the notice, so I have little data. The code line that fails is: course/lib.php l.2851 add_mod_to_section() $DB->set_field( "course_modules" , "section" , $section->id, array( "id" => $mod->id)); I've just had a look, and "mod/forum/lib.php" has the following line just after the call to `add_mod_to_section()`: mod/forum/lib.php l.3025 $DB->set_field( "course_modules" , "section" , $sectionid, array( "id" => $mod->coursemodule)); Thanks to this redundancy, the "course_module.section" is not kept at 0. Other modules (or the code in course/) may not behave like this, I haven't checked.
          Hide
          Russell Smith added a comment -

          There has been no activity on this for a number of days. The impact is currently limited to 2.3.4+ users. If 2.3.5 is released with the unresolved, there will be a much greater impact. I am aware this is listed a blocker, however searching the tracker reveals that not all blockers are resolved before releases are made. So I'm unconvinced that having this as a blocker tag will really block the 2.3.5 release.

          Damyon do you have and comment on Kris and my comments regarding course joins and performance? Are there any other status updates to be aware of?

          Show
          Russell Smith added a comment - There has been no activity on this for a number of days. The impact is currently limited to 2.3.4+ users. If 2.3.5 is released with the unresolved, there will be a much greater impact. I am aware this is listed a blocker, however searching the tracker reveals that not all blockers are resolved before releases are made. So I'm unconvinced that having this as a blocker tag will really block the 2.3.5 release. Damyon do you have and comment on Kris and my comments regarding course joins and performance? Are there any other status updates to be aware of?
          Hide
          Dan Marsden added a comment -

          yeah - I agree - if 2.3.5 is released without this fix we'll be in really bad shape.

          Show
          Dan Marsden added a comment - yeah - I agree - if 2.3.5 is released without this fix we'll be in really bad shape.
          Hide
          Dan Poltawski added a comment -

          I agree this is a release blocker.

          Show
          Dan Poltawski added a comment - I agree this is a release blocker.
          Hide
          Damyon Wiese added a comment -

          Hi Kris,

          After much thinking I agree it is safe to add the course to the join. I have added more comments to the patch, added the course to the join and put in a check so the second upgrade step only runs on sites that are upgrading from "2012062504.08".

          Show
          Damyon Wiese added a comment - Hi Kris, After much thinking I agree it is safe to add the course to the join. I have added more comments to the patch, added the course to the join and put in a check so the second upgrade step only runs on sites that are upgrading from "2012062504.08".
          Hide
          Eloy Lafuente (stronk7) added a comment -

          Some comments:

          1a) Cosmetic: The "m.section != s.id" condition is not a JOIN condition, but a WHERE clause restricting the results (much the same than the LIKE condition).

          1b) In fact I'm not sure if that "m.section != s.id" condition is correct as far as it will "hide" some borked records within the same section. I mean, in other words, as far as the borked information is course_modules->section, we cannot rely on it for any condition in the main query. As simple as that.

          2) Cosmetic: Please stop naming/aliasing course_modules as "m" or "module", use "cm" instead, it continuously causes problems because it's easy to confuse them (in fact this issue is about to fix the damage done by such confusion). And course_module->id should be "cmid".

          3) Improvement: Current query can return the same ( s.id, s.course, s.sequence) record multiple times (if there are more than one borked module.id in the same sequence). So it's also processed multiple times later in the loop. This can be fixed by adding some DISTINCT to the query or, alternatively, by returning the results ordered and quick-continuing in the loop if dupes are found.

          4) Depending of the number of orphans found, the get_in_or_equal() may fail. I'd suggest to process it in chunks (say, 500 elements each). Edited: Ah, we are under a section loop, perhaps we can assume there won't be sections with so many elements.

          5) Create folloup issue(s) about to cover with tests all those functions. They sound like easily testable.

          Ciao

          Show
          Eloy Lafuente (stronk7) added a comment - Some comments: 1a) Cosmetic: The "m.section != s.id" condition is not a JOIN condition, but a WHERE clause restricting the results (much the same than the LIKE condition). 1b) In fact I'm not sure if that "m.section != s.id" condition is correct as far as it will "hide" some borked records within the same section. I mean, in other words, as far as the borked information is course_modules->section, we cannot rely on it for any condition in the main query. As simple as that. 2) Cosmetic: Please stop naming/aliasing course_modules as "m" or "module", use "cm" instead, it continuously causes problems because it's easy to confuse them (in fact this issue is about to fix the damage done by such confusion). And course_module->id should be "cmid". 3) Improvement: Current query can return the same ( s.id, s.course, s.sequence) record multiple times (if there are more than one borked module.id in the same sequence). So it's also processed multiple times later in the loop. This can be fixed by adding some DISTINCT to the query or, alternatively, by returning the results ordered and quick-continuing in the loop if dupes are found. 4) Depending of the number of orphans found, the get_in_or_equal() may fail. I'd suggest to process it in chunks (say, 500 elements each). Edited: Ah, we are under a section loop, perhaps we can assume there won't be sections with so many elements. 5) Create folloup issue(s) about to cover with tests all those functions. They sound like easily testable. Ciao
          Hide
          Damyon Wiese added a comment -

          Thanks Eloy,

          6) (Am I allowed to do this for my own patch) It should be $oldversion >= 2012062504.08 && $oldversion < 2012062504.10 (and those version numbers might need bumping since this hasn't been rebased this week).

          Show
          Damyon Wiese added a comment - Thanks Eloy, 6) (Am I allowed to do this for my own patch) It should be $oldversion >= 2012062504.08 && $oldversion < 2012062504.10 (and those version numbers might need bumping since this hasn't been rebased this week).
          Hide
          Eloy Lafuente (stronk7) added a comment -

          Side note, once this lands, MDL-37642 needs to be informed.

          Show
          Eloy Lafuente (stronk7) added a comment - Side note, once this lands, MDL-37642 needs to be informed.
          Hide
          Eloy Lafuente (stronk7) added a comment -

          ah, yes, versions adjustment may be necessary.

          Show
          Eloy Lafuente (stronk7) added a comment - ah, yes, versions adjustment may be necessary.
          Hide
          Damyon Wiese added a comment -

          Rebased and added a new commit with improvements:

          1a) Changed to a where clause
          1b) This wont hide borked records in the same section - because if they are in the same section they are not borked
          2) Changed the aliases / variables accordingly
          3) Good idea - I added a distinct to the query which should cover this
          4) Yeah - because this is in a section loop I think it's best to keep this simple
          5) Just looking now - might post a patch on this issue (the test would only be valid on 23)
          6) Did the version adjustments in this version.

          Show
          Damyon Wiese added a comment - Rebased and added a new commit with improvements: 1a) Changed to a where clause 1b) This wont hide borked records in the same section - because if they are in the same section they are not borked 2) Changed the aliases / variables accordingly 3) Good idea - I added a distinct to the query which should cover this 4) Yeah - because this is in a section loop I think it's best to keep this simple 5) Just looking now - might post a patch on this issue (the test would only be valid on 23) 6) Did the version adjustments in this version.
          Hide
          Kris Stokking added a comment -

          Damyon - Many thanks, the new query performs much better. On average it returns in < 1s, and on the massive site I referred to above it still only took 24 seconds.

          Regarding the version check, wouldn't that always return true? Are you expecting a developer to rip out the upgrade step for 2012062504.08 if they aren't affected by MDL-37939?

          Show
          Kris Stokking added a comment - Damyon - Many thanks, the new query performs much better. On average it returns in < 1s, and on the massive site I referred to above it still only took 24 seconds. Regarding the version check, wouldn't that always return true? Are you expecting a developer to rip out the upgrade step for 2012062504.08 if they aren't affected by MDL-37939 ?
          Hide
          Dan Poltawski added a comment -

          Kris:
          > Regarding the version check, wouldn't that always return true? Are you expecting a developer to rip out the upgrade step for 2012062504.08 if they aren't affected by MDL-37939?

          We don't update the $oldversion variable during the upgrade process, so it shouldn't be necessary (is my understanding).

          Show
          Dan Poltawski added a comment - Kris: > Regarding the version check, wouldn't that always return true? Are you expecting a developer to rip out the upgrade step for 2012062504.08 if they aren't affected by MDL-37939 ? We don't update the $oldversion variable during the upgrade process, so it shouldn't be necessary (is my understanding).
          Hide
          Kris Stokking added a comment -

          I think you're right. That makes much more sense, thanks Dan!

          Show
          Kris Stokking added a comment - I think you're right. That makes much more sense, thanks Dan!
          Hide
          Eloy Lafuente (stronk7) added a comment - - edited

          hehe, playing with versions that way... really sucks... what if somebody updated last week to 2012062504.08 and today has updated to 2.4.1+ ? lalala, problem transmitted to 2.4.1 and not fixed. Do we need the fix in 24_STABLE and master? yes IMO.

          General rule says that any upgrade step must be present on any "achievable" version. And both 2.4.1+ and 2.5dev are.

          Show
          Eloy Lafuente (stronk7) added a comment - - edited hehe, playing with versions that way... really sucks... what if somebody updated last week to 2012062504.08 and today has updated to 2.4.1+ ? lalala, problem transmitted to 2.4.1 and not fixed. Do we need the fix in 24_STABLE and master? yes IMO. General rule says that any upgrade step must be present on any "achievable" version. And both 2.4.1+ and 2.5dev are.
          Hide
          Eloy Lafuente (stronk7) added a comment - - edited

          (otherwise, perfect and cross-db)

          Show
          Eloy Lafuente (stronk7) added a comment - - edited (otherwise, perfect and cross-db)
          Hide
          Dan Poltawski added a comment - - edited

          urgh. Very good point. So, i'm also glad to see the new query unbelievably too expensive.

          Show
          Dan Poltawski added a comment - - edited urgh. Very good point. So, i'm also glad to see the new query unbelievably too expensive.
          Hide
          Damyon Wiese added a comment -

          Added master and 24 branches with only the upgrade step.

          Show
          Damyon Wiese added a comment - Added master and 24 branches with only the upgrade step.
          Hide
          Damyon Wiese added a comment -

          Changed the testing instructions to cover the upgrades and set the difficulty of this issue to DIFFICULT!

          Show
          Damyon Wiese added a comment - Changed the testing instructions to cover the upgrades and set the difficulty of this issue to DIFFICULT!
          Hide
          Damyon Wiese added a comment -

          Re-pushed 24 and master because they dont need a save point and dont need a version bump.

          Show
          Damyon Wiese added a comment - Re-pushed 24 and master because they dont need a save point and dont need a version bump.
          Hide
          Eloy Lafuente (stronk7) added a comment -

          Thanks Damyon, looking...

          Show
          Eloy Lafuente (stronk7) added a comment - Thanks Damyon, looking...
          Hide
          Eloy Lafuente (stronk7) added a comment -

          Uhm... being horribly pessimistic... imagine the upgrade from 2.3 (2012062504.08) to 2.4.1+ fails 2 or 3 steps before the new block. Then, for next run, $oldversion would, already, have been changed to the last executed upgrade step, hence our block won't be executed ever.

          Anyway, I don't want to be horribly pesimistic... so I'm pushing this for a testing round.

          Any feedback, apart from testing will be really welcome, ciao

          Show
          Eloy Lafuente (stronk7) added a comment - Uhm... being horribly pessimistic... imagine the upgrade from 2.3 (2012062504.08) to 2.4.1+ fails 2 or 3 steps before the new block. Then, for next run, $oldversion would, already, have been changed to the last executed upgrade step, hence our block won't be executed ever. Anyway, I don't want to be horribly pesimistic... so I'm pushing this for a testing round. Any feedback, apart from testing will be really welcome, ciao
          Hide
          Eloy Lafuente (stronk7) added a comment -

          Integrated (23, 24 & master), thanks!

          Show
          Eloy Lafuente (stronk7) added a comment - Integrated (23, 24 & master), thanks!
          Hide
          Eloy Lafuente (stronk7) added a comment -

          Note: 1 extra commit has been added to the 23_STABLE branch, simply swapping the conditions, to make the CI servers happy when checking for upgrade steps/savepoints consistency.

          http://git.moodle.org/gw?p=integration.git;a=commitdiff;h=71db4e58d767e79bc77e5d01905e445126d25a12

          Ciao

          Show
          Eloy Lafuente (stronk7) added a comment - Note: 1 extra commit has been added to the 23_STABLE branch, simply swapping the conditions, to make the CI servers happy when checking for upgrade steps/savepoints consistency. http://git.moodle.org/gw?p=integration.git;a=commitdiff;h=71db4e58d767e79bc77e5d01905e445126d25a12 Ciao
          Hide
          Damyon Wiese added a comment -

          About 2am I thought of something else... - Someone upgrades to 2012062504.08, then they upgrade to an older version of 2.4, then then upgrade to the latest version of 2.4 - the upgrade step will be skipped. I think we should use a standard upgrade version check for 2.4 and 2.5.

          Show
          Damyon Wiese added a comment - About 2am I thought of something else... - Someone upgrades to 2012062504.08, then they upgrade to an older version of 2.4, then then upgrade to the latest version of 2.4 - the upgrade step will be skipped. I think we should use a standard upgrade version check for 2.4 and 2.5.
          Hide
          Damyon Wiese added a comment -

          Rebased 24 and master branches and added a patch to make the version change noted above.

          Show
          Damyon Wiese added a comment - Rebased 24 and master branches and added a patch to make the version change noted above.
          Hide
          Frédéric Massart added a comment -

          Passed \o/. orz

          Show
          Frédéric Massart added a comment - Passed \o/. orz
          Hide
          Eloy Lafuente (stronk7) added a comment -

          This is valid for unlimited entries to the, soon to be unveiled, Moodle Codebase Gardens. It includes free access to all facilities.

          Personal and non-transferable to all assignees, reviewers and testers in this issue. Valid until switching to Blackboard (100000€ penalization will be applied).

          Thanks, closing as fixed!

          Show
          Eloy Lafuente (stronk7) added a comment - This is valid for unlimited entries to the, soon to be unveiled, Moodle Codebase Gardens. It includes free access to all facilities. Personal and non-transferable to all assignees, reviewers and testers in this issue. Valid until switching to Blackboard (100000€ penalization will be applied). Thanks, closing as fixed!
          Hide
          Eloy Lafuente (stronk7) added a comment -

          Yay, this issue is going to get all the prizes:

          1) One hardcoded db prefix escaped all our verifications: MDL-38378 . I'm re-rolling all weeklies with that fixed.

          2) The last commits that you added to 24 and master weren't ever integrated. And this went to testing without them (I was sleeping while everything happened, grrr), testing should have been halted. I'm not rolling them now at all. Please create followup issue and add them there.

          3) MDLSITE-2149 has been created to increase the chances of harcoded db prefixes to be detected often by CI servers. Will work on that.

          What an issue, what an issue! Ciao

          Show
          Eloy Lafuente (stronk7) added a comment - Yay, this issue is going to get all the prizes: 1) One hardcoded db prefix escaped all our verifications: MDL-38378 . I'm re-rolling all weeklies with that fixed. 2) The last commits that you added to 24 and master weren't ever integrated. And this went to testing without them (I was sleeping while everything happened, grrr), testing should have been halted. I'm not rolling them now at all. Please create followup issue and add them there. 3) MDLSITE-2149 has been created to increase the chances of harcoded db prefixes to be detected often by CI servers. Will work on that. What an issue, what an issue! Ciao
          Hide
          Dan Marsden added a comment -

          wow - so MDL-37430 was fixed but caused a regression (MDL-37939) that was fixed but caused the regression MDL-38173 - we fixed MDL-38173 and caused more regressions?... crazy.....

          Show
          Dan Marsden added a comment - wow - so MDL-37430 was fixed but caused a regression ( MDL-37939 ) that was fixed but caused the regression MDL-38173 - we fixed MDL-38173 and caused more regressions?... crazy.....
          Hide
          Damyon Wiese added a comment -

          ... no words ...

          Show
          Damyon Wiese added a comment - ... no words ...
          Hide
          Aparup Banerjee added a comment -

          Eloy - we can't afford you to sleep :-p
          meanwhile, while he's sleeping - noting here the last commits have been integrated now.

          Show
          Aparup Banerjee added a comment - Eloy - we can't afford you to sleep :-p meanwhile, while he's sleeping - noting here the last commits have been integrated now.
          Hide
          Andrew Nicols added a comment -

          This may be causing another issue for some users: https://moodle.org/mod/forum/discuss.php?d=224238

          Show
          Andrew Nicols added a comment - This may be causing another issue for some users: https://moodle.org/mod/forum/discuss.php?d=224238
          Hide
          Eloy Lafuente (stronk7) added a comment -

          And the story continues this (caused) => MDL-38386 (caused) => MDL-38228 ... not sure if to laugh or to cry.

          Show
          Eloy Lafuente (stronk7) added a comment - And the story continues this (caused) => MDL-38386 (caused) => MDL-38228 ... not sure if to laugh or to cry.

            People

            • Votes:
              4 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: