Moodle
  1. Moodle
  2. MDL-36789

Corrupt modinfo in course table [problem and a fix]

    Details

    • Testing Instructions:
      Hide

      Test 1. Self-rebuild of corrupted cache.

      In order to test the change you need to have a corrupted course modinfo cache.

      Easy way to emulate a corrupt cache:

      1. Create course and couple of modules in it. View the course.
      2. Manually copy the contents of mdl_course.modinfo for this course into buffer
      3. Delete one of the modules. View the course
      4. Manually paste the old contents back into mdl_course.modinfo

      Corrupting the cache similar to production environment is also described in my comment below: https://tracker.moodle.org/browse/MDL-36789?focusedCommentId=237301&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-237301

      1. After the cache is corrupted try viewing the course page
      2. Make sure the debugging message is displayed and course can be viewed.

      Test 2. Performance and potential regressions (on 2.4 and 2.5)

      1. Enrol student in multiple courses
      2. Add in your config.php define('MAX_MODINFO_CACHE_SIZE', 10); - set the number less than number of courses where student is enrolled
      3. View each course page and /my/ page, make sure everything is displayed correctly.
      4. Compare the performance numbers of course view page and /my/ page with and without patch.
      Show
      Test 1. Self-rebuild of corrupted cache. In order to test the change you need to have a corrupted course modinfo cache. Easy way to emulate a corrupt cache: Create course and couple of modules in it. View the course. Manually copy the contents of mdl_course.modinfo for this course into buffer Delete one of the modules. View the course Manually paste the old contents back into mdl_course.modinfo Corrupting the cache similar to production environment is also described in my comment below: https://tracker.moodle.org/browse/MDL-36789?focusedCommentId=237301&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-237301 After the cache is corrupted try viewing the course page Make sure the debugging message is displayed and course can be viewed. Test 2. Performance and potential regressions (on 2.4 and 2.5) Enrol student in multiple courses Add in your config.php define('MAX_MODINFO_CACHE_SIZE', 10); - set the number less than number of courses where student is enrolled View each course page and /my/ page, make sure everything is displayed correctly. Compare the performance numbers of course view page and /my/ page with and without patch.
    • Workaround:
      Hide
      1. Open course edit page /course/edit.php?id=XXX
      2. Save without making any changes

      This action calls rebuilding of course cache.

      Show
      Open course edit page /course/edit.php?id=XXX Save without making any changes This action calls rebuilding of course cache.
    • Affected Branches:
      MOODLE_22_STABLE, MOODLE_24_STABLE
    • Fixed Branches:
      MOODLE_24_STABLE, MOODLE_25_STABLE
    • Pull 2.4 Branch:
      wip-MDL-36789-m24
    • Pull 2.5 Branch:
      wip-MDL-36789-m25
    • Pull Master Branch:
      wip-MDL-36789-master
    • Rank:
      46301
    • Sprint:
      BACKEND Sprint 3

      Description

      Moodle returns an error when viewing a given course at the URI: /course/view.php .
      E.g. https://moodledebug.hi.is/course/view.php?id=929

      The method to reproduce this error is only one step:
      1: https://example.moodle.site/course/view.php?id=929

      Moodle reports this error:
      Can not find data record in database table course_modules.

      Please bear in mind that that the very nature of this error means that it is not repeatable, yet it is consistent, in the sense that anybody could repeat the error and the fix if they had the database with the corrupt data.

      Web server logs show this, (when debugging is on):
      Default exception handler:

      Can not find data record in database table course_modules.
       Debug: 
      SELECT id,course FROM {course_modules} 
      WHERE id = ?\n[array (\n  0 => '20624',\n)]\n* 
      line 1272 of /lib/dml/moodle_database.php: dml_missing_record_exception thrown\n* 
      line 1249 of /lib/dml/moodle_database.php: call to moodle_database->get_record_select()\n* 
      line 6526 of /lib/accesslib.php: call to moodle_database->get_record()\n* 
      line 1019 of /lib/modinfolib.php: call to context_module::instance()\n* 
      line 1006 of /lib/modinfolib.php: call to cm_info->is_user_access_restricted_by_group()\n* 
      line 980 of /lib/modinfolib.php: call to cm_info->update_user_visible()\n* 
      line 272 of /lib/modinfolib.php: call to cm_info->obtain_dynamic_data()\n* 
      line 1123 of /lib/modinfolib.php: call to course_modinfo->__construct()\n* 
      line 1630 of /lib/navigationlib.php: call to get_fast_modinfo()\n* 
      line 1692 of /lib/navigationlib.php: call to global_navigation->generate_sections_and_activities()\n* 
      line 47 of /course/format/weeks/lib.php: call to global_navigation->load_generic_course_sections()\n* 
      line 1605 of /lib/navigationlib.php: call to callback_weeks_load_content()\n* 
      line 1155 of /lib/navigationlib.php: call to global_navigation->load_course_sections()\n* 
      line 2679 of /lib/navigationlib.php: call to global_navigation->initialise()\n* 
      line 659 of /lib/pagelib.php: call to navbar->has_items()\n* 
      line 4 of /theme/lbhi/layout/general.php: call to moodle_page->has_navbar()\n* 
      line 685 of /lib/outputrenderers.php: call to include()\n* 
      line 637 of /lib/outputrenderers.php: call to core_renderer->render_page_layout()\n* 
      line 217 of /course/view.php: call to core_renderer->header()\n
      

      The explanation for this error might be that Moodle builds up a cache in the moodle db, most likely for speeding up the database search. The cache is in the course table and the column is modinfo. Try to have a look at the content in this column. Here is one example:

       select modinfo from mdl_course where id = 923; 

      And it goes like this:

      course.modinfo
       a:23:{i:19867;O:8:"stdClass":10:{s:2:"id";s:4:"2800";s:2:"cm";s:5:"19867";s:3:"mod";s:5:"forum";s:7:"section";s:1:"0";s:9:"sectionid";s:4:"8
      945";s:6:"module";s:1:"7";s:5:"added";s:10:"1345668034";s:7:"visible";s:1:"1";s:10:"visibleold";s:1:"1";s:4:"name";s:20:"Tilkynningar kennara";}i:20923;O:8:"stdClass":12:s:2:"id";s:4:"7591";s:2:"cm";s:5:"20923";s:3:"mod";s:8:"resource";s:
      

      and it goes on and on like this for pages. :-/

      Now if this cache is somehow corrupted - and we have numerous proofs that it does, yet we do not know how it happens or what causes it - the corruption does not corrupt the database itself, but rather the function of the Moodle system.
      This cache can easily be reset for diagnostics.
      In the database, modinfo can be reset without any loss of data. Here is what we do:

       update mdl_course set modinfo='' where id = 929;

      In this example the course number is 929, the same number as in the url when viewing the course. Now if I am impatient, I tend to just reset the cache for all courses in the database like this:

       update mdl_course set modinfo='' where id > 0 and id < 2000;

      Doing this does defeat the purpose of the database cache in modinfo, thus slowing the system down. But for debugging problems, and to recover courses that seem to get lost, this procedure sometimes makes the sysadmin job a little bit easier.
      Good luck.

      1. MDL-36789-modinfo-broken.txt
        91 kB
        Anthony Borrow
      2. MDL-36789-modinfo-rebuild.txt
        90 kB
        Anthony Borrow

        Issue Links

          Activity

          Hide
          Michael de Raadt added a comment -

          Hi, Anna.

          Is 929 an id of an actual course in your Moodle instance?

          The problem may be the result of a corruption in your database. If we cannot reproduce the problem, there is not much we can do to fix it. If you can point to a cause for this problem, that would help.

          Could you please add details about your environment.

          Show
          Michael de Raadt added a comment - Hi, Anna. Is 929 an id of an actual course in your Moodle instance? The problem may be the result of a corruption in your database. If we cannot reproduce the problem, there is not much we can do to fix it. If you can point to a cause for this problem, that would help. Could you please add details about your environment.
          Hide
          Michael de Raadt added a comment -

          I don't think this is a security issue, so I've removed that.

          I'll leave the rest of this for Sam to triage.

          Show
          Michael de Raadt added a comment - I don't think this is a security issue, so I've removed that. I'll leave the rest of this for Sam to triage.
          Hide
          Dan Marsden added a comment -

          Sam - there's a more complete trace in MDL-36690 of an example when modinfo becomes corrupt - I wonder if there's a way we could check to see if modinfo unserializes and if not throw an error earlier/rebuild it somehow? - problem with MDL-36690 is that if you rebuild it will still fail

          Show
          Dan Marsden added a comment - Sam - there's a more complete trace in MDL-36690 of an example when modinfo becomes corrupt - I wonder if there's a way we could check to see if modinfo unserializes and if not throw an error earlier/rebuild it somehow? - problem with MDL-36690 is that if you rebuild it will still fail
          Hide
          Anna Jonna Ármannsdóttir added a comment - - edited

          Hi Michael and thanks for the answer and the question. As a matter of fact, the number 929 is an id of an actual course. We have been looking at this problem for a year but have not been able to pinpoint exactly what causes it. There was in fact a corruption in our database, and I think we managed to solve that thanks to a homebrewn script we developed.
          You see, if the modinfo column is cleared, it is rebuilt through the serialize function and that is done correctly. If this would be caused by a corruption in the database, then the serialization would fill modinfo with wrong information and the course would be fraught with problems. This is not the case!
          On the contrary, when modinfo column is cleared, everything works fine.

          If you like, this script can be sent to you.
          Details about the moodle production environment can also be sent. Maybe you have concise instructions about what you want exactly.
          The setup is two Apache webservers with a Pirahna load balancer/failover front.
          Red Hat Enterprise Linux Server release 6.3 (Santiago)
          The webservers and the loadbalancer is running in virtual machines in a
          KVM environment.

          The database is Postgresql version 8.4.12 .
          Red Hat Enterprise Linux Server release 6.3 (Santiago)
          All databases are backed up, at least once a day, and every time we have had trouble,
          I have copied the database onto my machine and I dont think I have erazed the copies.
          Thus I have a number of examples of databases that have a corrupt modinfo that could be
          rectified just by nulling it. Would you be interested in analyzing those databases?

          The Moodle files are on a file server running NFS and mounted onto the webservers,
          also Red Hat Enterprise Linux Server release 6.3 (Santiago)
          The database machine and the file server are both real physical machines
          (recent IBM blades).

          This was the terse version of the moodle production environment.
          I will be happy to provide some more, and please tell me where I can upload the script I mentioned.
          Best regards, Anna.

          Show
          Anna Jonna Ármannsdóttir added a comment - - edited Hi Michael and thanks for the answer and the question. As a matter of fact, the number 929 is an id of an actual course. We have been looking at this problem for a year but have not been able to pinpoint exactly what causes it. There was in fact a corruption in our database, and I think we managed to solve that thanks to a homebrewn script we developed. You see, if the modinfo column is cleared, it is rebuilt through the serialize function and that is done correctly. If this would be caused by a corruption in the database, then the serialization would fill modinfo with wrong information and the course would be fraught with problems. This is not the case! On the contrary, when modinfo column is cleared, everything works fine. If you like, this script can be sent to you. Details about the moodle production environment can also be sent. Maybe you have concise instructions about what you want exactly. The setup is two Apache webservers with a Pirahna load balancer/failover front. Red Hat Enterprise Linux Server release 6.3 (Santiago) The webservers and the loadbalancer is running in virtual machines in a KVM environment. The database is Postgresql version 8.4.12 . Red Hat Enterprise Linux Server release 6.3 (Santiago) All databases are backed up, at least once a day, and every time we have had trouble, I have copied the database onto my machine and I dont think I have erazed the copies. Thus I have a number of examples of databases that have a corrupt modinfo that could be rectified just by nulling it. Would you be interested in analyzing those databases? The Moodle files are on a file server running NFS and mounted onto the webservers, also Red Hat Enterprise Linux Server release 6.3 (Santiago) The database machine and the file server are both real physical machines (recent IBM blades). This was the terse version of the moodle production environment. I will be happy to provide some more, and please tell me where I can upload the script I mentioned. Best regards, Anna.
          Hide
          Anthony Borrow added a comment -

          I just received a report from one of my sites that is similar in nature to this report. I also deleted modinfo in mdl_course (and sectioncache too) so that it would be rebuilt and everything was fine. I am going to attach the before and after modinfo (should that be helpful in troubleshooting). In my case it was course_module 7785 that was missing. Since this seems to be a caching issue, it would seem that if the cache is failing that we may want to automatically rebuild it rather than not stopping on an error and causing the user to no longer to be able to see the content of the course. Peace - Anthony

          Show
          Anthony Borrow added a comment - I just received a report from one of my sites that is similar in nature to this report. I also deleted modinfo in mdl_course (and sectioncache too) so that it would be rebuilt and everything was fine. I am going to attach the before and after modinfo (should that be helpful in troubleshooting). In my case it was course_module 7785 that was missing. Since this seems to be a caching issue, it would seem that if the cache is failing that we may want to automatically rebuild it rather than not stopping on an error and causing the user to no longer to be able to see the content of the course. Peace - Anthony
          Hide
          Anthony Borrow added a comment -

          Here are the before and after modinfo fields in case they might be of help in tracking down what may be happening. Peace - Anthony

          Show
          Anthony Borrow added a comment - Here are the before and after modinfo fields in case they might be of help in tracking down what may be happening. Peace - Anthony
          Hide
          Anthony Borrow added a comment -

          In terms of a solution, at the very least, I think the site admin should have an option to clear the modinfo (and perhaps the section cache) under course administration. That would provide a way through the web interface to fix the problem without having to directly meddle with the database. Peace - Anthony

          Show
          Anthony Borrow added a comment - In terms of a solution, at the very least, I think the site admin should have an option to clear the modinfo (and perhaps the section cache) under course administration. That would provide a way through the web interface to fix the problem without having to directly meddle with the database. Peace - Anthony
          Hide
          Anthony Borrow added a comment -

          p.s. - Given that this is the only book resource and looking at MDL-36690, it may be a similar issue except with the book module (at least that is where I would start looking).

          Show
          Anthony Borrow added a comment - p.s. - Given that this is the only book resource and looking at MDL-36690 , it may be a similar issue except with the book module (at least that is where I would start looking).
          Hide
          Russell Smith added a comment -

          I do not know if it's related, but I saw this as a failure case for MDL-37939 on our 2.3.4+ instance. I might be linked to a 2.2 module moving bug.

          Show
          Russell Smith added a comment - I do not know if it's related, but I saw this as a failure case for MDL-37939 on our 2.3.4+ instance. I might be linked to a 2.2 module moving bug.
          Hide
          Sam Hemelryk added a comment -

          Hi guys,

          I've up'd the priority on this and set it to the course component.
          It appears what ever is going wrong is happening in the translation of course modinfo and having diff'd Anthony's before and afters it appears for him at least it was a conversion from mod/resource to mod/label.
          Perhaps there is something that we are missing in the upgrade?

          Show
          Sam Hemelryk added a comment - Hi guys, I've up'd the priority on this and set it to the course component. It appears what ever is going wrong is happening in the translation of course modinfo and having diff'd Anthony's before and afters it appears for him at least it was a conversion from mod/resource to mod/label. Perhaps there is something that we are missing in the upgrade?
          Hide
          Darko Grabar added a comment -

          Hi,

          same situation started happening in our main production site after upgrade to 2.4.1+ (from 2.4). This never happened before the upgrade. We are using postgresql 9.2.3.

          Show
          Darko Grabar added a comment - Hi, same situation started happening in our main production site after upgrade to 2.4.1+ (from 2.4). This never happened before the upgrade. We are using postgresql 9.2.3.
          Hide
          Anthony Borrow added a comment -

          For those who need to rebuild the modinfo there is now an admin tool that I have approved in Moodle Plugins - https://moodle.org/plugins/view.php?plugin=tool_rebuildcoursecache

          Show
          Anthony Borrow added a comment - For those who need to rebuild the modinfo there is now an admin tool that I have approved in Moodle Plugins - https://moodle.org/plugins/view.php?plugin=tool_rebuildcoursecache
          Hide
          Ed de Tollenaer added a comment -

          Thank you so much. I solved our problem with your info!

          Show
          Ed de Tollenaer added a comment - Thank you so much. I solved our problem with your info!
          Hide
          Russell Smith added a comment -

          Hi Ed,

          What version of Moodle are you running? Nobody has been able to identify a root cause for this and more supporting information would be great.

          Show
          Russell Smith added a comment - Hi Ed, What version of Moodle are you running? Nobody has been able to identify a root cause for this and more supporting information would be great.
          Hide
          Ed de Tollenaer added a comment -

          Dear Russel,
          We use Moodle 2.4.3+ (Build: 20130322). I don't know what my client did to produce this error, but I will dive into this.

          Regards,
          Ed

          Show
          Ed de Tollenaer added a comment - Dear Russel, We use Moodle 2.4.3+ (Build: 20130322). I don't know what my client did to produce this error, but I will dive into this. Regards, Ed
          Hide
          Dan Poltawski added a comment -

          This issue was assigned to me automatically, however I will not be able to work on this issue in the immediate future. In order to create a truer sense of the state of this issue and to allow other developers to have chance to become involved, I am removing myself as the assignee of this issue.

          For more information, see http://docs.moodle.org/dev/Changes_to_issue_assignment

          Show
          Dan Poltawski added a comment - This issue was assigned to me automatically, however I will not be able to work on this issue in the immediate future. In order to create a truer sense of the state of this issue and to allow other developers to have chance to become involved, I am removing myself as the assignee of this issue. For more information, see http://docs.moodle.org/dev/Changes_to_issue_assignment
          Hide
          Russell Smith added a comment -

          I've been thinking about this for a little while. Is there any possibility this is caused by a race condition between two editors working on the same section and racing each other to various parts of the rebuild_cache code. Something like;

          a. opens transaction
          b. opens transaction
          a. updates module and begins a cache rebuild
          b. updates a different module and begins course cache rebuild
          a. commits with first update.
          b. commits with second update, but missing changes from (a) in the cache.

          We now have a situation where the course cache does not match the course_modules information in the modinfo cache in the database. Under the right update//delete/add scenario we will now possibly attempt to lookup a module that is not there. When that happens you see the 'kaboom' of this issue. And obviously a simple cache rebiuld fixes it as we our of of the race condition territory.

          Thoughts and comments?

          Show
          Russell Smith added a comment - I've been thinking about this for a little while. Is there any possibility this is caused by a race condition between two editors working on the same section and racing each other to various parts of the rebuild_cache code. Something like; a. opens transaction b. opens transaction a. updates module and begins a cache rebuild b. updates a different module and begins course cache rebuild a. commits with first update. b. commits with second update, but missing changes from (a) in the cache. We now have a situation where the course cache does not match the course_modules information in the modinfo cache in the database. Under the right update//delete/add scenario we will now possibly attempt to lookup a module that is not there. When that happens you see the 'kaboom' of this issue. And obviously a simple cache rebiuld fixes it as we our of of the race condition territory. Thoughts and comments?
          Hide
          Paul Vaughan added a comment -

          Just seen this problem on our live Moodle 2.4.3 installation (and the fix worked just fine, thanks). Adjusting the details up top to reflect this.

          Show
          Paul Vaughan added a comment - Just seen this problem on our live Moodle 2.4.3 installation (and the fix worked just fine, thanks). Adjusting the details up top to reflect this.
          Hide
          Tim Fletcher added a comment -

          I think there maybe a race condition, possibly related to some way to the drag and drop AJAX functionality. A Moodle site that I look after has just had this problem on a course that 2 teachers where working on at the same time. I have both Moodle and Apache logs from the incident if they would be helpful to anyone.

          Show
          Tim Fletcher added a comment - I think there maybe a race condition, possibly related to some way to the drag and drop AJAX functionality. A Moodle site that I look after has just had this problem on a course that 2 teachers where working on at the same time. I have both Moodle and Apache logs from the incident if they would be helpful to anyone.
          Hide
          Paul Nicholls added a comment -

          Just encountered a new variation on this - a new Feedback activity was added to a course, but it didn't get added to the modinfo cache. This meant that it showed up in the "upcoming events" block - but nowhere else! Clicking on the link in the "upcoming events" block resulted in a similar error, and rebuilding the course's modinfo cache fixed it.

          This leads me to suspect that it's related to the rebuild step - something causes the rebuild that should happen after a change (module added or deleted) to fail or otherwise be skipped in some circumstances, and Moodle then continues on as if the modinfo cache was fresh.

          Show
          Paul Nicholls added a comment - Just encountered a new variation on this - a new Feedback activity was added to a course, but it didn't get added to the modinfo cache. This meant that it showed up in the "upcoming events" block - but nowhere else! Clicking on the link in the "upcoming events" block resulted in a similar error, and rebuilding the course's modinfo cache fixed it. This leads me to suspect that it's related to the rebuild step - something causes the rebuild that should happen after a change (module added or deleted) to fail or otherwise be skipped in some circumstances, and Moodle then continues on as if the modinfo cache was fresh.
          Hide
          Marina Glancy added a comment -

          Finally I reproduced, with a hack but it potentially can happen on production if DB connection is slow or course is big:

          1. Create a course, enrol student and teacher, make sure AJAX and javascript are enabled.
          2. In browser 1 (teacher): add two activities
          3. In browser 2 (student): open the course
          4. Hack the code and add "sleep(5);" in lib/modinfolib.php inside function rebuild_course_cache() before line $DB->update_record("course", $updateobj);
          5. In browser 1 (teacher): click on eye icon next to one activity to make it invisible
          6. In browser 2 (student): refresh the course page
          7. In browser 1 (teacher): delete another activity
          8. In browser 1 (teacher): Wait until student's page is rendered and refresh the course page - fatal error occurs.
          Show
          Marina Glancy added a comment - Finally I reproduced, with a hack but it potentially can happen on production if DB connection is slow or course is big: Create a course, enrol student and teacher, make sure AJAX and javascript are enabled. In browser 1 (teacher): add two activities In browser 2 (student): open the course Hack the code and add "sleep(5);" in lib/modinfolib.php inside function rebuild_course_cache() before line $DB->update_record("course", $updateobj); In browser 1 (teacher): click on eye icon next to one activity to make it invisible In browser 2 (student): refresh the course page In browser 1 (teacher): delete another activity In browser 1 (teacher): Wait until student's page is rendered and refresh the course page - fatal error occurs.
          Hide
          Marina Glancy added a comment -

          Requesting peer review. The patch does not prevent corrupt modinfo cache - I found it impossible without introducing new DB fields. But there is a relatively cheap way to determine if modinfo contains deleted modules (which leads to fatal error).

          When(if) peer review is successful, I will create branches for 2.4 and master

          Show
          Marina Glancy added a comment - Requesting peer review. The patch does not prevent corrupt modinfo cache - I found it impossible without introducing new DB fields. But there is a relatively cheap way to determine if modinfo contains deleted modules (which leads to fatal error). When(if) peer review is successful, I will create branches for 2.4 and master
          Hide
          Russell Smith added a comment -

          I'm a little confused about the implementation strategy here. You state in the comment above preload_course_contexts($course->id) that course modules are loaded there. If that's the case why do you then exclude the course unless it matches $COURSE. If the modules are already loaded, then excluding them doesn't add any value does it?

          From the research you've done is there a benefit to setting modinfo = null on updates rather than a full rebuild. Future pages would then rebuild the cache. Do that provide any possible protection against this race condition?

          Is it true that this code doesn't protect against uses of the course_overview_page or profile views as $course->id !== $COURSE->id. Is it possible for the corruption to appear on those pages, or will it already be fixed before somebody got to a page like that?

          The following comment could be clarified, however I think the first question needs answering before worrying about the comment clarity.

          // Do not check for not current course (and not frontpage) because contexts cache may be full and presence of modules contexts is not guaranteed.
          
          Show
          Russell Smith added a comment - I'm a little confused about the implementation strategy here. You state in the comment above preload_course_contexts($course->id) that course modules are loaded there. If that's the case why do you then exclude the course unless it matches $COURSE. If the modules are already loaded, then excluding them doesn't add any value does it? From the research you've done is there a benefit to setting modinfo = null on updates rather than a full rebuild. Future pages would then rebuild the cache. Do that provide any possible protection against this race condition? Is it true that this code doesn't protect against uses of the course_overview_page or profile views as $course->id !== $COURSE->id. Is it possible for the corruption to appear on those pages, or will it already be fixed before somebody got to a page like that? The following comment could be clarified, however I think the first question needs answering before worrying about the comment clarity. // Do not check for not current course (and not frontpage) because contexts cache may be full and presence of modules contexts is not guaranteed.
          Hide
          Paul Nicholls added a comment -

          Russell raises an interesting point - if the modinfo hasn't been reset (set to null), other requests shouldn't go rebuilding it unless they also cause it to need rebuilding. Could the user-initiated actions which invalidate the modinfo cache (such as deleting a course module) just rebuild the modinfo cache themselves? It'd slow them down a little, but that performance hit is currently being transferred onto whoever happens to visit the course next, and can probably even be triggered by multiple users at the same time - which is certainly not ideal! It may have been less of an issue before AJAX actions were implemented, as the user would be redirected back to the course immediately after (for instance) deleting a course module.

          There may still be some situations where full rebuilds are not desired, such as iterating over a list of modules (and modifying or deleting them) - but even then, perhaps there's a better approach than just wiping the cache after each one and then waiting for something else to rebuild it.

          Show
          Paul Nicholls added a comment - Russell raises an interesting point - if the modinfo hasn't been reset (set to null), other requests shouldn't go rebuilding it unless they also cause it to need rebuilding. Could the user-initiated actions which invalidate the modinfo cache (such as deleting a course module) just rebuild the modinfo cache themselves? It'd slow them down a little, but that performance hit is currently being transferred onto whoever happens to visit the course next, and can probably even be triggered by multiple users at the same time - which is certainly not ideal! It may have been less of an issue before AJAX actions were implemented, as the user would be redirected back to the course immediately after (for instance) deleting a course module. There may still be some situations where full rebuilds are not desired, such as iterating over a list of modules (and modifying or deleting them) - but even then, perhaps there's a better approach than just wiping the cache after each one and then waiting for something else to rebuild it.
          Hide
          Marina Glancy added a comment -

          Hi Russell, Paul, thanks for your review. Actually you both answered your questions

          Reset or rebuild
          It would be less probable to have a cache corruption if we ensure that course cache is rebuilt and not only reset after each change. Still possible though in case when two people edit the course simultaneously. But total load for the server will increase significantly: 1. course cache may be reset several times during one operation (and it does not have to be bulk operation, even moving a module involves several steps - removing from section & adding to section, each of them resets the course cache). 2. several AJAX operations like changing visibility or group mode of several modules in the course may be performed one after another without rebuilding the whole cache each time. Don't forget that rebuilding the course cache may take lots of resources and time in the big course. I can assure you that we will have much more complains than thanks if we introduce mandatory rebuilding.

          Why check for current course only
          Russell, this comment that you asked to clarify IS actually the answer to your question why do we only check integrity for current course. If you remember recent issues about modinfo cache size and bulk operations, we had significant performance loss because function course_modinfo::__construct() was called several times for one course. Even though it is fixed in course overview block now, it still can be present somewhere else. In this case contexts are preloaded only during the first call to preload_course_contexts() (it has it's own static var to ensure it) and even if modules contexts are removed from context cache (which is yet another static var cache and has max size of 2500), the contexts are not preloaded any more. If this happens, my "quick integrity check" will take ages because for each module it will have to query context from database in a separate DB query.

          Also since probability of the course modinfo corruption is not extremely high we don't need to risk this performance loss. Besides course cache can only get corrupted during the course editing, so the person who was editing the course will notice it first and since he/she is already on the course page, this patch will autofix it for him.

          Plans for master
          For 2.6 I plan to add a new field in course table that contains the unique token set when course cache is reset. Cached modinfo must contain the same token, or otherwise it is considered invalid and needs rebuilding. Also this will allow us to have course cache in localcachedir because it can be duplicated on each server. MDL-34397

          Show
          Marina Glancy added a comment - Hi Russell, Paul, thanks for your review. Actually you both answered your questions Reset or rebuild It would be less probable to have a cache corruption if we ensure that course cache is rebuilt and not only reset after each change. Still possible though in case when two people edit the course simultaneously. But total load for the server will increase significantly: 1. course cache may be reset several times during one operation (and it does not have to be bulk operation, even moving a module involves several steps - removing from section & adding to section, each of them resets the course cache). 2. several AJAX operations like changing visibility or group mode of several modules in the course may be performed one after another without rebuilding the whole cache each time. Don't forget that rebuilding the course cache may take lots of resources and time in the big course. I can assure you that we will have much more complains than thanks if we introduce mandatory rebuilding. Why check for current course only Russell, this comment that you asked to clarify IS actually the answer to your question why do we only check integrity for current course. If you remember recent issues about modinfo cache size and bulk operations, we had significant performance loss because function course_modinfo::__construct() was called several times for one course. Even though it is fixed in course overview block now, it still can be present somewhere else. In this case contexts are preloaded only during the first call to preload_course_contexts() (it has it's own static var to ensure it) and even if modules contexts are removed from context cache (which is yet another static var cache and has max size of 2500), the contexts are not preloaded any more. If this happens, my "quick integrity check" will take ages because for each module it will have to query context from database in a separate DB query. Also since probability of the course modinfo corruption is not extremely high we don't need to risk this performance loss. Besides course cache can only get corrupted during the course editing, so the person who was editing the course will notice it first and since he/she is already on the course page, this patch will autofix it for him. Plans for master For 2.6 I plan to add a new field in course table that contains the unique token set when course cache is reset. Cached modinfo must contain the same token, or otherwise it is considered invalid and needs rebuilding. Also this will allow us to have course cache in localcachedir because it can be duplicated on each server. MDL-34397
          Hide
          Russell Smith added a comment -

          Thanks Marina. I think I actually understand it now I've re-read all the code in light of your above comments.

          [Y] Syntax
          [Y] Whitespace
          [Y] Output
          [-] Language
          [Y] Databases
          [Y] Testing
          [Y] Security
          [N] Documentation
          [N] Git
          [Y] Sanity check

          Documentation: Based on the long explanation, I still think the comments require clarity;

          eg;

          • // This will also preload contexts for all modules inside the course.
          • // Modules are also cached here as long as it's the first time this course has been preloaded.
          • // Do not check for not current course (and not frontpage) because contexts cache may be full and presence of modules contexts is not guaranteed.
          • // Only verify current course (or frontpage) as pages with many courses may not have module contexts cached. Uncached modules will result in a very slow verification.

          Git: The component is missing from the commit message; MDL-36789 course: Quick course cache integrity check for deleted modules.

          Sanity check: Even though this does not solve the entire problem. I agree it's going to be too complex to back patch. Any improvement we can make to allow users to recover without admin intervention is a good step forward.

          So I'm happy with this if the very minor details of comments and commit message can be improved.

          Show
          Russell Smith added a comment - Thanks Marina. I think I actually understand it now I've re-read all the code in light of your above comments. [Y] Syntax [Y] Whitespace [Y] Output [-] Language [Y] Databases [Y] Testing [Y] Security [N] Documentation [N] Git [Y] Sanity check Documentation: Based on the long explanation, I still think the comments require clarity; eg; // This will also preload contexts for all modules inside the course. // Modules are also cached here as long as it's the first time this course has been preloaded. // Do not check for not current course (and not frontpage) because contexts cache may be full and presence of modules contexts is not guaranteed. // Only verify current course (or frontpage) as pages with many courses may not have module contexts cached. Uncached modules will result in a very slow verification. Git: The component is missing from the commit message; MDL-36789 course: Quick course cache integrity check for deleted modules. Sanity check: Even though this does not solve the entire problem. I agree it's going to be too complex to back patch. Any improvement we can make to allow users to recover without admin intervention is a good step forward. So I'm happy with this if the very minor details of comments and commit message can be improved.
          Hide
          Salvador Meneses added a comment -

          This workaround has worked with several clients:
          >>yourmoodlesite/course/edit.php?id=(courseid).
          >>Save changes without making any changes.

          Show
          Salvador Meneses added a comment - This workaround has worked with several clients: >>yourmoodlesite/course/edit.php?id=(courseid). >>Save changes without making any changes.
          Hide
          Marina Glancy added a comment -

          Thanks Russell, the comments are much clearer now.
          Submitting for integration

          Show
          Marina Glancy added a comment - Thanks Russell, the comments are much clearer now. Submitting for integration
          Hide
          Dan Poltawski added a comment -

          The main moodle.git repository has just been updated with latest weekly modifications. You may wish to rebase your PULL branches to simplify history and avoid any possible merge conflicts. This would also make integrator's life easier next week.

          TIA and ciao

          Show
          Dan Poltawski added a comment - The main moodle.git repository has just been updated with latest weekly modifications. You may wish to rebase your PULL branches to simplify history and avoid any possible merge conflicts. This would also make integrator's life easier next week. TIA and ciao
          Hide
          Dan Poltawski added a comment -

          The main moodle.git repository has just been updated with latest weekly modifications. You may wish to rebase your PULL branches to simplify history and avoid any possible merge conflicts. This would also make integrator's life easier next week.

          TIA and ciao

          Show
          Dan Poltawski added a comment - The main moodle.git repository has just been updated with latest weekly modifications. You may wish to rebase your PULL branches to simplify history and avoid any possible merge conflicts. This would also make integrator's life easier next week. TIA and ciao
          Hide
          Damyon Wiese added a comment -

          Thanks Marina,

          I also tested that for a healthy site this does not impact performance and it does not result in extra DB queries or other noticeable impact.

          Hopefully this fixes a few weird bugs.

          Integrated to 24, 25 and master.

          Show
          Damyon Wiese added a comment - Thanks Marina, I also tested that for a healthy site this does not impact performance and it does not result in extra DB queries or other noticeable impact. Hopefully this fixes a few weird bugs. Integrated to 24, 25 and master.
          Hide
          Rajesh Taneja added a comment -

          Thanks for fixing this Marina,

          Performance seems to be same as before patch and everything seems to work as expected.

          Passing ...

          Show
          Rajesh Taneja added a comment - Thanks for fixing this Marina, Performance seems to be same as before patch and everything seems to work as expected. Passing ...
          Hide
          Damyon Wiese added a comment -

          Thanks for your efforts. This issue is one of the outstanding issues that passed all our testing and was accepted into Moodle this week.

          Hurray!

          Show
          Damyon Wiese added a comment - Thanks for your efforts. This issue is one of the outstanding issues that passed all our testing and was accepted into Moodle this week. Hurray!

            People

            • Votes:
              9 Vote for this issue
              Watchers:
              19 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Agile