Uploaded image for project: 'Moodle'
  1. Moodle
  2. MDL-42673

Cache issue? Server freezes + Failed to unserialise data from file

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 2.4.6
    • Fix Version/s: None
    • Component/s: Caching
    • Labels:

      Description

      I've been experiencing some randomly Moodle server crashes and weird error messages using Moodle as a teacher and admin.

      Server crashes were related to: "Fatal error: $CFG->dataroot is not writable, admin has to fix directory permissions! Exiting." and weird error messages using Moodle to "... Failed to unserialise data from file. Either failed to read, or failed to write."

      I posted about all of this in my blog but it's a long post and I don't know if I should paste here a link or paste the long post here. Lete me know please.

      In summary:

      a.- Me servers crashed 6 times randomly (never at peak times) and I had to reboot it manually.

      Date of freezing --> Errors showed when trying to access Moodle WEB Server

      Friday 4th October at 00:00h AM -> Fatal error: $CFG>dataroot is not writable, admin has to fix directory permissions! Exiting.
      Sunday 6th October at 04:00h AM -> Fatal error: $CFG>dataroot is not writable, admin has to fix directory permissions! Exiting.
      Wednesday 16th October at 04:00h AM -> Fatal error: $CFG>dataroot is not writable, admin has to fix directory permissions! Exiting.
      Saturday 26th October at 19:00h PM -> Fatal error: $CFG>dataroot is not writable, admin has to fix directory permissions! Exiting.
      Monday 29th October at 00:00h AM -> Fatal error: $CFG>dataroot is not writable, admin has to fix directory permissions! Exiting.

      As soon as the server rebooted it started to rebuild the raid1 and everything started to work fine again... until the next incident.

      b.- We had also incidents reported by teachers and me (as admin) where the error message was "Getting Error:Coding error detected, it must be fixed by a programmer: Failed to unserialise data from file. Either failed to read, or failed to write".

      Initially I managed to sort them out from Settings > Site administration > Development > Purge all caches but the last 'similar' incident could not been sorted out same way because the error "Getting Error:Coding error detected, it must be fixed by a programmer: Failed to unserialise data from file. Either failed to read, or failed to write" affected to 'Purge all caches' too. I tried for console (cli) but I also failed. Look at the debugging message below:

      Error code: codingerror

      • line 468 of /cache/stores/file/lib.php: coding_exception thrown
      • line 371 of /cache/stores/file/lib.php: call to cachestore_file->prep_data_after_read()
      • line 295 of /cache/classes/loaders.php: call to cachestore_file->get()
      • line 1358 of /cache/classes/loaders.php: call to cache->get()
      • line 522 of /lib/dml/mysqli_native_moodle_database.php: call to cache_application->get()
      • line 1269 of /lib/dml/mysqli_native_moodle_database.php: call to mysqli_native_moodle_database->get_columns()
      • line 1565 of /lib/dml/moodle_database.php: call to mysqli_native_moodle_database->set_field_select()
      • line 1302 of /lib/moodlelib.php: call to moodle_database->set_field()
      • line 1205 of /lib/outputrequirementslib.php: call to set_config()
      • line 1503 of /lib/moodlelib.php: call to js_reset_all_caches()
      • line 51 of /admin/cli/purge_caches.php: call to purge_all_caches()

      Finally I went to administration ► Plugins ► Caching ► Configuration. From there I found the 'Configured store instances' section and the three links to purge content straight from there. Purging cache stores ended up solving the problem.... until now. No more server freezes either.

      A further investigation on server logs at crashing time (actually the last logs before crashing) showed me nothing in particular at syslog, messages, apache2 logs. But in kern.log I noticed this:

      Oct 29 00:00:58 server kernel: [52934.011290] ata1.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 frozen
      Oct 29 00:00:58 server kernel: [52934.011297] ata1.01: failed command: FLUSH CACHE EXT
      Oct 29 00:00:58 server kernel: [52934.011304] ata1.01: cmd ea/00:00:00:00:00/00:00:00:00:00/b0 tag 0

      Cacti charts (Apache, MySQL, IO disks, Memory,..), APC behaviour and smarttools tests over our raid1 disks didn't give any evidence of failure.

      Today I think that all of this has been caused by 'something' related to Moodle Cache but I don't know yet how and when. I though it could be valuable to share it here.

      I really was afraid of losing data because every time I had to manually reboot the server it started to rebuild the raid1 --> Panic!

      Thanks for reading and for working on this amazing software.

      Viva Moodle!

      Toni Soto

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                1 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated: