Uploaded image for project: 'Moodle'
  1. Moodle
  2. MDL-18468

Restore XML parsing: Improve for speed by preprocessing and splitting moodle.xml


    • Icon: Improvement Improvement
    • Resolution: Fixed
    • Icon: Minor Minor
    • 1.9.5
    • 1.9.4
    • Backup
    • None

      While researching about http://docs.moodle.org/en/Development:Backup_2.0_-_Improve_XML_parsing and some important bugs like MDL-15489 one of the things being planned for 2.0 is to split the current moodle.xml file into bunch of smaller files like:

      • info
      • course_header
      • course_structure
      • users
      • modules (one file per activity)
      • blocks
      • ...

      This will allow the parser to perform it's job (reading contents from XML and pushing them to in-memory structures) really quicker than current approach, where the whole moodle.xml file in parsed up to 19 times (19 is the number of "TODO"s - parts present in a moodle.xml that are parsed separately).

      While that split is going to be done in Moodle 2.0 backup (in order to allow restore to handle directly those smaller files), I thought that, perhaps it could be interesting also to perform some split in Moodle 1.9.

      After some (a lot!) of coding and tests, comparing results, I've ended with:

      1) One parser (moodle_splitter_parser) able to split the moodle.xml file into 19 smaller files.
      2) Hack the main restore parser (MoodleParser) to use those split files instead of the original moodle.xml one
      3) Put all that code under one experimental $CFG->experimentalsplitrestore configuration setting

      While results aren't noticeable in small backups, processing one 50MB files, spent initially 50 seconds performing the split, but later, the whole restore ended 350 seconds quicker, so we saved 300 seconds of non-useful parsing time. Note results can a lot, depending of the contents of the backup, but in any case, the split time should be always smaller than the saved time.

      So, I'm going to commit this both to 19_STABLE and HEAD (for the "legacy" Moodle 1.9 => 2.0 restore). It would be great to have people using and testing it, in order to get some feedback to promote the split strategy to be enabled always. I've restored at least 10 different courses of all sort of types and sizes and everything is working ok here. Let's see how it evolves.


            stronk7 Eloy Lafuente (stronk7)
            stronk7 Eloy Lafuente (stronk7)
            1 Vote for this issue
            5 Start watching this issue


                Error rendering 'clockify-timesheets-time-tracking-reports:timer-sidebar'. Please contact your Jira administrators.