Moodle
  1. Moodle
  2. MDL-9290

Restore takes too long and hangs when dealing with large backup files

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 1.5, 1.5.1, 1.5.2, 1.5.3, 1.5.4, 1.6, 1.6.1, 1.6.2, 1.6.3, 1.6.4, 1.6.5, 1.7, 1.7.1, 1.7.2, 1.8
    • Fix Version/s: 1.8.4, 1.9, 2.0
    • Component/s: Backup
    • Labels:
      None
    • Affected Branches:
      MOODLE_15_STABLE, MOODLE_16_STABLE, MOODLE_17_STABLE, MOODLE_18_STABLE
    • Fixed Branches:
      MOODLE_18_STABLE, MOODLE_19_STABLE, MOODLE_20_STABLE
    • Rank:
      29597

      Description

      Restore process takes too long, more than the maximum specified in restore.php, and hangs.

      When the moodle.xml file of the backup is large.

        Issue Links

          Activity

          Hide
          Carlos Romero Mas added a comment -

          We have resolved the problem

          • commenting out the call to backup_flush() of every StartElement... and
          • including the code marked with + in the file restorelib.php at the end of the function restore_read_xml(...)

          + $lasttime = time(); // crmas
          while ($data = fread($fp, 4096) and !$moodle_parser->finished) {
          + if ((time() - $lasttime) > 5)

          { + $lasttime = time(); + backup_flush(1); + }

          xml_parse($xml_parser, $data, feof($fp))
          or die(sprintf("XML error: %s at line %d",
          xml_error_string(xml_get_error_code($xml_parser)),
          xml_get_current_line_number($xml_parser)));
          }
          fclose($fp);

          Show
          Carlos Romero Mas added a comment - We have resolved the problem commenting out the call to backup_flush() of every StartElement... and including the code marked with + in the file restorelib.php at the end of the function restore_read_xml(...) + $lasttime = time(); // crmas while ($data = fread($fp, 4096) and !$moodle_parser->finished) { + if ((time() - $lasttime) > 5) { + $lasttime = time(); + backup_flush(1); + } xml_parse($xml_parser, $data, feof($fp)) or die(sprintf("XML error: %s at line %d", xml_error_string(xml_get_error_code($xml_parser)), xml_get_current_line_number($xml_parser))); } fclose($fp);
          Hide
          Carlos Romero Mas added a comment -

          This modification has achieved a reduction in the execution time from >6000 seconds to 600 secs (moodle.xml was larger than 700.000 lines)

          Show
          Carlos Romero Mas added a comment - This modification has achieved a reduction in the execution time from >6000 seconds to 600 secs (moodle.xml was larger than 700.000 lines)
          Hide
          Eloy Lafuente (stronk7) added a comment -

          Hi Carlos,

          thanks for your patch. I'm taking a look to it right now...stay tuned... ciao

          Show
          Eloy Lafuente (stronk7) added a comment - Hi Carlos, thanks for your patch. I'm taking a look to it right now...stay tuned... ciao
          Hide
          Eloy Lafuente (stronk7) added a comment -

          Adding Penny here to hear her opinion.

          I've found this could be a nice path to follow:

          1) Deprecate the second parameter of backup_flush($n=0,$time=false). I haven't found any usage and we could notice it for now and take it out for 2.0.

          2) I would leave all the current calls to backup_flush() in all the restore_create_xxxx(), restore_user_() because they can need such flush enabled there.

          3) Will delete all the backup_flush() from all the startElementXXX() handers as suggested by Carlos.

          4) Will add them to the main fread as suggested by carlos (controlling the SILENTY flag, of course)

          With this we avoid millions of calls to the backup_flush() function, adding content by time basis). How does this sound for you Penny? And one more question... why the backup_flush() function itself is using another different RESTORE_SILENTLY_NOFLUSH define? I think I asked this before but I don't remember the answer... (blush)

          Ciao

          Show
          Eloy Lafuente (stronk7) added a comment - Adding Penny here to hear her opinion. I've found this could be a nice path to follow: 1) Deprecate the second parameter of backup_flush($n=0,$time=false). I haven't found any usage and we could notice it for now and take it out for 2.0. 2) I would leave all the current calls to backup_flush() in all the restore_create_xxxx(), restore_user_() because they can need such flush enabled there. 3) Will delete all the backup_flush() from all the startElementXXX() handers as suggested by Carlos. 4) Will add them to the main fread as suggested by carlos (controlling the SILENTY flag, of course) With this we avoid millions of calls to the backup_flush() function, adding content by time basis). How does this sound for you Penny? And one more question... why the backup_flush() function itself is using another different RESTORE_SILENTLY_NOFLUSH define? I think I asked this before but I don't remember the answer... (blush) Ciao
          Hide
          Penny Leach added a comment -

          Sounds good to me!

          The reason for the different constant was... (I was trying to remember this the other day and I don't actually remember, I'm guessing) that there's a difference between printing stuff to STDOUT and flushing the output buffer. The two different cases are:

          wanting to call backup functions without actually having <li>doing blah blah</li> printed to the screen (eg one click backup), and avoiding browser timeouts. iirc the flushing is just to prevent browser timeouts?

          There are three different times backup happens:

          • interactively (want to flush buffer AND print the <li> stuff) still in browser
          • non interactively in browser (want to flush buffer but not print <li>)
          • non interactively in CLI (cron, scripts), don't want flush buffer or <li>s
          Show
          Penny Leach added a comment - Sounds good to me! The reason for the different constant was... (I was trying to remember this the other day and I don't actually remember, I'm guessing) that there's a difference between printing stuff to STDOUT and flushing the output buffer. The two different cases are: wanting to call backup functions without actually having <li>doing blah blah</li> printed to the screen (eg one click backup), and avoiding browser timeouts. iirc the flushing is just to prevent browser timeouts? There are three different times backup happens: interactively (want to flush buffer AND print the <li> stuff) still in browser non interactively in browser (want to flush buffer but not print <li>) non interactively in CLI (cron, scripts), don't want flush buffer or <li>s
          Hide
          Eloy Lafuente (stronk7) added a comment -

          I've reassigned it to Yu to see if it's safe finally. Sorry for the delay, 100% my fault (just re-discovered this some minutes ago).

          Show
          Eloy Lafuente (stronk7) added a comment - I've reassigned it to Yu to see if it's safe finally. Sorry for the delay, 100% my fault (just re-discovered this some minutes ago).
          Hide
          Yu Zhang added a comment -

          Hi, I have tested this using the feature's demo backup in MDL-9602.

          Before patch time taken to execute restore is 11 seconds
          After patch time taken to execute restore is 7 seconds

          I am not seeing a significant time improvement probably because this course isn't really that big. (BIG backup would help!)

          Restored course seems normal after patch.

          I don't mind putting this into 1.8, 1.9 and HEAD, waiting for Eloy's go ahead =)

          Cheers,

          Yu

          Show
          Yu Zhang added a comment - Hi, I have tested this using the feature's demo backup in MDL-9602 . Before patch time taken to execute restore is 11 seconds After patch time taken to execute restore is 7 seconds I am not seeing a significant time improvement probably because this course isn't really that big. (BIG backup would help!) Restored course seems normal after patch. I don't mind putting this into 1.8, 1.9 and HEAD, waiting for Eloy's go ahead =) Cheers, Yu
          Hide
          Eloy Lafuente (stronk7) added a comment -

          Well done Yu! B-)

          Can people, please, test BIG restores under 1.8, 1.9 and 2.0 to see if they experiment nice speed-ups?

          TIA and ciao

          Show
          Eloy Lafuente (stronk7) added a comment - Well done Yu! B-) Can people, please, test BIG restores under 1.8, 1.9 and 2.0 to see if they experiment nice speed-ups? TIA and ciao
          Hide
          Dan Poltawski added a comment -

          Tried to make an artificially big xml file, although wasn't that big, but there are definite improvement with the patch:

          before:
          26.933392 secs
          after:
          22.739394 secs

          Show
          Dan Poltawski added a comment - Tried to make an artificially big xml file, although wasn't that big, but there are definite improvement with the patch: before: 26.933392 secs after: 22.739394 secs
          Hide
          Eloy Lafuente (stronk7) added a comment -

          Closing as fixed. Ciao

          Show
          Eloy Lafuente (stronk7) added a comment - Closing as fixed. Ciao
          Hide
          Pedro Liska added a comment -

          I ran into a problem around the same lines of code related to this issue: Restore failing because xml_parse() crashes.

          I don't think this issue is the cause of my problem but I was wondering if any of you guys could take a look at it:

          http://tracker.moodle.org/browse/MDL-18825

          I'm not seasoned enough in PHP to be able to figure this one out. Any help would be greatly appreciated. Thank you!

          Show
          Pedro Liska added a comment - I ran into a problem around the same lines of code related to this issue: Restore failing because xml_parse() crashes. I don't think this issue is the cause of my problem but I was wondering if any of you guys could take a look at it: http://tracker.moodle.org/browse/MDL-18825 I'm not seasoned enough in PHP to be able to figure this one out. Any help would be greatly appreciated. Thank you!
          Hide
          Nicolas Connault added a comment -

          This was fixed a long time ago.

          Show
          Nicolas Connault added a comment - This was fixed a long time ago.

            People

            • Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: