Uploaded image for project: 'Moodle'
  1. Moodle
  2. MDL-70631

Poor performance of zip_packer::extract_to_pathname()

XMLWordPrintable

    • MOODLE_310_STABLE
    • MOODLE_310_STABLE, MOODLE_39_STABLE
    • MDL-70631-master-unzip
    • Hide

      Exploratory testing welcome. Please try with a wide range of various ZIP files such as:

      • Moodle plugins - see MDLSITE-6129 for an example of one that was causing particular troubles
      • Personal data exported from a Moodle site
      • etc.

      You can use the attached unzip.php script to test the functionality and compare the time it needed, e.g.:

      $ time php unzip.php test.zip /tmp
      

       

      You'll need to run the previous command with and without the patch using different "test.zip" files. For instance (as mentioned before): Moodle plugins, personal data exported, large H5P files...

      Verify time with the patch is shorter than without it. You can compare both manually or you can use your favourite diff utility.

      (Optional) Extra bonus for having profiling enabled while performing your tests and comparisons.
       

      Show
      Exploratory testing welcome. Please try with a wide range of various ZIP files such as: Moodle plugins - see MDLSITE-6129 for an example of one that was causing particular troubles Personal data exported from a Moodle site etc. You can use the attached unzip.php script to test the functionality and compare the time it needed, e.g.: $ time php unzip.php test.zip /tmp   You'll need to run the previous command with and without the patch using different "test.zip" files. For instance (as mentioned before): Moodle plugins, personal data exported, large H5P files... Verify  time with the patch is shorter than without it. You can compare both manually or you can use your favourite diff utility. (Optional) Extra bonus for having profiling enabled while performing your tests and comparisons.  

      It takes extremely long to extract a ZIP archive, especially if it contains many files.

      This was originally raised as MDLSITE-6114 and MDLSITE-6129 where plugin developers experienced timeouts when they were submitting plugins to the Plugins directory. Moodle did not manage to extract the submitted ZIP and timed out.

      Comments in MDLSITE-6129 have the whole story, the executive summary follows.

      stronk7 correctly identified the bottleneck in the current implementation of zip_packer::extract_to_pathname() which iterates over all files in the archive, obtains a stream resource for each of the files, reads from the stream in 256KB blocks and writes them into the target location. ZipArchive::getStream() takes significant time in this whole chain and if there are many files (e.g. a plugin with vendor folder), the difference becomes significant.

      It was suggested to switch to the alternative implementation that makes use of ZipArchive::extractTo(). That was confirmed to have significantly improved performance. During the development, a PHP bug in the ZipArchive extensions was discovered and communicated upstream.

      This issue brings a new version of the method which works significantly faster than the previous implementation and has work around for the said upstream PHP bug.

        1. image-2021-02-11-10-10-17-282.png
          image-2021-02-11-10-10-17-282.png
          19 kB
        2. screenshot-1.png
          screenshot-1.png
          62 kB
        3. unzip.php
          0.8 kB

            mudrd8mz David Mudrák (@mudrd8mz)
            mudrd8mz David Mudrák (@mudrd8mz)
            Victor Déniz Falcón Victor Déniz Falcón
            Sara Arjona (@sarjona) Sara Arjona (@sarjona)
            Janelle Barcega Janelle Barcega
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved:

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0 minutes
                0m
                Logged:
                Time Spent - 1 day, 3 hours, 45 minutes
                1d 3h 45m

                  Error rendering 'clockify-timesheets-time-tracking-reports:timer-sidebar'. Please contact your Jira administrators.