Uploaded image for project: 'Moodle'
  1. Moodle
  2. MDL-57202

Large docs/PDFs cripple grading interface

    XMLWordPrintable

    Details

    • Affected Branches:
      MOODLE_31_STABLE

      Description

      If you have a PDF with lots of pages (thousands), the conversion interface becomes crippled, and the user is basically unable to grade.

      This was a less common problem before unoconv, but now if students upload an excel (or some other file) that has lots of rows to it, unoconv converts it to PDF, and then Moodle converts those files to images for the PDF markup.

      I'm attaching some example PDFs. In my case, if I add the 3 files, it will timeout while trying to load them.

      A few problems we have observed:

      1. The fragment/widget doesn't ask for an increased timelimit, meaning if the setup takes longer than 30s (this includes unoconv conversion, combining into a single PDF, and determining page count), a timeout happens. This timeout is 'silent' to the user, the grading interface just shows spinning loading icons, and the users session is held. See document_services::page_number_for_attempt().
      2. Same goes for document_services::generate_page_images_for_attempt() - while converting combined pdf to images, it doesn't ask for more time (this timeout is less likely, because exec() doesn't count against the timelimit).
      3. The file conversion (unoconv), generating combined PDF, and generating images all hold the user's session open. The last one is particularly problematic because the interface has loaded, with a progress bar, but autosave throws errors (because the session is held), and the user can't move to another user, or save the grade/comment because of the session being held open. IMO editpdf/ajax.php -> loadallimages could/should release the user's session.
      4. editpdf/ajax.php -> loadallimages returns no data until complete, and a separate ajax poll is done to update the on-screen progress bar. This will run into load balancer timeout ($CFG->maxtimelimit). A better scenario would be to stream back using the progressbar class interface. It would reduce the number of connections and meet the requirements of data flowing through the connection to keep it open.
      5. If conversion to page images is interrupted (timeout, user navigates away, etc), when returning to grade that submission again, rather than resuming the conversion, it displays the number of pages that were finished, acting as if there are no more (opened MDL-57200 for this)

      It also dawned on me that it seems silly that we are generating and loading all the page PNGs up front, rather than doing it lazily (opened MDL-57201). Changing that would potentially fix #2, 3, and 4 above.

        Attachments

        1. 1000pgs-1.pdf
          737 kB
        2. 1000pgs-2.pdf
          737 kB
        3. 1000pgs-3.pdf
          736 kB

          Issue Links

            Activity

              People

              Assignee:
              Unassigned Unassigned
              Reporter:
              emerrill Eric Merrill
              Participants:
              Component watchers:
              Adrian Greeve, Jake Dallimore, Mathew May, Mihail Geshoski, Peter Dias
              Votes:
              9 Vote for this issue
              Watchers:
              17 Start watching this issue

                Dates

                Created:
                Updated: