Uploaded image for project: 'Moodle'
  1. Moodle
  2. MDL-76274

bump_submission_for_stale_conversions adhoc task runs in situations when it shouldn't

    XMLWordPrintable

Details

    • MOODLE_311_STABLE, MOODLE_400_STABLE, MOODLE_401_STABLE
    • MOODLE_400_STABLE, MOODLE_401_STABLE
    • MDL-76274-401
    • MDL-76274-master
    • Hide

      Delete the bump_submission_for_stale_conversions task from the mdl_task_adhoc table

      Show
      Delete the bump_submission_for_stale_conversions task from the mdl_task_adhoc table
    • Hide
      1. Install a clean Moodle 3.10 site
      2. Create a course with an assignment
      3. Enrol a student
      4. As the student submit something for the assignment
      5. Checkout the site to latest master
      6. Edit the file mod/assign/feedback/editpdf/classes/task/bump_submission_for_stale_conversions.php, adding an mtrace after the guard clause:

        if (isset($earliestconversion->min)) {
            mtrace('hello from bump stale conversions task');
            ...
        

      7. Run the upgrade:
        php admin/cli/upgrade.php
      8. Run adhoc tasks:
        php admin/cli/adhoc_task.php -e
      9. Verify that you do not see "hello from bump stale conversions task" anywhere in the output and that the task completes

      Regression testing

      Instructions adapted from MDL-68943

      Without the patch applied

      1. You need to make sure you're running a site WITHOUT the patch applied first. To do that:
        • Create a fresh instance, but don't install yet
        • Revert MDL-68943
          • For master and 4.1:

            git revert 1e98c4ad273

          • For 4.0:

            git revert 32094232c77
            

        • Install the site as normal
      2. Install and enable the dummy fileconverter plugin
        1. git clone git@github.com:catalyst/moodle-fileconverter_dummy.git files/converter/dummy
        2. Upgrade the site
        3. Browse to "Site administration > Plugins > Document converters > Manage document converters" and open the eye on the "Dummy" converter - ensure all other converters are disabled
      3. As Student A browse to the assignment and submit a text file with some small amount of content in it, e.g.,:

        myfileA.txt

        this is Student A's submission
        

      4. Do the same as Student B but with a different text file containing different content, e.g.,:

        myfileB.txt

        this is Student B's submission
        

      5. Run the relevant command to process the adhoc/scheduled task:
        • For master and 4.1:

          php admin/cli/adhoc_task.php -e
          

        • For 4.0:

          php admin/cli/scheduled_task.php --execute=\\assignfeedback_editpdf\\task\\convert_submissions
          

      6. Verify you see some output from the dummy converter showing conversions started for both students:

        Execute adhoc task: assignfeedback_editpdf\task\convert_submission
        Adhoc task id: 3
        Adhoc task custom data: {"submissionid":"2","submissionattempt":"0"}
        ... started 16:59:27. Current memory use 21.6 MB.
        Converting submission for user id 3
        Dummy conversion started.
        Conversion still in progress. Requeueing self to check again.
        ... used 42 dbqueries
        ... used 0.27109098434448 seconds
        Adhoc task complete: assignfeedback_editpdf\task\convert_submission
        Execute adhoc task: assignfeedback_editpdf\task\convert_submission
        Adhoc task id: 4
        Adhoc task custom data: {"submissionid":"3","submissionattempt":"0"}
        ... started 16:59:28. Current memory use 38.8 MB.
        Converting submission for user id 4
        Dummy conversion started.
        Conversion still in progress. Requeueing self to check again.
        ... used 31 dbqueries
        ... used 0.14274597167969 seconds
        Adhoc task complete: assignfeedback_editpdf\task\convert_submission
        

      7. Wait 60 seconds and run the adhoc/scheduled task again
      8. Verify you see some output from the dummy converter showing conversions completed for both students:

        Execute adhoc task: assignfeedback_editpdf\task\convert_submission
        Adhoc task id: 5
        Adhoc task custom data: {"submissionid":"2","submissionattempt":"0"}
        ... started 17:00:44. Current memory use 14.3 MB.
        Converting submission for user id 3
        Dummy conversion completed.
        The document has been successfully converted
        ... used 108 dbqueries
        ... used 1.248309135437 seconds
        Adhoc task complete: assignfeedback_editpdf\task\convert_submission
        Execute adhoc task: assignfeedback_editpdf\task\convert_submission
        Adhoc task id: 6
        Adhoc task custom data: {"submissionid":"3","submissionattempt":"0"}
        ... started 17:00:45. Current memory use 50.3 MB.
        Converting submission for user id 4
        Dummy conversion completed.
        The document has been successfully converted
        ... used 103 dbqueries
        ... used 1.1581900119781 seconds
        Adhoc task complete: assignfeedback_editpdf\task\convert_submission
        

      9. Tail your error logs as the next steps are carried out, e.g.,: tail -f /var/log/apache2/error.log
      10. As admin, navigate to the assignment and view all submissions
      11. Click the "Grade" button next to Student A
      12. Verify that the text shown in the grader interface is the same as the text in the submitted text file
      13. Repeat for Student B
      14. Verify there is nothing in the error logs about document conversions
      15. Edit Student A's text file locally (NB do not change the filename), adding some content; e.g.,

        myfile.txt

        this is Student A's submission
         
        with some extra content
        

      16. As the student, go back to the assignment submission page
      17. Press "Edit submission"
      18. Upload the same text file (that you just modified)
      19. You should be prompted to overwrite or rename the file, press "Overwrite"
      20. Press "Save changes"
      21. Run the scheduled/adhoc task again, no output from the dummy converter should be present (there should be no "Dummy conversion ..." lines)
      22. As admin, grade Student A's assignment, the content should not have changed (will will verify that the upgrade fixes this)

      With the patch applied

      1. Undo the revert:

        git reset --hard HEAD~1
        

      2. Run adhoc tasks:
        1. For master and 4.1:

          php admin/cli/adhoc_task.php -e
          

        2. For 4.0:

          php admin/cli/scheduled_task.php --execute=\\assignfeedback_editpdf\\task\\convert_submissions

      3. Verify you see some output indicating that one (and only one) submission has had its timemodified updated:

        Execute adhoc task: assignfeedback_editpdf\task\bump_submission_for_stale_conversions
        Adhoc task id: 6
        Adhoc task custom data:
        ... started 11:25:14. Current memory use 40.1 MB.
        Set submission 2 timemodified to 1661484205
        ... used 5 dbqueries
        ... used 0.037574052810669 seconds
        Adhoc task complete: assignfeedback_editpdf\task\bump_submission_for_stale_conversions
        

      4. Tail your error logs  (E.g. tail -f /var/log/apache2/error.log)
      5. As admin, grade Student A's submission
      6. Verify you see some output from the dummy converter in the error logs:

        [Tue Aug 23 17:11:04.326063 2022] [php7:notice] [pid 1076237] [client 127.0.0.1:45092] Dummy conversion started., referer: [http://localhost/m/stable_master/mod/assign/view.php?id=2&rownum=0&action=grader&userid=3]
        [Tue Aug 23 17:11:05.503660 2022] [php7:notice] [pid 1076237] [client 127.0.0.1:45092] Dummy conversion completed., referer: [http://localhost/m/stable_master/mod/assign/view.php?id=2&rownum=0&action=grader&userid=3]
        

      7. Verify the content has updated in the assignment grader and now matches the text file
      8. As admin, grade Student B's assignment
      9. Verify no output from the dummy converter is present in the error logs
      10. As Student C browse to the assignment and submit a text file with some small amount of content in it, e.g.,:

        myfileC.txt

        this is Student C's submission
        

      11. Run the relevant command to process the adhoc/scheduled task:
        • For master and 4.1:

          php admin/cli/adhoc_task.php -e
          

        • For 4.0:

          php admin/cli/scheduled_task.php --execute=\\assignfeedback_editpdf\\task\\convert_submissions
          

      12. Verify there is output from the dummy converter about Student C's submission (and no output about any other submissions):

        Adhoc task custom data: {"submissionid":"4","submissionattempt":"0"}
        ... started 17:19:01. Current memory use 21.5 MB.
        Converting submission for user id 5
        Dummy conversion started.
        Conversion still in progress. Requeueing self to check again.
        ... used 35 dbqueries
        ... used 0.218022108078 seconds
        Adhoc task complete: assignfeedback_editpdf\task\convert_submission
        

      13. Wait 60 seconds and run it again to complete the conversion
      14. As admin, grade Student C's submission
      15. Verify that the text shown in the grader interface is the same as the text in the submitted text file
      16. Edit Student C's text file locally (NB do not change the filename), adding some content; e.g.,

        myfileC.txt

        this is Student C's submission
         
        with some extra content
        

      17. As Student C, go back to the assignment submission page
      18. Press "Edit submission"
      19. Upload the same text file (that you just modified)
      20. You should be prompted to overwrite or rename the file, press "Overwrite"
      21. Press "Save changes"
      22. Tail your error logs  (E.g. tail -f /var/log/apache2/error.log)
      23. As admin, grade Student C's submission
      24. Verify you see some output from the dummy converter in the error logs:

        [Tue Aug 23 17:22:33.474669 2022] [php7:notice] [pid 2575696] [client 127.0.0.1:45202] Dummy conversion started., referer: http://localhost/m/stable_master/mod/assign/view.php?id=2&rownum=0&action=grader&userid=5
        [Tue Aug 23 17:22:34.640888 2022] [php7:notice] [pid 2717556] [client 127.0.0.1:45208] Dummy conversion completed., referer: http://localhost/m/stable_master/mod/assign/view.php?id=2&rownum=0&action=grader&userid=5
        

      25. Verify the content has updated in the assignment grader and now matches the text file
      Show
      Install a clean Moodle 3.10 site Create a course with an assignment Enrol a student As the student submit something for the assignment Checkout the site to latest master Edit the file mod/assign/feedback/editpdf/classes/task/bump_submission_for_stale_conversions.php , adding an mtrace after the guard clause: if (isset( $earliestconversion ->min)) { mtrace( 'hello from bump stale conversions task' ); ... Run the upgrade: php admin/cli/upgrade.php Run adhoc tasks: php admin/cli/adhoc_task.php -e Verify that you do not see "hello from bump stale conversions task" anywhere in the output and that the task completes Regression testing Instructions adapted from MDL-68943 Without the patch applied You need to make sure you're running a site WITHOUT the patch applied first. To do that: Create a fresh instance, but don't install yet Revert MDL-68943 For master and 4.1 : git revert 1e98c4ad273 For 4.0 : git revert 32094232c77 Install the site as normal Install and enable the dummy fileconverter plugin git clone git@github.com:catalyst/moodle-fileconverter_dummy.git files/converter/dummy Upgrade the site Browse to "Site administration > Plugins > Document converters > Manage document converters" and open the eye on the "Dummy" converter - ensure all other converters are disabled As Student A browse to the assignment and submit a text file with some small amount of content in it, e.g.,: myfileA.txt this is Student A's submission Do the same as Student B but with a different text file containing different content, e.g.,: myfileB.txt this is Student B's submission Run the relevant command to process the adhoc/scheduled task: For master and 4.1 : php admin/cli/adhoc_task.php -e For 4.0 : php admin/cli/scheduled_task.php --execute=\\assignfeedback_editpdf\\task\\convert_submissions Verify you see some output from the dummy converter showing conversions started for both students: Execute adhoc task: assignfeedback_editpdf\task\convert_submission Adhoc task id: 3 Adhoc task custom data: {"submissionid":"2","submissionattempt":"0"} ... started 16:59:27. Current memory use 21.6 MB. Converting submission for user id 3 Dummy conversion started. Conversion still in progress. Requeueing self to check again. ... used 42 dbqueries ... used 0.27109098434448 seconds Adhoc task complete: assignfeedback_editpdf\task\convert_submission Execute adhoc task: assignfeedback_editpdf\task\convert_submission Adhoc task id: 4 Adhoc task custom data: {"submissionid":"3","submissionattempt":"0"} ... started 16:59:28. Current memory use 38.8 MB. Converting submission for user id 4 Dummy conversion started. Conversion still in progress. Requeueing self to check again. ... used 31 dbqueries ... used 0.14274597167969 seconds Adhoc task complete: assignfeedback_editpdf\task\convert_submission Wait 60 seconds and run the adhoc/scheduled task again Verify you see some output from the dummy converter showing conversions completed for both students: Execute adhoc task: assignfeedback_editpdf\task\convert_submission Adhoc task id: 5 Adhoc task custom data: {"submissionid":"2","submissionattempt":"0"} ... started 17:00:44. Current memory use 14.3 MB. Converting submission for user id 3 Dummy conversion completed. The document has been successfully converted ... used 108 dbqueries ... used 1.248309135437 seconds Adhoc task complete: assignfeedback_editpdf\task\convert_submission Execute adhoc task: assignfeedback_editpdf\task\convert_submission Adhoc task id: 6 Adhoc task custom data: {"submissionid":"3","submissionattempt":"0"} ... started 17:00:45. Current memory use 50.3 MB. Converting submission for user id 4 Dummy conversion completed. The document has been successfully converted ... used 103 dbqueries ... used 1.1581900119781 seconds Adhoc task complete: assignfeedback_editpdf\task\convert_submission Tail your error logs as the next steps are carried out, e.g.,: tail -f /var/log/apache2/error.log As admin, navigate to the assignment and view all submissions Click the "Grade" button next to Student A Verify that the text shown in the grader interface is the same as the text in the submitted text file Repeat for Student B Verify there is nothing in the error logs about document conversions Edit Student A's text file locally ( NB do not change the filename), adding some content; e.g., myfile.txt this is Student A's submission   with some extra content As the student, go back to the assignment submission page Press "Edit submission" Upload the same text file (that you just modified) You should be prompted to overwrite or rename the file, press "Overwrite" Press "Save changes" Run the scheduled/adhoc task again, no output from the dummy converter should be present (there should be no "Dummy conversion ..." lines) As admin, grade Student A's assignment, the content should not have changed (will will verify that the upgrade fixes this) With the patch applied Undo the revert: git reset --hard HEAD~1 Run adhoc tasks: For master and 4.1 : php admin/cli/adhoc_task.php -e For 4.0 : php admin/cli/scheduled_task.php --execute=\\assignfeedback_editpdf\\task\\convert_submissions Verify you see some output indicating that one (and only one) submission has had its timemodified updated: Execute adhoc task: assignfeedback_editpdf\task\bump_submission_for_stale_conversions Adhoc task id: 6 Adhoc task custom data: ... started 11:25:14. Current memory use 40.1 MB. Set submission 2 timemodified to 1661484205 ... used 5 dbqueries ... used 0.037574052810669 seconds Adhoc task complete: assignfeedback_editpdf\task\bump_submission_for_stale_conversions Tail your error logs  (E.g. tail -f /var/log/apache2/error.log ) As admin, grade Student A's submission Verify you see some output from the dummy converter in the error logs: [Tue Aug 23 17:11:04.326063 2022] [php7:notice] [pid 1076237] [client 127.0.0.1:45092] Dummy conversion started., referer: [http://localhost/m/stable_master/mod/assign/view.php?id=2&rownum=0&action=grader&userid=3] [Tue Aug 23 17:11:05.503660 2022] [php7:notice] [pid 1076237] [client 127.0.0.1:45092] Dummy conversion completed., referer: [http://localhost/m/stable_master/mod/assign/view.php?id=2&rownum=0&action=grader&userid=3] Verify the content has updated in the assignment grader and now matches the text file As admin, grade Student B's assignment Verify no output from the dummy converter is present in the error logs As Student C browse to the assignment and submit a text file with some small amount of content in it, e.g.,: myfileC.txt this is Student C's submission Run the relevant command to process the adhoc/scheduled task: For master and 4.1 : php admin/cli/adhoc_task.php -e For 4.0 : php admin/cli/scheduled_task.php --execute=\\assignfeedback_editpdf\\task\\convert_submissions Verify there is output from the dummy converter about Student C's submission (and no output about any other submissions): Adhoc task custom data: {"submissionid":"4","submissionattempt":"0"} ... started 17:19:01. Current memory use 21.5 MB. Converting submission for user id 5 Dummy conversion started. Conversion still in progress. Requeueing self to check again. ... used 35 dbqueries ... used 0.218022108078 seconds Adhoc task complete: assignfeedback_editpdf\task\convert_submission Wait 60 seconds and run it again to complete the conversion As admin, grade Student C's submission Verify that the text shown in the grader interface is the same as the text in the submitted text file Edit Student C's text file locally ( NB do not change the filename), adding some content; e.g., myfileC.txt this is Student C's submission   with some extra content As Student C, go back to the assignment submission page Press "Edit submission" Upload the same text file (that you just modified) You should be prompted to overwrite or rename the file, press "Overwrite" Press "Save changes" Tail your error logs  (E.g. tail -f /var/log/apache2/error.log ) As admin, grade Student C's submission Verify you see some output from the dummy converter in the error logs: [Tue Aug 23 17:22:33.474669 2022] [php7:notice] [pid 2575696] [client 127.0.0.1:45202] Dummy conversion started., referer: http://localhost/m/stable_master/mod/assign/view.php?id=2&rownum=0&action=grader&userid=5 [Tue Aug 23 17:22:34.640888 2022] [php7:notice] [pid 2717556] [client 127.0.0.1:45208] Dummy conversion completed., referer: http://localhost/m/stable_master/mod/assign/view.php?id=2&rownum=0&action=grader&userid=5 Verify the content has updated in the assignment grader and now matches the text file

    Description

      The bump_submission_for_stale_conversions adhoc task contains a guard clause intended to prevent it from running on sites where document conversion has never been enabled. However it always returns true regardless of whether the $earliestconversion query returns a value or not. $earliestconversion is never null, it's always an object with a min member - it is this min member that can be null. So the guard clause should be: if (isset($earliestconversion->min))

      The value of $earliestconversion->min is used in a subsequent query on the files table - and it appears when this value is null, it causes some DBs with a large number of records in the files table to spend an absurd amount of time on this query.

      Original report

      evsoldatkin commented this on MDL-68943, and I decided to post this as an issue, because it seemed like an urgent matter to me:

       Seems that SQL request has no chance to finish for medium DB on SSD with 4 million files. 3 days hanging and restart. f1 with left join on f2 is too big load to execute. I suggest spliting sql in several requests. That resulted into speed increase on the tested DB from never to 2 min 57 sec. Also adding raise_memory_limit(MEMORY_HUGE) can be used for safety, but was not needed for the tested DB.

       

      public function execute() {
          global $DB;
          // Used to only get records after whenever document conversion was enabled for this site.
          $earliestconversion = $DB->get_record_sql("SELECT MIN(timecreated) AS min
                                                       FROM {files}
                                                      WHERE filearea = 'documentconversion'");
          if ($earliestconversion) {
              ['sql' => $extensionsql, 'params' => $extensionparams] = array_reduce(
                  ['doc', 'docx', 'rtf', 'xls', 'xlsx', 'ppt', 'pptx', 'html', 'odt', 'ods', 'png', 'jpg', 'txt', 'gif'],
                  function(array $c, string $ext) use ($DB): array {
                      return [
                          'sql' => $c['sql'] . ($c['sql'] ? ' OR ' : '') . $DB->sql_like('f1.filename', ':' . $ext),
                          'params' => $c['params'] + [$ext => '%.' . $ext]
                      ];
                  },
                  ['sql' => '', 'params' => []]
              );
              // A converted file has its filename set to the contenthash of the file it converted.
              // Find all files in the relevant file areas for which there is no corresponding
              // file with the contenthash as the file name.
              //
              // Also check if the file has a greater modified time than the submission, if it does
              // that means it is both stale (as per the above) and will never be reconverted.
              $sql = "SELECT * FROM mdl_files f1 WHERE f1.filearea = 'submission_files' AND ($extensionsql)";
              $files1 = $DB->get_records_sql($sql, $extensionparams);
              $files2 = $DB->get_records('files', ['component' => 'core', 'filearea' => 'documentconversion']);
              $files2 = array_map(
                  function($file2) {
                      return $file2->filename;    
                  },
                  $files2
              );
              $files = [];
              foreach ($files1 as $file1) {
                  if (!in_array($file1->contenthash, $files2)) {
                      $files[] = $file1;
                  }
              }
              $sql = "SELECT f3.id, f3.timemodified as fmodified, asu.id as submissionid
                  FROM {assign_submission} asu
                  JOIN {assign_grades} asg ON asg.userid = asu.userid AND asg.assignment = asu.assignment
                  JOIN {files} f3 ON f3.itemid = asg.id
                  WHERE asu.id = :itemid
                      AND f3.timecreated >= :earliest
                      AND f3.component = 'assignfeedback_editpdf'
                      AND f3.filearea = 'combined'
                      AND f3.filename = 'combined.pdf'
                      AND f3.timemodified >= asu.timemodified";
              foreach ($files as $file)
              {
                  $submission = $DB->get_record_sql($sql, ['earliest' => $earliestconversion->min, 'itemid' => $file->itemid]);
                  if ($submission)
                  {
                      // Set the submission modified time to one second later than the
                      // converted files modified time, this will cause assign to reconvert
                      // everything and delete the old files when the assignment grader is
                      // viewed. See get_page_images_for_attempt in document_services.php.
                      $newmodified = $submission->fmodified + 1;
                      $record = (object)[
                          'id' => $submission->submissionid,
                          'timemodified' => $newmodified
                      ];
                      mtrace('Set submission ' . $submission->submissionid . ' timemodified to ' . $newmodified);
                      $DB->update_record('assign_submission', $record);
                  }
              }
          }
      }
      

      I don't have a large instance to test this on, but mikhailgolenkov commented that the original step already took 18 minutes for him, so it seems definitely possible that it takes exceedingly long on larger instances.

      Attachments

        1. 310_to_400.png
          310_to_400.png
          118 kB
        2. adhoc_tasks.log
          2 kB
        3. image-2022-12-15-13-23-40-949.png
          image-2022-12-15-13-23-40-949.png
          25 kB
        4. image-2022-12-15-13-27-43-916.png
          image-2022-12-15-13-27-43-916.png
          18 kB
        5. image-2022-12-15-13-55-46-610.png
          image-2022-12-15-13-55-46-610.png
          36 kB
        6. regression_adhoc_tasks.log
          2 kB
        7. upgrade.log
          52 kB

        Issue Links

          Activity

            People

              cameron1729 cameron1729
              justusdieckmann Justus Dieckmann
              Erica Bithell Erica Bithell
              Sara Arjona (@sarjona) Sara Arjona (@sarjona)
              Amaia Anabitarte Amaia Anabitarte
              Votes:
              8 Vote for this issue
              Watchers:
              19 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0 minutes
                  0m
                  Logged:
                  Time Spent - 6 hours, 31 minutes
                  6h 31m

                  Clockify

                    Error rendering 'clockify-timesheets-time-tracking-reports:timer-sidebar'. Please contact your Jira administrators.