Uploaded image for project: 'Moodle'
  1. Moodle
  2. MDL-51707

Long running tasks can run twice

    XMLWordPrintable

    Details

    • Database:
      MySQL
    • Testing Instructions:
      Hide

      Hack 2 scheduled tasks so they take longer than usual - sleep(120) will do nicely.

      mod_forum\task\cron_task and \core\task\blog_cron_task are good candidates because they run every time.

      Run cron via the cli - it should pause once it gets the first of those 2 jobs.

      While it is paused - disable the second task via the admin interface.

      When the cron cli finishes the first task - it should "skip" the second because it is disabled.

      Show
      Hack 2 scheduled tasks so they take longer than usual - sleep(120) will do nicely. mod_forum\task\cron_task and \core\task\blog_cron_task are good candidates because they run every time. Run cron via the cli - it should pause once it gets the first of those 2 jobs. While it is paused - disable the second task via the admin interface. When the cron cli finishes the first task - it should "skip" the second because it is disabled.
    • Affected Branches:
      MOODLE_28_STABLE, MOODLE_29_STABLE
    • Fixed Branches:
      MOODLE_28_STABLE, MOODLE_29_STABLE
    • Pull Master Branch:

      Description

      There is a race condition in the cron system that can cause long running tasks to run twice in a period when they should only run once. (In this case, "long running" only means one that is still running when the next cron job starts, not necessarily one that is running for an actual long time.)

      The problem is in \core\task\manager::get_next_scheduled_task() which queries the task_scheduled table to find tasks that are due to be run, then runs through them to find the first one it can get a lock on (i.e. the first one that isn't running). If the long running task finishes (and its lock is released) between the database query and the lock being requested, it will be returned, even though it just finished and isn't due to be run again for a while.

      It's probably worth saying that, even though this sounds more like a theoretical race condition than one that would actually manifest in practice, we discovered it due to a task which has been frequently running twice in the morning (a dozen times in the last month), so it is a real problem. It's also worth noting that we're continuing to see the issue after moving from file locking to database locking (as we initially assumed there was a problem with our filesystem setup).

      (Credit where it's due - the bug was identified by my colleague Michael Hughes, not me)

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                2 Vote for this issue
                Watchers:
                8 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:
                  Fix Release Date:
                  9/Nov/15