-
Improvement
-
Resolution: Fixed
-
Minor
-
3.7.4, 3.8, 3.9
-
MOODLE_37_STABLE, MOODLE_38_STABLE, MOODLE_39_STABLE
-
MOODLE_37_STABLE, MOODLE_38_STABLE
-
MDL-67486-swap-cron-lock -
This tracker has evolved and the original was a little bit of a red herring. The symptom was an emergent property of a natural limit to the scaling ceiling of cron.
(old description)
The core_cron lock is held by the task manager to guarantee that only a single instance of a particular scheduled task, or a particular ad hoc task is allocated to any cron running process. In a very highly scaled environment, eg there might be 30 cron processes, each of these processes must wait for the global core_cron and this times out after 10 seconds and there can be a lot of contention for this lock, and once a process hits the 10 second timeout it exits which lowers the overall throughput. It can also cause an emergent behavior of cascading exits and you end up with less running processes than if a lower number of processes had started in the first place. A simple approach is just to increase the timeout from 10 to something larger which stops them shutting down but won't increase the max concurrency level.
Out of 30 processes a typical balance is that 10 might be scheduled tasks and 20 are ad hoc tasks. The task manager only needs to guarantee atomic allocation of scheduled tasks as a group and ad hoc tasks as a group, they don't need to be grouped together. By splitting them we'd get roughly +200% concurrency for scheduled tasks and +33% more ad hoc concurrency in this simple example.
These two resources can be split:
https://github.com/moodle/moodle/blob/master/lib/classes/task/manager.php#L611
https://github.com/moodle/moodle/blob/master/lib/classes/task/manager.php#L555
Proposing to leave 'core_cron' for scheduled tasks and create a new lock resource key 'core_adhoc' for the adhoc task queue.
- Discovered while testing
-
MDL-67597 Blocking cron tasks only effectively lock into the future not the past, ie race condition
- Closed
- has been marked as being related by
-
MDL-67483 Improve adhoc tasks quality of service at very high scale
- Closed
- has to be done after
-
MDL-67485 Release the task runner lock before throwing exception
- Closed
-
MDL-67596 Cron / adhoc task runners ramp up slowly for no reason
- Closed
-
MDL-67433 Update admin/tool/task/cli/adhoc_task.php to respect task_adhoc_concurrency_limit
- Closed
- Testing discovered
-
MDL-67596 Cron / adhoc task runners ramp up slowly for no reason
- Closed