Details
-
Bug
-
Status: Closed
-
Minor
-
Resolution: Fixed
-
3.9
-
MOODLE_39_STABLE
-
MOODLE_39_STABLE
-
MDL-67483-qos-perf -
Hide
0) Install these in admin tools to make the testing easier:
https://github.com/catalyst/moodle-tool_testtasks
https://github.com/catalyst/moodle-tool_lockstats
Testing scenario 1. Test the case with a single task runner
1) Queue up 1000 one seconds tasks:
php admin/tool/testtasks/cli/queue_adhoc_tasks.php -d=1 -n=1000
2) Behind this queue up a single 'another' type of adhoc task:
php admin/tool/testtasks/cli/queue_adhoc_tasks.php -d=1 -n=1 --class='tool_testtasks\task\another_timed_adhoc_task'
3) Peek into the task queue by class:
$ select count(*),classname from mdl_task_adhoc group by classname;
count | classname
-------+-----------------------------------------------
1000 | \tool_testtasks\task\timed_adhoc_task
1 | \tool_testtasks\task\another_timed_adhoc_task
(2 rows)
4) Now start processing the queue:
php admin/tool/task/cli/adhoc_task.php --execute
5) You should see that the first 2 tasks it picked off cycled through each type and very quickly you are left with a number lower than 1000 (i.e. 997):
$ select count(*),classname from mdl_task_adhoc group by classname;
count | classname
-------+---------------------------------------
997 | \tool_testtasks\task\timed_adhoc_task
(1 row)
Testing scenario 2. A bunch of runners in parallel
1) First, allow this to work in config.php, also setup the lock stats tool which we'll use later:
$CFG->task_adhoc_concurrency_limit = 1000;
$CFG->lock_factory = '\tool_lockstats\proxy_lock_factory';
$CFG->proxied_lock_factory = "auto";
2) Queue up 22000 of type A and then another 22000 of type B behind it:
$ php admin/tool/testtasks/cli/clear_adhoc_task_queue.php
$ php admin/tool/testtasks/cli/queue_adhoc_tasks.php -d=1 -n=22000
$ php admin/tool/testtasks/cli/queue_adhoc_tasks.php -d=1 -n=22000 --class='tool_testtasks\task\another_timed_adhoc_task'
3) Confirm the queues:
$ select count(*),classname from mdl_task_adhoc group by classname;
count | classname
-------+-----------------------------------------------
22000 | \tool_testtasks\task\timed_adhoc_task
22000 | \tool_testtasks\task\another_timed_adhoc_task
(2 rows)
6) Now lets fire up several task runners, do this say 4 times (you may need to open several terminals in order to execute them):
php admin/tool/task/cli/adhoc_task.php --execute &
7) Recheck the queues and confirm they are being processed evenly (so the count column number is lower than in step #3):
$ select count(*),classname from mdl_task_adhoc group by classname;
count | classname
-------+-----------------------------------------------
21944 | \tool_testtasks\task\timed_adhoc_task
21944 | \tool_testtasks\task\another_timed_adhoc_task
(2 rows)
Testing scenario 3. NOT optional
8) Lastly, and most importantly, keep throwing more and more runners at it and confirm that the overall system is still scaling up linearly the more processes you throw at it.
ie fire up a number of processes and count how many processes you have running:
php admin/tool/task/cli/adhoc_task.php --execute & Then using the lock stats tool you can see what is running right now in the gui:
/admin/tool/lockstats/
or from a sql shell:
$ select count(*) from mdl_tool_lockstats_locks;
count
-------
6
(1 row)
Ideally you want to run this until it breaks to find the total maximum practical level of concurrency the system can handle. On my local box I saw something like:
Cron processes Tasks being processed 5 5 10 10 20 19 30 28 35 32 40 18 When it hits the max threshold each process may get a lock timeout and exit so you'll get a sharp drop off from linear back to something much smaller. This max concurrency is an issue with or without this patch but we need to make sure it doesn't go backwards under similar conditions.
Show0) Install these in admin tools to make the testing easier: https://github.com/catalyst/moodle-tool_testtasks https://github.com/catalyst/moodle-tool_lockstats Testing scenario 1. Test the case with a single task runner 1) Queue up 1000 one seconds tasks: php admin /tool/testtasks/cli/queue_adhoc_tasks .php -d=1 -n=1000 2) Behind this queue up a single 'another' type of adhoc task: php admin /tool/testtasks/cli/queue_adhoc_tasks .php -d=1 -n=1 --class= 'tool_testtasks\task\another_timed_adhoc_task' 3) Peek into the task queue by class: $ select count(*),classname from mdl_task_adhoc group by classname; count | classname -------+----------------------------------------------- 1000 | \tool_testtasks\task\timed_adhoc_task 1 | \tool_testtasks\task\another_timed_adhoc_task (2 rows) 4) Now start processing the queue: php admin /tool/task/cli/adhoc_task .php --execute 5) You should see that the first 2 tasks it picked off cycled through each type and very quickly you are left with a number lower than 1000 (i.e. 997): $ select count(*),classname from mdl_task_adhoc group by classname; count | classname -------+--------------------------------------- 997 | \tool_testtasks\task\timed_adhoc_task (1 row) Testing scenario 2. A bunch of runners in parallel 1) First, allow this to work in config.php, also setup the lock stats tool which we'll use later: $CFG ->task_adhoc_concurrency_limit = 1000; $CFG ->lock_factory = '\tool_lockstats\proxy_lock_factory' ; $CFG ->proxied_lock_factory = "auto" ; 2) Queue up 22000 of type A and then another 22000 of type B behind it: $ php admin /tool/testtasks/cli/clear_adhoc_task_queue .php $ php admin /tool/testtasks/cli/queue_adhoc_tasks .php -d=1 -n=22000 $ php admin /tool/testtasks/cli/queue_adhoc_tasks .php -d=1 -n=22000 --class= 'tool_testtasks\task\another_timed_adhoc_task' 3) Confirm the queues: $ select count(*),classname from mdl_task_adhoc group by classname; count | classname -------+----------------------------------------------- 22000 | \tool_testtasks\task\timed_adhoc_task 22000 | \tool_testtasks\task\another_timed_adhoc_task ( 2 rows) 6) Now lets fire up several task runners, do this say 4 times (you may need to open several terminals in order to execute them): php admin /tool/task/cli/adhoc_task .php --execute & 7) Recheck the queues and confirm they are being processed evenly (so the count column number is lower than in step #3): $ select count(*),classname from mdl_task_adhoc group by classname; count | classname -------+----------------------------------------------- 21944 | \tool_testtasks\task\timed_adhoc_task 21944 | \tool_testtasks\task\another_timed_adhoc_task ( 2 rows) Testing scenario 3. NOT optional 8) Lastly, and most importantly, keep throwing more and more runners at it and confirm that the overall system is still scaling up linearly the more processes you throw at it. ie fire up a number of processes and count how many processes you have running: php admin/tool/task/cli/adhoc_task.php --execute & Then using the lock stats tool you can see what is running right now in the gui: /admin/tool/lockstats/ or from a sql shell: $ select count(*) from mdl_tool_lockstats_locks; count ------- 6 (1 row) Ideally you want to run this until it breaks to find the total maximum practical level of concurrency the system can handle. On my local box I saw something like: Cron processes Tasks being processed 5 5 10 10 20 19 30 28 35 32 40 18 When it hits the max threshold each process may get a lock timeout and exit so you'll get a sharp drop off from linear back to something much smaller. This max concurrency is an issue with or without this patch but we need to make sure it doesn't go backwards under similar conditions.
Description
When I tested MDL-67363 I didn't scale it up high enough, the algorithm works but when you get really large it is scaling at O(n^2) and it starts to choke.
This is a tweak to get the performance back to as close to linear as we can get.
Attachments
Issue Links
- has a non-specific relationship to
-
MDL-67486 Minimize how long we hold the global cron lock for
-
- Closed
-
- has been marked as being related by
-
MDL-67363 Add a Quality of Service layer to the processing of the ad-hoc task queue
-
- Closed
-
- has to be done before
-
MDL-64610 Add support for per-task concurrency limits
-
- Closed
-