Front end File IO debugging
1.1) Turn debugging on
1.2) Hack any file such as /index.php and add an error after the config.php include:
print_error('some error ' . $CFG->dataroot);
|
1.3) Run this page in the browser and confirm the error shown has a redacted path:
error/some error [dirroot]
1.4) Turn perfdebug in admin/settings.php?section=debugging
1.5) In the footer confirm you see a new extra table with a breakdown of each major file path used by moodle and what sort of IO it has consumed
1.6) Purge the caches
php admin/cli/purge_caches.php
|
1.7) Reload the page and note the expected increase in file IO
1.8) Load the /my/ a couple times until the file IO has stabilized
1.9) Edit /my/index.php and add this line near the top which loads a ~1.8meg file
file_get_contents($CFG->libdir . '/tcpdf/fonts/freeserif.z');
|
1.10) Reload the page and confirm in the footer that the dirroot entry for Read has increased by 1.8meg. Remove the hack
1.11) Put in a similar hack in the same place doing a 1 meg write:
file_put_contents($CFG->dataroot . '/test.dat', random_bytes(1024 * 1024));
|
1.12) Confirm in the table you see 1024 kb Write in the dataroot row
Back end File IO logging
2.1) Tail the error logs and confirm there is no extra noise, keep the logs open
2.2) In config.php add config to watch for file reads, stats and writes on dataroot:
$CFG->debugfileio = 1 + 2 + 4;
|
2.3) Reload /my/ and confirm you get some lines in the error log eg:
Tue Jun 29 22:18:07.746672 2021] [php7:notice] [pid 1091995] [client 127.0.0.1:53006] FILEIO [4:read] bootstrap var/lib/sitedata/moodle.local/cache/core_component.php * line 313 of /lib/classes/component.php: call to sha1_file()
|
[Tue Jun 29 22:18:08.134110 2021] [php7:notice] [pid 1091995] [client 127.0.0.1:53006] FILEIO [2:miss] /my/ /var/lib/sitedata/moodle.local/pix/t/message.png * line 2262 of /lib/outputlib.php: call to file_exists()
|
[Tue Jun 29 22:18:08.134193 2021] [php7:notice] [pid 1091995] [client 127.0.0.1:53006] FILEIO [2:miss] /my/ /var/lib/sitedata/moodle.local/pix/t/message.gif * line 2264 of /lib/outputlib.php: call to file_exists()
|
...
|
(Note these are issues that will be fixed in MDL-68719)
2.4) To see exactly where the issues are set this in config.php
$CFG->debugfileiostacksize = 20;
|
2.5) Now reload /my/ and you should see a full stacktrace to help pin point where these errant reads are coming from
2.6) Set the level to -1 which is everything (get ready for spam!)
2.7) Confirm the error log is a firehose of very verbose IO logging. Turn this off again.
2.8) In the footer you should see that datadir is red and this is due to the component cache, lets move that to somewhere local:
$CFG->alternative_component_cache = '/tmp/component';
|
2.9) Reload twice and confirm the red warning row is now gone
Simulate remote file latency
3.1) Download attached test file kitten.jpg to your local wwwroot
3.2) Confirm it's sha1 hash:
$ sha1sum kitten.jpg
|
a3eb1d7cb63ddd6df5b2edf7720ca4631badd4f6 kitten.jpg
|
3.3) Turn on fileio debugging to watch for reads, writes, stats and misses on datadir:
$CFG->debugfileio = 1 + 2 + 4 + 8;
|
3.4) Download the test script MDL-72026-file-rename-race.php to wwwroot
3.4) Run the test script redirecting stderr to stdout so we can grep to just filedir logs:
php MDL-72026-file-rename-race.php 2>&1 | grep filedir
|
3.5) Confirm you saw 11 lines of debugging showing exactly what operations happened on the /filedir/
3.6) Also confirm the hash of the file matches the hash above
3.7) Run the command again, this time confirm that because the file already existed there is only 4 operations on filedir.
php MDL-72026-file-rename-race.php 2>&1 | grep filedir
|
3.8) Now delete the internal filedir file:
rm /var/your/path/to/sitedata/filedir/a3/eb/a3eb1d7cb63ddd6df5b2edf7720ca4631badd4f6
|
3.6) Now run the command above again in two terminals side by side - this is a little pointless but confirm that it is hard to reproduce race conditions and that both commands will very likely run correctly and won't collide because the execute too quickly.
3.7) Now configure artificial delays in the file system:
$CFG->debugfileiodelay = [
|
4 => 0.1, // Read
|
8 => 0.1, // Stat
|
16 => 0.1, // File read chunk of bytes
|
32 => 10, // File write chunk of bytes
|
];
|
3.8) Rerun the command, confirm this time it should take a couple seconds to complete. Mine took about 4.5 seconds:
time php MDL-72026-file-rename-race.php 2>&1 | grep filedir
|
3.9) Now delete the internal filedir file again
3.10) Now rerun the script again in two terminals at the same time
3.11) Confirm that either one or both terminals threw an exception due to the race condition we've now exposed and will be fixed in MDL-72026