-
Improvement
-
Resolution: Unresolved
-
Minor
-
None
-
3.9, 3.11.7, 4.0.6, 4.1.1
-
MOODLE_311_STABLE, MOODLE_39_STABLE, MOODLE_400_STABLE, MOODLE_401_STABLE
-
MDL-68166-master-2
-
Whilst looking at some performance stats I notice on some of our systems that moodle backups are the single biggest hit to IOPs.
The reason behind this is that when we write a larger file, we write it in chunks.
We have a buffer size (default 4096), and we write the file tag, by tag.
Once the buffer size sills up with enough data, we push the current buffer to the file, and then continue filling the buffer.
The same also happens with the mbz file itself. The initial file is opened, and then each part of it is appended to it.
What this means is that, for large files in backup, the backups are really write heavy.
For clustered environments this matters a lot. The files are written to the Moodle tempdir, which must be shared. Typically that sharing is over a system such as NFS. NFS does not handle those kinds of operations well.
Where that remote system is a clustered file system such as GlusterFS, Ceph, etc. then a replication step is also required.
The ideal solution would be to stop writing backups to the shared file system, but that is not something that we can currently achieve.
However, we are able to make use of our separation between local cache and shared cache very easily.
Rather than writing the XML files in small (4k) chunks straight to the tempdir, we can easily write them in exactly the same way to a per-request directory, and when complete move them to the final destination.
Likewise we can perform the same type of change with the mbz backup itself. Writing it to localcache then moving it into place once complete.
On non-clustered systems where localcachedir and tempdir are on the same filesystem this will simply be an atomic move and incur no penalty.
On clustered systems, or those where a different filesystem is in use, this is an additional step; however the IOPs to move a single file is vastly more efficient than writing it in small chunks.
Right now we cannot just move backups out of tempdir - it's simply too big a change.
- has been marked as being related by
-
MDL-66928 Cron throws exceptions during a cache purge as localcachedir is purged / get_request_storage_directory should use system temp
-
- Closed
-
-
MDL-70243 Add a file system performance summary into the footer and file IO debug mode
-
- Development in progress
-
-
MDL-80549 backuptempdir bug in course copy feature
-
- Open
-