If you are streaming audio / video files then the browser will request lots of small chunks for a file as you stream it or scrub through the file.
These appear as multiple 206 responses in the access log in fast succession in the same session.
Each of these requests goes through the whole moodle boostrap and session lock and then the plugin file access checks before service the file. But these requests are essentially identical except for the range request header. Even in the network tab in the chrome dev tools the multiple requests are all shown as a single resource.
So I think there is an opportunity to skip a whole bunch of processing, speed up the serving of these and also reduce contention on the session locking (similar conceptually to the read only locks MDL-58018)
1) In the plugin file callback then calls send_file. In send file detect that this was a) a range and create a cache item which maps this unique url + session id to the file that was actually served
2) in pluginfile.php on the second range request and before we even start the session, check if there has been a recent request for this exact file from our session id. If there is a cache entry then we already know the raw file, so just serve it and skip a whole bunch of DB, session wait and lock and write, and also the plugin file callback which will have additional db lookups. We already know that a very recent permissions check passed for this file in this session so this is essentially as the same security as if the same person has ended downloading whole file in the original request just a few seconds ago. The cache entry timeout could be something like a couple minutes or perhaps proportional to the file size and still provide a very high hit rate.
My back of envelope numbers are that this affects about 1% of traffic overall, but it is a much larger fraction of pluginfile traffic, and the traffic it does affect is by nature often short intense bursts of many requests. When we have seen session lock contention start to escalate it is very often these 206's which bunch up and get clogged queuing behind each other and continue to snowball.
I don't have perfect evidence (yet) for this but I have a feeling that certain browsers 'sniff' range requests, ie if they don't get a response quickly then they can cancel it and try a smaller range request. Some browser will do a test of just byte 0-1 to test support for range requests and then fire off more requests. And I know that some esoteric browsers will fire off multiple smaller ranges in parallel to download a large file faster. I've seen this with PDF's which appear to also be 'streamed' by some readers.
This also goes hand in hand with MDL-65693 which adds proper / faster support for http HEAD requests. In a lot of cases a HEAD request is purely to grab the content length before making subsequent range requests. So in this case the cache item could also be set in the HEAD logic as well so it is ready for the first range request.