Uploaded image for project: 'Moodle'
  1. Moodle
  2. MDL-78109

Cache: Remove harmful requirelockingwrite/requirelockingread options

    XMLWordPrintable

Details

    Description

      (This was originally combined with new issue MDL-78467 but I have split them apart as suggested.)

      I propose removing two cache definition options, requirelockingread and requirelockingwrite.

      • I am not removing requirelockingbeforewrite, which checks you currently hold a lock when writing the cache, and remains useful.
      • An equivalent requirelockingbeforeread option has been suggested. I am not implementing it at this stage but it could be added as a future MDL if anyone ever needs it.

      This was previously discussed in policy MDL-76087.

      What these options do

      If requirelockingwrite is set, then when you set a cache key to a value, instead of just setting it, this happens:

      • Acquire lock on the key
      • Set the key
      • Release lock on the key

      The requirelockingread option does similar but for reads.

      Where they are used

      These options are not used anywhere in core.

      Why this behaviour is useless

      The locks in these cases are only kept for the duration of setting (or getting) the item. In other words, they prevent two separate processes from simultaneously getting or setting an item.

      This could be useful if cache stores did not provide atomic behaviour for set and get. For example, if a cache store had behaviour like this:

      • The value of x is 'Frog'
      • Simultaneously, one user sets the value to 'Toad', and another user gets the value.
      • Because the user gets it simultaneously with setting, they get a mixed-up value such as 'Toog'

      This is not the case with any cache store in Moodle, as they all behave atomically - in the scenario above, the value will either be 'Frog' or 'Toad'.

      If it were the case with any cache store then we would not be able to safely use it with any existing cache definition, so there is no point having this as a per-definition option.

      In other words, the automatic locking behaviour was implemented at too high a level - it is useful in cache stores (where it's generally handled by the underlying stores such as Redis, memcache, etc.) but not in a definition, where turning these options on only adds an additional very slow layer of locking on top of the locking already provided by the store's inherent behaviour.

      Edge case _many

      The 'many' functions (set_many/get_many) do have a theoretical function with automatic locking because these work by locking all of the keys being set at once, before doing any of the sets. This means we can be assured that a whole collection of sets either take place, or do not take place, atomically.

      This is offset by the facts that:

      1. The set_many function doesn't work at all when requirelockingwrite is set at present, so clearly nobody has used it this way.
      2. It is safer to (manually) lock a single key in this type of case, rather than (automatically) lock a whole bunch of keys. Locking multiple keys can cause deadlocks unless you are careful to always lock keys in the same order.
      3. Both functions had weird behaviour if any lock failed.

      Example of useful locking

      The manual locking feature in cache can be useful. For example, when the course modinfo cache needs rebuilding, this can take a while e.g. 10 seconds. If a site is receiving 10 requests per second for the course, and there was no locking, then this could result in 100 attempts to rebuild the cache before any of them finish. By adding locking, we can ensure that the cache is only rebuilt once, and the other 99 just wait for the lock and then pick up the rebuilt code. The logic here is:

      • Lock the key
      • Get the value (to see if it still needs rebuilding)
      • If so, rebuild the cache and set the value
      • Release the lock

      The key difference here is that the lock spans a wider set of code, not just the cache get or set.

      Deprecation

      There is no need to use the formal deprecation process for these options, because they will now just do nothing, which is a performance improvement over the previous situation of making some network calls in order to do nothing.

      I have added debugging messages in case anyone actually uesd them.

      Attachments

        Issue Links

          Activity

            People

              quen Sam Marshall
              quen Sam Marshall
              Katie Ransom Katie Ransom
              Andrew Lyons Andrew Lyons
              CiBoT CiBoT
              Votes:
              2 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - 0 minutes
                  0m
                  Remaining:
                  Remaining Estimate - 0 minutes
                  0m
                  Logged:
                  Time Spent - 1 hour, 2 minutes
                  1h 2m

                  Clockify

                    Error rendering 'clockify-timesheets-time-tracking-reports:timer-sidebar'. Please contact your Jira administrators.