Uploaded image for project: 'Moodle'
  1. Moodle
  2. MDL-25826

HTMLPurifier rewrites definition cache on every request

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.0
    • Fix Version/s: 2.0.3
    • Component/s: Performance
    • Environment:
      Produced on linux cluster under Apache httpd, PHP 5.2.x and PostgreSQL 8.4, but should be irrelevant.
    • Database:
      Any
    • Affected Branches:
      MOODLE_20_STABLE
    • Fixed Branches:
      MOODLE_20_STABLE

      Description

      During performance testing – with HTMLPurifier enabled, we witnessed Moodle 2.x re-wrote the HTMLPurifier cached serialised definition file for every request which used the purifier; this thoroughly destroyed performance of the entire application.

      This is a major point of contention performance-wise, as all other pages being purified will block until completion of this write. This is heavily exaggerated when used in an environment with shared storage or a clustered filesystem used across multiple webservers and is essentially the worst use-case for a shared filesystem (multiple writes to the same file in the same directory, concurrently from multiple nodes).

      I'm not sure if this is "correct" behaviour from HTMLPurifier (i.e. does it need to re-write this file frequently?), but I suggest this needs to be repaired to not re-write the serialised definition file and use the already cached file (if possible), otherwise alternate config options would need to be provided so that these cached serialised definition files can be written to alternate directories (e.g. one for each webserver perhaps).

      The poor performing call stack is:

      lib/weblib.php: purify_html()
      => lib/htmlpurifier/HTMLPurifier.php: HTMLPurifier->purify()
      => lib/htmlpurifier/HTMLPurifier/Generator.php: HTMLPurifier_Generator->__construct()
      => lib/htmlpurifier/HTMLPurifier/Config.php: HTMLPurifier_Config->getHTMLDefinition()
      => lib/htmlpurifier/HTMLPurifier/Config.php: HTMLPurifier_Config->getDefinition()
      => lib/htmlpurifier/HTMLPurifier/DefinitionCache/Decorator/Cleanup.php: HTMLPurifier_DefinitionCache_Decorator_Cleanup->set()
      => lib/htmlpurifier/HTMLPurifier/DefinitionCache/Decorator.php: HTMLPurifier_DefinitionCache_Decorator->set()
      => lib/htmlpurifier/HTMLPurifier/DefinitionCache/Serializer.php: HTMLPurifier_DefinitionCache_Serializer->set()
      => lib/htmlpurifier/HTMLPurifier/DefinitionCache/Serializer.php: HTMLPurifier_DefinitionCache_Serializer->_write()

        Gliffy Diagrams

          Attachments

            Activity

              People

              • Votes:
                3 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:
                  Fix Release Date:
                  5/May/11