Uploaded image for project: 'Moodle'
  1. Moodle
  2. MDL-47962

Major room for performance improvement in filter_glossary

XMLWordPrintable

    • MOODLE_27_STABLE, MOODLE_34_STABLE, MOODLE_35_STABLE
    • MOODLE_36_STABLE
    • Hide
      1. Go to your glossary
      2. Click Glossary administration > Filters (or use cog menu in boost and click Filters)
      3. Turn off glossary auto linking.
        Other pages will still be auto linked, just not the glossary to itself
      Show
      Go to your glossary Click Glossary administration > Filters (or use cog menu in boost and click Filters) Turn off glossary auto linking. Other pages will still be auto linked, just not the glossary to itself
    • Hide

      The main thing to test here is that there are no regressions in relation to the glossary filter, so redoing MDLQA-11715 and MDLQA-11716 would be a good test.

      It would also be good to verify that there are not regressions in other filters that use the filter_phrases function (Activity names, Censorship and Database filters). As far as I can see (e.g. tracker queries project = MDLQA AND text ~ 'filter' AND component = "Database activity module" AND parent = MDLQA-11698 or project = MDLQA AND component = "Filters" AND parent = MDLQA-11698 there are no manual tests for this, which I take to mean that the automated tests for these are sufficient.

      Finally, you could either look at my test results below, and consider them sufficient, or replicate them yourself. You would need to:

      1. Create a course with many lables (or many activities set to show description on the course page).
      2. Create a main glossary containing may terms. (An import from qa.moodle.net is a good way to get this.)
      3. Ensure the glossary filter is turned on.
      4. Turn on display of performance info in the footer (or turn on profiling)
      5. Verify that without this fix the course page loads slowly, and with it it loads fast.
      Show
      The main thing to test here is that there are no regressions in relation to the glossary filter, so redoing MDLQA-11715 and MDLQA-11716 would be a good test. It would also be good to verify that there are not regressions in other filters that use the filter_phrases function (Activity names, Censorship and Database filters). As far as I can see (e.g. tracker queries project = MDLQA AND text ~ 'filter' AND component = "Database activity module" AND parent = MDLQA-11698 or project = MDLQA AND component = "Filters" AND parent = MDLQA-11698 there are no manual tests for this, which I take to mean that the automated tests for these are sufficient. Finally, you could either look at my test results below, and consider them sufficient, or replicate them yourself. You would need to: Create a course with many lables (or many activities set to show description on the course page). Create a main glossary containing may terms. (An import from qa.moodle.net is a good way to get this.) Ensure the glossary filter is turned on. Turn on display of performance info in the footer (or turn on profiling) Verify that without this fix the course page loads slowly, and with it it loads fast.

      I am doing profiling on a big course containing a glossary with 480 terms.

      I am comparing our custom filter_ouglossary, which does caching by caching the final result of fitering the input string, with standard filter_glossary

      With filter_ouglossary:

      Total page-load time: 1.6s
      Calls to filter_ouglossary::__contruct: 170ms (12 calls)
      Nothing else shows up, which means it total less than 36ms.

      With filter_glossary:

      Total page-load time: 3.2s
      filter_glossary::filter: 1900ms (14 calls) - so this is all the difference.

      This calls the following functions:
      html_writer::start_tag: 560ms (5760 calls)
      usort (calling filter_glossary::sort_entries_by_length): 420ms (12 calls, calling the comparator 49,000 times)
      filter_phrases: 290ms (14 calls)
      moodle_url::__construct: 260ms (5760 calls)
      mod_glossary\local\concept_cache::get_concepts: 220ms (12 calls)
      and everything else is small

      The problems with the code

      I think there are two fundamental problems here;

      1. filter_glossary caches the data in the class instance, but on this page there is text from 12 different contexts, so we create 12 instances of the class which each load their own data. This code was implemented without a proper understanding of how text filtering acutally works. We need to load this data only once per courseid per page load.
      2. filter_glossary::filter renders the HTML for a glossary link to each term in the glossary, whether on not that will be used. Since most pages only contain a tiny fraction of the glossary terms, the HTML for the link needs to be generated on-demand. We need to change the filter_phrases / filterobject API so that fields like $hreftagbegin are accessed through a getter, so we can do lazy initialisation.

        1. MDL-47962.PNG
          MDL-47962.PNG
          10 kB
        2. perf_after_4.png
          perf_after_4.png
          60 kB
        3. perf_after_all.png
          perf_after_all.png
          30 kB
        4. perf_before_4.png
          perf_before_4.png
          59 kB

            timhunt Tim Hunt
            timhunt Tim Hunt
            Damyon Wiese Damyon Wiese
            David Monllaó David Monllaó
            Anna Carissa Sadia Anna Carissa Sadia
            Votes:
            6 Vote for this issue
            Watchers:
            11 Start watching this issue

              Created:
              Updated:
              Resolved:

                Estimated:
                Original Estimate - 0 minutes
                0m
                Remaining:
                Remaining Estimate - 0 minutes
                0m
                Logged:
                Time Spent - 30 minutes
                30m

                  Error rendering 'clockify-timesheets-time-tracking-reports:timer-sidebar'. Please contact your Jira administrators.