Uploaded image for project: 'Moodle'
  1. Moodle
  2. MDL-60857

core_search: Improve cron log progress information

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 3.5
    • Fix Version/s: 3.5
    • Component/s: Global search
    • Labels:
    • Testing Instructions:
      Hide

      You will need a Moodle installation with working Apache Solr configured for Moodle global search. For best results, if you have a search area that contains a lot of content (so that it takes more than a couple of minutes to index), that would be helpful.

      1. Go to the search areas page. Delete the index for one or more search areas so that you will definitely have some content to index.
      2. Check the time limit for search is set to a reasonable value e.g. 2 minutes.
      3. Run the indexing task. You can do this from the CLI (search/cli/indexer.php), or by clicking 'Run now' from the scheduled tasks screen.
      4. Look at the progress logs that are generated:
        • EXPECTED: Logs should show each search area that was processed along with the number of documents processed and how long it took. (Same as previous.)
        • EXPECTED: In the 'Processed N records containing N documents' line, the time taken should be shown with one decimal place. (Previously three.)

      In addition to the items covered above, if indexing of a single area takes more than 30 seconds, you should see a line displayed about 30 seconds after the 'Processing area' line, which indicates how far it has processed (see screenshot). And finally, if search indexing did not entirely complete for an area, the 'not complete' text should now also include a date indicating how far it has processed.

      Note: Details not covered by this manual test are covered by the unit test so it's probably OK to skip them in manual testing.

      Show
      You will need a Moodle installation with working Apache Solr configured for Moodle global search. For best results, if you have a search area that contains a lot of content (so that it takes more than a couple of minutes to index), that would be helpful. Go to the search areas page. Delete the index for one or more search areas so that you will definitely have some content to index. Check the time limit for search is set to a reasonable value e.g. 2 minutes. Run the indexing task. You can do this from the CLI (search/cli/indexer.php), or by clicking 'Run now' from the scheduled tasks screen. Look at the progress logs that are generated: EXPECTED: Logs should show each search area that was processed along with the number of documents processed and how long it took. (Same as previous.) EXPECTED: In the 'Processed N records containing N documents' line, the time taken should be shown with one decimal place. (Previously three.) In addition to the items covered above, if indexing of a single area takes more than 30 seconds, you should see a line displayed about 30 seconds after the 'Processing area' line, which indicates how far it has processed (see screenshot). And finally, if search indexing did not entirely complete for an area, the 'not complete' text should now also include a date indicating how far it has processed. Note: Details not covered by this manual test are covered by the unit test so it's probably OK to skip them in manual testing.
    • Affected Branches:
      MOODLE_35_STABLE
    • Fixed Branches:
      MOODLE_35_STABLE
    • Pull Master Branch:
      MDL-60857-master

      Description

      During search indexing, the log contains information like this:

      Processing area: File
      Processed 1341 records containing 1341 documents, in 601.795 seconds (not complete).

      This is already useful information, but when indexing takes a long time (particularly during initial indexing), it would be useful to have more information. There are two particular things that are useful to know:

      • If indexing is not complete, how far did it get (i.e. what is the date of the latest indexed document)? This information is available on the scheduled tasks page but it can be useful to see it historically in cron logs, for example when estimating how long indexing will take.
      • How much of the total time was taken up waiting for the initial query for records to index, and how much by running through the results processing them? Only relevant if it takes a significant time to carry out the query.

      To see why the latter case is important, we currently are indexing a very large table and the initial query takes 7 minutes. Because we had the time limit on indexing set to the default 10 minutes, it actually only spent 3 minutes really indexing things each time! Noticing that the query takes a long time means we can (a) decide to set the time limit larger or (b) try to fix inefficiency in the query.

      In addition, it would also be useful to see progress when indexing within a single area takes a long time.

      I suggest the following changes:

      • Change the time report to 1 decimal place (just tidying up since other changes make the line longer).
      • If the initial query takes more than 5 seconds, add a line to display this.
      • If progress within a single area takes more than 30 seconds, add a line every 30 seconds with the current time and progress.
      • If the search area is not complete because it stops due to time limit, then also show how far the index reached i.e. the date of last indexed document.

      See screenshot for an example. (I added faked delays to get the 'Initial query took' line, but the rest is genuine.)

      Note: As background, the reason for this request is that while getting search indexing to work here, I added a hacked-together version of this type of logging, which was important in debugging problems that occurred. I intend to remove the hacks, but the logging was useful, and probably useful for other people too - so I've coded a less-hacky version here.

        Attachments

          Activity

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:
                Fix Release Date:
                17/May/18