Uploaded image for project: 'Moodle'
  1. Moodle
  2. MDL-60981

core_search: UI to gradually reindex a single area

    XMLWordPrintable

    Details

    • Testing Instructions:
      Hide
      Prerequisites
      1. Solr setup using docker.

        docker run --name solr6 -d -p 8983:8983 -t solr:6
        docker exec -it --user=solr solr6 bin/solr create_core -c MOODLEINDEX
        

        where MOODLEINDEX is the index name that you will use later when you configure Solr in Moodle (Site administration / Plugins / Search / Solr)

      2. You will need a site with working Moodle global search. The site will need to contain some content. Specifically, we're going to assume at least 2 forums and at least one other activity that is quick to duplicate.
      Test
      1. Run the search indexing task (e.g. from the Scheduled tasks page) until indexing is up to date.
      2. Go to the search areas page.
        • EXPECTED: At the bottom of the page, there should not be an Additional indexing queue heading.
      3. Find the forum posts search area and click the 'Gradual reindex' button on that row.
        • EXPECTED: You should get to a confirm page.
      4. Try clicking Cancel from the confirm page.
        • EXPECTED: You should get back to the search areas page.
      5. Repeat, but this time click Continue instead.
        • EXPECTED: If you have a very large number of forums you might see a progress bar (which will probably be at 0% until it completes, because we're using it to show indeterminate progress). Otherwise, you should just see a summary of the number of contexts added to the indexing queue.
      6. Click the Continue button.
        • EXPECTED: You should be back on the search areas page. Down the bottom, you should now see an Additional indexing queue heading and table.
        • EXPECTED: The heading should show the number of contexts you just set up for reindexing. The table should list up to 10 forums on your system, with links to the forum. If there are more than 10, there will be an additional last row which only contains '...'.
      7. Go to a course page and find an existing activity that is searchable and won't take long to duplicate (e.g. a Page).
      8. Duplicate the activity.
      9. Return to the search areas page.
        • EXPECTED: The indexing queue should now show an additional item for the activity you duplicated (this is because searching it won't work until it is reindexed). This additional item should be at the top, because restore reindexing is given a higher priority than potentially long reindex tasks.
      10. Run global search indexing again (e.g. from the scheduled tasks page).
      11. Look at the search areas page again.
        • EXPECTED: Depending on the number of forums, the time limit for indexing, etc., the indexing queue should either have disappeared (as the queue is empty), or should have fewer items on it, and/or should show some progress (indicated as a search area and date) against the first item.
      Show
      Prerequisites Solr setup using docker. docker run --name solr6 -d -p 8983:8983 -t solr:6 docker exec -it --user=solr solr6 bin/solr create_core -c MOODLEINDEX where MOODLEINDEX is the index name that you will use later when you configure Solr in Moodle (Site administration / Plugins / Search / Solr) You will need a site with working Moodle global search. The site will need to contain some content. Specifically, we're going to assume at least 2 forums and at least one other activity that is quick to duplicate. Test Run the search indexing task (e.g. from the Scheduled tasks page) until indexing is up to date. Go to the search areas page. EXPECTED: At the bottom of the page, there should not be an Additional indexing queue heading. Find the forum posts search area and click the 'Gradual reindex' button on that row. EXPECTED: You should get to a confirm page. Try clicking Cancel from the confirm page. EXPECTED: You should get back to the search areas page. Repeat, but this time click Continue instead. EXPECTED: If you have a very large number of forums you might see a progress bar (which will probably be at 0% until it completes, because we're using it to show indeterminate progress). Otherwise, you should just see a summary of the number of contexts added to the indexing queue. Click the Continue button. EXPECTED: You should be back on the search areas page. Down the bottom, you should now see an Additional indexing queue heading and table. EXPECTED: The heading should show the number of contexts you just set up for reindexing. The table should list up to 10 forums on your system, with links to the forum. If there are more than 10, there will be an additional last row which only contains '...'. Go to a course page and find an existing activity that is searchable and won't take long to duplicate (e.g. a Page). Duplicate the activity. Return to the search areas page. EXPECTED: The indexing queue should now show an additional item for the activity you duplicated (this is because searching it won't work until it is reindexed). This additional item should be at the top, because restore reindexing is given a higher priority than potentially long reindex tasks. Run global search indexing again (e.g. from the scheduled tasks page). Look at the search areas page again. EXPECTED: Depending on the number of forums, the time limit for indexing, etc., the indexing queue should either have disappeared (as the queue is empty), or should have fewer items on it, and/or should show some progress (indicated as a search area and date) against the first item.
    • Affected Branches:
      MOODLE_35_STABLE
    • Fixed Branches:
      MOODLE_35_STABLE
    • Pull Master Branch:
      MDL-60981-master

      Description

      In some cases it is necessary to reindex content from a specific search area. For example:

      • If there was a bug in indexing code for a search area that has now been fixed (this could be a core search area or one in a contributed plugin), or an improvement made to that search area, existing content may need reindexing.
      • If we make an addition to the schema (e.g. group support MDL-58885), the search areas that implement that schema addition may need reindexing.

      At present there is no UI to gradually reindex a single search area. You can delete the index, which does achieve this result (the scheduled task will gradually rebuild the index), but if the search area is a large one such as all forum posts, reindexing may take a long time (forum posts take some weeks to index in our system) and during that time, the search engine will not find any results in newer forum posts. This is a bad experience for users.

      One option would be to simply reset the date on the search area so that it reindexes from the beginning, but without deleting the index. This would work but because of the date ordering it means that, until the reindex completes, no new content will be indexed. Given that searching might be most important for more recent confent, this could also be a bad experience for users.

      We already have the feature to add a single context to a queue for reindexing. I think the best solution would be to provide a feature to automatically add all contexts relating to a particular search area. This means that 'normal' indexing (of new forum posts, for instance) will continue.

      Within this solution we can either just add the system context which should reindex everything (oldest to newest) or, probably more helpfully, we could make it separately add the individual contexts e.g. so that the most recently updated forums get reindexed first, and those which haven't been touched in years are done later. This latter approach is more work but seems like it would be a more complete solution so I'll have a go.

      While doing this, we should add UI to the search areas page to display information about this queue:

      This also displays information from the existing use of that queue (when courses are restored).

      We are also going to need to change the way queueing works so that the reindexing jobs (which may be long-lasting) are lower priority than the course-restore ones.

       

        Attachments

        1. mdl-60981.png
          mdl-60981.png
          24 kB
        2. mdl-60981-confirm.png
          mdl-60981-confirm.png
          16 kB
        3. mdl-60981-done.png
          mdl-60981-done.png
          13 kB

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:
                  Fix Release Date:
                  17/May/18