Uploaded image for project: 'Moodle'
  1. Moodle
  2. MDL-68729

Search: Allow query on one Solr server and indexing on another

    XMLWordPrintable

Details

    • MOODLE_310_STABLE
    • MOODLE_310_STABLE
    • MDL-68729-master
    • Hide

      To carry out this test, you will need a working Apache Solr installation and to have configured global search for your Moodle site.

      Note: If you have a large amount of content on your site, then deleting the whole index might cause it to take a long time to reindex. In this case, it is sufficient for the test if you just delete the 'Forum posts' index.

      PART A - Switching search engine type

      This sequence simulates what might happen if you switch between two types of search engines (apart from the part where it takes a week to rebuild the index), where users can still get search results during the switch, albeit only from before the point that the switch started.

      1. Configure the server to use Apache Solr to search.
      2. Create a new forum post containing the special word BLOOKAZOID.
      3. Run indexing (via the scheduled task or any other method).
      4. Do a search for BLOOKAZOID using the global search in header.
        • Expected: It finds your forum post.
      5. Ensure the 'Global search indexing scheduled task' is disabled or otherwise does not run automatically (because this might mess up the test).
      6. In the admin settings Plugins / Search / Manage global search page, change the search engine to 'Simple search'.
      7. In the admin settings Plugins / Search / Search areas page, click 'Delete all indexed contents' and confirm the prompt.
      8. Search again for BLOOKAZOID using the global search in header.
        • Expected: There are no results because the post has not been indexed using the current search engine.
      9. In the admin settings Plugins / Search / Manage global search page, set the 'Query-only search engine' to Solr.
      10. Search again for BLOOKAZOID using the global search in header.
        • Expected: You should now get a result again.
      11. Add a new forum post that also contains the word BLOOKAZOID.
      12. Run the 'Global search indexing' scheduled task, or otherwise update the search index.
      13. Search again for BLOOKAZOID using the global search in header.
        • Expected: You still get only one result, because it's using the query-only search engine which hasn't been updated.
      14. In the admin settings Plugins / Search / Manage global search page, set the 'Query-only search engine' to 'None'.
      15. Search again for BLOOKAZOID using the global search in header.
        • Expected: You now get both results, because it's using the simple db search engine which has now been indexed.

      PART B Switching between two Solr instances

      1. Configure the server to use Apache Solr for search.
      2. For this test you will need a second Apache Solr instance, as well as the normal one tht is currently configured.
        • You could use a genuinely different server...
        • But the easiest way to create a second independent instance on the same server is to run, from the Solr directory, bin/solr create -c frog (assuming you want the new collection to be called frog, and why wouldn't you).
        • Obviously if you already have a 'spare' collection you can just use that.
      3. To start off with you should be using your normal Apache Solr instance, not the new one you just configured.
      4. Create a new forum post containing the special word SMORGASBLOOT.
      5. Run indexing (via the scheduled task or any other method).
      6. Do a search for SMORGASBLOOT using the global search in header.
        • Expected: It finds your forum post.
      7. Ensure the 'Global search indexing scheduled task is disabled or otherwise does not run automatically (because this might mess up the test).
      8. In the admin settings Plugins / Search / Solr page, change the Solr collection name from the real name to the new one you just created (e.g. frog).
      9. If this is a new collection, initialise it: go to Plugins / Search / Manage global search, then you will see a red 'Error validating Solr schema' message; click the 'follow this link' link next to the error.
      10. In the admin settings Plugins / Search / Search areas page, click 'Delete all indexed contents' and confirm the prompt.
      11. Search again for SMORGASBLOOT using the global search in header.
        • Expected: There are no results (because you haven't indexed yet in the new Solr instance).
      12. In the admin settings Plugins / Search / Solr page, scroll to the bottom and set up the 'Alternate settings' area to the same as your normal settings, except with the index name set back to your real collection name (not frog).
      13. In the admin settings Plugins / Search / Manage global search page, set 'Query-only search engine' to 'Solr (alternate settings)'.
      14. Search again for SMORGASBLOOT using the global search in header.
        • Expected: It now finds the results because it's using the original search index.
      15. Add a new forum post that also contains the word SMORGASBLOOT.
      16. Run the 'Global search indexing' scheduled task, or otherwise update the search index.
        1. Now the 'new' (frog) search index is up to date.
      17. Search again for SMORGASBLOOT using the global search in header.
        • Expected: You still get only one result, because it's using the query-only search engine which hasn't been updated.
      18. In the admin settings Plugins / Search / Manage global search page, set the 'Query-only search engine' to 'None'.
      19. Search again for SMORGASBLOOT using the global search in header.
        • Expected: You now get both results, because it's using the new (frog) search index which is now up to date.
      20. You might want to put back your Solr settings to point to the normal collection.

      PART C - Showing information for users

      Note: This is also covered by Behat tests. (The other two parts are not.)

      1. In the Plugins / Search / Manage global search page, find the 'Search information' box, and type some text such as 'Due to current maintenance work, search results newer than last Friday are not being displayed at the moment. We expect to have full search facilities back tomorrow.'
      2. Also set the 'Display search information' checkbox.
      3. Do a search (for anything) using the header.
        • Expected: You should see the informational text displayed in a warning box at the top of the search results.
      4. Go back to Plugins / Search / Manage global search, and turn off 'Display search information' while leaving the text set (note: this is a convenient way for admins to be able to leave a standard message defined for when it is needed, rather than having to record it somewhere else).
      5. Do another search using the header.
        • Expected: You should not see the informational text.

      PART D - Trying to set the same search engine for querying than the main one.

      1. Go to Admin -> Plugins -> Search -> Manage global search.
      2. Set "searchengine" to "Simple Search".
      3. Save changes.
      4. Set "searchenginequeryonly" to the same.
      5. Save changes.
      6. Verify that you see "Some settings were not changed due to an error." on top of the page.
      7. Verify that in the "searchenginequeryonly" field (the one being saved last) you see the error "The query-only search engine and the main search engine cannot be set to the same value."
      8. Set "searchenginequeryonly" to "Solr".
      9. Save changes.
      10. Set "searchengine" to "Solr".
      11. Save changes.
      12. Verify that you see "Some settings were not changed due to an error." on top of the page.
      13. Verify that in the "searchengine" field (the one being saved last) you see the error "The query-only search engine and the main search engine cannot be set to the same value."
      Show
      To carry out this test, you will need a working Apache Solr installation and to have configured global search for your Moodle site. Note: If you have a large amount of content on your site, then deleting the whole index might cause it to take a long time to reindex. In this case, it is sufficient for the test if you just delete the 'Forum posts' index. PART A - Switching search engine type This sequence simulates what might happen if you switch between two types of search engines (apart from the part where it takes a week to rebuild the index), where users can still get search results during the switch, albeit only from before the point that the switch started. Configure the server to use Apache Solr to search. Create a new forum post containing the special word BLOOKAZOID. Run indexing (via the scheduled task or any other method). Do a search for BLOOKAZOID using the global search in header. Expected: It finds your forum post. Ensure the 'Global search indexing scheduled task' is disabled or otherwise does not run automatically (because this might mess up the test). In the admin settings Plugins / Search / Manage global search page, change the search engine to 'Simple search'. In the admin settings Plugins / Search / Search areas page, click 'Delete all indexed contents' and confirm the prompt. Search again for BLOOKAZOID using the global search in header. Expected: There are no results because the post has not been indexed using the current search engine. In the admin settings Plugins / Search / Manage global search page, set the 'Query-only search engine' to Solr. Search again for BLOOKAZOID using the global search in header. Expected: You should now get a result again. Add a new forum post that also contains the word BLOOKAZOID. Run the 'Global search indexing' scheduled task, or otherwise update the search index. Search again for BLOOKAZOID using the global search in header. Expected: You still get only one result, because it's using the query-only search engine which hasn't been updated. In the admin settings Plugins / Search / Manage global search page, set the 'Query-only search engine' to 'None'. Search again for BLOOKAZOID using the global search in header. Expected: You now get both results, because it's using the simple db search engine which has now been indexed. PART B Switching between two Solr instances Configure the server to use Apache Solr for search. For this test you will need a second Apache Solr instance, as well as the normal one tht is currently configured. You could use a genuinely different server... But the easiest way to create a second independent instance on the same server is to run, from the Solr directory, bin/solr create -c frog (assuming you want the new collection to be called frog, and why wouldn't you). Obviously if you already have a 'spare' collection you can just use that. To start off with you should be using your normal Apache Solr instance, not the new one you just configured. Create a new forum post containing the special word SMORGASBLOOT. Run indexing (via the scheduled task or any other method). Do a search for SMORGASBLOOT using the global search in header. Expected: It finds your forum post. Ensure the 'Global search indexing scheduled task is disabled or otherwise does not run automatically (because this might mess up the test). In the admin settings Plugins / Search / Solr page, change the Solr collection name from the real name to the new one you just created (e.g. frog). If this is a new collection, initialise it: go to Plugins / Search / Manage global search, then you will see a red 'Error validating Solr schema' message; click the 'follow this link' link next to the error. In the admin settings Plugins / Search / Search areas page, click 'Delete all indexed contents' and confirm the prompt. Search again for SMORGASBLOOT using the global search in header. Expected: There are no results (because you haven't indexed yet in the new Solr instance). In the admin settings Plugins / Search / Solr page, scroll to the bottom and set up the 'Alternate settings' area to the same as your normal settings, except with the index name set back to your real collection name (not frog). In the admin settings Plugins / Search / Manage global search page, set 'Query-only search engine' to 'Solr (alternate settings)'. Search again for SMORGASBLOOT using the global search in header. Expected: It now finds the results because it's using the original search index. Add a new forum post that also contains the word SMORGASBLOOT. Run the 'Global search indexing' scheduled task, or otherwise update the search index. Now the 'new' (frog) search index is up to date. Search again for SMORGASBLOOT using the global search in header. Expected: You still get only one result, because it's using the query-only search engine which hasn't been updated. In the admin settings Plugins / Search / Manage global search page, set the 'Query-only search engine' to 'None'. Search again for SMORGASBLOOT using the global search in header. Expected: You now get both results, because it's using the new (frog) search index which is now up to date. You might want to put back your Solr settings to point to the normal collection. PART C - Showing information for users Note: This is also covered by Behat tests. (The other two parts are not.) In the Plugins / Search / Manage global search page, find the 'Search information' box, and type some text such as 'Due to current maintenance work, search results newer than last Friday are not being displayed at the moment. We expect to have full search facilities back tomorrow.' Also set the 'Display search information' checkbox. Do a search (for anything) using the header. Expected: You should see the informational text displayed in a warning box at the top of the search results. Go back to Plugins / Search / Manage global search, and turn off 'Display search information' while leaving the text set (note: this is a convenient way for admins to be able to leave a standard message defined for when it is needed, rather than having to record it somewhere else). Do another search using the header. Expected: You should not see the informational text. PART D - Trying to set the same search engine for querying than the main one. Go to Admin -> Plugins -> Search -> Manage global search. Set "searchengine" to "Simple Search". Save changes. Set "searchenginequeryonly" to the same. Save changes. Verify that you see "Some settings were not changed due to an error." on top of the page. Verify that in the "searchenginequeryonly" field (the one being saved last) you see the error "The query-only search engine and the main search engine cannot be set to the same value." Set "searchenginequeryonly" to "Solr". Save changes. Set "searchengine" to "Solr". Save changes. Verify that you see "Some settings were not changed due to an error." on top of the page. Verify that in the "searchengine" field (the one being saved last) you see the error "The query-only search engine and the main search engine cannot be set to the same value."

    Description

      Sometimes it is necessary to move a Solr installation, for example when upgrading to a new major version. This can involve reindexing everything.

      Indexing can be a time-consuming operation, taking several days at least. (We think the current estimate on our site is about two weeks, although this is prior to MDL-68690.) It is not usually considered acceptable for search to stop working through the whole site for days or weeks, but at present there is no easy way around this.

      Current behaviour after switch to new server:

      • Initially students can search, but always get no results.
      • Over a period of minutes, days, or weeks depending on the size of the site, the index is gradually rebuilt, starting from the oldest data. Students gradually start to get more results to global search queries.
      • There is no advice to students about the situation.

      I propose adding the ability to query from one server while indexing on another. Code-wise this is relatively simple: we just need to ensure that a particular server address is used for the query operations, and a different one for the indexing operation.

      Proposed behaviour

      • Students can search and get all results as of the time indexing started (e.g. minutes, days or weeks ago).
      • Newer results, for example new forum posts, are not returned.
      • When indexing catches up, the system is manually switched over and normal service resumes.
      • A status message on search screens warns students about the current situation.

      The current architecture doesn't support multiple search engines of the same type (and there's no good reason it should) so I decided it would be simplest to implement this as 'alternate configuration settings' within search engines that support the feature. This requires minimal modification to the actual search engine - one simple added function, a slight change to the constructor, and a copy-paste of the relevant connection settings in settings.php. I have implemented this for the Solr search engine in core. But even without this modification, you can still use this feature to switch between different types of search engine (e.g. simpledb->something else) with less disruption for students.

      Attachments

        1. MDL-68729-mainadmin.png
          MDL-68729-mainadmin.png
          33 kB
        2. MDL-68729-searchbanner.png
          MDL-68729-searchbanner.png
          17 kB
        3. MDL-68729-solradmin.png
          MDL-68729-solradmin.png
          22 kB
        4. Screenshot_1.png
          Screenshot_1.png
          126 kB
        5. Screenshot_2.png
          Screenshot_2.png
          111 kB
        6. Screenshot_3.png
          Screenshot_3.png
          111 kB

        Issue Links

          Activity

            People

              quen Sam Marshall
              quen Sam Marshall
              Mark Johnson Mark Johnson
              Eloy Lafuente (stronk7) Eloy Lafuente (stronk7)
              Janelle Barcega Janelle Barcega
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:
                9/Nov/20

                Time Tracking

                  Estimated:
                  Original Estimate - 0 minutes
                  0m
                  Remaining:
                  Remaining Estimate - 0 minutes
                  0m
                  Logged:
                  Time Spent - 4 hours, 40 minutes
                  4h 40m