Uploaded image for project: 'Moodle'
  1. Moodle
  2. MDL-59434

Global Search: implement content aware searching / alternate results sort orders

    XMLWordPrintable

    Details

    • Testing Instructions:
      Hide

      Steps 1-6 here are the same as the setup tasks from MDL-60880, so if you've tested that item then you probably can reuse the same setup:

      1. Create a new course called Search test squigflorp
      2. Add a forum called Forum about squigflorp
      3. In the forum, add a discussion that also contains the word squigflorp.
      4. Create a second course called Search test squigflorp 2
      5. Add a forum called Forum about squigflorp 2
      6. Add a discussion with the word squigflorp.
      7. Add a second discussion with the phrase squigflorp zoing floob.
      8. Run search indexing (e.g. by running the search indexing scheduled task using the 'Run now' option on the scheduled tasks page).
      9. Go to the first course page and use the search box in the header to search for squigflorp.
        • EXPECTED: The search page should have a Results order dropdown, set to Most relevant results first.
      10. Change the dropdown to Prioritise results related to Course: Search test squigflorp and repeat the search.
        • EXPECTED: All the results within this first course should appear first, followed by all the results from the second course.
      11. Go into the forum in the first course.
      12. Search from the header for squigflorp.
      13. Change the dropdown to Prioritise results related to Forum: Forum about squigflorp and repeat the search.
        • EXPECTED: The two results within the forum (the forum itself, and the discussion) should appear first, followed by the other result from this course (the course itself), followed by all results from the other course.
      14. Change the search to squigflorp zoing floob and repeat the search.
        • EXPECTED: The result from the other course containing the full phrase may move up the list, and should be the first of all the results from the other course, but will not necessarily be higher than that.
      15. Change the option to search by relevance and repeat.
        • EXPECTED: Now the result with the full phrase will definitely be first.
      16. Go to a location outside of any course, such as the dashboard page.
      17. Do a search from the header box
        • EXPECTED: Because you didn't search from a specific context, there will be no dropdown offering a choice of ordering (it always orders by relevance).

      As well as completing the manual test it will be useful to manually run the search engine unit tests, because these do not necessarily run on the CI server. This requires some setup if you don't already have the search engine unit tests running:

      1 In your test Solr installation, create a core called 'unittest'. You can normally do this as follows:

      • At the command line, change to the Solr install directory
      • Run bin/solr create -c unittest (Unix) or bin\solr.cmd create -c unittest (Windows)

      2 Edit config.php to add the following (assumes you are running a test Solr installation on the local machine, change it if necessary):

      define('TEST_SEARCH_SOLR_HOSTNAME', '127.0.0.1');
      define('TEST_SEARCH_SOLR_PORT', '8983');
      define('TEST_SEARCH_SOLR_INDEXNAME', 'unittest');
      

      3 Now you can run the unit tests:

      vendor/bin/phpunit --testsuite=search_solr_testsuite
      

      Show
      Steps 1-6 here are the same as the setup tasks from MDL-60880 , so if you've tested that item then you probably can reuse the same setup: Create a new course called Search test squigflorp Add a forum called Forum about squigflorp In the forum, add a discussion that also contains the word squigflorp . Create a second course called Search test squigflorp 2 Add a forum called Forum about squigflorp 2 Add a discussion with the word squigflorp . Add a second discussion with the phrase squigflorp zoing floob . Run search indexing (e.g. by running the search indexing scheduled task using the 'Run now' option on the scheduled tasks page). Go to the first course page and use the search box in the header to search for squigflorp . EXPECTED: The search page should have a Results order dropdown, set to Most relevant results first . Change the dropdown to Prioritise results related to Course: Search test squigflorp and repeat the search. EXPECTED: All the results within this first course should appear first, followed by all the results from the second course. Go into the forum in the first course. Search from the header for squigflorp . Change the dropdown to Prioritise results related to Forum: Forum about squigflorp and repeat the search. EXPECTED: The two results within the forum (the forum itself, and the discussion) should appear first, followed by the other result from this course (the course itself), followed by all results from the other course. Change the search to squigflorp zoing floob and repeat the search. EXPECTED: The result from the other course containing the full phrase may move up the list, and should be the first of all the results from the other course, but will not necessarily be higher than that. Change the option to search by relevance and repeat. EXPECTED: Now the result with the full phrase will definitely be first. Go to a location outside of any course, such as the dashboard page. Do a search from the header box EXPECTED: Because you didn't search from a specific context, there will be no dropdown offering a choice of ordering (it always orders by relevance). As well as completing the manual test it will be useful to manually run the search engine unit tests, because these do not necessarily run on the CI server. This requires some setup if you don't already have the search engine unit tests running: 1 In your test Solr installation, create a core called 'unittest'. You can normally do this as follows: At the command line, change to the Solr install directory Run bin/solr create -c unittest (Unix) or bin\solr.cmd create -c unittest (Windows) 2 Edit config.php to add the following (assumes you are running a test Solr installation on the local machine, change it if necessary): define('TEST_SEARCH_SOLR_HOSTNAME', '127.0.0.1'); define('TEST_SEARCH_SOLR_PORT', '8983'); define('TEST_SEARCH_SOLR_INDEXNAME', 'unittest'); 3 Now you can run the unit tests: vendor/bin/phpunit --testsuite=search_solr_testsuite
    • Affected Branches:
      MOODLE_34_STABLE
    • Fixed Branches:
      MOODLE_35_STABLE
    • Pull Master Branch:
      MDL-59434-master

      Description

      We want search to be context aware, i.e. if you are inside a forum inside a course and you search for 'fish' the results should prioritize results with fish from inside that forum first, and then inside the other sibling forums, then the section, then course, then your other courses, then site level stuff.

      Essentially using the context id hierarchy in Moodle to determine how 'far' we are from the content.

      This isn't a new type of filtering, it's really just an alternate form of sort order, ie instead of using just pure relevance to the search terms it's order based on both relevance and proximity. Almost all engines internally support various concepts of different sort orders, so we would want this implemented generically at the engine api level and then each engine can declare whether it supports any additional sort orders.

      A rough outline of what we think an approach could be:

      • add a new method to the engine base class supports_sort_orders() which returns an array of supported sort orders. It would be a map of orders to the lang pack key. There would be a couple defined options like SORT_ORDER_RELEVANCE and SORT_ORDER_CONTEXT  The base would only return 'relevance'.
      • An engine could choose to return as many sort orders as it wished, ie it could return those two core ones as well as 'last mod date', 'filesize' or whatever
      • if the engine selected supports CONTEXT then in the quick search form in the header then an extra hidden field containing the current pages context hierarchy would be stored. Upon submission this is passed back to the engine. There would be a 'sort by' field form clearly showing that this is ordered by context and which can be changed back to raw relevance.
      • If you go directly to the global search page then you will default to the relevance based search. If there are other alternate orders available (besides context) then the sort order field will also be shown. However context search is a special case and you can't simply swap from relevance back to context without providing a context. So in this case context would not be available in the sort order field
      • the engine execute_query() would have an additional argument for the sort order.
      • The core solr engine should implement at least one, but ideally both, other sort orders as a proof of concept. https://wiki.apache.org/solr/CommonQueryParameters#sort

      It would be up to engine to choose how to balance the relevance and proximity together, this could either be hard coded into each engine or it could provide admin settings to allow tuning of the boost levels.

      See UI rough mockup below

      The preselected option would depend on how you searched:

      • Relevance would be shown if you searched from the site dashboard or navigated directly to the global search page
      • Contextual would be selected if you searched from a site subpage, e.g. a forum
      • Most recent would always need to be selected by the user

      We would also display the context the user is searching in below the global search heading, as shown in mock up

      Additionally, maybe need to be some admin setting for how strongly we weight this stuff, for instance we could say that a site level support activity has a bit more weight or whatever

      The very broad tasks that need to be done here are:

      • Have search boxes in Moodle pass the context they were invoked in when they pass the search terms - core change
      • Construct a context hierarchy with weightings (boost values) - core change, search plugins can override
      • Search "types" - core to define, plugins to override with ones they support
      • Alter and construct the search queries that go to the search engine backends to reflect the chosen sort types - search plugins

        Attachments

        1. mdl-59434.png
          28 kB
          Sam Marshall
        2. Selection_040.png
          49 kB
          Matt Porritt

          Issue Links

            Activity

              People

              • Votes:
                4 Vote for this issue
                Watchers:
                10 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:
                  Fix Release Date:
                  17/May/18