Details

    • Testing Instructions:
      Hide

      Unit test coverage is pretty good, but unfortunately we can not add behat tests as all this depends on external software so we need some manual testing.

      Installation

      1. Follow only 'Installation' section in https://docs.moodle.org/dev/Global_search#Installation (this page will be moved to 31/Global_search_solr)
        • After this you should have a working solr server with an index instance for moodle

      Setup

      1. Upgrade moodle
      2. Relax, it is not as bad as it seems
      3. Go to site admin -> plugins -> search
      4. Enable search should be disabled by default
      5. Open a new tab and go to your dashboard, press 'Customise this page' and add a global search block
      6. You should be able to add it and you should see the following message inside of it 'Global searching is not enabled.'
      7. Leave this tab open, return to Manage global search and enable it
      8. Return to your dashboard, now you should see a search form where you can filter by area
      9. Type something and press search
      10. You should be redirected to search/index.php and you should see 'No results'
      11. Close this tab and return to the previous one
      12. Go to site admin -> plugins -> search -> solr and save changes (basically set host to localhost, port to the default 8983 and indexname to moodle)
      13. Open a CLI and while in moodle's root execute php search/engine/solr/cli/setup_schema.php
      14. It should finish saying something like Adding moodle fields... and The schema is ready to be used
      15. Run again the same script, no exception should be triggered and you should see something like Adding moodle fields... and The schema is ready to be used

      Indexing & searching

      1. Go to reports -> search
      2. Use tool_generator to generate a few courses with forums if your site don't have much data (we will need forum posts)
      3. Tick recreate index and save changes. I hope you don't have many many forum_posts this will send them to the search engine, may take a bit of time if you have thousands and thousands posts, take a break, you deserve it
      4. At some point this will finish and will show a 'Indexing finished' and the screen last run numbers will be updated with some numbers
      5. Open your dashboard in a new tab and use the global search block to search something you know that your forum post contain, for example 'forum'
      6. You should see the list of results including that search string
      7. Click all links in one of these results, they should all be valid results
      8. When you have more than 10 results you will see pagination links, try that these pagination links work
      9. Search something else with valid results and check that you can see results
      10. Search something else with a crazy query string like dfsghuasdguhisdfgui and check that you can not see results
      11. If will be nice if you can look at the query field help and try some combinations
      12. Return to reports -> search browser tab and tick 'Delete' checkbox, 'Entire index' should be selected already, press "Save changes"
      13. Back to dashboard, search a valid result, you should not see results now
      14. Filter queries are unit tested but it will be nice if you can look at the results you get and try some query combinations checking that the results you get are correct

      Unit tests

      1. Go to the directory where you installed solr (/something/solr-5.4.1 probably)
      2. Run bin/solr create -c unittest
      3. Add TEST_SEARCH_SOLR_* vars in your config.php (see search/engine/solr/tests/engine_test.php php-doc block)
      4. Run vendor/bin/phpunit search/engine/solr/tests/engine_test.php and check that the tests are executed and not skipped, they should pass
      5. Run the whole phpunit suite, everything should pass
      Show
      Unit test coverage is pretty good, but unfortunately we can not add behat tests as all this depends on external software so we need some manual testing. Installation Follow only 'Installation' section in https://docs.moodle.org/dev/Global_search#Installation (this page will be moved to 31/Global_search_solr) After this you should have a working solr server with an index instance for moodle Setup Upgrade moodle Relax, it is not as bad as it seems Go to site admin -> plugins -> search Enable search should be disabled by default Open a new tab and go to your dashboard, press 'Customise this page' and add a global search block You should be able to add it and you should see the following message inside of it 'Global searching is not enabled.' Leave this tab open, return to Manage global search and enable it Return to your dashboard, now you should see a search form where you can filter by area Type something and press search You should be redirected to search/index.php and you should see 'No results' Close this tab and return to the previous one Go to site admin -> plugins -> search -> solr and save changes (basically set host to localhost, port to the default 8983 and indexname to moodle) Open a CLI and while in moodle's root execute php search/engine/solr/cli/setup_schema.php It should finish saying something like Adding moodle fields... and The schema is ready to be used Run again the same script, no exception should be triggered and you should see something like Adding moodle fields... and The schema is ready to be used Indexing & searching Go to reports -> search Use tool_generator to generate a few courses with forums if your site don't have much data (we will need forum posts) Tick recreate index and save changes. I hope you don't have many many forum_posts this will send them to the search engine, may take a bit of time if you have thousands and thousands posts, take a break, you deserve it At some point this will finish and will show a 'Indexing finished' and the screen last run numbers will be updated with some numbers Open your dashboard in a new tab and use the global search block to search something you know that your forum post contain, for example 'forum' You should see the list of results including that search string Click all links in one of these results, they should all be valid results When you have more than 10 results you will see pagination links, try that these pagination links work Search something else with valid results and check that you can see results Search something else with a crazy query string like dfsghuasdguhisdfgui and check that you can not see results If will be nice if you can look at the query field help and try some combinations Return to reports -> search browser tab and tick 'Delete' checkbox, 'Entire index' should be selected already, press "Save changes" Back to dashboard, search a valid result, you should not see results now Filter queries are unit tested but it will be nice if you can look at the results you get and try some query combinations checking that the results you get are correct Unit tests Go to the directory where you installed solr (/something/solr-5.4.1 probably) Run bin/solr create -c unittest Add TEST_SEARCH_SOLR_* vars in your config.php (see search/engine/solr/tests/engine_test.php php-doc block) Run vendor/bin/phpunit search/engine/solr/tests/engine_test.php and check that the tests are executed and not skipped, they should pass Run the whole phpunit suite, everything should pass
    • Affected Branches:
      MOODLE_30_STABLE
    • Fixed Branches:
      MOODLE_31_STABLE
    • Epic Name:
      Global search
    • Pull from Repository:
    • Pull Master Branch:
      MDL-31989_master
    • Story Points (Obsolete):
      20

      Description

      Global Search is currently removed from Moodle 2.2.

      As part of GSoC '13 project, I've re-written Tomasz Muras's Global Search implementation for the current 2.6 (dev) master branch.

      You may view the demo and play with it here: http://global-search.jmuras.com/
      ( Username:search_admin Password: Search_admin1 )

      The docs are here: http://docs.moodle.org/dev/Global_search

        Gliffy Diagrams

        1. Screen Shot 2016-02-22 at 16.53.33.png
          32 kB
        2. search-block.png
          12 kB
        3. search-results.png
          103 kB
        4. search-setup.png
          94 kB
        5. search-solr-setup.png
          150 kB

          Issue Links

            Issues in Epic

              Activity

              Hide
              nebgor Aparup Banerjee added a comment -

              Thanks for working on this Tomasz

              It might be useful to put up a git repository link here.
              (also because i'm wondering how its going )

              Show
              nebgor Aparup Banerjee added a comment - Thanks for working on this Tomasz It might be useful to put up a git repository link here. (also because i'm wondering how its going )
              Hide
              tmuras Tomasz Muras added a comment -

              I've added link to my git repo to the description.
              As I've promised in the forums, I am starting with the performance testing. I'm going to test:

              • performance of the indexing
              • performance of the search (split into Lucene performance and perfomance of security checking - that is, filtering down the search results depending on the permissions)
              • memory footprint of the above

              Although it's far from completion, I have now enough code to do that. I will post my finindings in the dev forum.

              Show
              tmuras Tomasz Muras added a comment - I've added link to my git repo to the description. As I've promised in the forums, I am starting with the performance testing. I'm going to test: performance of the indexing performance of the search (split into Lucene performance and perfomance of security checking - that is, filtering down the search results depending on the permissions) memory footprint of the above Although it's far from completion, I have now enough code to do that. I will post my finindings in the dev forum.
              Hide
              nebgor Aparup Banerjee added a comment -

              this isn't minor! --> major!

              Show
              nebgor Aparup Banerjee added a comment - this isn't minor! --> major!
              Hide
              tmuras Tomasz Muras added a comment -

              Checked indexing performance - moving on the the performance of the search. See http://moodle.org/mod/forum/discuss.php?d=199636 for discussion.

              Show
              tmuras Tomasz Muras added a comment - Checked indexing performance - moving on the the performance of the search. See http://moodle.org/mod/forum/discuss.php?d=199636 for discussion.
              Hide
              tmuras Tomasz Muras added a comment -

              Update on http://moodle.org/mod/forum/discuss.php?d=199636 on index optimization.

              Show
              tmuras Tomasz Muras added a comment - Update on http://moodle.org/mod/forum/discuss.php?d=199636 on index optimization.
              Hide
              nadavkav Nadav Kavalerchik added a comment -

              Hi Tomasz
              Is this ready for some testing? what is the timeline and when it is scheduled for upstream integration?

              Show
              nadavkav Nadav Kavalerchik added a comment - Hi Tomasz Is this ready for some testing? what is the timeline and when it is scheduled for upstream integration?
              Hide
              tmuras Tomasz Muras added a comment -

              Hi Nadav,

              Global Search 2 rewrite will not be ready before Moodle 2.4. As soon as I get some time, I'll continue my work on testing the search performance. I will be updating forum discussion with my findings.

              Show
              tmuras Tomasz Muras added a comment - Hi Nadav, Global Search 2 rewrite will not be ready before Moodle 2.4. As soon as I get some time, I'll continue my work on testing the search performance. I will be updating forum discussion with my findings.
              Hide
              nadavkav Nadav Kavalerchik added a comment -

              Hi Tomasz,
              Thank you for the clarification

              btw, I have already git-clone it from you and downloaded the latest Zend/Search library files
              I got the search_reset_index() to work, I guess. but not the search.php
              If you have any tips or suggestions... that would be nice

              Show
              nadavkav Nadav Kavalerchik added a comment - Hi Tomasz, Thank you for the clarification btw, I have already git-clone it from you and downloaded the latest Zend/Search library files I got the search_reset_index() to work, I guess. but not the search.php If you have any tips or suggestions... that would be nice
              Hide
              tmuras Tomasz Muras added a comment -

              The development is not finished there yet. While search should work already, the biggest work now (implementation-wise) is to do access control.
              Would you be interested in helping with the development/coding?

              Show
              tmuras Tomasz Muras added a comment - The development is not finished there yet. While search should work already, the biggest work now (implementation-wise) is to do access control. Would you be interested in helping with the development/coding?
              Hide
              nadavkav Nadav Kavalerchik added a comment -

              Like, you. I am very much busy with allot of work, which i already committed to.
              so I can not promises to help the development. but please feel free to ask anything, specific, when it arises.
              I can definitely help testing. ( I have set "watching" on the git repository )

              Show
              nadavkav Nadav Kavalerchik added a comment - Like, you. I am very much busy with allot of work, which i already committed to. so I can not promises to help the development. but please feel free to ask anything, specific, when it arises. I can definitely help testing. ( I have set "watching" on the git repository )
              Hide
              dougiamas Martin Dougiamas added a comment -

              I'm keeping an eye on this. Let us know where/how we can help.

              Show
              dougiamas Martin Dougiamas added a comment - I'm keeping an eye on this. Let us know where/how we can help.
              Hide
              tmuras Tomasz Muras added a comment -

              Just to update you: I'm very slow at the moment because of the lack of free time.
              My plan is to do some slow progress between now and end of September and then do most of the work until the end of October. That would leave November for the reviews and integration - so it can be included in Moodle 2.4 in December. I'd sill mark it as experimental feature and keep working on it in later releases.

              Show
              tmuras Tomasz Muras added a comment - Just to update you: I'm very slow at the moment because of the lack of free time. My plan is to do some slow progress between now and end of September and then do most of the work until the end of October. That would leave November for the reviews and integration - so it can be included in Moodle 2.4 in December. I'd sill mark it as experimental feature and keep working on it in later releases.
              Hide
              dougiamas Martin Dougiamas added a comment -

              Thanks, Tomasz!

              Code freeze for 2.4 is actually mid-October, and it really should already have had some HQ review by then too. it would be great if it gets that attention before then but otherwise it would have to wait for 2.5.

              Show
              dougiamas Martin Dougiamas added a comment - Thanks, Tomasz! Code freeze for 2.4 is actually mid-October, and it really should already have had some HQ review by then too. it would be great if it gets that attention before then but otherwise it would have to wait for 2.5.
              Hide
              tmuras Tomasz Muras added a comment -

              I will keep the ball rolling: http://moodle.org/mod/forum/discuss.php?d=208958 .
              I think it would be better to have even simper version available for 2.4 than to wait again until 2.5. I should get it done mid-October and I'll make sure that the implementation is not a surprise for any core dev.

              Show
              tmuras Tomasz Muras added a comment - I will keep the ball rolling: http://moodle.org/mod/forum/discuss.php?d=208958 . I think it would be better to have even simper version available for 2.4 than to wait again until 2.5. I should get it done mid-October and I'll make sure that the implementation is not a surprise for any core dev.
              Hide
              tmuras Tomasz Muras added a comment -

              Hi All,

              It's been another weekend when I planned to do some work on Global Search but couldn't. I hate to say it but the truth is that I don't have enough time to work on it and will not be able to deliver!
              At the very least I will dump all I've learnt so far onto the wiki page, to make it easier for another developer to start where I've stopped.

              Sorry about that, I've overestimated the resources I had at my disposal . If nobody picks it up, then it's very likely I will complete the coding at some point but absolutely don't count on it.

              cheers,
              Tomek

              Show
              tmuras Tomasz Muras added a comment - Hi All, It's been another weekend when I planned to do some work on Global Search but couldn't. I hate to say it but the truth is that I don't have enough time to work on it and will not be able to deliver! At the very least I will dump all I've learnt so far onto the wiki page, to make it easier for another developer to start where I've stopped. Sorry about that, I've overestimated the resources I had at my disposal . If nobody picks it up, then it's very likely I will complete the coding at some point but absolutely don't count on it. cheers, Tomek
              Hide
              libertymoodle Luis de Vasconcelos added a comment -

              Will this Global Search function be part of 2.4?

              Show
              libertymoodle Luis de Vasconcelos added a comment - Will this Global Search function be part of 2.4?
              Hide
              tmuras Tomasz Muras added a comment -

              No, it will not.

              Show
              tmuras Tomasz Muras added a comment - No, it will not.
              Hide
              libertymoodle Luis de Vasconcelos added a comment -

              wrensoft.com has a search engine that looks interesting. We plan to test it and see if it is a viable solution.

              How to create a search engine for your Moodle site
              http://www.wrensoft.com/zoom/support/tutorial_moodle.html

              How to index sites requiring authentication with Zoom
              http://www.wrensoft.com/zoom/support/auth.html

              Show
              libertymoodle Luis de Vasconcelos added a comment - wrensoft.com has a search engine that looks interesting. We plan to test it and see if it is a viable solution. How to create a search engine for your Moodle site http://www.wrensoft.com/zoom/support/tutorial_moodle.html How to index sites requiring authentication with Zoom http://www.wrensoft.com/zoom/support/auth.html
              Hide
              tsala Helen Foster added a comment -

              Hi Tomasz,

              I hope you don't mind me adding an infobox to http://docs.moodle.org/dev/Global_Search so people can easily see details of the project. I also removed the 2.3 template. Please amend as necessary.

              Show
              tsala Helen Foster added a comment - Hi Tomasz, I hope you don't mind me adding an infobox to http://docs.moodle.org/dev/Global_Search so people can easily see details of the project. I also removed the 2.3 template. Please amend as necessary.
              Hide
              aspark Alistair Spark added a comment -

              We have just finalised the implementation of the wrensoft search tool which works very well for us as a small institution with a limited number of long courses.

              The Version 7 alpha release is required - http://www.wrensoft.com/zoom/v7.html

              We have created student accounts for each course which index the appropriate Moodle courses which students will have access to.

              I wouldn't want to implement this on a site with 1500 courses. We essentially have 4 main courses which run for 3-5 years and have massive amounts of contents. For this, this software package is perfect.

              Show
              aspark Alistair Spark added a comment - We have just finalised the implementation of the wrensoft search tool which works very well for us as a small institution with a limited number of long courses. The Version 7 alpha release is required - http://www.wrensoft.com/zoom/v7.html We have created student accounts for each course which index the appropriate Moodle courses which students will have access to. I wouldn't want to implement this on a site with 1500 courses. We essentially have 4 main courses which run for 3-5 years and have massive amounts of contents. For this, this software package is perfect.
              Hide
              libertymoodle Luis de Vasconcelos added a comment -

              Alistair, why do you say you "wouldn't want to implement this on a site with 1500 courses"? Is that Zoom Search product not a good one for large Moodle installations? If not, can you describe why not? Thanks.

              Show
              libertymoodle Luis de Vasconcelos added a comment - Alistair, why do you say you "wouldn't want to implement this on a site with 1500 courses"? Is that Zoom Search product not a good one for large Moodle installations? If not, can you describe why not? Thanks.
              Hide
              aspark Alistair Spark added a comment -

              Hi Luis,

              The Zoom Search product is a great one and it can cope with 100 000s of files without any issues, their tech support is very good and ultra-responsive (free limited time trial, bug fix within 24 hours of emailing,...) - we were impressed by the cost&service.

              However, from an architectural standpoint we have gone through the route of creating student-like read-only accounts so that we have a search page for each student group (class/academic year combinations), those accounts have access to the exact same courses that the student group would have.

              Once you have created one working template for your site, creating more is just a question of changing the username&password combinations, scheduling the indexing and creating a new search box in the course (html block).

              Each of these search engines are then located in a different folder (eg http://domain.com/course1year1/search.php http://domain.com/course2year1/search.php) - this is automated by the FTP upload settings.

              However, we only needed to configure about 30 profiles in total. (ie Course 1 Year 1, Course 1 Year 2, Course 2 Year 1,....). If we had to have 1500 manual accounts in Moodle just for search, that would become a massive job to create accounts, configure all of the profiles, it becomes kind of a security risk, and maintaining this in the long run would be hard to keep track of.

              You could simply have a setup which would crawl your whole Moodle site and give you search results BUT if you have that many courses, it is more than likely that some will cover the same subject at various points, causing major user annoyance when they get the error message telling them they don't have permissions to view these results.

              Show
              aspark Alistair Spark added a comment - Hi Luis, The Zoom Search product is a great one and it can cope with 100 000s of files without any issues, their tech support is very good and ultra-responsive (free limited time trial, bug fix within 24 hours of emailing,...) - we were impressed by the cost&service. However, from an architectural standpoint we have gone through the route of creating student-like read-only accounts so that we have a search page for each student group (class/academic year combinations), those accounts have access to the exact same courses that the student group would have. Once you have created one working template for your site, creating more is just a question of changing the username&password combinations, scheduling the indexing and creating a new search box in the course (html block). Each of these search engines are then located in a different folder (eg http://domain.com/course1year1/search.php http://domain.com/course2year1/search.php ) - this is automated by the FTP upload settings. However, we only needed to configure about 30 profiles in total. (ie Course 1 Year 1, Course 1 Year 2, Course 2 Year 1,....). If we had to have 1500 manual accounts in Moodle just for search, that would become a massive job to create accounts, configure all of the profiles, it becomes kind of a security risk, and maintaining this in the long run would be hard to keep track of. You could simply have a setup which would crawl your whole Moodle site and give you search results BUT if you have that many courses, it is more than likely that some will cover the same subject at various points, causing major user annoyance when they get the error message telling them they don't have permissions to view these results.
              Hide
              dougiamas Martin Dougiamas added a comment -

              In the hackfest last week we came up with a pretty good design for completing this project.

              If this was a GSOC project, would you be available to mentor it Tomasz? I guess it would not be a lot of work for you (certainly much less than writing code!)

              Show
              dougiamas Martin Dougiamas added a comment - In the hackfest last week we came up with a pretty good design for completing this project. If this was a GSOC project, would you be available to mentor it Tomasz? I guess it would not be a lot of work for you (certainly much less than writing code!)
              Hide
              libertymoodle Luis de Vasconcelos added a comment -

              Thanks Martin. Is there any documentation for that design?

              Show
              libertymoodle Luis de Vasconcelos added a comment - Thanks Martin. Is there any documentation for that design?
              Hide
              nikunjness Nikunj Thakkar added a comment -

              Hello people,

              I am also interested in this project.One of our professor did a research for a similar kind of application and I got their research paper and currently I am working on prototype design for the project.I am first time applying for GSoC this year.So I have very little idea about approaching the project.But I am very much interested in this project.I will be updating the status of the application here.Any suggestions are appreciated.

              Show
              nikunjness Nikunj Thakkar added a comment - Hello people, I am also interested in this project.One of our professor did a research for a similar kind of application and I got their research paper and currently I am working on prototype design for the project.I am first time applying for GSoC this year.So I have very little idea about approaching the project.But I am very much interested in this project.I will be updating the status of the application here.Any suggestions are appreciated.
              Hide
              libertymoodle Luis de Vasconcelos added a comment -

              Martin, is that design that you created at the hackfest the same design described on: http://docs.moodle.org/dev/Global_search_%28GSoC2013%29

              Are you happy that that solution will only work with Moodle sites running on Apache? And other web servers, including IIS, are not supported?

              Show
              libertymoodle Luis de Vasconcelos added a comment - Martin, is that design that you created at the hackfest the same design described on: http://docs.moodle.org/dev/Global_search_%28GSoC2013%29 Are you happy that that solution will only work with Moodle sites running on Apache? And other web servers, including IIS, are not supported?
              Hide
              tmuras Tomasz Muras added a comment - - edited

              @Luis
              Is that because atm there is no DLL for solr extension? http://www.php.net/manual/en/solr.installation.php

              Show
              tmuras Tomasz Muras added a comment - - edited @Luis Is that because atm there is no DLL for solr extension? http://www.php.net/manual/en/solr.installation.php
              Hide
              xan Prateek Sachan added a comment -

              Here's a quick update of Moodle's Global Search feature. Currently, many of the things have been implemented and I would like to ask developers to try it and test it for any security leaks.

              The complete code is on Github here. The Global Search discussion is here.

              All the information on is given in the Global Search wiki:

              Steps on how to install and set-up Global Search prototype is given here: http://docs.moodle.org/dev/Global_search#Prototype:_version_1.0
              List of search features currently implemented: http://docs.moodle.org/dev/Global_search#Introduction
              Searchable content through Global Search: http://docs.moodle.org/dev/Global_search#Searchable_content_in_Global_Search

              I've also included a section Tests where there are many cases to test. Developers may test these cases or add their own. You may update the wiki with your own test cases and results of the test, which would help me in making Global Search better and prone to more security leaks.

              If you have any problem in setting-up Global Search, feel free to post here Testing Moodle's Global Search prototype or you may directly message me.

              Thanks.

              Show
              xan Prateek Sachan added a comment - Here's a quick update of Moodle's Global Search feature. Currently, many of the things have been implemented and I would like to ask developers to try it and test it for any security leaks. The complete code is on Github here . The Global Search discussion is here . All the information on is given in the Global Search wiki : Steps on how to install and set-up Global Search prototype is given here: http://docs.moodle.org/dev/Global_search#Prototype:_version_1.0 List of search features currently implemented: http://docs.moodle.org/dev/Global_search#Introduction Searchable content through Global Search: http://docs.moodle.org/dev/Global_search#Searchable_content_in_Global_Search I've also included a section Tests where there are many cases to test. Developers may test these cases or add their own. You may update the wiki with your own test cases and results of the test, which would help me in making Global Search better and prone to more security leaks. If you have any problem in setting-up Global Search, feel free to post here Testing Moodle's Global Search prototype or you may directly message me . Thanks.
              Show
              libertymoodle Luis de Vasconcelos added a comment - Tomasz, Will these compiled SOLR extensions not work on Windows? https://github.com/ecaron/php_solr.dll or http://downloads.php.net/pierre/php_solr-5.3-nts-svn20091122-vc9-x86.zip http://downloads.php.net/pierre/php_solr-5.3-svn20091122-vc9-x86.zip
              Hide
              nadavkav Nadav Kavalerchik added a comment -

              Installing php-solr-pecl extention on Debian/Ubuntu
              http://www.wijiti.com/help/37-jsolr/140-installing-apache-solr-php-extension

              Show
              nadavkav Nadav Kavalerchik added a comment - Installing php-solr-pecl extention on Debian/Ubuntu http://www.wijiti.com/help/37-jsolr/140-installing-apache-solr-php-extension
              Hide
              tmuras Tomasz Muras added a comment -

              Luis de Vasconcelos

              Luis,

              I'm using Debian and Ubuntu for the deployments, and this is what I'm testing. Tested instructions how to set it up there are on http://docs.moodle.org/dev/Global_search_%28GSoC2013%29 - I would recommend you to stick to LAMP environment. If you want to test it on Windows, then simply check those DLLs and let us know if they work for you.

              Tomek

              Show
              tmuras Tomasz Muras added a comment - Luis de Vasconcelos Luis, I'm using Debian and Ubuntu for the deployments, and this is what I'm testing. Tested instructions how to set it up there are on http://docs.moodle.org/dev/Global_search_%28GSoC2013%29 - I would recommend you to stick to LAMP environment. If you want to test it on Windows, then simply check those DLLs and let us know if they work for you. Tomek
              Hide
              xan Prateek Sachan added a comment - - edited

              Hi,
              Dan Poltawski, Eloy Lafuente (stronk7), Sam Hemelryk

              I wanted to add you all to this issue to help integrate Global Search into Moodle core.
              Aparup Banerjee advised me to add you.

              This is the original branch built on MOODLE_25 as base: https://github.com/prateeksachan/moodle/tree/gs2
              This is the rebased branch, (rebased off current master): https://github.com/prateeksachan/moodle/tree/gs2rebased

              You may find Global Search docs here: http://docs.moodle.org/dev/Global_search

              Thanks.

              Show
              xan Prateek Sachan added a comment - - edited Hi, Dan Poltawski , Eloy Lafuente (stronk7) , Sam Hemelryk I wanted to add you all to this issue to help integrate Global Search into Moodle core. Aparup Banerjee advised me to add you. This is the original branch built on MOODLE_25 as base: https://github.com/prateeksachan/moodle/tree/gs2 This is the rebased branch, (rebased off current master): https://github.com/prateeksachan/moodle/tree/gs2rebased You may find Global Search docs here: http://docs.moodle.org/dev/Global_search Thanks.
              Hide
              timhunt Tim Hunt added a comment -

              What about mod/quiz?

              How does this work? What do I need to do to enable Global search in a new module? I can't see anything on http://docs.moodle.org/dev/Global_search.

              Show
              timhunt Tim Hunt added a comment - What about mod/quiz? How does this work? What do I need to do to enable Global search in a new module? I can't see anything on http://docs.moodle.org/dev/Global_search .
              Hide
              xan Prateek Sachan added a comment - - edited

              Hi Tim.

              Thanks for the quick feedback. Actually the quiz module isn't covered under Global Search yet. I'll be writing the API for the remaining mods shortly. They have plenty of security issues and I restricted myself mainly to Resources and some smaller modules. These are the mods/resources for which I've written the GS APIs currently : http://docs.moodle.org/dev/Global_search#Searchable_content_in_Global_Search

              Edit: Every mod/resource has a different set of APIs to handle the content being indexed and its security which varies across different mods/resources.

              I've updated the Global Search docs. You may refer to http://docs.moodle.org/dev/Global_search_(GSoC2013)#Quick_setup_for_testing
              One needs to first download and install the pecl-php-solr extension, clone gs2rebased branch and set-up solr as mentioned in the docs.

              I've also included a readme file for the same purpose. Also, Tomasz installed installed GS on his own server (http://global-search.jmuras.com/) a few weeks back. (Currently, it hasn't been updated but you may view it. It works perfectly).

              Thanks.

              Show
              xan Prateek Sachan added a comment - - edited Hi Tim. Thanks for the quick feedback. Actually the quiz module isn't covered under Global Search yet. I'll be writing the API for the remaining mods shortly. They have plenty of security issues and I restricted myself mainly to Resources and some smaller modules. These are the mods/resources for which I've written the GS APIs currently : http://docs.moodle.org/dev/Global_search#Searchable_content_in_Global_Search Edit: Every mod/resource has a different set of APIs to handle the content being indexed and its security which varies across different mods/resources. I've updated the Global Search docs. You may refer to http://docs.moodle.org/dev/Global_search_(GSoC2013)#Quick_setup_for_testing One needs to first download and install the pecl-php-solr extension, clone gs2rebased branch and set-up solr as mentioned in the docs. I've also included a readme file for the same purpose. Also, Tomasz installed installed GS on his own server ( http://global-search.jmuras.com/ ) a few weeks back. (Currently, it hasn't been updated but you may view it. It works perfectly). Thanks.
              Hide
              drex Mark Drechsler added a comment -

              Noting that this doesn't have a fix version - anyone know when this is targeted for implementation?

              Show
              drex Mark Drechsler added a comment - Noting that this doesn't have a fix version - anyone know when this is targeted for implementation?
              Hide
              xan Prateek Sachan added a comment - - edited

              Hi Mark,

              The original branch was made on Moodle 2.5. I've now rebased it off the current master dev.

              My plan was to make it a part of core (2.6+) which would be released in November. I hope the feature gets integrated in the coming weeks. It is open for testing right now. You may see the forum entry: Moodle's Global Search

              Show
              xan Prateek Sachan added a comment - - edited Hi Mark, The original branch was made on Moodle 2.5. I've now rebased it off the current master dev. My plan was to make it a part of core (2.6+) which would be released in November. I hope the feature gets integrated in the coming weeks. It is open for testing right now. You may see the forum entry: Moodle's Global Search
              Hide
              dougiamas Martin Dougiamas added a comment -

              Hi Prateek,

              I don't want you to get your hopes up too much .... code freeze for 2.6 is two weeks or so away and there is a huge pile of work for us to deal with. I really doubt that there will be time for Global Search to get the review it needs. However, assuming the code is good then we should be able to land it early in the cycle for 2.7, probably January.

              Thanks for your work and I'm really keen to see it working!

              Cheers,
              Martin

              Show
              dougiamas Martin Dougiamas added a comment - Hi Prateek, I don't want you to get your hopes up too much .... code freeze for 2.6 is two weeks or so away and there is a huge pile of work for us to deal with. I really doubt that there will be time for Global Search to get the review it needs. However, assuming the code is good then we should be able to land it early in the cycle for 2.7, probably January. Thanks for your work and I'm really keen to see it working! Cheers, Martin
              Hide
              xan Prateek Sachan added a comment -

              Sure..maybe 2.7 next year then.

              Thanks.

              Show
              xan Prateek Sachan added a comment - Sure..maybe 2.7 next year then. Thanks.
              Hide
              tmuras Tomasz Muras added a comment -

              OK, So it looks to me, that the best way to go, is to wait for 2.6 release and new master for 2.7. As soon as we have a new branch, we should seize the opportunity and send the code for integration.

              Show
              tmuras Tomasz Muras added a comment - OK, So it looks to me, that the best way to go, is to wait for 2.6 release and new master for 2.7. As soon as we have a new branch, we should seize the opportunity and send the code for integration.
              Hide
              xan Prateek Sachan added a comment - - edited

              Yes. I hope it gets the review required by the next stable moodle (2.7) release. In the remaining months of this year, I'll try and write GS APIs for the remaining modules and then asking moodle integrators to review it.

              Thanks.

              Show
              xan Prateek Sachan added a comment - - edited Yes. I hope it gets the review required by the next stable moodle (2.7) release. In the remaining months of this year, I'll try and write GS APIs for the remaining modules and then asking moodle integrators to review it. Thanks.
              Hide
              dougiamas Martin Dougiamas added a comment - - edited

              The first reviewers at HQ will be the Moodle HQ BACKEND team.

              Of course anyone from the community is welcome to pitch in with reviews and comments here anytime.

              Show
              dougiamas Martin Dougiamas added a comment - - edited The first reviewers at HQ will be the Moodle HQ BACKEND team. Of course anyone from the community is welcome to pitch in with reviews and comments here anytime.
              Hide
              libertymoodle Luis de Vasconcelos added a comment -

              This new Global Search requires Apache Solr, so what will happen to Moodle sites that run under IIS on Windows? Will Global Search not work under IIS?

              Show
              libertymoodle Luis de Vasconcelos added a comment - This new Global Search requires Apache Solr, so what will happen to Moodle sites that run under IIS on Windows? Will Global Search not work under IIS?
              Hide
              xan Prateek Sachan added a comment -

              Hi Luis,

              As pointed out earlier, GS uses php solr extension. As long as you have php solr extension installed on your server, GS will work. It's just another php extension, so it won't affect Moodle sites. (I've just tested out GS on LAMP). You may find this link useful which underlines all the dependencies and their installation of php solr extension.

              Show
              xan Prateek Sachan added a comment - Hi Luis, As pointed out earlier, GS uses php solr extension. As long as you have php solr extension installed on your server, GS will work. It's just another php extension, so it won't affect Moodle sites. (I've just tested out GS on LAMP). You may find this link useful which underlines all the dependencies and their installation of php solr extension.
              Hide
              itamart Itamar Tzadok added a comment -

              Prateek, I haven't had the chance to examine thoroughly so off hand do you have any idea how easy or otherwise it would be to make this work with Solr php client? For windows installations it may not be straightforward to add a php_solr extension but the client may be a viable option.

              Show
              itamart Itamar Tzadok added a comment - Prateek, I haven't had the chance to examine thoroughly so off hand do you have any idea how easy or otherwise it would be to make this work with Solr php client? For windows installations it may not be straightforward to add a php_solr extension but the client may be a viable option.
              Hide
              xan Prateek Sachan added a comment - - edited

              Hi Itamar Tzadok,

              I researched a bit for choosing the client for interacting with solr. Making it work using the Solr Php client would be easy and do-able, though may require many modifications in the current implementation. I concentrated towards using the php_solr extension.

              There are many php_solr dlls available for Windows. I haven't checked them though. I used LAMP for GS implementation.

              https://github.com/ecaron/php_solr.dll
              http://downloads.php.net/pierre/php_solr-5.3-nts-svn20091122-vc9-x86.zip
              http://downloads.php.net/pierre/php_solr-5.3-svn20091122-vc9-x86.zip

              If you want, you may test it on Windows using the dlls, and notify me. It would be great.

              Edit: I installed php_solr extension using the github dll given above on Windows. The extension is working.

              Thanks.

              Show
              xan Prateek Sachan added a comment - - edited Hi Itamar Tzadok , I researched a bit for choosing the client for interacting with solr. Making it work using the Solr Php client would be easy and do-able, though may require many modifications in the current implementation. I concentrated towards using the php_solr extension. There are many php_solr dlls available for Windows. I haven't checked them though. I used LAMP for GS implementation. https://github.com/ecaron/php_solr.dll http://downloads.php.net/pierre/php_solr-5.3-nts-svn20091122-vc9-x86.zip http://downloads.php.net/pierre/php_solr-5.3-svn20091122-vc9-x86.zip If you want, you may test it on Windows using the dlls, and notify me. It would be great. Edit: I installed php_solr extension using the github dll given above on Windows. The extension is working. Thanks.
              Hide
              juerg.hoerner Juerg Hoerner added a comment -

              in your testmoodle I tried to find words in the attached pdf in the glossary. As example: "Introduction" . Maybe that this does not work yet.

              can I delete the index and reindexing all content, like in the old module?

              Thanks for your very important Global search, one of the most important module.

              Show
              juerg.hoerner Juerg Hoerner added a comment - in your testmoodle I tried to find words in the attached pdf in the glossary. As example: "Introduction" . Maybe that this does not work yet. can I delete the index and reindexing all content, like in the old module? Thanks for your very important Global search, one of the most important module.
              Hide
              xan Prateek Sachan added a comment -

              Hi Juerg Hoerner,

              Yes. You may delete your index and re-index content from cron. All specifications are giving in Global Search docs.

              Thanks.

              Show
              xan Prateek Sachan added a comment - Hi Juerg Hoerner , Yes. You may delete your index and re-index content from cron. All specifications are giving in Global Search docs. Thanks.
              Hide
              mandrews Mark Andrews added a comment -

              Hi Prateek,
              I've just had a play with this and it's great! - really pleased to hear it might make 2.7

              couple of questions/comments from testing though;

              • As it's a global search will it also be searching; people & courses? would help to make it truly global search. - Perhaps as an option that admin's could enable/disable
              • Also I wonder if there needs to be some tweaking done to the 'filter query' area as fields like 'From records having this author' could mean different things to different users; a participant looking for a particular forum post by a class mate to the up loader of a resource.

              Cheers
              Mark

              Show
              mandrews Mark Andrews added a comment - Hi Prateek, I've just had a play with this and it's great! - really pleased to hear it might make 2.7 couple of questions/comments from testing though; As it's a global search will it also be searching; people & courses? would help to make it truly global search. - Perhaps as an option that admin's could enable/disable Also I wonder if there needs to be some tweaking done to the 'filter query' area as fields like 'From records having this author' could mean different things to different users; a participant looking for a particular forum post by a class mate to the up loader of a resource. Cheers Mark
              Hide
              xan Prateek Sachan added a comment -

              Hi Mark,
              Thanks for the reviews.

              As it was the first version of GS, hence I restricted myself to only mods/resources. I still have some mods for which I would be writing GS APIs in next months. Only then, I could think about including users/courses.

              "From records having this author" means "all records that have an author as x". These are mainly uploaded PDFs/PPTs/etc. that have an "author" field in their metadata. I guess I should change the string a bit to suit the users. Thanks for offering your thoughts on this one.

              Show
              xan Prateek Sachan added a comment - Hi Mark, Thanks for the reviews. As it was the first version of GS, hence I restricted myself to only mods/resources. I still have some mods for which I would be writing GS APIs in next months. Only then, I could think about including users/courses. "From records having this author" means "all records that have an author as x". These are mainly uploaded PDFs/PPTs/etc. that have an "author" field in their metadata. I guess I should change the string a bit to suit the users. Thanks for offering your thoughts on this one.
              Hide
              dougiamas Martin Dougiamas added a comment -

              This is now in the backlog of the BACKEND team to review.

              Show
              dougiamas Martin Dougiamas added a comment - This is now in the backlog of the BACKEND team to review.
              Hide
              timdrescher Tim Drescher added a comment -

              We would really love a global search. It would greatly assist our staff/students in finding the course or information they need.

              Show
              timdrescher Tim Drescher added a comment - We would really love a global search. It would greatly assist our staff/students in finding the course or information they need.
              Hide
              nebgor Aparup Banerjee added a comment - - edited

              Please note there could be other plugins utilizing solr directly
              (such as https://moodle.org/plugins/view.php?plugin=tool_coursesearch)
              so steps could be made wrt allowing migration of those to use the global search api to access the installed search engine (the api in moodle core after this patch).

              or to just keep those separate (ie: self managed instance(s) of solr), but this would lead to more inefficient usage of solr.

              ps: it would be great if someone could lend even a glancing eye to this for a review..

              Show
              nebgor Aparup Banerjee added a comment - - edited Please note there could be other plugins utilizing solr directly (such as https://moodle.org/plugins/view.php?plugin=tool_coursesearch ) so steps could be made wrt allowing migration of those to use the global search api to access the installed search engine (the api in moodle core after this patch). or to just keep those separate (ie: self managed instance(s) of solr), but this would lead to more inefficient usage of solr. ps: it would be great if someone could lend even a glancing eye to this for a review..
              Hide
              xan Prateek Sachan added a comment -

              The Global Search implementation differs from that of the coursesearch plugin as they are using completely different Solr php-client-libraries for interacting with the Solr server through Moodle. Therefore, the API in moodle core after this patch wouldn't affect course search.

              Show
              xan Prateek Sachan added a comment - The Global Search implementation differs from that of the coursesearch plugin as they are using completely different Solr php-client-libraries for interacting with the Solr server through Moodle. Therefore, the API in moodle core after this patch wouldn't affect course search.
              Hide
              libertymoodle Luis de Vasconcelos added a comment - - edited

              Adam Morris and antriver have developed the moodle-block_search block. See https://github.com/antriver/moodle-block_search and https://moodle.org/mod/forum/discuss.php?d=239646.

              How does that search block compare with this Global Search 2 feature?

              Show
              libertymoodle Luis de Vasconcelos added a comment - - edited Adam Morris and antriver have developed the moodle-block_search block. See https://github.com/antriver/moodle-block_search and https://moodle.org/mod/forum/discuss.php?d=239646 . How does that search block compare with this Global Search 2 feature?
              Hide
              amirelion Amir Elion added a comment - - edited

              Luis, I dd some initial testing with the suggested block.
              it does look like a good start.
              It searches course and activity names and description, and can be configured to choose which ones to search.
              It also seems to handle permissions (with ability to show grayed out/not show users inaccessible search results), wildcards, "quoted" search terms, exclude words.
              Also classifies and allows quick navigation by types of search results and

              It lacks search inside activity content - e.g. page content, glossary terms, wiki pages, forum posts, etc.
              If the above can be added, I think it would be a good candidate for this.

              Show
              amirelion Amir Elion added a comment - - edited Luis, I dd some initial testing with the suggested block. it does look like a good start. It searches course and activity names and description, and can be configured to choose which ones to search. It also seems to handle permissions (with ability to show grayed out/not show users inaccessible search results), wildcards, "quoted" search terms, exclude words. Also classifies and allows quick navigation by types of search results and It lacks search inside activity content - e.g. page content, glossary terms, wiki pages, forum posts, etc. If the above can be added, I think it would be a good candidate for this.
              Hide
              tmuras Tomasz Muras added a comment - - edited

              As far as I can see this search implementation does not do any relevancy search. It will simply return all rows with a given keyword, not neceserily the relevant first. This will not work well for more than 10 results.

              As this is simply implemented as a DB query with like %keyword% clause it will not scale - this can only be used on smaller sites.

              You can also compare it with solr features: http://lucene.apache.org/solr/features.html - most of them will be missing.

              Show
              tmuras Tomasz Muras added a comment - - edited As far as I can see this search implementation does not do any relevancy search. It will simply return all rows with a given keyword, not neceserily the relevant first. This will not work well for more than 10 results. As this is simply implemented as a DB query with like %keyword% clause it will not scale - this can only be used on smaller sites. You can also compare it with solr features: http://lucene.apache.org/solr/features.html - most of them will be missing.
              Hide
              xan Prateek Sachan added a comment -

              As Moodle sites become big, it becomes very difficult (concerning speed and optimization) to do direct DB searches. The whole concept behind "Global Search" was to make the entire content within modules and resources searchable alongwith providing complex search queries. I highly doubt if these could be implemented via direct DB searches in an optimized way.

              That's why indexing was used which is the obvious solution for making huge chunks of data searchable.

              Also, the kinds of queries that one would use for searching will put a heavy load on your database server.

              Thanks.

              Show
              xan Prateek Sachan added a comment - As Moodle sites become big, it becomes very difficult (concerning speed and optimization) to do direct DB searches. The whole concept behind "Global Search" was to make the entire content within modules and resources searchable alongwith providing complex search queries . I highly doubt if these could be implemented via direct DB searches in an optimized way. That's why indexing was used which is the obvious solution for making huge chunks of data searchable. Also, the kinds of queries that one would use for searching will put a heavy load on your database server. Thanks.
              Hide
              binare Eugene Venter added a comment - - edited

              We're currently planning on using this implementation for one of our clients. The current state of the code seems to need more work and optimisation and we've started fixing up a few things (with a lot more to come).

              For a start, we'll be pushing our contributable updates here: https://github.com/catalyst/moodle

              Show
              binare Eugene Venter added a comment - - edited We're currently planning on using this implementation for one of our clients. The current state of the code seems to need more work and optimisation and we've started fixing up a few things (with a lot more to come). For a start, we'll be pushing our contributable updates here: https://github.com/catalyst/moodle
              Hide
              dougiamas Martin Dougiamas added a comment -

              Pushed back to BACKEND to look at in this sprint. Needs a basic review before going to integration.

              Show
              dougiamas Martin Dougiamas added a comment - Pushed back to BACKEND to look at in this sprint. Needs a basic review before going to integration.
              Hide
              jonathan Jonathan Harker added a comment - - edited

              Here is my 2¢

              1. I'd say that as it is, it is not sufficiently loosely coupled for use with anything other than Solr. A rewrite would be required for pluggable backends for Elastic et al.
              2. The Solr implementation should make use of dynamic fields, which requires much less dependence on Solr schema.xml configuration. Dynamic fields are in the default example configuration, and work fine with 1.4, 3.x, and 4.x
              3. The code needs a bunch of clean-up - e.g. everything in search/documents is unreachable.
              4. Pagination should be done via the Solr API rather than fetching a hundred documents and hoping for the best.
              5. We've done some work to display facets, which should be a part of this solution.

              Show
              jonathan Jonathan Harker added a comment - - edited Here is my 2¢ 1. I'd say that as it is, it is not sufficiently loosely coupled for use with anything other than Solr. A rewrite would be required for pluggable backends for Elastic et al. 2. The Solr implementation should make use of dynamic fields, which requires much less dependence on Solr schema.xml configuration. Dynamic fields are in the default example configuration, and work fine with 1.4, 3.x, and 4.x 3. The code needs a bunch of clean-up - e.g. everything in search/documents is unreachable. 4. Pagination should be done via the Solr API rather than fetching a hundred documents and hoping for the best. 5. We've done some work to display facets, which should be a part of this solution.
              Hide
              tmuras Tomasz Muras added a comment -

              1. I wouldn't worry about plug-ability too much. I think that chances for anyone to implement any other backend are minimal, let's worry when someone tries.
              2. Yeah, why not.
              3. It's most likely an old core - you're right, it should be cleaned up.
              4. It's going to be very difficult to replicate all Moodle logic (permissions check) on the solr side. I suggest first version with keeping it in Moodle, then building from there. First step could be to filter out obvious non-accessible documents and double check remaining ones with Moodle.
              5. Phase 2 maybe? Let's get at least minimal working version into Moodle core.

              Show
              tmuras Tomasz Muras added a comment - 1. I wouldn't worry about plug-ability too much. I think that chances for anyone to implement any other backend are minimal, let's worry when someone tries. 2. Yeah, why not. 3. It's most likely an old core - you're right, it should be cleaned up. 4. It's going to be very difficult to replicate all Moodle logic (permissions check) on the solr side. I suggest first version with keeping it in Moodle, then building from there. First step could be to filter out obvious non-accessible documents and double check remaining ones with Moodle. 5. Phase 2 maybe? Let's get at least minimal working version into Moodle core.
              Hide
              jonathan Jonathan Harker added a comment - - edited

              I think it should be possible to do the permissions thing. There appear to be two tricks to doing permissions right:

              1. make each module responsible at index time for ensuring that anything needed to filter on permissions is indexed in a field in the module document (e.g. course ID, hidden, visible, deleted, etc), and
              2. make each module responsible at query time for figuring out what $USER can and cannot see, and apply the right filter (using the above).

              Using dynamic fields, different document types (e.g. course, lesson, forum) can have fields in common, one of which should probably be course_id - this will at the very least enable us to restrict results to the courses $USER is enrolled in by using something like

               'course_id:(' . join(',', $enrolled_courses) . ')' 

              It may be a week or two before we get time to collapse the work we've done here into something that deserves a PULL request... there are some curly bespoke bits we'll have to remove first!

              Show
              jonathan Jonathan Harker added a comment - - edited I think it should be possible to do the permissions thing. There appear to be two tricks to doing permissions right: 1. make each module responsible at index time for ensuring that anything needed to filter on permissions is indexed in a field in the module document (e.g. course ID, hidden, visible, deleted, etc), and 2. make each module responsible at query time for figuring out what $USER can and cannot see, and apply the right filter (using the above). Using dynamic fields, different document types (e.g. course, lesson, forum) can have fields in common, one of which should probably be course_id - this will at the very least enable us to restrict results to the courses $USER is enrolled in by using something like 'course_id:(' . join(',', $enrolled_courses) . ')' It may be a week or two before we get time to collapse the work we've done here into something that deserves a PULL request... there are some curly bespoke bits we'll have to remove first!
              Hide
              tmuras Tomasz Muras added a comment -

              That makes sense. It will require a lot of testing but it would be a Holy Grail of Search if you can put that logic into solr side.

              Show
              tmuras Tomasz Muras added a comment - That makes sense. It will require a lot of testing but it would be a Holy Grail of Search if you can put that logic into solr side.
              Hide
              xan Prateek Sachan added a comment -

              Hi all,

              This is just an update. The official php-pecl-solr extensions for Solr 4.x[1] have been released. I'll be testing it out and updating the Docs in the coming week.

              Thanks.

              [1]: http://pecl.php.net/package/solr

              Show
              xan Prateek Sachan added a comment - Hi all, This is just an update. The official php-pecl-solr extensions for Solr 4.x [1] have been released. I'll be testing it out and updating the Docs in the coming week. Thanks. [1] : http://pecl.php.net/package/solr
              Hide
              danielneis Daniel Neis added a comment -

              Hello,

              i've managed to make it work on top of current master (2.7 beta):
              https://github.com/danielneis/moodle/tree/MDL-31989

              I've tested it creating a medium test course (with 500 forum posts + 200 pages) and it worked well.
              Also, the current php-pecl-solr extension version 2.0.0b works well with the Solr 4.8.0

              Kind regards,
              Daniel

              Show
              danielneis Daniel Neis added a comment - Hello, i've managed to make it work on top of current master (2.7 beta): https://github.com/danielneis/moodle/tree/MDL-31989 I've tested it creating a medium test course (with 500 forum posts + 200 pages) and it worked well. Also, the current php-pecl-solr extension version 2.0.0b works well with the Solr 4.8.0 Kind regards, Daniel
              Hide
              nebgor Aparup Banerjee added a comment -

              Thanks Daniel, good to know core hasn't changed that much for this.

              2.7 is due for release next week so;
              It'll be great if Jonathan Harker's excellent suggestions on index/query time permissions API were ready so that this could be incorporated early into 2.8's development. Scrutiny on security seems the major point on getting this integrated.

              Show
              nebgor Aparup Banerjee added a comment - Thanks Daniel, good to know core hasn't changed that much for this. 2.7 is due for release next week so; It'll be great if Jonathan Harker 's excellent suggestions on index/query time permissions API were ready so that this could be incorporated early into 2.8's development. Scrutiny on security seems the major point on getting this integrated.
              Hide
              martinlanghoff Martín Langhoff added a comment -

              Looking at using this (and helping polish it along the way. Jonathan Harker - you mention you have done some work on facets. Is that available anywhere ? Thanks!

              Show
              martinlanghoff Martín Langhoff added a comment - Looking at using this (and helping polish it along the way. Jonathan Harker - you mention you have done some work on facets. Is that available anywhere ? Thanks!
              Hide
              dmonllao David Monllaó added a comment -

              Reviewing this now

              Show
              dmonllao David Monllaó added a comment - Reviewing this now
              Hide
              dmonllao David Monllaó added a comment -

              Assigning back to Prateek as he expressed his will to continue helping here

              Show
              dmonllao David Monllaó added a comment - Assigning back to Prateek as he expressed his will to continue helping here
              Hide
              dmonllao David Monllaó added a comment -

              Thanks for working on this Prateek; it looks very good and works, although before submitting it to integration we would need to consider a few things.

              1. Multiple search plugins support: Even if supporting multiple plugins is deferred to a future release there are parts of the code that would be better to abstract now, before 3rd parties begins adding support for global search in their own plugins, otherwise they will have to update a big bunch of code (even abstracting now they will probably have to change something in future)
                1. To abstract before it gets integrated
                  1. We shouldn't allow hardcoded solr calls in plugins code, we should provide a common API a core_search\document for example each search plugin can extend it, but the modules will reference the generic document.
                  2. global_search_engine should be abstracted too, we can have a \core_search\engine + a \core_search\solr\engine (which would be moved to \search_solr\engine in future)
                2. Leaving to abstract in future:
                  1. Directory structure with new plugin type (search/PLUGINSDIR/solr, search/PLUGINSDIR/elastic...)
                  2. Administration settings pages reading a dynamic list of plugins from search/PLUGINSDIR
                  3. Settings values; move them to the future search_solr (example name) plugin space
              2. Modules (or plugins/components) implementing FEATURE_GLOBAL_SEARCH: You can easily abstract most of $MODULENAME_search_access(), including groups, activities visibility... You can abstract even the 'mod/$PLUGINNAME:view' capability name in most cases, also in this case is more appropriate to use has_capability() than require_capability + try & catch; all this would be easier using OOP, related with the next point
              3. I would move search-related stuff to plugin_or_component/db/search.php, which would be common for all types of plugins; IMO this patch adds too much logic into lib.php files and in future we might be interested in adding support for other plugin types or components like in https://github.com/catalyst/moodle/commit/5aba875be1ddc2690a6e2ffd67f766ccb51b8fc0#diff-3 search_get_iterators()
              4. In general, I think that we can move a few functions files to classes according to their behaviour, proposed changes below
              5. Again, related with support for other plugin types or components; I would rename module and modulelink fields to component and component link
              6. The iterators are using timemodified/modified fields intensively, some of the used fields are not indexed in the database, I'm not sure about the current policies to add new index to db
              7. Coding styles (in my dev branch I've fixed what git --check proposes me, but local_moodlecheck and local_codechecker should also pass with a 100%)
              8. Unit tests should use another collection, they are currently affecting the "production" one
              9. We will also need acceptance tests as this is a new feature http://docs.moodle.org/dev/Acceptance_testing but I don't think it will be easy to do...
              10. A renderer.php should be used for all display-related stuff
              11. Silly question, I can't see how will search/install.php be accessed, probably I am missing something
              12. Use fullname() rather than $user->firstname . ' ' . $user->lastname
              13. You could try using get_fast_modinfo($course)->instances[$moduletype][$instanceid] in $MODULENAME_search_get_documents() and $MODULENAME_search_access(), you can probably save some db queries
              14. The new MDL-31089 functions can be used in forum_search_access() to control the discussions groups.
              15. Silly question again, about gmdate('Y-m-d\TH:i:s\Z', $page->timemodified). Are you required to use this specific format?
              16. As a personal preference I try to avoid files like search/solr/connection.php, including files inside other files can overwrite other vars, a function would be an alternative.
              17. No hardcoded text in scripts please, all in lang strings
              18. Class files should only include 1 class
              19. Any reason for set_time_limit(576000);? Can't just be 0?
              20. If possible would also be good to address/comment the pagination and access points Jonathan Harker commented about, I'm not a solr expert so I guess you all can judge better than me

              Proposed code organization changes:

              • classes/document.php (representation of a document)
              • classes/engine.php (ABSTRACT)
                • search_get_modules() -> search_get_components() to include all plugin types + core
                • search_get_iterators()
                • search_reset_config()
                • search_get_config()
                • search_get_config_file()
                • Generic abstract methods declaration
              • classes/solr/engine.php (EXTENDS classes/engine.php)
                • Renaming methods to fit classes/engine.php common API methods
                • search/solr/connection.php
                • search/solr/lib.php
                • solr_check_server() (STATIC)
                • solr_installed() (STATIC)
                • solr_execute_query()
                • solr_set_query()
                • solr_set_highlight()
                • solr_prepare_filter()
                • solr_add_fields()
                • solr_add_highlight_content()
                • solr_merge_highlight_field_values()
                • solr_query_response()
                • solr_post_file()
              • classes/indexer.php (DEPENDS ON classes/engine.php, all specific solr logic should be in classes/solr/engine)
                • search_optimize_index()
                • search_index()
                • search_index_files()
                • search_delete_index()
                • search_delete_index_by_id()
              • classes/search_form.php
                • class core_search_search_form
              • renderer.php
                • search_display_results()
                • search_get_user_url()
                • //search_display_form() IMO that function is unnecessary

              I've created a dev/testing branch where I've:

              • Rebased and updated the current implementation on top of latest 2.8dev
              • Considered that the 2.0 PECL solr package was released and 2.x -> solr 4.x and 1.x -> solr 3.x
              • Included catalyst and Daniel fixes
              • Removed styles (ignore me completely)

              *https://github.com/dmonllao/moodle/compare/MDL-31989_master*

              Show
              dmonllao David Monllaó added a comment - Thanks for working on this Prateek; it looks very good and works, although before submitting it to integration we would need to consider a few things. Multiple search plugins support: Even if supporting multiple plugins is deferred to a future release there are parts of the code that would be better to abstract now, before 3rd parties begins adding support for global search in their own plugins, otherwise they will have to update a big bunch of code (even abstracting now they will probably have to change something in future) To abstract before it gets integrated We shouldn't allow hardcoded solr calls in plugins code, we should provide a common API a core_search\document for example each search plugin can extend it, but the modules will reference the generic document. global_search_engine should be abstracted too, we can have a \core_search\engine + a \core_search\solr\engine (which would be moved to \search_solr\engine in future) Leaving to abstract in future: Directory structure with new plugin type (search/PLUGINSDIR/solr, search/PLUGINSDIR/elastic...) Administration settings pages reading a dynamic list of plugins from search/PLUGINSDIR Settings values; move them to the future search_solr (example name) plugin space Modules (or plugins/components) implementing FEATURE_GLOBAL_SEARCH: You can easily abstract most of $MODULENAME_search_access(), including groups, activities visibility... You can abstract even the 'mod/$PLUGINNAME:view' capability name in most cases, also in this case is more appropriate to use has_capability() than require_capability + try & catch; all this would be easier using OOP, related with the next point I would move search-related stuff to plugin_or_component/db/search.php, which would be common for all types of plugins; IMO this patch adds too much logic into lib.php files and in future we might be interested in adding support for other plugin types or components like in https://github.com/catalyst/moodle/commit/5aba875be1ddc2690a6e2ffd67f766ccb51b8fc0#diff-3 search_get_iterators() In general, I think that we can move a few functions files to classes according to their behaviour, proposed changes below Again, related with support for other plugin types or components; I would rename module and modulelink fields to component and component link The iterators are using timemodified/modified fields intensively, some of the used fields are not indexed in the database, I'm not sure about the current policies to add new index to db Coding styles (in my dev branch I've fixed what git --check proposes me, but local_moodlecheck and local_codechecker should also pass with a 100%) Unit tests should use another collection, they are currently affecting the "production" one We will also need acceptance tests as this is a new feature http://docs.moodle.org/dev/Acceptance_testing but I don't think it will be easy to do... A renderer.php should be used for all display-related stuff Silly question, I can't see how will search/install.php be accessed, probably I am missing something Use fullname() rather than $user->firstname . ' ' . $user->lastname You could try using get_fast_modinfo($course)->instances [$moduletype] [$instanceid] in $MODULENAME_search_get_documents() and $MODULENAME_search_access(), you can probably save some db queries The new MDL-31089 functions can be used in forum_search_access() to control the discussions groups. Silly question again, about gmdate('Y-m-d\TH:i:s\Z', $page->timemodified) . Are you required to use this specific format? As a personal preference I try to avoid files like search/solr/connection.php, including files inside other files can overwrite other vars, a function would be an alternative. No hardcoded text in scripts please, all in lang strings Class files should only include 1 class Any reason for set_time_limit(576000); ? Can't just be 0? If possible would also be good to address/comment the pagination and access points Jonathan Harker commented about, I'm not a solr expert so I guess you all can judge better than me Proposed code organization changes: classes/document.php (representation of a document) classes/engine.php (ABSTRACT) search_get_modules() -> search_get_components() to include all plugin types + core search_get_iterators() search_reset_config() search_get_config() search_get_config_file() Generic abstract methods declaration classes/solr/engine.php (EXTENDS classes/engine.php) Renaming methods to fit classes/engine.php common API methods search/solr/connection.php search/solr/lib.php solr_check_server() (STATIC) solr_installed() (STATIC) solr_execute_query() solr_set_query() solr_set_highlight() solr_prepare_filter() solr_add_fields() solr_add_highlight_content() solr_merge_highlight_field_values() solr_query_response() solr_post_file() classes/indexer.php (DEPENDS ON classes/engine.php, all specific solr logic should be in classes/solr/engine) search_optimize_index() search_index() search_index_files() search_delete_index() search_delete_index_by_id() classes/search_form.php class core_search_search_form renderer.php search_display_results() search_get_user_url() //search_display_form() IMO that function is unnecessary I've created a dev/testing branch where I've: Rebased and updated the current implementation on top of latest 2.8dev Considered that the 2.0 PECL solr package was released and 2.x -> solr 4.x and 1.x -> solr 3.x Included catalyst and Daniel fixes Removed styles (ignore me completely) * https://github.com/dmonllao/moodle/compare/MDL-31989_master*
              Hide
              dmonllao David Monllaó added a comment -

              As commented in BACKEND meeting, as there is still a bit of work to do before this can be integrated we can move this issue to an epic and open subtasks, I will begin tomorrow.

              Show
              dmonllao David Monllaó added a comment - As commented in BACKEND meeting, as there is still a bit of work to do before this can be integrated we can move this issue to an epic and open subtasks, I will begin tomorrow.
              Hide
              xan Prateek Sachan added a comment -

              Hi all,

              I'm happy to work on this as suggested by David and have actually started the implementation. (It might take some time though)

              Thanks.

              Show
              xan Prateek Sachan added a comment - Hi all, I'm happy to work on this as suggested by David and have actually started the implementation. (It might take some time though) Thanks.
              Hide
              dmonllao David Monllaó added a comment -

              Hi Prateek,

              Good to hear that! As commented by private message, today I will create linked issues to this epic so we can split the work in reasonable pieces, feel free to assign issues to you.

              Show
              dmonllao David Monllaó added a comment - Hi Prateek, Good to hear that! As commented by private message, today I will create linked issues to this epic so we can split the work in reasonable pieces, feel free to assign issues to you.
              Hide
              salvetore Michael de Raadt added a comment -

              I've taken this out of development so that Prateek, or anyone else interested, can work on the tasks David has defined.

              Show
              salvetore Michael de Raadt added a comment - I've taken this out of development so that Prateek, or anyone else interested, can work on the tasks David has defined.
              Hide
              danielneis Daniel Neis added a comment -

              Hello, David

              thanks for your review and for integrating my changes!
              I've made some more on top of your work:

              https://github.com/danielneis/moodle/tree/MDL-31989_master

              Hope that you like,
              Daniel

              Show
              danielneis Daniel Neis added a comment - Hello, David thanks for your review and for integrating my changes! I've made some more on top of your work: https://github.com/danielneis/moodle/tree/MDL-31989_master Hope that you like, Daniel
              Hide
              danielneis Daniel Neis added a comment -

              Hello,

              i just added a commit that moves the forms to search/classes directory as David suggested.

              Kind regards,
              Daniel

              Show
              danielneis Daniel Neis added a comment - Hello, i just added a commit that moves the forms to search/classes directory as David suggested. Kind regards, Daniel
              Hide
              dmonllao David Monllaó added a comment -

              Hi Daniel,

              Thanks for your contribution, I will probably not work on this issue though, Prateek Sachan expressed his interest in working on it so probably he will take a look at them.

              Regards,
              David

              Show
              dmonllao David Monllaó added a comment - Hi Daniel, Thanks for your contribution, I will probably not work on this issue though, Prateek Sachan expressed his interest in working on it so probably he will take a look at them. Regards, David
              Hide
              salvetore Michael de Raadt added a comment -

              It's good that progress has been made on this issue. I'm taking it out of the sprint so we can close the sprint. It would be good if progress can continue now that some changes have been suggested. I'll leave that to Prateek in the short-term.

              Show
              salvetore Michael de Raadt added a comment - It's good that progress has been made on this issue. I'm taking it out of the sprint so we can close the sprint. It would be good if progress can continue now that some changes have been suggested. I'll leave that to Prateek in the short-term.
              Hide
              tuekorsgaard Tue Korsgaard added a comment -

              Any news on whether development will resume on t this issue?

              Show
              tuekorsgaard Tue Korsgaard added a comment - Any news on whether development will resume on t this issue?
              Hide
              nbruley Nathan Bruley added a comment -

              Where do I go to download the global search files solconfig.xml and schema.xml mentioned at (https://docs.moodle.org/dev/Global_search_%28GSoC2013%29#Installing_PHP_Solr_extension) as well as the plugin files if I don't use git?
              Thanks.

              Show
              nbruley Nathan Bruley added a comment - Where do I go to download the global search files solconfig.xml and schema.xml mentioned at ( https://docs.moodle.org/dev/Global_search_%28GSoC2013%29#Installing_PHP_Solr_extension ) as well as the plugin files if I don't use git? Thanks.
              Show
              tmuras Tomasz Muras added a comment - Nathan Bruley try those: https://github.com/danielneis/moodle/tree/MDL-31989_master/search/solr/solr_conf
              Hide
              td0s toby saunders added a comment -

              I was wondering if there was a timescale for merging this into the stable branch at the moment?

              Show
              td0s toby saunders added a comment - I was wondering if there was a timescale for merging this into the stable branch at the moment?
              Hide
              dmonllao David Monllaó added a comment -

              For reference, I've rebased the review branch on top of latest weekly master https://github.com/dmonllao/moodle/tree/MDL-31989_master

              Show
              dmonllao David Monllaó added a comment - For reference, I've rebased the review branch on top of latest weekly master https://github.com/dmonllao/moodle/tree/MDL-31989_master
              Hide
              danielneis Daniel Neis added a comment -

              Hello,

              I've started some code to support elastic search:
              https://github.com/danielneis/moodle/commit/d24d6e165e440c69240a3f2e465c65ba3e60a16a

              I've first tackled with changes needed to add the another search engine.
              Next step is implementing the functions to add documents to index.
              As elastic search has a nice restful interface, similar to the existing solr api, it should be easy to implement =)

              Show
              danielneis Daniel Neis added a comment - Hello, I've started some code to support elastic search: https://github.com/danielneis/moodle/commit/d24d6e165e440c69240a3f2e465c65ba3e60a16a I've first tackled with changes needed to add the another search engine. Next step is implementing the functions to add documents to index. As elastic search has a nice restful interface, similar to the existing solr api, it should be easy to implement =)
              Hide
              skodak Petr Skoda added a comment -

              The searching part is easy - the problem is to do the access control properly.

              Show
              skodak Petr Skoda added a comment - The searching part is easy - the problem is to do the access control properly.
              Hide
              mchaney2010 Mark Chaney added a comment -

              Hi all, is there an update regarding the search functionality and any perceived Moodle version when it may be released (even if its as an experimental setting).

              Show
              mchaney2010 Mark Chaney added a comment - Hi all, is there an update regarding the search functionality and any perceived Moodle version when it may be released (even if its as an experimental setting).
              Hide
              danielneis Daniel Neis added a comment - - edited

              Hello,

              I've got Global Search basic indexing and searching function working with elastic search:
              https://github.com/danielneis/moodle/tree/MDL-31989_master

              The searching part is easy, and that is why elastic search is cool =)
              Also, the creation of the index and the setup is easy too, as you can see at
              https://www.elastic.co/guide/en/elasticsearch/reference/1.5/_installation.html

              You just need a working JVM and do not need to bother with configuration XML,
              until now you also need to know nothing about tokenizing, stemming, common words and so.

              About the access control, it is already being done: may not be the best, but it may work as a function is
              called for each document to test if user can see it. Each module implements it's own "search_access" function,
              that at least tests capabilities like 'mod/book:read' in the context of the document.

              Any suggestions to improve this are welcome =)

              Kind regards,
              Daniel

              Show
              danielneis Daniel Neis added a comment - - edited Hello, I've got Global Search basic indexing and searching function working with elastic search: https://github.com/danielneis/moodle/tree/MDL-31989_master The searching part is easy, and that is why elastic search is cool =) Also, the creation of the index and the setup is easy too, as you can see at https://www.elastic.co/guide/en/elasticsearch/reference/1.5/_installation.html You just need a working JVM and do not need to bother with configuration XML, until now you also need to know nothing about tokenizing, stemming, common words and so. About the access control, it is already being done: may not be the best, but it may work as a function is called for each document to test if user can see it. Each module implements it's own "search_access" function, that at least tests capabilities like 'mod/book:read' in the context of the document. Any suggestions to improve this are welcome =) Kind regards, Daniel
              Hide
              danielneis Daniel Neis added a comment -

              Hello,

              here is an updated version of global search.
              Engines (solr and elasticsearch) are now plugins and there is a new class core_search that is used by the block and the renderers.
              Also, all configs are now under "Plugins - Global Search" sothere is no more top level Global Search.
              Still have to move some strings from lang/en/admin.php to lang/en/search.php and some other cleanups but things are getting better organized at least =)

              Kind regards,
              Daniel

              Show
              danielneis Daniel Neis added a comment - Hello, here is an updated version of global search. Engines (solr and elasticsearch) are now plugins and there is a new class core_search that is used by the block and the renderers. Also, all configs are now under "Plugins - Global Search" sothere is no more top level Global Search. Still have to move some strings from lang/en/admin.php to lang/en/search.php and some other cleanups but things are getting better organized at least =) Kind regards, Daniel
              Hide
              dmonllao David Monllaó added a comment -

              Thanks for working on this Daniel, we will try to review it properly as soon as possible.

              Show
              dmonllao David Monllaó added a comment - Thanks for working on this Daniel, we will try to review it properly as soon as possible.
              Hide
              jumb0eadg A B added a comment - - edited

              Thanks for the work Daniel. You mentioned https://github.com/danielneis/moodle/tree/MDL-31989_master,
              but MDL-31989_master branch has been deleted. Can you please verify. Does the branch MDL-31989 have the same code?
              I installed using MDL-31989, but when using solr, when I click on the statistics link, I get the following error:

              Show
              jumb0eadg A B added a comment - - edited Thanks for the work Daniel. You mentioned https://github.com/danielneis/moodle/tree/MDL-31989_master , but MDL-31989 _master branch has been deleted. Can you please verify. Does the branch MDL-31989 have the same code? I installed using MDL-31989 , but when using solr, when I click on the statistics link, I get the following error:
              Hide
              danielneis Daniel Neis added a comment -

              Hello, A B

              you are usig the right branch that is MDL-31989.
              I've refactored the code but have not tested that much with the Solr engine.
              Just yesterday someone made a pull request adding a missing } to the solr/classes/engine.php .
              Feel free to fix the code and submit a pull request.

              Kind regards,
              Daniel

              Show
              danielneis Daniel Neis added a comment - Hello, A B you are usig the right branch that is MDL-31989 . I've refactored the code but have not tested that much with the Solr engine. Just yesterday someone made a pull request adding a missing } to the solr/classes/engine.php . Feel free to fix the code and submit a pull request. Kind regards, Daniel
              Hide
              jumb0eadg A B added a comment -

              Thanks for the info. Actually I am the one who created the pull request adding a missing }. I am trying to make it work all I can.
              Thanks again.

              Show
              jumb0eadg A B added a comment - Thanks for the info. Actually I am the one who created the pull request adding a missing }. I am trying to make it work all I can. Thanks again.
              Hide
              dmonllao David Monllaó added a comment -

              Assigning to myself; thanks for the changes Daniel, I will continue working on top of your changes, I see a few things that we need to change before this can be submitted for integration, I will comment later / tomorrow.

              Show
              dmonllao David Monllaó added a comment - Assigning to myself; thanks for the changes Daniel, I will continue working on top of your changes, I see a few things that we need to change before this can be submitted for integration, I will comment later / tomorrow.
              Hide
              dmonllao David Monllaó added a comment - - edited

              This needs more work and cleanup, unit tests... I already started but it will take time.

              Being performance and access control together on the search engine side (at least as much as possible) a critical point, I've started looking at different solutions and I would appreciate feedback from Tomasz Muras or any one that knows how this monster engines works internally and how they perform better.

              1. I don't think that we should restrict the search components to modules + course; actually, looking at the current code and how course search is hardcoded would be better to allow other components than modules to implement the required search interfaces. I'm starting trying the most optimistic solution (with elasticsearch at the moment) thinking that the search engine performance will be good enough to return data based on a search query containing the list of contexts where the user has access to (get_fast_modinfo). The components implementing global search would then depend on the context level where they live and in any case, the component will have always the last word, I mean some like https://github.com/dmonllao/moodle/blob/MDL-31989_master/lib/classes/search.php#L250 (all messy and untested) Only activities visible by the user would be returned (extra checkings module-dependant) on course-level components like course or questions we would use the course context and check its visibility + specific capabilities in any case.... If the search engines don't perform well we could lower to courseid, but in modules case we would need to add extra checkings in moodle-side once we get the results. I hope I explained it well.
                • In elasticsearch, filters seems appropriate for this component_name => [context1, context2...] mapping, as the results of applying the filters are cached in the engine for future queries before executing the query. We don't expect the list of contexts a user have access to to change often, so it might work. I'm still filling a big dataset to see how it performs.
                • No idea in solr yet Any idea how the contextids thing would perform?
              2. I've just started looking at solr, as after the last refactoring it was completely broken, I still have many stuff to read, but I don't initially think that hardcoding moodle's own fields in a schema.xml is a good idea, even less knowing that we want 3rd party plugins to implement global search interfaces; dynamic types sounds better. No idea at all of what facets are, will look
              3. We need to allow users to specify both elastic instance name and solr collection, we will also need TEST_ELASTIC_INSTANCE and TEST_SOLR_COLLECTION for unit testing

              I still think that this is too big for a single issue, but instead of the subtasks I created last year when I reviewed Prateek's patch (all interdependant to get integrated) I will try to push a single issue with the basic stuff working and add more stuff (files search, different modules...) later, because otherwise I'm afraid that this will take ages to get integrated (in any case would NOT be ready for 3.0) What a massive job you did here Prateek or Tomasz, congrats. Thanks Daniel too for the initial plugin system, but please, if you work on top of other people's changes, keep their contributions, we always have time later to clean up, squash...

              I would not recommend to anyone to use the current proposal in a production site.

              Show
              dmonllao David Monllaó added a comment - - edited This needs more work and cleanup, unit tests... I already started but it will take time. Being performance and access control together on the search engine side (at least as much as possible) a critical point, I've started looking at different solutions and I would appreciate feedback from Tomasz Muras or any one that knows how this monster engines works internally and how they perform better. I don't think that we should restrict the search components to modules + course; actually, looking at the current code and how course search is hardcoded would be better to allow other components than modules to implement the required search interfaces. I'm starting trying the most optimistic solution (with elasticsearch at the moment) thinking that the search engine performance will be good enough to return data based on a search query containing the list of contexts where the user has access to (get_fast_modinfo). The components implementing global search would then depend on the context level where they live and in any case, the component will have always the last word, I mean some like https://github.com/dmonllao/moodle/blob/MDL-31989_master/lib/classes/search.php#L250 (all messy and untested) Only activities visible by the user would be returned (extra checkings module-dependant) on course-level components like course or questions we would use the course context and check its visibility + specific capabilities in any case.... If the search engines don't perform well we could lower to courseid, but in modules case we would need to add extra checkings in moodle-side once we get the results. I hope I explained it well. In elasticsearch, filters seems appropriate for this component_name => [context1, context2...] mapping, as the results of applying the filters are cached in the engine for future queries before executing the query. We don't expect the list of contexts a user have access to to change often, so it might work. I'm still filling a big dataset to see how it performs. No idea in solr yet Any idea how the contextids thing would perform? I've just started looking at solr, as after the last refactoring it was completely broken, I still have many stuff to read, but I don't initially think that hardcoding moodle's own fields in a schema.xml is a good idea, even less knowing that we want 3rd party plugins to implement global search interfaces; dynamic types sounds better. No idea at all of what facets are, will look We need to allow users to specify both elastic instance name and solr collection, we will also need TEST_ELASTIC_INSTANCE and TEST_SOLR_COLLECTION for unit testing I still think that this is too big for a single issue, but instead of the subtasks I created last year when I reviewed Prateek's patch (all interdependant to get integrated) I will try to push a single issue with the basic stuff working and add more stuff (files search, different modules...) later, because otherwise I'm afraid that this will take ages to get integrated (in any case would NOT be ready for 3.0) What a massive job you did here Prateek or Tomasz, congrats. Thanks Daniel too for the initial plugin system, but please, if you work on top of other people's changes, keep their contributions, we always have time later to clean up, squash... I would not recommend to anyone to use the current proposal in a production site.
              Hide
              nebgor Aparup Banerjee added a comment -

              I just popped by wondering what the state of this was, good to see it stirring.

              HQ, you might want to contact Prateek over skype just in case contact was lost during organisational changes at HQ. He did all of the initial work as part of GSoC'13 and skype was the preferred means to discuss. Cheers!

              Show
              nebgor Aparup Banerjee added a comment - I just popped by wondering what the state of this was, good to see it stirring. HQ, you might want to contact Prateek over skype just in case contact was lost during organisational changes at HQ. He did all of the initial work as part of GSoC'13 and skype was the preferred means to discuss. Cheers!
              Hide
              xan Prateek Sachan added a comment -

              Hi everyone,

              Good to finally see some developments towards this project after such a long time. I'll be happy to work on this as and when required.

              Thanks.

              Show
              xan Prateek Sachan added a comment - Hi everyone, Good to finally see some developments towards this project after such a long time. I'll be happy to work on this as and when required. Thanks.
              Hide
              tmuras Tomasz Muras added a comment -

              Maybe we need to prioritize features here, otherwise it will never be done . I think it's more important to get even very basic version into core, than to implement some features now.

              I suggest:
              1. Don't bother with search engine side optimization in version 1. We can simply return more results (say 1000) from search engine and then filter it down on Moodle side. If we get down to 0 valid results, well - let it be! Hopefully we can then get feedback and real data from users - and decide on best optimization strategy.
              2. Do not implement support for more than one search engine, let's not "fragment" users and not try to support both solr and elasticsearch. Support for future search engines is OK but roll out with 1 only (and with unit tests for 1). Pick whichever one you're more comfortable with.
              3. I think it's OK for now to support only modules. In the future it'd be great to extend it (e.g. use search engine to search through courses, users, ...).
              4. Maybe I've missed it - but I don't see implementation for deleting resources. We can either implement it with SEARCH_ACCESS_DELETED or say that index must be re-indexed every now & then to get rid of deleted resources.

              cheers,
              Tomek

              Show
              tmuras Tomasz Muras added a comment - Maybe we need to prioritize features here, otherwise it will never be done . I think it's more important to get even very basic version into core, than to implement some features now. I suggest: 1. Don't bother with search engine side optimization in version 1. We can simply return more results (say 1000) from search engine and then filter it down on Moodle side. If we get down to 0 valid results, well - let it be! Hopefully we can then get feedback and real data from users - and decide on best optimization strategy. 2. Do not implement support for more than one search engine, let's not "fragment" users and not try to support both solr and elasticsearch. Support for future search engines is OK but roll out with 1 only (and with unit tests for 1). Pick whichever one you're more comfortable with. 3. I think it's OK for now to support only modules. In the future it'd be great to extend it (e.g. use search engine to search through courses, users, ...). 4. Maybe I've missed it - but I don't see implementation for deleting resources. We can either implement it with SEARCH_ACCESS_DELETED or say that index must be re-indexed every now & then to get rid of deleted resources. cheers, Tomek
              Hide
              dmonllao David Monllaó added a comment -

              Thanks for commenting Tomasz, I completely agree with you that we should reduce the scope of this issue, but I think that we should abstract and prepare the APIs now, so in future when we want to add a new search engine or a new component to search in we don't have to change all our APIs, for this we need to:

              1. Abstract all solr engine related logic into a plugin scope (Daniel has done most of it, although it needs some refinements)
              2. Avoid module-tied logic

              About your comments
              1) Yeah, we can defer further improvements for later, but as commented, we can not release a pluginable system and break backwards compatibility in 6 months or be tied to a slow legacy system because we didn't spend a couple of weeks more of testing and development.
              2) You are right, + 2 search engines would slow us down
              3) Sounds good to implement search on some modules as a starting point, but the interfaces we define now should be reusable across other plugin types and other moodle subsystems.
              4) No idea sorry, haven't checked yet the whole implementation

              Show
              dmonllao David Monllaó added a comment - Thanks for commenting Tomasz, I completely agree with you that we should reduce the scope of this issue, but I think that we should abstract and prepare the APIs now, so in future when we want to add a new search engine or a new component to search in we don't have to change all our APIs, for this we need to: Abstract all solr engine related logic into a plugin scope (Daniel has done most of it, although it needs some refinements) Avoid module-tied logic About your comments 1) Yeah, we can defer further improvements for later, but as commented, we can not release a pluginable system and break backwards compatibility in 6 months or be tied to a slow legacy system because we didn't spend a couple of weeks more of testing and development. 2) You are right, + 2 search engines would slow us down 3) Sounds good to implement search on some modules as a starting point, but the interfaces we define now should be reusable across other plugin types and other moodle subsystems. 4) No idea sorry, haven't checked yet the whole implementation
              Hide
              dmonllao David Monllaó added a comment -

              Thanks Aparup Banerjee, and hi Prateek, we were in contact more than 1 year ago, we created subtasks and Prateek was keen to work on them, but no news from him until 2 days ago, good to have you back here Prateek, I guessed that you was busy with other stuff.

              Let me clean the current patch and prepare a serious proposal, I will need some time as I'm most of the time in integration team though.

              Show
              dmonllao David Monllaó added a comment - Thanks Aparup Banerjee , and hi Prateek, we were in contact more than 1 year ago, we created subtasks and Prateek was keen to work on them, but no news from him until 2 days ago, good to have you back here Prateek, I guessed that you was busy with other stuff. Let me clean the current patch and prepare a serious proposal, I will need some time as I'm most of the time in integration team though.
              Hide
              xan Prateek Sachan added a comment -

              Hi David Monllaó. I was busy with a lot of things at that time and comparatively free currently. Moreover, seeing the recent developments for this project, I would be really happy to work on it again.

              Show
              xan Prateek Sachan added a comment - Hi David Monllaó . I was busy with a lot of things at that time and comparatively free currently. Moreover, seeing the recent developments for this project, I would be really happy to work on it again.
              Hide
              dmonllao David Monllaó added a comment -

              Hi, sorry for the delay, I'm pushing a complete patch for peer review.

              We can limit this issue's scope to agree about the APIs (search engine, search component and document), add a search_solr plugin and implement search in a core components. I would really appreciate if someone experienced on different search engines, could comment about the engine APIs.

              Some comments:

              1. Tomasz, delete was working through 'Statistics' link, I've moved it to a new report_search in 'Reports' -> 'Search'
              2. I'm proposing a field format using solr fields naming, elasticsearch or other engines naming differs a bit indexed -> index, stored -> store... We can change this and set a new engine agnostic format, and force all search engines to map to their own format, but the naming used in solr is perfectly understandable, so I would say that it is not worth to force an different naming just to make it different
              3. I propose a list of required fields and 2 optional for each document. I was initially thinking about allowing components implementing search to add more fields, but this would create a complex dependency from search components to search engines, that would require us to add a bunch of new methods, I am not sure if it is worth. If anyone can think of an example that could not be possible with the proposed list of fields we can create an issue for it or discuss it. I've proposed a distribution (https://github.com/dmonllao/moodle/compare/MDL-31989_master~6...MDL-31989_master#diff-af77ee00a0c44b2a2b51e63c62e06db7R69) The optional fields are named 'name' and 'intro', based on module naming. But I am not 100% convinced we necessarily need them with these names, they could be renamed to extratitle & extracontent for example so they would also make sense for other components that need to inject extra data
              4. Using dynamic fields would not be ideal if we want to allow multiple search engines, their default format is too solr-specific, and if you have to change them in schema.xml anyway there is no benefit in using them. We can though set the required moodle fields in the schema through Solr Schema API (https://cwiki.apache.org/confluence/display/solr/Managed+Schema+Definition+in+SolrConfig). Search engines are responsible of reading the fields definition (a php array - \core\search\document) and set them. I've added a CLI script to set the required and optional fields in solr through its Schema REST API so people don't need to worry about overwriting solr files to use moodle, and deal with different solr versions conflicts... If there is anyone with experience on solr and performance and you think I am missing any important point please comment, I've read that schemaless indexes are not good for performance, but I haven't found any comment regarding negative performance impact of a managed schema definition, once the fields are added through the script mutable can be set to false again in case this may affect performance. In any case running the php CLI script is not required, but makes things easier.

              Changes I've introduced to the code resulting from GSOC + Daniel Neis additions:

              1. Allow not only modules but moodle components (all plugin types + core subsystems)
              2. Force the interface that components implement to define the context levels (system, course, module..) where they live so we can partially automate access control. Plugins and subsystems implementing search can always extend these basic access controls. For example, mod_forum contains data at module level, we automatically check cm_info->uservisible
              3. Fields contents validation according to the field type
              4. Move all config stuff from core to each plugin/subsystem
              5. Restrict the data that modules can add to the search engine and make it general enough to be shared between any plugin type and core subsystems. Things like modulelink should be responsability of the component implementing search, different components adding different stuff would make harder to filter, to retrieve fields and to add compatibility with other search engines
              6. New task to index docs instead of cron.php
              7. Constants to \core_search, some of them could be converted to settings, we can create followup issues for it
              8. After discussing it with Eloy, being search both a core API to implement by other components and a subsystem, I moved all classes structure to search/classes, instead of a lib/classes/core_search we have a search/classes/manager
              9. Cache key generated from all search filters and defined in lib/db/caches.php, invalidated on new indexing. It is MODE_SESSION so invalidation is quite aggressive. We can look at different approaches
              10. Performance stuff: get_fast_modinfo, static cached stuff, reduce DB queries... It took around 3h 20min (local solr server) to index 668090 docs (only forum posts) and executed 946965 db queries to do it. I will continue testing this, but IMO these results would be acceptable.
              11. Decoupled time format stuff from base document as the way it is stored depends on the search engine, timestamps is the default format (I've overwritten it with the solr one)
              12. Language strings cleanup
              13. I've created a new renderer from scratch and restyle it, the previous one was a mix of db queries, logic and UI rendering, broken pagination.... I've added a mustache template for a result. I would like to get rid of the global search block and replace it with a tiny search box in the navigation bar loading results through AJAX (would require a new result_short mustache template) I've talked about it with Ryan Wyllie, we will try to come with a proposal
              14. Unit test for each different component

              More comments:

              1. I am playing with a pretty big database (19GB) and indexing only forum posts (around 670000) the results so far are pretty good. With the access control at context level + specific forum post checkings (most other modules or components will not require other checkings than modinfo->uservisible) I get the results + render them (10 per page) faster than any of this site's main course pages. I'm limiting the search engine returned results to 100, this helps a lot, it is up to the search engine to decide what is more relevant. Important to mention that the solr server I'm using is in localhost. I want to test with a bigger dataset, I will implement search in a couple more components to see how it behaves with a couple of million records.
              2. What to do with the current lib/searchlib.php? I don't think we can get rid of it. I see that it is used in forum search (search_lexer at least) but I don't know the state of it. Andrew Nicols might have an idea. In any case, this global search requires external systems to work, we can not replace forum search with this
              3. I would vote for moving files indexing to another issue on top of this one. We should not hardcode in search API which components can add files
              4. Another followup issue could be to allow users to get results from courses opened to guests, at the moment users can only access their own courses + frontpage
              5. After some changes, highlighting is working but the returned highlighted fields don't work like I would expect, probably is due to my ignorance in the area, more eyes are welcomed
              6. get more results like this & facets, I would only include it as part of this issue if adding it later will make us break backwards compatibility with the APIs we are defining now. Any strong point / ideas about it? As long as they are in search_solr's scope (like highlighting) we should not have any problem
              7. I agree with Tomasz, search_elastic support should be also moved to another issue, I'm planning to mentor a GSOC project this year, search_elastic or files support could be good candidates
              8. Haven't checked if all search_solr server stuff related is working, just localhost dev, it should just work as it is using SolrClient, but would be nice if anyone interested on using global search can play with it
              9. Haven't checked Windows
              10. About solr supported versions... I've been playing with 5.x branch so no idea about previous versions
              11. We would need to compare pagination on solr side vs on moodle side with PHP, I've been more inclined by PHP side based on https://cwiki.apache.org/confluence/display/solr/Pagination+of+Results#PaginationofResults-PerformanceProblemswith"DeepPaging"
              12. Also compare addDocuments vs addDocument + final single commit performance, not sure if it should be the same or there is a performance impact. Related with this, we call addDocument with commitWithin to 15 secs. The only issue this may lead to is that, if there is any exception during indexing and moodle can not store the lastindexrun value some documents will be indexed again as there is no synchronization between what is already commited to the engine and what moodle thinks it is, it is not a big issue, they would be overwritten and done.
              13. I am naming the whole moodle indexed stuff a 'collection' as in solr cloud, but this implementation can also work on solr standalone mode, so it should probably be renamed to index, any opinion?

              Information to test this in https://github.com/dmonllao/moodle/blob/MDL-31989_master/search/engine/solr/README.md, it is compatible with both cloud and standalone modes.

              I had to clean up the current code to move forward, I've pushed the code before the clean up to https://github.com/dmonllao/moodle/tree/MDL-31989-before-cleanup, I'm not throwing away this code at all, just storing it there as we will need it in future for follow up issues.

              Many people is contributing here, I think that everybody should be credited, these are the names I've found, please comment if I'm missing someone:

              Show
              dmonllao David Monllaó added a comment - Hi, sorry for the delay, I'm pushing a complete patch for peer review. We can limit this issue's scope to agree about the APIs (search engine, search component and document), add a search_solr plugin and implement search in a core components. I would really appreciate if someone experienced on different search engines, could comment about the engine APIs. Some comments: Tomasz, delete was working through 'Statistics' link, I've moved it to a new report_search in 'Reports' -> 'Search' I'm proposing a field format using solr fields naming, elasticsearch or other engines naming differs a bit indexed -> index, stored -> store... We can change this and set a new engine agnostic format, and force all search engines to map to their own format, but the naming used in solr is perfectly understandable, so I would say that it is not worth to force an different naming just to make it different I propose a list of required fields and 2 optional for each document. I was initially thinking about allowing components implementing search to add more fields, but this would create a complex dependency from search components to search engines, that would require us to add a bunch of new methods, I am not sure if it is worth. If anyone can think of an example that could not be possible with the proposed list of fields we can create an issue for it or discuss it. I've proposed a distribution ( https://github.com/dmonllao/moodle/compare/MDL-31989_master~6...MDL-31989_master#diff-af77ee00a0c44b2a2b51e63c62e06db7R69 ) The optional fields are named 'name' and 'intro', based on module naming. But I am not 100% convinced we necessarily need them with these names, they could be renamed to extratitle & extracontent for example so they would also make sense for other components that need to inject extra data Using dynamic fields would not be ideal if we want to allow multiple search engines, their default format is too solr-specific, and if you have to change them in schema.xml anyway there is no benefit in using them. We can though set the required moodle fields in the schema through Solr Schema API ( https://cwiki.apache.org/confluence/display/solr/Managed+Schema+Definition+in+SolrConfig ). Search engines are responsible of reading the fields definition (a php array - \core\search\document) and set them. I've added a CLI script to set the required and optional fields in solr through its Schema REST API so people don't need to worry about overwriting solr files to use moodle, and deal with different solr versions conflicts... If there is anyone with experience on solr and performance and you think I am missing any important point please comment, I've read that schemaless indexes are not good for performance, but I haven't found any comment regarding negative performance impact of a managed schema definition, once the fields are added through the script mutable can be set to false again in case this may affect performance. In any case running the php CLI script is not required, but makes things easier. Changes I've introduced to the code resulting from GSOC + Daniel Neis additions: Allow not only modules but moodle components (all plugin types + core subsystems) Force the interface that components implement to define the context levels (system, course, module..) where they live so we can partially automate access control. Plugins and subsystems implementing search can always extend these basic access controls. For example, mod_forum contains data at module level, we automatically check cm_info->uservisible Fields contents validation according to the field type Move all config stuff from core to each plugin/subsystem Restrict the data that modules can add to the search engine and make it general enough to be shared between any plugin type and core subsystems. Things like modulelink should be responsability of the component implementing search, different components adding different stuff would make harder to filter, to retrieve fields and to add compatibility with other search engines New task to index docs instead of cron.php Constants to \core_search, some of them could be converted to settings, we can create followup issues for it After discussing it with Eloy, being search both a core API to implement by other components and a subsystem, I moved all classes structure to search/classes, instead of a lib/classes/core_search we have a search/classes/manager Cache key generated from all search filters and defined in lib/db/caches.php, invalidated on new indexing. It is MODE_SESSION so invalidation is quite aggressive. We can look at different approaches Performance stuff: get_fast_modinfo, static cached stuff, reduce DB queries... It took around 3h 20min (local solr server) to index 668090 docs (only forum posts) and executed 946965 db queries to do it. I will continue testing this, but IMO these results would be acceptable. Decoupled time format stuff from base document as the way it is stored depends on the search engine, timestamps is the default format (I've overwritten it with the solr one) Language strings cleanup I've created a new renderer from scratch and restyle it, the previous one was a mix of db queries, logic and UI rendering, broken pagination.... I've added a mustache template for a result. I would like to get rid of the global search block and replace it with a tiny search box in the navigation bar loading results through AJAX (would require a new result_short mustache template) I've talked about it with Ryan Wyllie , we will try to come with a proposal Unit test for each different component More comments: I am playing with a pretty big database (19GB) and indexing only forum posts (around 670000) the results so far are pretty good. With the access control at context level + specific forum post checkings (most other modules or components will not require other checkings than modinfo->uservisible) I get the results + render them (10 per page) faster than any of this site's main course pages. I'm limiting the search engine returned results to 100, this helps a lot, it is up to the search engine to decide what is more relevant. Important to mention that the solr server I'm using is in localhost. I want to test with a bigger dataset, I will implement search in a couple more components to see how it behaves with a couple of million records. What to do with the current lib/searchlib.php? I don't think we can get rid of it. I see that it is used in forum search (search_lexer at least) but I don't know the state of it. Andrew Nicols might have an idea. In any case, this global search requires external systems to work, we can not replace forum search with this I would vote for moving files indexing to another issue on top of this one. We should not hardcode in search API which components can add files Another followup issue could be to allow users to get results from courses opened to guests, at the moment users can only access their own courses + frontpage After some changes, highlighting is working but the returned highlighted fields don't work like I would expect, probably is due to my ignorance in the area, more eyes are welcomed get more results like this & facets, I would only include it as part of this issue if adding it later will make us break backwards compatibility with the APIs we are defining now. Any strong point / ideas about it? As long as they are in search_solr's scope (like highlighting) we should not have any problem I agree with Tomasz, search_elastic support should be also moved to another issue, I'm planning to mentor a GSOC project this year, search_elastic or files support could be good candidates Haven't checked if all search_solr server stuff related is working, just localhost dev, it should just work as it is using SolrClient, but would be nice if anyone interested on using global search can play with it Haven't checked Windows About solr supported versions... I've been playing with 5.x branch so no idea about previous versions We would need to compare pagination on solr side vs on moodle side with PHP, I've been more inclined by PHP side based on https://cwiki.apache.org/confluence/display/solr/Pagination+of+Results#PaginationofResults-PerformanceProblemswith "DeepPaging" Also compare addDocuments vs addDocument + final single commit performance, not sure if it should be the same or there is a performance impact. Related with this, we call addDocument with commitWithin to 15 secs. The only issue this may lead to is that, if there is any exception during indexing and moodle can not store the lastindexrun value some documents will be indexed again as there is no synchronization between what is already commited to the engine and what moodle thinks it is, it is not a big issue, they would be overwritten and done. I am naming the whole moodle indexed stuff a 'collection' as in solr cloud, but this implementation can also work on solr standalone mode, so it should probably be renamed to index, any opinion? Information to test this in https://github.com/dmonllao/moodle/blob/MDL-31989_master/search/engine/solr/README.md , it is compatible with both cloud and standalone modes. I had to clean up the current code to move forward, I've pushed the code before the clean up to https://github.com/dmonllao/moodle/tree/MDL-31989-before-cleanup , I'm not throwing away this code at all, just storing it there as we will need it in future for follow up issues. Many people is contributing here, I think that everybody should be credited, these are the names I've found, please comment if I'm missing someone: https://github.com/prateeksachan https://github.com/tmuras https://github.com/danielneis https://github.com/doctorlard https://github.com/eugeneventer
              Hide
              cibot CiBoT added a comment -

              Fails against automated checks.

              Checked MDL-31989 using repository: https://github.com/dmonllao/moodle

              More information about this report

              Show
              cibot CiBoT added a comment - Fails against automated checks. Checked MDL-31989 using repository: https://github.com/dmonllao/moodle Testing instructions are missing. master (11 errors / 9 warnings) [branch: MDL-31989_master | CI Job ] phplint (0/0) , php (6/8) , js (0/0) , css (0/0) , phpdoc (5/0) , commit (0/1) , savepoint (0/0) , thirdparty (0/0) , grunt (0/0) , shifter (0/0) , More information about this report
              Hide
              cibot CiBoT added a comment -

              Fails against automated checks.

              Checked MDL-31989 using repository: https://github.com/dmonllao/moodle

              More information about this report

              Show
              cibot CiBoT added a comment - Fails against automated checks. Checked MDL-31989 using repository: https://github.com/dmonllao/moodle Testing instructions are missing. master (0 errors / 2 warnings) [branch: MDL-31989_master | CI Job ] phplint (0/0) , php (0/1) , js (0/0) , css (0/0) , phpdoc (0/0) , commit (0/1) , savepoint (0/0) , thirdparty (0/0) , grunt (0/0) , shifter (0/0) , More information about this report
              Hide
              lameze Simey Lameze added a comment - - edited

              Hello David Monllaó, thanks for catching up with this issue.

              I've managed to make the global search work, which is really cool, I'm learning from the code and new things that I never seen before, such solr.
              So far, I must say well done! Great job from you and the other devs that spent time working to make this happen, I think it's pretty close to get this in for 3.1, I would strongly consider this new feature, it's highly voted and would make many people happy out there.

              As we discussed, I went through half of the patch, there are some things that I spotted and some questions to ask, so I'll post here now so you won't be blocked waiting my review.

              1. During the upgrade I've got the following debug message, not sure if it was some caching issue, but I think worth having a look:

                Invalid get_string() identifier: 'solroptional' or component 'admin'. Perhaps you are missing $string['solroptional'] = ''; in lang/en/admin.php?
                line 349 of /lib/classes/string_manager_standard.php: call to debugging()
                line 6673 of /lib/moodlelib.php: call to core_string_manager_standard->get_string()
                line 1487 of /lib/environmentlib.php: call to get_string()
                line 1898 of /admin/renderer.php: call to environment_results->strToReport()
                line 172 of /admin/renderer.php: call to core_admin_renderer->environment_check_table()
                line 354 of /admin/index.php: call to core_admin_renderer->upgrade_environment_page()
                

              2. It states it's just supporting solr at the moment, looking the settings, I had the impression that I'm able to add another search engine easily, how easily that would be? I guess we would have to create another issue in the future to support other search engines on core by default (if many people ask for it).
              3. In the Component dropdown, I just saw a setting for forum, what about other modules? Are they covered in this first moment.
              4. I would suggest improve the lang strings, to make it more nice. Furthermore, I would ask Helen advice on that.
                • Strings such: You must have, at least, one search engine installed and ready to use. could be reworded IMO.
                • solr not installed, Solr PHP extension is not installed, please follow the documentation. are another case.
                • Available search components could be reworded to Available components for search.
              5. All components/forum, why a separate option for forum? Is the global search really doing the search in all components?
              6. Would be really nice something automatic, one you are done with the setup would take the user to the page to recreate index, would give the new feature a much more professional look. Imagine generating the index and displaying a progress bar (like backup). Just a thought, not to be done now.
              7. Is this indentation correct, seems it should be 8 spaces in the next line.
              8. You could add use namespace core_search, instead of using \core_search\ everywhere on the indexer.php, In my opinion would improve the readability of the file.
              9. When I did my first search, got this error message:

                Warning: strip_tags() expects parameter 1 to be string, array given in /home/moodle/moodles/stable_review/moodle/search/classes/document.php on line 208
                

              10. The period of date displayed on the filter, seems non sense.. I would let a default period of 1 week ago to now.
              11. If you choose a modified date before (let's say I want search something before 31 of January), it doesn't retrieve any results.
              12. How's going to work with the new cloud tag introduced on 3.0?
              13. I think the whole global search could be a bit more standard, most of the blocks names has _ to space between words, but the globalsearch block does not.
                • The task name is global_search_task, in other places, such events are using just search. I think we should standardize that.
              14. The report > search really needs more work I think, each option should have an explanation and I think it should be placed with the
              15. The result is trimming the last letter of the user name in the search results. (Admin use, Herbert Garriso), more than that. I think the indexes are being generated like that. Because if you try to search as Admin User, it doesn't find any records, but if you search for Admin use, it displays the results.
              16. I would change to display the search results like a list, with at hr as a line separator, changing the color to grey when the mouse is over the line.
              really minors, feel free to ignore me.
              1. The search/manager class instance comment block says it triggers \moodle_exception, but it also throws \core_search\engine_exception.
              2. Missing period here and other comments block on this class are missing the period.
              3. Unused global $DB; variable here.
              4. Unused $CFG here.
              5. I would remove this space.
              6. Incorrect parameter declaration, it can be bool as well here
              7. Unused variables here

              I kept looking today(Monday), but my overall feeling is a good addition to moodle. Something more integrated with the moodle, would be even better.. the backend part seems really well done, just thought the frontend part could be looked with a bit more of detail. The impression I had to visit 3 different places to make the whole thing work (specially the report > search) seems not the right place for that feature, as it's not just a report.

              I had the impression that this is a good feature, but can end up hidden like many others. Unfortunately blocks structure stuck the usability and the user experience.
              But, I think we can try to improve the usability using what we have now, it's early 3.1 dev so many things can be add/improved.

              Sorry if I asked some questions that you already explained, I could read the whole history of the issue.

              Thanks.

              Show
              lameze Simey Lameze added a comment - - edited Hello David Monllaó , thanks for catching up with this issue. I've managed to make the global search work, which is really cool, I'm learning from the code and new things that I never seen before, such solr. So far, I must say well done! Great job from you and the other devs that spent time working to make this happen, I think it's pretty close to get this in for 3.1, I would strongly consider this new feature, it's highly voted and would make many people happy out there. As we discussed, I went through half of the patch, there are some things that I spotted and some questions to ask, so I'll post here now so you won't be blocked waiting my review. During the upgrade I've got the following debug message, not sure if it was some caching issue, but I think worth having a look: Invalid get_string() identifier: 'solroptional' or component 'admin'. Perhaps you are missing $string['solroptional'] = ''; in lang/en/admin.php? line 349 of /lib/classes/string_manager_standard.php: call to debugging() line 6673 of /lib/moodlelib.php: call to core_string_manager_standard->get_string() line 1487 of /lib/environmentlib.php: call to get_string() line 1898 of /admin/renderer.php: call to environment_results->strToReport() line 172 of /admin/renderer.php: call to core_admin_renderer->environment_check_table() line 354 of /admin/index.php: call to core_admin_renderer->upgrade_environment_page() It states it's just supporting solr at the moment, looking the settings, I had the impression that I'm able to add another search engine easily, how easily that would be? I guess we would have to create another issue in the future to support other search engines on core by default (if many people ask for it). In the Component dropdown, I just saw a setting for forum, what about other modules? Are they covered in this first moment. I would suggest improve the lang strings, to make it more nice. Furthermore, I would ask Helen advice on that. Strings such: You must have, at least, one search engine installed and ready to use. could be reworded IMO. solr not installed, Solr PHP extension is not installed, please follow the documentation. are another case. Available search components could be reworded to Available components for search . All components/forum, why a separate option for forum? Is the global search really doing the search in all components? Would be really nice something automatic, one you are done with the setup would take the user to the page to recreate index, would give the new feature a much more professional look. Imagine generating the index and displaying a progress bar (like backup). Just a thought, not to be done now. Is this indentation correct, seems it should be 8 spaces in the next line. You could add use namespace core_search, instead of using \core_search\ everywhere on the indexer.php, In my opinion would improve the readability of the file. When I did my first search, got this error message: Warning: strip_tags() expects parameter 1 to be string, array given in /home/moodle/moodles/stable_review/moodle/search/classes/document.php on line 208 The period of date displayed on the filter, seems non sense.. I would let a default period of 1 week ago to now. If you choose a modified date before (let's say I want search something before 31 of January), it doesn't retrieve any results. How's going to work with the new cloud tag introduced on 3.0? I think the whole global search could be a bit more standard, most of the blocks names has _ to space between words, but the globalsearch block does not. The task name is global_search_task, in other places, such events are using just search. I think we should standardize that. The report > search really needs more work I think, each option should have an explanation and I think it should be placed with the The result is trimming the last letter of the user name in the search results. (Admin use, Herbert Garriso), more than that. I think the indexes are being generated like that. Because if you try to search as Admin User, it doesn't find any records, but if you search for Admin use, it displays the results. I would change to display the search results like a list, with at hr as a line separator, changing the color to grey when the mouse is over the line. really minors, feel free to ignore me. The search/manager class instance comment block says it triggers \moodle_exception , but it also throws \core_search\engine_exception . Missing period here and other comments block on this class are missing the period. Unused global $DB; variable here . Unused $CFG here . I would remove this space . Incorrect parameter declaration, it can be bool as well here Unused variables here I kept looking today(Monday), but my overall feeling is a good addition to moodle. Something more integrated with the moodle, would be even better.. the backend part seems really well done, just thought the frontend part could be looked with a bit more of detail. The impression I had to visit 3 different places to make the whole thing work (specially the report > search) seems not the right place for that feature, as it's not just a report. I had the impression that this is a good feature, but can end up hidden like many others. Unfortunately blocks structure stuck the usability and the user experience. But, I think we can try to improve the usability using what we have now, it's early 3.1 dev so many things can be add/improved. Sorry if I asked some questions that you already explained, I could read the whole history of the issue. Thanks.
              Hide
              dmonllao David Monllaó added a comment -

              Thanks for the review Simey, I can imagine it was not specially pleasant given how big the patch is.

              1. Thanks! I missed that one
              2. Yes, that is the idea, it wouldn't be specially hard, although I am not sure if would be better to keep alternatives in core or as 3rd party plugins. In any case it is important to have a generic search engine API instead of limiting ourselves to Solr. You can even implement a search_postgres and set everything under a single table, for small sites or for testing purposes.
              3. The search API allows every moodle component (plugins and core subsystems) to add data to the search engine. Forum is just first component to implement search, but if this is accepted for 3.1 I would add search to, at least, glossary and databases before the freeze, wouldn't be very useful to search only in forums, well, maybe for sites like moodle.org, but not for an average institution.
              4. Yeah, I agree with you Helen Foster's help would be much appreciated as I have mostly see it from a technical point of view. I've added screenshots, in any case, and being only master we could defer language improvements to another issue if she is busy these weeks. I've changed what you have proposed.
              5. Related with previous #3 point, this patch only implements search for forums, each component should add their own search classes, this should be done as part of a different issue, IMO it is better if we can integrate the base and after it add search to more and more components
              6. Sorry, I didn't explain it properly, in production sites search will usually run through cron, but a note after the set up would not hurt anyone (I've added it) Now that you have gone through all the setup process having no previous idea of what this "search thing" is, what would you automate? I mean
              7. I've changed it to 8
              8. According to the coding style we should not import entire namespaces, and even if we do it it will be core_search, IMO it is not worth the change
              9. As we saw together you indexed data before setting up the schema, this means that your fields in solr had multivalued attribute to true, which stores the data in an array, if you got the warning when searching the execution flow should come from set_data_from_engine returning an array instead of a scalar. I added an exception for this case while you was reviewing it. I've added the same exception in \core_search\document::set.
              10. By default modified before and modified after are disabled, I don't think I can set the default to a date and disable it, if you know how to do it please let me know
              11. I fixed this while you was reviewing the patch It should be fixed
              12. I am not sure I get your point, tags is a different core subsystem, although tags could implement search, if tags implement search and search is limited to tags component then it should be similar to tags subsystem own search system. Ping me if it is not clear.
              13. I am not sure about this point; I intentionally changed the block name to globalsearch as I remember discussions about blocks allowing underscores on their names and that it was a bad decision, although it is unlikely that we change this now I don't like to use underscores in plugins names; I've tended to prefix a 'global_' in some cases because search is a pretty common name but I might have over prefixed stuff like in the task case, I've removed global_ from the task name
              14. I don't expect people to use report_search much, to see the server status they should check the server. This is more an admin tool for small sites and to force indexing. I don't think it is worth spending much time on this.
              15. hehee, funny one, I fixed this already, I was calling trim with '\r\n' instead of "\r\n"
              16. Yeah, you are right, I've added a hr, looks better now

              Minor stuff:

              1. engine_exception extends moodle_exception so I though it was enough, I set them both
              2. I thought it was compulsory but it is not I've added anyway
              3. Done
              4. Done
              5. I've removed the &, php 5 is already assigning object by reference
              6. Done
              7. Done

              About something more integrated in moodle, I agree with you, but you had to visit 3 different places because you are in a dev environment, in production environment people would use cron. They would just need to:

              1. Install solr (it is a separate service, we can't do much about it)
              2. Set up solr in moodle
              3. Add global search block to frontpage

              We could skip the third step replacing the block with a tiny search icon in .navbar if search is enabled (should not take much space, should be responsive) something like http://developer.android.com/index.html (top-right side) and using autocomplete while typing to show results calling an external function, it would also require a new template. I opted to submit the block solution following the initial proposal but I would be very keen to replace it with something more modern and integrated to moodle if integrators agree with.

              I've found an issue (characters escaping related) that I need to fix before moving this forward and I'd also like to wait a bit to give people a chance to comment before sending this to integration. In any case, this is master only and we can change stuff later. Thanks again for the review and feel free to comment about any problem you found during the setup.

              Show
              dmonllao David Monllaó added a comment - Thanks for the review Simey, I can imagine it was not specially pleasant given how big the patch is. Thanks! I missed that one Yes, that is the idea, it wouldn't be specially hard, although I am not sure if would be better to keep alternatives in core or as 3rd party plugins. In any case it is important to have a generic search engine API instead of limiting ourselves to Solr. You can even implement a search_postgres and set everything under a single table, for small sites or for testing purposes. The search API allows every moodle component (plugins and core subsystems) to add data to the search engine. Forum is just first component to implement search, but if this is accepted for 3.1 I would add search to, at least, glossary and databases before the freeze, wouldn't be very useful to search only in forums, well, maybe for sites like moodle.org, but not for an average institution. Yeah, I agree with you Helen Foster 's help would be much appreciated as I have mostly see it from a technical point of view. I've added screenshots, in any case, and being only master we could defer language improvements to another issue if she is busy these weeks. I've changed what you have proposed. Related with previous #3 point, this patch only implements search for forums, each component should add their own search classes, this should be done as part of a different issue, IMO it is better if we can integrate the base and after it add search to more and more components Sorry, I didn't explain it properly, in production sites search will usually run through cron, but a note after the set up would not hurt anyone (I've added it) Now that you have gone through all the setup process having no previous idea of what this "search thing" is, what would you automate? I mean I've changed it to 8 According to the coding style we should not import entire namespaces, and even if we do it it will be core_search, IMO it is not worth the change As we saw together you indexed data before setting up the schema, this means that your fields in solr had multivalued attribute to true, which stores the data in an array, if you got the warning when searching the execution flow should come from set_data_from_engine returning an array instead of a scalar. I added an exception for this case while you was reviewing it. I've added the same exception in \core_search\document::set. By default modified before and modified after are disabled, I don't think I can set the default to a date and disable it, if you know how to do it please let me know I fixed this while you was reviewing the patch It should be fixed I am not sure I get your point, tags is a different core subsystem, although tags could implement search, if tags implement search and search is limited to tags component then it should be similar to tags subsystem own search system. Ping me if it is not clear. I am not sure about this point; I intentionally changed the block name to globalsearch as I remember discussions about blocks allowing underscores on their names and that it was a bad decision, although it is unlikely that we change this now I don't like to use underscores in plugins names; I've tended to prefix a 'global_' in some cases because search is a pretty common name but I might have over prefixed stuff like in the task case, I've removed global_ from the task name I don't expect people to use report_search much, to see the server status they should check the server. This is more an admin tool for small sites and to force indexing. I don't think it is worth spending much time on this. hehee, funny one, I fixed this already, I was calling trim with '\r\n' instead of "\r\n" Yeah, you are right, I've added a hr, looks better now Minor stuff: engine_exception extends moodle_exception so I though it was enough, I set them both I thought it was compulsory but it is not I've added anyway Done Done I've removed the &, php 5 is already assigning object by reference Done Done About something more integrated in moodle , I agree with you, but you had to visit 3 different places because you are in a dev environment, in production environment people would use cron. They would just need to: Install solr (it is a separate service, we can't do much about it) Set up solr in moodle Add global search block to frontpage We could skip the third step replacing the block with a tiny search icon in .navbar if search is enabled (should not take much space, should be responsive) something like http://developer.android.com/index.html (top-right side) and using autocomplete while typing to show results calling an external function, it would also require a new template. I opted to submit the block solution following the initial proposal but I would be very keen to replace it with something more modern and integrated to moodle if integrators agree with. I've found an issue (characters escaping related) that I need to fix before moving this forward and I'd also like to wait a bit to give people a chance to comment before sending this to integration. In any case, this is master only and we can change stuff later. Thanks again for the review and feel free to comment about any problem you found during the setup.
              Hide
              cibot CiBoT added a comment -

              Fails against automated checks.

              Checked MDL-31989 using repository: https://github.com/dmonllao/moodle

              More information about this report

              Show
              cibot CiBoT added a comment - Fails against automated checks. Checked MDL-31989 using repository: https://github.com/dmonllao/moodle master [branch: MDL-31989_master | CI Job ] Info: the branch is based off moodle.git Info: base commit c9d91bb7347d8d35d90a7cbf8e4cdb518230c428 being used as initial commit. Error: The MDL-31989 _master branch at https://github.com/dmonllao/moodle does not apply clean to origin/master More information about this report
              Hide
              cibot CiBoT added a comment -
              Show
              cibot CiBoT added a comment - Fails against automated checks. Checked MDL-31989 using repository: https://github.com/dmonllao/moodle master (4 errors / 2 warnings) [branch: MDL-31989_master | CI Job ] phplint (0/0) , php (4/1) , js (0/0) , css (0/0) , phpdoc (0/0) , commit (0/1) , savepoint (0/0) , thirdparty (0/0) , grunt (0/0) , shifter (0/0) , More information about this report
              Hide
              dmonllao David Monllaó added a comment -

              Simey, I've added a search box to the navigation bar.

              Show
              dmonllao David Monllaó added a comment - Simey, I've added a search box to the navigation bar.
              Hide
              dmonllao David Monllaó added a comment -

              Oh, forgot to mention, Yesterday I also fixed the escape characters issue I mentioned.

              Show
              dmonllao David Monllaó added a comment - Oh, forgot to mention, Yesterday I also fixed the escape characters issue I mentioned.
              Hide
              lameze Simey Lameze added a comment -

              Wow just tried that new search bar, looks AWESOME!!! That's one thing that take this new feature to another level.. completely integrated with moodle.

              That bar is just going to display if global search is enabled? Otherwise can be a bit confusing with the search we have on the bottom of the menu (regular moodle search), for a user that is using moodle for the very first time.

              But I must say, looks waaaaaaay better now. Well done!

              Show
              lameze Simey Lameze added a comment - Wow just tried that new search bar, looks AWESOME!!! That's one thing that take this new feature to another level.. completely integrated with moodle. That bar is just going to display if global search is enabled? Otherwise can be a bit confusing with the search we have on the bottom of the menu (regular moodle search), for a user that is using moodle for the very first time. But I must say, looks waaaaaaay better now. Well done!
              Hide
              dmonllao David Monllaó added a comment -

              lol, yeah, looks good, I talked about it with Fred and he showed me http://developer.android.com/index.html example.

              Show
              dmonllao David Monllaó added a comment - lol, yeah, looks good, I talked about it with Fred and he showed me http://developer.android.com/index.html example.
              Hide
              dmonllao David Monllaó added a comment -

              I'm sending this to integration, copying interesting comments from https://moodle.org/mod/forum/discuss.php?d=189023 that are still applicable to this solution:

              • Would be interesting to filter by course and group results by course
              • This solution is based on cron + timestamps as using events for it will affect user experience once we start adding files, the limitation of this timestamps based system as Sam Marshall commented, is restore, data is restored using the original timemodified, which may make the indexer script skip these records, we need to find a way to fix this, will require each component to "tag" the restored indexable items to later be added by cron. Although I agree we should provide an easy way for components to do it, I would create a separate issue for it.
              • The other known limitation is that we are including the user name as part of the indexed document content, so we can filter by author, the problem is that users may change their names and there is no sync process to update all the user documents in the search engine. The best alternative I can think of is removing the author fullname from the search engine and try to get a user id from the user input in the author input, another alternative is to get rid of filtering by author
              Show
              dmonllao David Monllaó added a comment - I'm sending this to integration, copying interesting comments from https://moodle.org/mod/forum/discuss.php?d=189023 that are still applicable to this solution: Would be interesting to filter by course and group results by course This solution is based on cron + timestamps as using events for it will affect user experience once we start adding files, the limitation of this timestamps based system as Sam Marshall commented, is restore, data is restored using the original timemodified, which may make the indexer script skip these records, we need to find a way to fix this, will require each component to "tag" the restored indexable items to later be added by cron. Although I agree we should provide an easy way for components to do it, I would create a separate issue for it. The other known limitation is that we are including the user name as part of the indexed document content, so we can filter by author, the problem is that users may change their names and there is no sync process to update all the user documents in the search engine. The best alternative I can think of is removing the author fullname from the search engine and try to get a user id from the user input in the author input, another alternative is to get rid of filtering by author
              Hide
              cibot CiBoT added a comment -

              Moving this issue to current integration cycle, will be reviewed soon. Thanks for the hard work!

              Show
              cibot CiBoT added a comment - Moving this issue to current integration cycle, will be reviewed soon. Thanks for the hard work!
              Hide
              cibot CiBoT added a comment -
              Show
              cibot CiBoT added a comment - Fails against automated checks. Checked MDL-31989 using repository: https://github.com/dmonllao/moodle master (4 errors / 4 warnings) [branch: MDL-31989_master | CI Job ] phplint (0/0) , phpcs (4/3) , js (0/0) , css (0/0) , phpdoc (0/0) , commit (0/1) , savepoint (0/0) , thirdparty (0/0) , grunt (0/0) , shifter (0/0) , More information about this report
              Hide
              marina Marina Glancy added a comment -

              It's a pity David that you did not participate in our huge hooks discussions last Friday and today. This implementation looks like it could really benefit from MDL-44078

              Show
              marina Marina Glancy added a comment - It's a pity David that you did not participate in our huge hooks discussions last Friday and today. This implementation looks like it could really benefit from MDL-44078
              Hide
              dmonllao David Monllaó added a comment - - edited

              The problem using hooks would be similar than what Tomasz commented about events, time required to send files to the search engine will be too much, you can not make users wait.

              Show
              dmonllao David Monllaó added a comment - - edited The problem using hooks would be similar than what Tomasz commented about events, time required to send files to the search engine will be too much, you can not make users wait.
              Hide
              marina Marina Glancy added a comment -

              It would be exactly the same overhead as from using core_search\manager - you pretty much replicated hook dispatcher there

              Show
              marina Marina Glancy added a comment - It would be exactly the same overhead as from using core_search\manager - you pretty much replicated hook dispatcher there
              Hide
              dmonllao David Monllaó added a comment -

              Indexing runs through cron, it does not affect user experience. There is no agreement about hooks api, it has been like this for years and it is not in core, there is nothing to replicate

              Show
              dmonllao David Monllaó added a comment - Indexing runs through cron, it does not affect user experience. There is no agreement about hooks api, it has been like this for years and it is not in core, there is nothing to replicate
              Hide
              tsala Helen Foster added a comment -

              Thanks David for asking me to take a look at the language strings.

              There are a few strings with too many words capitalized. (Our policy is to only capitalize the first word in a phrase.) Also, there are a few desc strings which are possibly not necessary and can be removed.

              Here are my suggestions:

              In lang/en/block_globalsearch.php

              globalsearch:addinstance

              'Add a new global search block'

              globalsearch:myaddinstance

              'Add a new global search block to Dashboard'

              In lang/en/admin.php

              choosesearchengine

              'Search engine'

              No need for the string choosesearchengine_desc unless there is extra info about the setting?

              enableglobalsearch_desc

              'If enabled, data will be indexed and syncronised by a scheduled task.'

              It doesn't seem like globalsearchsupported_desc is necessary?

              solroptional

              'Global search requires the Apache Solr PHP extension to be installed.'

              In lang/en/search.php

              engineserverstatus

              'The search engine is not available. Please contact your administrator.'

              (just removing an unnecessary comma!)

              For the global search page as shown in the screenshot search-results.png, I'm wondering why there are two collapsable sections rather than one e.g. titled 'Filter' or 'New filter' (as in admin/user.php)?

              What is the lang string for the search query help popup? (I couldn't find a string enteryoursearchquery_help.)

              enteryoursearchquery

              'Search term'

              (I'm not sure this is the best wording; maybe someone has a better suggestion? Normally the main search field does not have a label.)

              filterqueryheader

              'Filter'

              globalsearch

              'Global search'

              In lang/en/report_search.php

              entireindex

              'Entire index'

              In lang/en/search_solr.php

              addingfields

              'Adding Moodle fields to the collection schema'

              extensionerror

              'The Apache Solr PHP extension is not installed. Please check the documentation.'

              missingconfig

              'Solr is not yet configured in Moodle.'

              searchinfo_help - where is the help popup displayed?

              Re. solrauthuser_desc solrauthpassword_desc and solrhttpconnectionport_desc does extra info need to be provided about the settings, or can these strings be removed?

              solrcollectionname_desc - sorry I don't understand this!

              solrhttpconnectiontimeout_desc

              'The HTTP connection timeout is the maximum time in seconds allowed for the HTTP data transfer operation.'

              solrsecuremode

              'Secure mode'

              solrsetting

              'Solr settings'

              solrsslkeypassword

              bq' 'SSL key password'

              solrversion_desc

              'The Apache Solr PHP extension installed is <code>{$a}</code>.'

              Show
              tsala Helen Foster added a comment - Thanks David for asking me to take a look at the language strings. There are a few strings with too many words capitalized. (Our policy is to only capitalize the first word in a phrase.) Also, there are a few desc strings which are possibly not necessary and can be removed. Here are my suggestions: In lang/en/block_globalsearch.php globalsearch:addinstance 'Add a new global search block' globalsearch:myaddinstance 'Add a new global search block to Dashboard' In lang/en/admin.php choosesearchengine 'Search engine' No need for the string choosesearchengine_desc unless there is extra info about the setting? enableglobalsearch_desc 'If enabled, data will be indexed and syncronised by a scheduled task.' It doesn't seem like globalsearchsupported_desc is necessary? solroptional 'Global search requires the Apache Solr PHP extension to be installed.' In lang/en/search.php engineserverstatus 'The search engine is not available. Please contact your administrator.' (just removing an unnecessary comma!) For the global search page as shown in the screenshot search-results.png, I'm wondering why there are two collapsable sections rather than one e.g. titled 'Filter' or 'New filter' (as in admin/user.php)? What is the lang string for the search query help popup? (I couldn't find a string enteryoursearchquery_help.) enteryoursearchquery 'Search term' (I'm not sure this is the best wording; maybe someone has a better suggestion? Normally the main search field does not have a label.) filterqueryheader 'Filter' globalsearch 'Global search' In lang/en/report_search.php entireindex 'Entire index' In lang/en/search_solr.php addingfields 'Adding Moodle fields to the collection schema' extensionerror 'The Apache Solr PHP extension is not installed. Please check the documentation.' missingconfig 'Solr is not yet configured in Moodle.' searchinfo_help - where is the help popup displayed? Re. solrauthuser_desc solrauthpassword_desc and solrhttpconnectionport_desc does extra info need to be provided about the settings, or can these strings be removed? solrcollectionname_desc - sorry I don't understand this! solrhttpconnectiontimeout_desc 'The HTTP connection timeout is the maximum time in seconds allowed for the HTTP data transfer operation.' solrsecuremode 'Secure mode' solrsetting 'Solr settings' solrsslkeypassword bq' 'SSL key password' solrversion_desc 'The Apache Solr PHP extension installed is <code>{$a}</code>.'
              Hide
              poltawski Dan Poltawski added a comment -

              Hi Everyone,

              Thanks for the persistence with this change, with what i've seen after hours looking through the code, I generally like the approach and general achitecure and I think its important to get this base in place now despite not implemented in the entirety. Getting tired now so commenting my in progress comments so far:

              1) Major issue is that I can't test this, upon running the cron task:

              Scheduled task failed: Global search indexing (core\task\search_task),Coding error detected, it must be fixed by a programmer: Missing "contentformat" field in document with id "mod_forum-2"
              Backtrace:
              * line 473 of /search/classes/manager.php: call to core_search\document->export_for_engine()
              * line 55 of /lib/classes/task/search_task.php: call to core_search\manager->index()
              * line 137 of /admin/tool/task/cli/schedule_task.php: call to core\task\search_task->execute()e:
              

              2) I would like to see this be a bit more admin friendly to be able to tell if the search system is setup, the schema created and so on from the web UI. It's a bit of a barrier at the moment and especially if we end up releasing with only solr, lots of people will be interested in this. The report is not obvious to find for example.. Maybe on the search settings page we could have a summary of the things which need setting up, a way to see if the solr connection is in place and so on. Index status etc. It was not very clear to me

              3) I love the search bar - but I think its a bit premature when we've only got mod_forum implemented.

              4) I think the environmental 'recommendation' for solr might be too strong, especially because it not only needs the php extension to work, but also need the server installed and listening. We don't have the same thing in environment.xml fro memcached for example. Also, I like to hope we would eventually have a basic local database engine to make ti work out of the box.

              5) get_component_type_visible_name $lazyload paramater looks like garbage to me. Its being passed as $a to get_string() isn't it?

              6) I don't like get_config_var_name() - its akward and seems a bit too magic and also the use of it is weird. Wouldn't it be cleaner to make a getter and setting of config in the base class and use that, rather than make the users have to

              7) I don't think we should keep the inline markdown file - docs to the wiki, lets remove it before it becomes outdated.

              8) In the block, i'd move the component picker into the advanced options and just keep that as a search box (somehwat related to the search bar i guess)

              9) We should run this branch against hte moodle.org dataset. If ever there was a case to test this, that is it!

              Show
              poltawski Dan Poltawski added a comment - Hi Everyone, Thanks for the persistence with this change, with what i've seen after hours looking through the code, I generally like the approach and general achitecure and I think its important to get this base in place now despite not implemented in the entirety. Getting tired now so commenting my in progress comments so far: 1) Major issue is that I can't test this, upon running the cron task: Scheduled task failed: Global search indexing (core\task\search_task),Coding error detected, it must be fixed by a programmer: Missing "contentformat" field in document with id "mod_forum-2" Backtrace: * line 473 of /search/classes/manager.php: call to core_search\document->export_for_engine() * line 55 of /lib/classes/task/search_task.php: call to core_search\manager->index() * line 137 of /admin/tool/task/cli/schedule_task.php: call to core\task\search_task->execute()e: 2) I would like to see this be a bit more admin friendly to be able to tell if the search system is setup, the schema created and so on from the web UI. It's a bit of a barrier at the moment and especially if we end up releasing with only solr, lots of people will be interested in this. The report is not obvious to find for example.. Maybe on the search settings page we could have a summary of the things which need setting up, a way to see if the solr connection is in place and so on. Index status etc. It was not very clear to me 3) I love the search bar - but I think its a bit premature when we've only got mod_forum implemented. 4) I think the environmental 'recommendation' for solr might be too strong, especially because it not only needs the php extension to work, but also need the server installed and listening. We don't have the same thing in environment.xml fro memcached for example. Also, I like to hope we would eventually have a basic local database engine to make ti work out of the box. 5) get_component_type_visible_name $lazyload paramater looks like garbage to me. Its being passed as $a to get_string() isn't it? 6) I don't like get_config_var_name() - its akward and seems a bit too magic and also the use of it is weird. Wouldn't it be cleaner to make a getter and setting of config in the base class and use that, rather than make the users have to 7) I don't think we should keep the inline markdown file - docs to the wiki, lets remove it before it becomes outdated. 8) In the block, i'd move the component picker into the advanced options and just keep that as a search box (somehwat related to the search bar i guess) 9) We should run this branch against hte moodle.org dataset. If ever there was a case to test this, that is it!
              Hide
              dmonllao David Monllaó added a comment -

              Thanks for the review Dan, I agree with most of what you said, just some comments about it, I will try to get some time for this during next week.

              1) Did you run the schema_setup.php script? It might be a bug because I added a couple for xxxxformat fields, but they should be created with the installation. May you have pulled a previous version of this patch and created the schema before Yesterday's review?
              2) Agree
              3) Agree, I will move it to a separate issue, once the search api is agreed it should be quick to add search to a few more components before the next major version
              4) Yeah, I thought too much about this one, makes sense to follow the same approach than for memcached
              5) grrr, should be the 4th get_string argument, IIRC I tried to force getting a new lang_string instance.
              6) np in having a getter setter in the base class, but we need the get_config_var_name for settings in admin/settings/plugins.php as admin_settings requires the config var name
              7) Agree
              8) ok, fine for me, I would love to get rid of this block in favour of 3) I deeply dislike moodle's blocks
              9) yeah, I've used a 19GB dataset with around 650.000 forum posts (results above), but moodle.org would be nice

              Dan, do you have any opinion about the author filter thing? I proposed to "guess" in php (moodle side) the user id from the string passed as author filter and send it to solr to filter by userid. Being a query filter not a part of the query, solr it is not basing the results score on it, just discarding documents with a user different than the provided one. Would be a pain to update all records where a user was involved once they change a part of their fullname. The other similar problem we have is that we should mark all modules context with changes on their name or intro as we include that info in the indexed doc, we can use the same system than from restore, basically a list of context to "force" indexing of documents depending on them (e.g. I edit a glossary activity name -> I mark its context so later I can get that activity glossary entries that were indexed and update them in the search engine)

              Show
              dmonllao David Monllaó added a comment - Thanks for the review Dan, I agree with most of what you said, just some comments about it, I will try to get some time for this during next week. 1) Did you run the schema_setup.php script? It might be a bug because I added a couple for xxxxformat fields, but they should be created with the installation. May you have pulled a previous version of this patch and created the schema before Yesterday's review? 2) Agree 3) Agree, I will move it to a separate issue, once the search api is agreed it should be quick to add search to a few more components before the next major version 4) Yeah, I thought too much about this one, makes sense to follow the same approach than for memcached 5) grrr, should be the 4th get_string argument, IIRC I tried to force getting a new lang_string instance. 6) np in having a getter setter in the base class, but we need the get_config_var_name for settings in admin/settings/plugins.php as admin_settings requires the config var name 7) Agree 8) ok, fine for me, I would love to get rid of this block in favour of 3) I deeply dislike moodle's blocks 9) yeah, I've used a 19GB dataset with around 650.000 forum posts (results above), but moodle.org would be nice Dan, do you have any opinion about the author filter thing? I proposed to "guess" in php (moodle side) the user id from the string passed as author filter and send it to solr to filter by userid. Being a query filter not a part of the query, solr it is not basing the results score on it, just discarding documents with a user different than the provided one. Would be a pain to update all records where a user was involved once they change a part of their fullname. The other similar problem we have is that we should mark all modules context with changes on their name or intro as we include that info in the indexed doc, we can use the same system than from restore, basically a list of context to "force" indexing of documents depending on them (e.g. I edit a glossary activity name -> I mark its context so later I can get that activity glossary entries that were indexed and update them in the search engine)
              Hide
              poltawski Dan Poltawski added a comment -

              1) Yes I ran the schema setup script - and no i've never run any previous version of this branch. I will try again on a clean install later but there does seem to be some type of bug.

              5) Ahh, I was looking at wrong get_string definition, never seen that lazy lang_string() fallback before.

              6) Mmm, well its not a life changing detail anyway.

              8) I don't dislike blocks like you and I it might be useful to have the block to replace the forum search block for example or raising the prominence of search. They don't 'cost' much.

              9) Sorry I missed that. I think that moodle.org come after it lands then.

              10) Regarding author filtering, I mean to comment about it yesterday. Do we have a strong use case for it, can we 'de-scope' it for now? The only use case I can think of is that 'I want to search for a post which David Monllaó wrote about chocolate' - it doesn't seem super compelling to me.

              By guessing in php, I guess you mean doing an SQL search and then passing a list of ids as a filter? To me that seems quite a practical approach. Alternatively providing some sort of autocomplete user selector (which is a minefield of issues) might fit the case I can think of.

              Basically I think it makes sense to put the userid in for filtering, but we could defer how we actually provide access to that filtering.

              11) New one: something worth thinking about is the path to getting rid of the current (slow and horrible) forum search functionality

              Show
              poltawski Dan Poltawski added a comment - 1) Yes I ran the schema setup script - and no i've never run any previous version of this branch. I will try again on a clean install later but there does seem to be some type of bug. 5) Ahh, I was looking at wrong get_string definition, never seen that lazy lang_string() fallback before. 6) Mmm, well its not a life changing detail anyway. 8) I don't dislike blocks like you and I it might be useful to have the block to replace the forum search block for example or raising the prominence of search. They don't 'cost' much. 9) Sorry I missed that. I think that moodle.org come after it lands then. 10) Regarding author filtering, I mean to comment about it yesterday. Do we have a strong use case for it, can we 'de-scope' it for now? The only use case I can think of is that 'I want to search for a post which David Monllaó wrote about chocolate' - it doesn't seem super compelling to me. By guessing in php, I guess you mean doing an SQL search and then passing a list of ids as a filter? To me that seems quite a practical approach. Alternatively providing some sort of autocomplete user selector (which is a minefield of issues) might fit the case I can think of. Basically I think it makes sense to put the userid in for filtering, but we could defer how we actually provide access to that filtering. 11) New one: something worth thinking about is the path to getting rid of the current (slow and horrible) forum search functionality
              Hide
              poltawski Dan Poltawski added a comment -

              12) One thing that struck me in the shower and want to write it down before I forget . Do we have a way to 'expire' and indexing - I mean say we implement mod_resource indexexing first on the intro/description only and then later adding file indexing. Or a new field added to a module which we want to index.

              Show
              poltawski Dan Poltawski added a comment - 12) One thing that struck me in the shower and want to write it down before I forget . Do we have a way to 'expire' and indexing - I mean say we implement mod_resource indexexing first on the intro/description only and then later adding file indexing. Or a new field added to a module which we want to index.
              Hide
              dmonllao David Monllaó added a comment -

              Thanks for commenting Dan.

              10) auto completion is a very good idea
              11) as long as we depend on solr I don't think we can get rid of it unless we wrote a search_db engine to write into an moodle db table (https://tracker.moodle.org/browse/MDL-31989?focusedCommentId=392990&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-392990 #2) but would only work for small sites and not even in that case is a recommendable choice, many redundant data
              12) apart from setting / removing fields in solr, in both cases we would need to go though all resources, would be enough with a run checking against timemodified > 0

              Show
              dmonllao David Monllaó added a comment - Thanks for commenting Dan. 10) auto completion is a very good idea 11) as long as we depend on solr I don't think we can get rid of it unless we wrote a search_db engine to write into an moodle db table ( https://tracker.moodle.org/browse/MDL-31989?focusedCommentId=392990&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-392990 #2) but would only work for small sites and not even in that case is a recommendable choice, many redundant data 12) apart from setting / removing fields in solr, in both cases we would need to go though all resources, would be enough with a run checking against timemodified > 0
              Hide
              dmonllao David Monllaó added a comment -

              Many thanks for the language strings improvements Helen, I will apply them once I return to this issue.

              Regarding For the global search page as shown in the screenshot search-results.png, I'm wondering why there are two collapsable sections rather than one e.g. titled 'Filter' or 'New filter' (as in admin/user.php)? the first one is expanded by default and it only contains the text box for content search, the second one if for extra filters that can be applied over the first query results.

              The search query for enteryoursearchquery_help is searchinfo_help, in search/engine/solr/lang/en/search_solr.php, it depends on the selected search engine. I am not sure about using "Search term" because we are able to process a lot more than just a term, for example the text pasted below is solr's search engine searchinfo_help string value:

              Features you can use while performing search queries. Search queries are contained within []:
               
              * Fields: You can specify which fields you want results from.
              [title:("moodle" + "perth")]: returns all records that contains both "moodle" and "perth" in the title.
              Available fields: title, name, content, user, author.
              * Boolean Operators ("AND", "OR", "NOT"): <br>[("moodle" AND "perth") OR ("moodle" AND "australia")]
              * Wildcards ("&#42;", "?"): <br>["mo??dl&#42;"] returns both "moodle" and "moodledata".
              * Proximity Searches ("~"): ["mood"~2] returns "moodle". <br>(2 alphabets away from "mood").<br>
              ["moodle australia"~3] returns results containing "moodle hq at perth australia" (the queried terms were within 3 words of each other)
              * Boosting Terms ("^"): To boost certain words/phrases. <br>
              ["perth australia"^5 "australia"] will make results with the phrase "perth australia" more relevant.
              

              Re solrauthuser_desc solrauthpassword_desc and solrhttpconnectionport_desc does extra info need to be provided about the settings, or can these strings be removed yeah, np

              Thanks!

              Show
              dmonllao David Monllaó added a comment - Many thanks for the language strings improvements Helen, I will apply them once I return to this issue. Regarding For the global search page as shown in the screenshot search-results.png, I'm wondering why there are two collapsable sections rather than one e.g. titled 'Filter' or 'New filter' (as in admin/user.php)? the first one is expanded by default and it only contains the text box for content search, the second one if for extra filters that can be applied over the first query results. The search query for enteryoursearchquery_help is searchinfo_help, in search/engine/solr/lang/en/search_solr.php, it depends on the selected search engine. I am not sure about using "Search term" because we are able to process a lot more than just a term, for example the text pasted below is solr's search engine searchinfo_help string value: Features you can use while performing search queries. Search queries are contained within []:   * Fields: You can specify which fields you want results from. [title:("moodle" + "perth")]: returns all records that contains both "moodle" and "perth" in the title. Available fields: title, name, content, user, author. * Boolean Operators ("AND", "OR", "NOT"): <br>[("moodle" AND "perth") OR ("moodle" AND "australia")] * Wildcards ("&#42;", "?"): <br>["mo??dl&#42;"] returns both "moodle" and "moodledata". * Proximity Searches ("~"): ["mood"~2] returns "moodle". <br>(2 alphabets away from "mood").<br> ["moodle australia"~3] returns results containing "moodle hq at perth australia" (the queried terms were within 3 words of each other) * Boosting Terms ("^"): To boost certain words/phrases. <br> ["perth australia"^5 "australia"] will make results with the phrase "perth australia" more relevant. Re solrauthuser_desc solrauthpassword_desc and solrhttpconnectionport_desc does extra info need to be provided about the settings, or can these strings be removed yeah, np Thanks!
              Hide
              poltawski Dan Poltawski added a comment -

              re. 1) I have tried a clean install and been unable to reproduece it.

              Show
              poltawski Dan Poltawski added a comment - re. 1) I have tried a clean install and been unable to reproduece it.
              Hide
              poltawski Dan Poltawski added a comment -

              Reproduced and the problem is the 'empty' check which fails on 0 value:

              diff --git a/search/classes/document.php b/search/classes/document.php
              index b8a6eda..fe0427e 100644
              --- a/search/classes/document.php
              +++ b/search/classes/document.php
              @@ -436,7 +436,7 @@ class document implements \renderable, \templatable {
                       foreach (static::$requiredfields as $fieldname => $field) {
               
                           // We also check that we have everything we need; they all need a value, !isset is not enough.
              -            if (empty($data[$fieldname])) {
              +            if (!isset($data[$fieldname])) {
                               throw new \coding_exception('Missing "' . $fieldname . '" field in document with id "' . $this->data['id'] . '"');
              

              Show
              poltawski Dan Poltawski added a comment - Reproduced and the problem is the 'empty' check which fails on 0 value: diff --git a/search/classes/document.php b/search/classes/document.php index b8a6eda..fe0427e 100644 --- a/search/classes/document.php +++ b/search/classes/document.php @@ -436,7 +436,7 @@ class document implements \renderable, \templatable { foreach (static::$requiredfields as $fieldname => $field) { // We also check that we have everything we need; they all need a value, !isset is not enough. - if (empty($data[$fieldname])) { + if (!isset($data[$fieldname])) { throw new \coding_exception('Missing "' . $fieldname . '" field in document with id "' . $this->data['id'] . '"');
              Hide
              poltawski Dan Poltawski added a comment -

              Hi David,

              I tried to understand this more by implementing global search in mod_url and I have concluded this is not yet ready for the prime time.

              Maybe it's not the use case you might think of straight away, but I imagine the 'resource' example is quite a good one. To put it in a single line:

              As a student, when revising for an exam, I want to be able to search for all previous resources in my courses for photosynthesis.

              I found it not possible to do this with the current API. At the moment it seems to me like the API works for that forum_post example but just trying to achieve this for mod_url showed a few problems:

              • The url resource doens't have a created data
              • The url resource isn't tied to a user

              So I tried hacking it a bit to support this:
              https://github.com/danpoltawski/moodle/commit/fd419e06ebd2bf1e2935a8456295e2948c6aaab5

              And ended up with an indexer like this:
              https://github.com/danpoltawski/moodle/commit/3d3f2e675f51d1917a8acb20f84167f5a0656b6d

              It also made me think that restricting the indexing to only one 'type' per component might be problematic. For example - I want to index all forum posts, but also I might want to search for a forum instance. That might not be super compelling, but it seems short sighted of us to only think there is only one type of object per component?

              Some other smaller things:
              Moving to 'advanced features' https://github.com/danpoltawski/moodle/commit/f510b4a120cacf42db9d4919bee2012e36d16fd5

              And aforementioned '0' handling case:
              https://github.com/danpoltawski/moodle/commit/50a90721ee4a369620ba28cb5159ce84e048afee

              Show
              poltawski Dan Poltawski added a comment - Hi David, I tried to understand this more by implementing global search in mod_url and I have concluded this is not yet ready for the prime time. Maybe it's not the use case you might think of straight away, but I imagine the 'resource' example is quite a good one. To put it in a single line: As a student, when revising for an exam, I want to be able to search for all previous resources in my courses for photosynthesis. I found it not possible to do this with the current API. At the moment it seems to me like the API works for that forum_post example but just trying to achieve this for mod_url showed a few problems: The url resource doens't have a created data The url resource isn't tied to a user So I tried hacking it a bit to support this: https://github.com/danpoltawski/moodle/commit/fd419e06ebd2bf1e2935a8456295e2948c6aaab5 And ended up with an indexer like this: https://github.com/danpoltawski/moodle/commit/3d3f2e675f51d1917a8acb20f84167f5a0656b6d It also made me think that restricting the indexing to only one 'type' per component might be problematic. For example - I want to index all forum posts, but also I might want to search for a forum instance. That might not be super compelling, but it seems short sighted of us to only think there is only one type of object per component? Some other smaller things: Moving to 'advanced features' https://github.com/danpoltawski/moodle/commit/f510b4a120cacf42db9d4919bee2012e36d16fd5 And aforementioned '0' handling case: https://github.com/danpoltawski/moodle/commit/50a90721ee4a369620ba28cb5159ce84e048afee
              Hide
              poltawski Dan Poltawski added a comment -

              (ps. I think we are v.close, I think that we would probably get some benefit by implementing support to a number of modules)

              Show
              poltawski Dan Poltawski added a comment - (ps. I think we are v.close, I think that we would probably get some benefit by implementing support to a number of modules)
              Hide
              cibot CiBoT added a comment -

              Moving this reopened issue out from current integration. Please, re-submit it for integration once ready.

              Show
              cibot CiBoT added a comment - Moving this reopened issue out from current integration. Please, re-submit it for integration once ready.
              Hide
              dmonllao David Monllaó added a comment - - edited

              Thanks for commenting Dan.

              When I started rewriting the patch I allowed multiple types per component, but I remove it because I thought it would be too complex, Yesterday I shared the proposal with Frédéric Massart and we agree that we need multiple types per component, the benefits of it against the current proposal are important enough:

              1. Easier to get better matches when searching, the current patch includes module name and module intro as part of a the forum post document, if I search for "Forum instance name" all forum posts will be retrieved without a "winner", the score will probably on forum posts containing the search terms, would be better to show as a first result "Forum instance name" module
              2. No dependencies between different models data, so no need to sync forum posts when the activity is enabled
              3. Components allowed to index more stuff
              4. Search components can be replaced by search types, search by type sounds better than search by component as the component is somehow more technical than type would be

              I propose something like \plugintype_pluginname\search\indexer\type_name, in forum example we could have \mod_forum\search\indexer\forum_posts and \mod_forum\search\indexer\activity. We could generalize most of the activity one so would be fast to index other modules data. It would require a few changes in the current patch here and there as we should enable/disable "types" instead of components, some langs strings to describe this types contents, a pluginpath/db/search.php to define which ... An alternative would be to just use \plugintype_pluginname\search\type_name without the \indexer level, I would prefer to add indexer in case there is more stuff required in future. Looking at the mod_url example you created (https://github.com/danpoltawski/moodle/commit/3d3f2e675f51d1917a8acb20f84167f5a0656b6d) I think that we should rename name and intro fields to extratitle and extracontent, because we are not indexing the url, mod_page case would be more representative of the problem, we need both page content and page intro.

              About created and userid fields, the proposed patch requires the component to set whatever time field the search 'type' has in both, but I think that we could just remove created. About userid I am not that convinced, there is always a userid, we should think what is the expected outcome if I want to search for teacher X in course Y without a search query, a retrieve everything. If the user field is "author" I agree that userid should be optional and not set when indexing mod_url instances, but if it is "user" then it should be always set. I'm more inclined to agree with you and make userid optional, a student does not "need" to know who created which activity. But we should think about this, we might miss some use cases.

              Talking with Fred Yesterday he also pointed out that the contextid used to store files does not necessarily match the context where the content is created, it is completely up to the developer, we couldn't think of any example in core, but if this is the case we might need an extra contentcontextid. This also highlights an undocumented restriction, content field contents are restricted to a filearea + itemid, as we need to convert html links to moodle files when showing the document contents. The proposal I've sent to integration forces the indexer to set the document unique id (unique inside that indexer scope, it is later prefixed to make it unique across components) to the content filearea itemid, but in some cases the itemid is just 0 like for course modules intro, this was not a problem in the current proposal as intro was part of something else like a forum post, but if we are using separate 'types' to store forum posts and forum activities (just an example) we need to allow search components to set a new field; we can help them with a function to set the behaviour (no itemid, copy from id...) but always allowing them to overwrite the value if they are interested.

              I'm scared about https://github.com/danpoltawski/moodle/commit/50a90721ee4a369620ba28cb5159ce84e048afee#diff-af77ee00a0c44b2a2b51e63c62e06db7R439 I would bet I fixed that, I hope I haven't messed up the branch or forget to push latest changes... I will check when I have time to return to this.

              About the author filter, would be possible to defer it to another issue like the per-course filter. I haven't created any follow-up issue because I'd like to have some agreement before. Both could use autocomplete, but the user one is not that trivial as there are cross-context capability checkings involved.

              Show
              dmonllao David Monllaó added a comment - - edited Thanks for commenting Dan. When I started rewriting the patch I allowed multiple types per component, but I remove it because I thought it would be too complex, Yesterday I shared the proposal with Frédéric Massart and we agree that we need multiple types per component, the benefits of it against the current proposal are important enough: Easier to get better matches when searching, the current patch includes module name and module intro as part of a the forum post document, if I search for "Forum instance name" all forum posts will be retrieved without a "winner", the score will probably on forum posts containing the search terms, would be better to show as a first result "Forum instance name" module No dependencies between different models data, so no need to sync forum posts when the activity is enabled Components allowed to index more stuff Search components can be replaced by search types, search by type sounds better than search by component as the component is somehow more technical than type would be I propose something like \plugintype_pluginname\search\indexer\type_name , in forum example we could have \mod_forum\search\indexer\forum_posts and \mod_forum\search\indexer\activity . We could generalize most of the activity one so would be fast to index other modules data. It would require a few changes in the current patch here and there as we should enable/disable "types" instead of components, some langs strings to describe this types contents, a pluginpath/db/search.php to define which ... An alternative would be to just use \plugintype_pluginname\search\type_name without the \indexer level, I would prefer to add indexer in case there is more stuff required in future. Looking at the mod_url example you created ( https://github.com/danpoltawski/moodle/commit/3d3f2e675f51d1917a8acb20f84167f5a0656b6d ) I think that we should rename name and intro fields to extratitle and extracontent, because we are not indexing the url, mod_page case would be more representative of the problem, we need both page content and page intro. About created and userid fields, the proposed patch requires the component to set whatever time field the search 'type' has in both, but I think that we could just remove created. About userid I am not that convinced, there is always a userid, we should think what is the expected outcome if I want to search for teacher X in course Y without a search query, a retrieve everything. If the user field is "author" I agree that userid should be optional and not set when indexing mod_url instances, but if it is "user" then it should be always set. I'm more inclined to agree with you and make userid optional, a student does not "need" to know who created which activity. But we should think about this, we might miss some use cases. Talking with Fred Yesterday he also pointed out that the contextid used to store files does not necessarily match the context where the content is created, it is completely up to the developer, we couldn't think of any example in core, but if this is the case we might need an extra contentcontextid. This also highlights an undocumented restriction, content field contents are restricted to a filearea + itemid, as we need to convert html links to moodle files when showing the document contents. The proposal I've sent to integration forces the indexer to set the document unique id (unique inside that indexer scope, it is later prefixed to make it unique across components) to the content filearea itemid, but in some cases the itemid is just 0 like for course modules intro, this was not a problem in the current proposal as intro was part of something else like a forum post, but if we are using separate 'types' to store forum posts and forum activities (just an example) we need to allow search components to set a new field; we can help them with a function to set the behaviour (no itemid, copy from id...) but always allowing them to overwrite the value if they are interested. I'm scared about https://github.com/danpoltawski/moodle/commit/50a90721ee4a369620ba28cb5159ce84e048afee#diff-af77ee00a0c44b2a2b51e63c62e06db7R439 I would bet I fixed that, I hope I haven't messed up the branch or forget to push latest changes... I will check when I have time to return to this. About the author filter, would be possible to defer it to another issue like the per-course filter. I haven't created any follow-up issue because I'd like to have some agreement before. Both could use autocomplete, but the user one is not that trivial as there are cross-context capability checkings involved.
              Hide
              matteo Matteo Scaramuccia added a comment -

              Is there any chance to see this awesome and great piece of work within a prototype running on a clone of moodle.org?
              Too excited to see how it would perform .

              TIA,
              Matteo

              Show
              matteo Matteo Scaramuccia added a comment - Is there any chance to see this awesome and great piece of work within a prototype running on a clone of moodle.org? Too excited to see how it would perform . TIA, Matteo
              Hide
              poltawski Dan Poltawski added a comment -

              Some quick responses:

              • Glad that we agreed about multiple indexers per component. Do we need that extra 'indexer' level (only ask beacuse I imagine it will be a layer of empty directories which just makes thinks seem more complex to the dev).
              • +1 to remove created
              • Do not worry about my patches too much, I wasn't creating them for you to pull, just to demonstrate things
              • I really think that userid is an optional field in most cases - for searching resources - students don't see which teacher authored the content when they are browsing a course. A wiki page could be collaborative and not have a single user. In fact I think the case for 'public' searchable user content might be restricted to only mod_forum (maybe glossary, but its weak). Can you think of other cases where userid is important? I see this search tool as a tool for finding content (and not private content) not for searching for assignments submitted by a student, for example. I think userid is a big distraction to be frank. It's useful for forum and will be nice to have it, but I don't think its key here for the common use case of search.
              Show
              poltawski Dan Poltawski added a comment - Some quick responses: Glad that we agreed about multiple indexers per component. Do we need that extra 'indexer' level (only ask beacuse I imagine it will be a layer of empty directories which just makes thinks seem more complex to the dev). +1 to remove created Do not worry about my patches too much, I wasn't creating them for you to pull, just to demonstrate things I really think that userid is an optional field in most cases - for searching resources - students don't see which teacher authored the content when they are browsing a course. A wiki page could be collaborative and not have a single user. In fact I think the case for 'public' searchable user content might be restricted to only mod_forum (maybe glossary, but its weak). Can you think of other cases where userid is important? I see this search tool as a tool for finding content (and not private content) not for searching for assignments submitted by a student, for example. I think userid is a big distraction to be frank. It's useful for forum and will be nice to have it, but I don't think its key here for the common use case of search.
              Hide
              poltawski Dan Poltawski added a comment -

              Matteo Scaramuccia, I agree this will be a great thing to do created MDLSITE-4410 for that.

              Show
              poltawski Dan Poltawski added a comment - Matteo Scaramuccia , I agree this will be a great thing to do created MDLSITE-4410 for that.
              Hide
              danielneis Daniel Neis added a comment -

              Hello,

              good to see people working on this (even better, it does not even have a "partner" label

              A funny thing I've found is that there is no more elastic search plugin. Things changed a lot from in the begging having a solr plugin, than I've made the elastic search plugin but may have broke the solr plugin due to the re-engineering of everything and now we have only the solr plugin again. I did not look at the code yet, but I hope thing not changed much so it will not be that hard to recover the elasticsearch plugin and make it work again for the new code that is supposed to go mainstream, right?

              Also do you prefer solr over elastic search or you just did it because it was the first tool used to solve this issue or another reason ? Because I've worked with both and elasticsearch have many advantages over solr and both are "services" on top of the same "apache lucene" library, so may be any licensing reasons or anything else?

              Kind regards,
              Daniel

              Show
              danielneis Daniel Neis added a comment - Hello, good to see people working on this (even better, it does not even have a "partner" label A funny thing I've found is that there is no more elastic search plugin. Things changed a lot from in the begging having a solr plugin, than I've made the elastic search plugin but may have broke the solr plugin due to the re-engineering of everything and now we have only the solr plugin again. I did not look at the code yet, but I hope thing not changed much so it will not be that hard to recover the elasticsearch plugin and make it work again for the new code that is supposed to go mainstream, right? Also do you prefer solr over elastic search or you just did it because it was the first tool used to solve this issue or another reason ? Because I've worked with both and elasticsearch have many advantages over solr and both are "services" on top of the same "apache lucene" library, so may be any licensing reasons or anything else? Kind regards, Daniel
              Hide
              dmonllao David Monllaó added a comment -

              Hi Daniel,

              The reason behind removing elasticsearch was that after years having this issue in 'development in progress' I thought that would be good to see some advance and the more stuff we initially add the more problems we would have to get this integrated, the patch was massive and both search engine plugins were far from being ready for integration, same with the rest of the apis. Only basic querying was working for elastic search and solr was quite closer than elastic to be ready for integration as Prateek in GSOC spend a lot of time on it, we have pecl solr extension to manage the connection with the server, SSL.... we need all that stuff, you can check out solr 5 apis, they are really close to elasticsearch ones. I agree the first impression is that elasticsearch is simpler than solr and it just works and the way it structures data may help with performance when searching in specific , but the implementation is quite more complex than what we had, it will be easy to add it back in the sense that the API is not hard to understand. You did an good job abstracting the search engine API (note that I kept all your credits) Many thanks Daniel.

              Show
              dmonllao David Monllaó added a comment - Hi Daniel, The reason behind removing elasticsearch was that after years having this issue in 'development in progress' I thought that would be good to see some advance and the more stuff we initially add the more problems we would have to get this integrated, the patch was massive and both search engine plugins were far from being ready for integration, same with the rest of the apis. Only basic querying was working for elastic search and solr was quite closer than elastic to be ready for integration as Prateek in GSOC spend a lot of time on it, we have pecl solr extension to manage the connection with the server, SSL.... we need all that stuff, you can check out solr 5 apis, they are really close to elasticsearch ones. I agree the first impression is that elasticsearch is simpler than solr and it just works and the way it structures data may help with performance when searching in specific , but the implementation is quite more complex than what we had, it will be easy to add it back in the sense that the API is not hard to understand. You did an good job abstracting the search engine API (note that I kept all your credits) Many thanks Daniel.
              Hide
              dmonllao David Monllaó added a comment -

              I've updated the pull branch with a WIP branch, I should be able to send it back to integration during this week.

              Show
              dmonllao David Monllaó added a comment - I've updated the pull branch with a WIP branch, I should be able to send it back to integration during this week.
              Hide
              cibot CiBoT added a comment -

              Fails against automated checks.

              Checked MDL-31989 using repository: https://github.com/dmonllao/moodle

              More information about this report

              Show
              cibot CiBoT added a comment - Fails against automated checks. Checked MDL-31989 using repository: https://github.com/dmonllao/moodle master [branch: MDL-31989_master | CI Job ] Error: The MDL-31989 _master branch at https://github.com/dmonllao/moodle does not apply clean to origin/master More information about this report
              Show
              cibot CiBoT added a comment - Fails against automated checks. Checked MDL-31989 using repository: https://github.com/dmonllao/moodle master (3 errors / 11 warnings) [branch: MDL-31989_master | CI Job ] phplint (0/0) , phpcs (0/10) , js (0/0) , css (0/0) , phpdoc (3/0) , commit (0/1) , savepoint (0/0) , thirdparty (0/0) , grunt (0/0) , shifter (0/0) , travis (0/0) , More information about this report
              Hide
              dmonllao David Monllaó added a comment -

              I've fixed all points mentioned by Dan in https://tracker.moodle.org/browse/MDL-31989?focusedCommentId=393995&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-393995, https://tracker.moodle.org/browse/MDL-31989?focusedCommentId=394250&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-394250 that are still applicable after moving from search by component to search by area:

              1. Search is by area, we can define multiple areas per component
              2. Setup helper in 'Manage global search'
              3. More unit tests
              4. Abstracted moodle activities search in \core_search\area\base_activity so now all activities data is indexed (does not include internal data like forum posts or database entries, they should be other areas)
              5. Glossary entries search area
              6. I've created some issues:
                • Search filters by course and by user
                • Iterator by context for course restores
              Show
              dmonllao David Monllaó added a comment - I've fixed all points mentioned by Dan in https://tracker.moodle.org/browse/MDL-31989?focusedCommentId=393995&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-393995 , https://tracker.moodle.org/browse/MDL-31989?focusedCommentId=394250&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-394250 that are still applicable after moving from search by component to search by area: Search is by area, we can define multiple areas per component Setup helper in 'Manage global search' More unit tests Abstracted moodle activities search in \core_search\area\base_activity so now all activities data is indexed (does not include internal data like forum posts or database entries, they should be other areas) Glossary entries search area I've created some issues: Search filters by course and by user Iterator by context for course restores
              Hide
              cibot CiBoT added a comment -

              Fails against automated checks.

              Checked MDL-31989 using repository: https://github.com/dmonllao/moodle

              More information about this report

              Show
              cibot CiBoT added a comment - Fails against automated checks. Checked MDL-31989 using repository: https://github.com/dmonllao/moodle master [branch: MDL-31989_master | CI Job ] Error: The MDL-31989 _master branch at https://github.com/dmonllao/moodle does not apply clean to origin/master More information about this report
              Hide
              cibot CiBoT added a comment -
              Show
              cibot CiBoT added a comment - Code verified against automated checks with warnings. Checked MDL-31989 using repository: https://github.com/dmonllao/moodle master (0 errors / 3 warnings) [branch: MDL-31989_master | CI Job ] phplint (0/0) , phpcs (0/2) , js (0/0) , css (0/0) , phpdoc (0/0) , commit (0/1) , savepoint (0/0) , thirdparty (0/0) , grunt (0/0) , shifter (0/0) , travis (0/0) , More information about this report
              Hide
              cibot CiBoT added a comment -

              Moving this issue to current integration cycle, will be reviewed soon. Thanks for the hard work!

              Show
              cibot CiBoT added a comment - Moving this issue to current integration cycle, will be reviewed soon. Thanks for the hard work!
              Hide
              cibot CiBoT added a comment -
              Show
              cibot CiBoT added a comment - Code verified against automated checks with warnings. Checked MDL-31989 using repository: https://github.com/dmonllao/moodle master (0 errors / 3 warnings) [branch: MDL-31989_master | CI Job ] phplint (0/0) , phpcs (0/2) , js (0/0) , css (0/0) , phpdoc (0/0) , commit (0/1) , savepoint (0/0) , thirdparty (0/0) , grunt (0/0) , shifter (0/0) , travis (0/0) , More information about this report
              Hide
              poltawski Dan Poltawski added a comment -

              Hi David,

              We seem close now I like the look of things

              1) All the search areas are not being displayed in the admin settings. I did a clean install and saw the same:

              2) Related to that.. in core_search\manager seems like an excessive amount of static caches going on.
              3) Also in core_search\document and core_search\manager - in various places we are using late static binding (static:: rather than self:. This always spikes my interest because I kind of feel like its use should be rare, is that all necessary/correct? I find it a very 'obscure' phpish thing
              4) test_get_component_classes_int_namespace looks extremely fragile (Also "/ Prefix with backslash if it doesn\'t come prefixed." made me laugh)
              5) In the admin settings linking to $url = new moodle_url('/admin/settings.php?section=manageglobalsearch#id_s_mod_assign_enablesearch_activity'); seems extremely fragile (and I always prefer the param setters)
              6) editor_input_to_text - I actually think thats not related to editors and should be in weblib instead (along with html_2o_text() and markdown_to_html()).
              7) I got a good load of php warnings from solr when doing a search, but haven't yet elimiated previous bad data
              8) Is it the right decision to force every activity to extend base_activity rather than just implement it for them?
              9) Still lis dlsike get_config_var_name()
              10) I think a Helen review of new strings is necessary, but sure it'll be easier post-eintgration anyway

              Show
              poltawski Dan Poltawski added a comment - Hi David, We seem close now I like the look of things 1) All the search areas are not being displayed in the admin settings. I did a clean install and saw the same: 2) Related to that.. in core_search\manager seems like an excessive amount of static caches going on. 3) Also in core_search\document and core_search\manager - in various places we are using late static binding (static:: rather than self: . This always spikes my interest because I kind of feel like its use should be rare, is that all necessary/correct? I find it a very 'obscure' phpish thing 4) test_get_component_classes_int_namespace looks extremely fragile (Also "/ Prefix with backslash if it doesn\'t come prefixed." made me laugh) 5) In the admin settings linking to $url = new moodle_url('/admin/settings.php?section=manageglobalsearch#id_s_mod_assign_enablesearch_activity'); seems extremely fragile (and I always prefer the param setters) 6) editor_input_to_text - I actually think thats not related to editors and should be in weblib instead (along with html_2o_text() and markdown_to_html()). 7) I got a good load of php warnings from solr when doing a search, but haven't yet elimiated previous bad data 8) Is it the right decision to force every activity to extend base_activity rather than just implement it for them? 9) Still lis dlsike get_config_var_name() 10) I think a Helen review of new strings is necessary, but sure it'll be easier post-eintgration anyway
              Hide
              dmonllao David Monllaó added a comment -

              Thanks for the review Dan,

              1. Grrr, fixed, I removed an adminlib change before sending to integration (I thought it wasn't required any more) I shouldn't have done it
              2. Do you mean manager or search area base classes? There are 2 static caches in \core_search\manager, one for enabled search areas and another one for all search areas, I kept them separated because we don't need to fill both in all cases, also they do not contain much data. About \core_search\area\base and family, we want to avoid db queries during indexing, if we remove static caches it will be slower and there is no need to query again. MUC is not a good alternative here, overhead and everything should continue working without cache. Indexing will run through cron, which runs in CLI which should have an unlimited or pretty high memory limit, I don't see any problem with this static caches. Which class did you mean? Are you worried about something in particular?
              3. static makes static stuff to work like in any other language, self is the phpish thing it only would make sense to use self (apart from some cases where it is the intended behaviour) if we would support php < 5.3. I've looked for performance implications of late static bindings and I couldn't find anything.
              4. lol, why? Does it sound sexual? Sorry, I have no idea what you talk about. I've changed the strpos thing for a ltrim \. I'm open to suggestions for further changes.
              5. Agree, I discarded that initial proposal and I've been thinking about alternatives, I've gone with the IMO less ugly option (3rd one):
                • No link (we would also have to remove it from 'Select search engine') -> pity
                • Add a fake hidden admin_setting to point to its id -> horrible
                • Use 'Select search engine' as an anchor -> Not likely that search engine heading will have more setting underneath, with only one setting and a big 'Available areas for search' heading below it, I think people will know what they are supposed to do
              6. Done, I was thinking of also updating the name of the function, but I left it as it, I couldn't find a better alternative and it is really editor input what it expects, not "text", can be misleading
              7. Delete & create it again, it is normal that it is not working as expected
              8. I am not sure I understand you, not all activities can be indexed (MDL-53149) and not all of them are indexed in the same way (page, url...) some of them might not be interested in indexing stuff (3rd party activities)... A class with 5 lines of code is not much work for devs if they want their modules to be indexed
              9. I don't like it either although if we want to respect the principle of storing plugins config data in config_plugins instead of in config we don't have much choice, we need it in admin/settings/plugins.php, plugins write to config_plugins, core subsystems write to config. To me this weird function interface is not a problem because, although it is public (we need it for admin settings), it is internal to the search subsystem and devs are not supposed to use it. I added a @access private to the function's php doc block and I've added new get_config method to searcharea base class to provide an "easy going" alternative (used in reset_config now)
              Show
              dmonllao David Monllaó added a comment - Thanks for the review Dan, Grrr, fixed, I removed an adminlib change before sending to integration (I thought it wasn't required any more) I shouldn't have done it Do you mean manager or search area base classes? There are 2 static caches in \core_search\manager, one for enabled search areas and another one for all search areas, I kept them separated because we don't need to fill both in all cases, also they do not contain much data. About \core_search\area\base and family, we want to avoid db queries during indexing, if we remove static caches it will be slower and there is no need to query again. MUC is not a good alternative here, overhead and everything should continue working without cache. Indexing will run through cron, which runs in CLI which should have an unlimited or pretty high memory limit, I don't see any problem with this static caches. Which class did you mean? Are you worried about something in particular? static makes static stuff to work like in any other language, self is the phpish thing it only would make sense to use self (apart from some cases where it is the intended behaviour) if we would support php < 5.3. I've looked for performance implications of late static bindings and I couldn't find anything. lol, why? Does it sound sexual? Sorry, I have no idea what you talk about. I've changed the strpos thing for a ltrim \. I'm open to suggestions for further changes. Agree, I discarded that initial proposal and I've been thinking about alternatives, I've gone with the IMO less ugly option (3rd one): No link (we would also have to remove it from 'Select search engine') -> pity Add a fake hidden admin_setting to point to its id -> horrible Use 'Select search engine' as an anchor -> Not likely that search engine heading will have more setting underneath, with only one setting and a big 'Available areas for search' heading below it, I think people will know what they are supposed to do Done, I was thinking of also updating the name of the function, but I left it as it, I couldn't find a better alternative and it is really editor input what it expects, not "text", can be misleading Delete & create it again, it is normal that it is not working as expected I am not sure I understand you, not all activities can be indexed ( MDL-53149 ) and not all of them are indexed in the same way (page, url...) some of them might not be interested in indexing stuff (3rd party activities)... A class with 5 lines of code is not much work for devs if they want their modules to be indexed I don't like it either although if we want to respect the principle of storing plugins config data in config_plugins instead of in config we don't have much choice, we need it in admin/settings/plugins.php, plugins write to config_plugins, core subsystems write to config. To me this weird function interface is not a problem because, although it is public (we need it for admin settings), it is internal to the search subsystem and devs are not supposed to use it. I added a @access private to the function's php doc block and I've added new get_config method to searcharea base class to provide an "easy going" alternative (used in reset_config now)
              Hide
              dmonllao David Monllaó added a comment -

              I've made some more changes:

              1. Added a capabilty to restrict search to specific roles (even guest allowed by default)
              2. \n\t cleanup
              3. Removed search/engine/solr/cli/setup_schema.php (running through web search/engine/solr/setup_schema.php)
              4. Keyboard navigation improvement for the search widget

              All changes are along with the previous comment ones in the last commit.

              Show
              dmonllao David Monllaó added a comment - I've made some more changes: Added a capabilty to restrict search to specific roles (even guest allowed by default) \n\t cleanup Removed search/engine/solr/cli/setup_schema.php (running through web search/engine/solr/setup_schema.php) Keyboard navigation improvement for the search widget All changes are along with the previous comment ones in the last commit.
              Hide
              poltawski Dan Poltawski added a comment -

              Hi David,

              My comments are not so important (which I almost wrote on each point):
              2. Thanks, you explained your choices.
              3. I don't really agree, but lets not become distracted by it.
              4. That auth_cas and core_user classes match counts and will break unit tests if someone adds a class in the namesapces is the cause for the fragility. What made me laugh was the escapaing of your php comment which doesn't need backslash escape (And funny it would be in a method dealing with namespace escape characters)
              6. I don't understand what you think is so editor-ish about it. Its just a text format (defined at the top of weblib) - we don't call them editor inputs..
              8. I'm not sure its clear to developers why their activites can not be indexed when the implementation contains no code and somewhat depends on the standard mod conventions.
              9. As I said before, I'd make it be less-magic guessing and more explicit on the plugin side. I'm sure your way works better for easier plugin implemetnation.

              Show
              poltawski Dan Poltawski added a comment - Hi David, My comments are not so important (which I almost wrote on each point): 2. Thanks, you explained your choices. 3. I don't really agree , but lets not become distracted by it. 4. That auth_cas and core_user classes match counts and will break unit tests if someone adds a class in the namesapces is the cause for the fragility. What made me laugh was the escapaing of your php comment which doesn't need backslash escape (And funny it would be in a method dealing with namespace escape characters) 6. I don't understand what you think is so editor-ish about it. Its just a text format (defined at the top of weblib) - we don't call them editor inputs.. 8. I'm not sure its clear to developers why their activites can not be indexed when the implementation contains no code and somewhat depends on the standard mod conventions. 9. As I said before, I'd make it be less-magic guessing and more explicit on the plugin side. I'm sure your way works better for easier plugin implemetnation.
              Hide
              dmonllao David Monllaó added a comment -

              Thanks for the quick reply Dan.

              4. lol, I get it now. I tried to select namespaces (in the test) that are not likely to break, and if they do it would not be hard to find out why tests are failing, we really need a unit test for multilevel namespaces, if you have any better candidate in mind let me know, I'm happy to change it.
              6. No problem for me in changing it, it is just that I don't have a good alternative, if you have any suggestion I'm happy to change the name
              8. We need them to explicitly say that they want to index their activity stuff and what they want to index, in most of the cases will be name and description as title and content, but they are not "strict" standard conventions, they may be empty or duplicated and they are not compulsory fields, same with time field used to calculate the last modification, it is not a strict convention; looking at it from another point of view, forcing them to create a class is a better alternative than a boilerplate db/search.php with just a $search = array('activity'); which is more moodle way, but not necessarily better. I can add more comments in \core_search\area\base_activity with this explanation if you think this will help
              9. We could add a get_enabled_settingname function in searcharea base class help so search areas always have the last word.

              Waiting for feedback before proceeding.

              Show
              dmonllao David Monllaó added a comment - Thanks for the quick reply Dan. 4. lol, I get it now. I tried to select namespaces (in the test) that are not likely to break, and if they do it would not be hard to find out why tests are failing, we really need a unit test for multilevel namespaces, if you have any better candidate in mind let me know, I'm happy to change it. 6. No problem for me in changing it, it is just that I don't have a good alternative, if you have any suggestion I'm happy to change the name 8. We need them to explicitly say that they want to index their activity stuff and what they want to index, in most of the cases will be name and description as title and content, but they are not "strict" standard conventions, they may be empty or duplicated and they are not compulsory fields, same with time field used to calculate the last modification, it is not a strict convention; looking at it from another point of view, forcing them to create a class is a better alternative than a boilerplate db/search.php with just a $search = array('activity'); which is more moodle way, but not necessarily better. I can add more comments in \core_search\area\base_activity with this explanation if you think this will help 9. We could add a get_enabled_settingname function in searcharea base class help so search areas always have the last word. Waiting for feedback before proceeding.
              Hide
              poltawski Dan Poltawski added a comment -

              4. I don't propose to change this.
              6. We discussed in chat the proposal content_to_text()
              8. We discussed in chat a bit about enforcing the activity to define the timemodified field to make it a bit clearer. It does just add duplicated code - i'm not sure its worth it. But I don't love the empty class either. Lets see. This is master, we can change in future. Better to get it landed as is.
              9. I don't propose to change this. It's just my bugbear in a mountain of code.

              Show
              poltawski Dan Poltawski added a comment - 4. I don't propose to change this. 6. We discussed in chat the proposal content_to_text() 8. We discussed in chat a bit about enforcing the activity to define the timemodified field to make it a bit clearer. It does just add duplicated code - i'm not sure its worth it. But I don't love the empty class either. Lets see. This is master, we can change in future. Better to get it landed as is. 9. I don't propose to change this. It's just my bugbear in a mountain of code.
              Hide
              dmonllao David Monllaó added a comment -

              Yeah, content_to_text sounds better.

              I agree with you about the empty class, but having autoload and using the search area name as the class name they are not "required" to set anything else if they are happy with name and description as title and content. It is probably faster than a db/search.php too.

              Show
              dmonllao David Monllaó added a comment - Yeah, content_to_text sounds better. I agree with you about the empty class, but having autoload and using the search area name as the class name they are not "required" to set anything else if they are happy with name and description as title and content. It is probably faster than a db/search.php too.
              Hide
              poltawski Dan Poltawski added a comment -

              Integrated to master, thanks David. Note I had to rebase your changes ontop master as your branch had some curious mistaken duplicated relelase-related commits which didn't match integration.

              I have some followup comments which I think we can devolve to new issues (i'm sure more will come from usage an testing, which is why its important to me to get this landed):

              9) OK, coming back to my bugbear, I can't drop it. I still don't like this. I don't think it fits into admin setting checkboxes like you've done here. When I see all the activities in place - having them defined in admin settings like this seems 'wrong' (and especially when the admin gets prompted to set them all on upgrade). It will quickly get out of hand as we add more areas. I'm not sure this is the right apporach - I can't think of other examples of us doing the same thing. When a new scheduled task is added, we don't get the admin to 'enable' each one. Its a bad example, but similar sort of situation. More and more I just think that admin settings fit the way you've done it here.

              10) I think the base_activity subclasses string naming is bad in many places, some examples:

              • What is 'Label activities'? We don't really think of labels as activities, it's just a label.
              • We call Book a resource, but here we are reffering to it as an activity.
              • It's not clear that 'Wiki activites' doesn't contain the entiriety of the wiki - the naming makes it sound like its the entire activity. People might expect everything to be indexed.

              I think we need to alter those strings for each module and in particular make clear when the area being indexed is just the module metadata and not the module content. I don't have good suggestions here I just think its not clear right now. It's sort of like name/description vs content.

              There are other strings needing review too. It might be worth creating a specific issue to do this ASAP.

              Show
              poltawski Dan Poltawski added a comment - Integrated to master, thanks David. Note I had to rebase your changes ontop master as your branch had some curious mistaken duplicated relelase-related commits which didn't match integration. I have some followup comments which I think we can devolve to new issues (i'm sure more will come from usage an testing, which is why its important to me to get this landed): 9) OK, coming back to my bugbear, I can't drop it. I still don't like this. I don't think it fits into admin setting checkboxes like you've done here. When I see all the activities in place - having them defined in admin settings like this seems 'wrong' (and especially when the admin gets prompted to set them all on upgrade). It will quickly get out of hand as we add more areas. I'm not sure this is the right apporach - I can't think of other examples of us doing the same thing. When a new scheduled task is added, we don't get the admin to 'enable' each one. Its a bad example, but similar sort of situation. More and more I just think that admin settings fit the way you've done it here. 10) I think the base_activity subclasses string naming is bad in many places, some examples: What is 'Label activities'? We don't really think of labels as activities, it's just a label. We call Book a resource, but here we are reffering to it as an activity. It's not clear that 'Wiki activites' doesn't contain the entiriety of the wiki - the naming makes it sound like its the entire activity. People might expect everything to be indexed. I think we need to alter those strings for each module and in particular make clear when the area being indexed is just the module metadata and not the module content. I don't have good suggestions here I just think its not clear right now. It's sort of like name/description vs content. There are other strings needing review too. It might be worth creating a specific issue to do this ASAP.
              Hide
              poltawski Dan Poltawski added a comment - - edited

              Sending straight to failed, due to problems with survey.

              Error retrieving mod_survey-activity 1 document, not all required data is available: Can not find data record in database table course.
              line 81 of /search/classes/area/base_activity.php: call to debugging()
              line ? of unknownfile: call to core_search\area\base_activity->get_document()
              

              (etc)

              Coming from:

              id course template days timecreated timemodified name intro introformat questions
              1 0 0 0 985017600 985017600 collesaname collesaintro 0 25,26,27,28,29,30,43,44
              2 0 0 0 985017600 985017600 collespname collespintro 0 31,32,33,34,35,36,43,44
              3 0 0 0 985017600 985017600 collesapname collesapintro 0 37,38,39,40,41,42,43,44
              4 0 0 0 985017600 985017600 attlsname attlsintro 0 65,67,68
              5 0 0 0 985017600 985017600 ciqname ciqintro 0 69,70,71,72,73
              6 4 1 0 1456224228 1456224228 test <p>test</p> 1 25,26,27,28,29,30,43,44

              (Looks like those first records are templates and this is not right)

              Show
              poltawski Dan Poltawski added a comment - - edited Sending straight to failed, due to problems with survey. Error retrieving mod_survey-activity 1 document, not all required data is available: Can not find data record in database table course. line 81 of /search/classes/area/base_activity.php: call to debugging() line ? of unknownfile: call to core_search\area\base_activity->get_document() (etc) Coming from: id course template days timecreated timemodified name intro introformat questions 1 0 0 0 985017600 985017600 collesaname collesaintro 0 25,26,27,28,29,30,43,44 2 0 0 0 985017600 985017600 collespname collespintro 0 31,32,33,34,35,36,43,44 3 0 0 0 985017600 985017600 collesapname collesapintro 0 37,38,39,40,41,42,43,44 4 0 0 0 985017600 985017600 attlsname attlsintro 0 65,67,68 5 0 0 0 985017600 985017600 ciqname ciqintro 0 69,70,71,72,73 6 4 1 0 1456224228 1456224228 test <p>test</p> 1 25,26,27,28,29,30,43,44 (Looks like those first records are templates and this is not right)
              Hide
              dmonllao David Monllaó added a comment -

              Thanks Dan.

              Patch for the survey failure:
              git pull git://github.com/dmonllao/moodle.git MDL-31989_survey-fix
              https://github.com/dmonllao/moodle/commit/c625b01a4c5487e591c9923dd9678fea7eb8b2b9

              Show
              dmonllao David Monllaó added a comment - Thanks Dan. Patch for the survey failure: git pull git://github.com/dmonllao/moodle.git MDL-31989 _survey-fix https://github.com/dmonllao/moodle/commit/c625b01a4c5487e591c9923dd9678fea7eb8b2b9
              Hide
              poltawski Dan Poltawski added a comment - - edited

              Thanks, pulled. Sending back to testing

              Show
              poltawski Dan Poltawski added a comment - - edited Thanks, pulled. Sending back to testing
              Hide
              dmonllao David Monllaó added a comment -

              I will create issues for the lang strings review and the administration UI improvements and link them as part of this epic.

              The release commits thing is strange, I've been working on top of latest github.com/moodle/moodle master branches)

              Show
              dmonllao David Monllaó added a comment - I will create issues for the lang strings review and the administration UI improvements and link them as part of this epic. The release commits thing is strange, I've been working on top of latest github.com/moodle/moodle master branches)
              Hide
              poltawski Dan Poltawski added a comment -

              I have created MDL-53222 about the admin settings.

              Show
              poltawski Dan Poltawski added a comment - I have created MDL-53222 about the admin settings.
              Hide
              dmonllao David Monllaó added a comment -

              The performance comparison tool detected a performance regression (http://integration.moodle.org/view/nightly/job/P.08.%20Compare%20performance%20between%20Moodle%20master%20stable%20and%20Moodle%20master%20integration/526/console) I missed a $hassiteadmin checking in admin settings so they were loaded for all users.

              I'm running it again. Posting the patch (on top of integration) in the meanwhile (note the w=1 in the diff)
              git pull git://github.com/dmonllao/moodle.git MDL-31989_admin-performance
              https://github.com/dmonllao/moodle/compare/MDL-31989_admin-performance~...MDL-31989_admin-performance?w=1

              Show
              dmonllao David Monllaó added a comment - The performance comparison tool detected a performance regression ( http://integration.moodle.org/view/nightly/job/P.08.%20Compare%20performance%20between%20Moodle%20master%20stable%20and%20Moodle%20master%20integration/526/console ) I missed a $hassiteadmin checking in admin settings so they were loaded for all users. I'm running it again. Posting the patch (on top of integration) in the meanwhile (note the w=1 in the diff) git pull git://github.com/dmonllao/moodle.git MDL-31989 _admin-performance https://github.com/dmonllao/moodle/compare/MDL-31989_admin-performance~...MDL-31989_admin-performance?w=1
              Hide
              dobedobedoh Andrew Nicols added a comment -

              Thanks David,

              Pulled your patch. Tester: Please update your branches.

              Show
              dobedobedoh Andrew Nicols added a comment - Thanks David, Pulled your patch. Tester: Please update your branches.
              Hide
              abgreeve Adrian Greeve added a comment -

              Tested on master.
              Searching works, unit tests pass. All good.
              Test passed.

              Show
              abgreeve Adrian Greeve added a comment - Tested on master. Searching works, unit tests pass. All good. Test passed.
              Hide
              dobedobedoh Andrew Nicols added a comment -

              Thanks for your contributions! This change is now available from the main moodle.git repository and will shortly be available on download.moodle.org.

              On this day in 1943, George Harrison MBE, lead guitarist of the Beatles was born.

              Show
              dobedobedoh Andrew Nicols added a comment - Thanks for your contributions! This change is now available from the main moodle.git repository and will shortly be available on download.moodle.org. On this day in 1943, George Harrison MBE, lead guitarist of the Beatles was born.
              Hide
              danielneis Daniel Neis added a comment -

              Great! Thank you everybody!

              Show
              danielneis Daniel Neis added a comment - Great! Thank you everybody!
              Hide
              emerrill Eric Merrill added a comment - - edited

              For some information, I ran an index against our production dataset. It produced 1.5M records, took 4.25 hours to complete, used 1.8M DB queries, and the resulting index is 1.9GB in size. You can see the per-area information below:

                  Time (s) Docs Records Ignored
              Assignment activities Sunday, February 28, 2016, 10:17 PM 400 43443 43443 0
              Book activities Sunday, October 11, 2015, 7:42 PM 13 985 985 0
              Chat activities Thursday, May 30, 2013, 2:13 PM 14 1323 1323 0
              Choice activities Friday, February 19, 2016, 9:08 PM 4 473 473 0
              Feedback activities Sunday, November 1, 2015, 12:22 PM 15 1172 1172 0
              Folder activities Monday, September 21, 2015, 9:14 PM 118 14615 14615 0
              Forum activities Tuesday, September 22, 2015, 4:15 PM 536 53262 53262 0
              Forum posts Tuesday, February 23, 2016, 11:16 AM 10742 922110 922110 0
              Glossary activities Wednesday, October 28, 2015, 2:31 PM 0 81 81 0
              Glossary entries Monday, July 20, 2015, 4:54 PM 3 573 573 0
              IMS content package activities Saturday, April 28, 2012, 12:27 PM 2 551 551 0
              Label activities Friday, September 11, 2015, 5:59 PM 583 58494 58494 0
              Lesson activities Tuesday, November 24, 2015, 1:05 PM 6 665 665 0
              LTI activities Tuesday, February 16, 2016, 10:43 PM 2 186 186 0
              Page activities Tuesday, September 22, 2015, 5:33 PM 470 33690 33690 0
              Quiz activities Tuesday, February 16, 2016, 6:36 PM 393 25092 25092 0
              Resource activities Tuesday, September 22, 2015, 5:58 PM 1694 286607 286607 0
              SCORM package activities Never 4 451 451 0
              Survey activities Never 0 0 0 0
              URL activities Friday, February 20, 2015, 2:02 PM 461 72783 72783 0
              Wiki activities Tuesday, September 22, 2015, 2:25 PM 7 549 549 0
              Workshop activities Friday, November 20, 2015, 2:43 PM 1 131 131 0
                  15468 1517236 1517236 0
              Show
              emerrill Eric Merrill added a comment - - edited For some information, I ran an index against our production dataset. It produced 1.5M records, took 4.25 hours to complete, used 1.8M DB queries, and the resulting index is 1.9GB in size. You can see the per-area information below:     Time (s) Docs Records Ignored Assignment activities Sunday, February 28, 2016, 10:17 PM 400 43443 43443 0 Book activities Sunday, October 11, 2015, 7:42 PM 13 985 985 0 Chat activities Thursday, May 30, 2013, 2:13 PM 14 1323 1323 0 Choice activities Friday, February 19, 2016, 9:08 PM 4 473 473 0 Feedback activities Sunday, November 1, 2015, 12:22 PM 15 1172 1172 0 Folder activities Monday, September 21, 2015, 9:14 PM 118 14615 14615 0 Forum activities Tuesday, September 22, 2015, 4:15 PM 536 53262 53262 0 Forum posts Tuesday, February 23, 2016, 11:16 AM 10742 922110 922110 0 Glossary activities Wednesday, October 28, 2015, 2:31 PM 0 81 81 0 Glossary entries Monday, July 20, 2015, 4:54 PM 3 573 573 0 IMS content package activities Saturday, April 28, 2012, 12:27 PM 2 551 551 0 Label activities Friday, September 11, 2015, 5:59 PM 583 58494 58494 0 Lesson activities Tuesday, November 24, 2015, 1:05 PM 6 665 665 0 LTI activities Tuesday, February 16, 2016, 10:43 PM 2 186 186 0 Page activities Tuesday, September 22, 2015, 5:33 PM 470 33690 33690 0 Quiz activities Tuesday, February 16, 2016, 6:36 PM 393 25092 25092 0 Resource activities Tuesday, September 22, 2015, 5:58 PM 1694 286607 286607 0 SCORM package activities Never 4 451 451 0 Survey activities Never 0 0 0 0 URL activities Friday, February 20, 2015, 2:02 PM 461 72783 72783 0 Wiki activities Tuesday, September 22, 2015, 2:25 PM 7 549 549 0 Workshop activities Friday, November 20, 2015, 2:43 PM 1 131 131 0     15468 1517236 1517236 0
              Hide
              senikuro Nils Kurowsky added a comment -

              Hi David,

              are you also planning to implement a function to index file contents (e.g. pdf, docx, html)? This project actually appears in the GSoC 2013 page (https://docs.moodle.org/dev/Global_search_%28GSoC2013%29#Introduction) but isn't supported in the latest version. So is this project still on your road map or have you decided to discard it instead?

              Thank you!

              Best regards,

              Nils

              Show
              senikuro Nils Kurowsky added a comment - Hi David, are you also planning to implement a function to index file contents (e.g. pdf, docx, html)? This project actually appears in the GSoC 2013 page ( https://docs.moodle.org/dev/Global_search_%28GSoC2013%29#Introduction ) but isn't supported in the latest version. So is this project still on your road map or have you decided to discard it instead? Thank you! Best regards, Nils
              Hide
              emerrill Eric Merrill added a comment - - edited

              This (file indexing) is going to be something I'm going to be looking into if someone else doesn't get to it first. It would be a big boon, particularly for single file resources, like when teachers attach a syllabus, or PDF docs to courses.

              There are a few hurdles we need to work on, like pecl-solr doesn't support it, so you have to setup your own curl object to do the connection, error handling, etc.

              I've created MDL-53318 for it.

              Show
              emerrill Eric Merrill added a comment - - edited This (file indexing) is going to be something I'm going to be looking into if someone else doesn't get to it first. It would be a big boon, particularly for single file resources, like when teachers attach a syllabus, or PDF docs to courses. There are a few hurdles we need to work on, like pecl-solr doesn't support it, so you have to setup your own curl object to do the connection, error handling, etc. I've created MDL-53318 for it.
              Hide
              xan Prateek Sachan added a comment -

              This is great work. Thanks for getting the project integrated into Moodle core finally. I was on a holiday so it'll take me sometime to get on track. Too many updates and developments.

              Show
              xan Prateek Sachan added a comment - This is great work. Thanks for getting the project integrated into Moodle core finally. I was on a holiday so it'll take me sometime to get on track. Too many updates and developments.
              Show
              dmonllao David Monllaó added a comment - Removing dev_docs_required: https://docs.moodle.org/dev/Search_API and https://docs.moodle.org/dev/Search_engines
              Hide
              danielneis Daniel Neis added a comment -

              Hello, everyone

              I've made the changes needed to my old cold to create a new Elasticsearch plugin:
              https://github.com/danielneis/moodle-search_elasticsearch

              It would be very nice if Eric Merrill could test with the database you used on your previous test with Solr =)

              Kind regards,
              Daniel

              Show
              danielneis Daniel Neis added a comment - Hello, everyone I've made the changes needed to my old cold to create a new Elasticsearch plugin: https://github.com/danielneis/moodle-search_elasticsearch It would be very nice if Eric Merrill could test with the database you used on your previous test with Solr =) Kind regards, Daniel
              Hide
              dmonllao David Monllaó added a comment - - edited

              Thanks for sharing this Daniel, note that, although it may "work" and it is a nice proof of concept to play with in dev environments, it is far from being ready and very insecure.

              Show
              dmonllao David Monllaó added a comment - - edited Thanks for sharing this Daniel, note that, although it may "work" and it is a nice proof of concept to play with in dev environments, it is far from being ready and very insecure.
              Hide
              poltawski Dan Poltawski added a comment -

              David commented on the dev forums on why: https://moodle.org/mod/forum/discuss.php?d=227805#p1327561

              Show
              poltawski Dan Poltawski added a comment - David commented on the dev forums on why: https://moodle.org/mod/forum/discuss.php?d=227805#p1327561
              Hide
              emerrill Eric Merrill added a comment -

              I've opened MDL-53535 to discuss setting a minimum Solr version. So far I haven't been able to get the current code to run against 4.8, but that might be me.

              Show
              emerrill Eric Merrill added a comment - I've opened MDL-53535 to discuss setting a minimum Solr version. So far I haven't been able to get the current code to run against 4.8, but that might be me.
              Hide
              tugbadogan Tuğba Doğan added a comment -

              Hi I'm Tuğba Doğan. I'm interested in developing elasticsearch plugin for Moodle as a GSOC project and I upload proposal draft to GSoC web site. I am working on bug MDL-53433. Should I do anything else besides the proposal? Can you give feedback for the proposal draft?

              Show
              tugbadogan Tuğba Doğan added a comment - Hi I'm Tuğba Doğan. I'm interested in developing elasticsearch plugin for Moodle as a GSOC project and I upload proposal draft to GSoC web site. I am working on bug MDL-53433 . Should I do anything else besides the proposal? Can you give feedback for the proposal draft?
              Hide
              dmonllao David Monllaó added a comment -

              Hi Tuğba,

              Thanks for your interest. We will go soon through the proposals list. Cheers.

              Show
              dmonllao David Monllaó added a comment - Hi Tuğba, Thanks for your interest. We will go soon through the proposals list. Cheers.
              Hide
              rajeshtaneja Rajesh Taneja added a comment -

              MDLQA-9240 has been added to QA list to test Global search.

              Show
              rajeshtaneja Rajesh Taneja added a comment - MDLQA-9240 has been added to QA list to test Global search.
              Hide
              nbruley Nathan Bruley added a comment -

              Can someone clarify where the best place to go to get global search is now? Can it be used with Moodle 2.9?

              Show
              nbruley Nathan Bruley added a comment - Can someone clarify where the best place to go to get global search is now? Can it be used with Moodle 2.9?
              Hide
              emerrill Eric Merrill added a comment -

              It is 3.1 (not out yet) only. It might be possible for someone to backport it to an older version of Moodle if they really had their heart set on it, but it would be very hard, and in no way supported.

              Show
              emerrill Eric Merrill added a comment - It is 3.1 (not out yet) only. It might be possible for someone to backport it to an older version of Moodle if they really had their heart set on it, but it would be very hard, and in no way supported.
              Hide
              marycooch Mary Cooch added a comment -

              Removing docs _required label as in addition to the dev docs pages we also have https://docs.moodle.org/31/en/Global_search

              Show
              marycooch Mary Cooch added a comment - Removing docs _required label as in addition to the dev docs pages we also have https://docs.moodle.org/31/en/Global_search
              Hide
              tsala Helen Foster added a comment -

              Just noting that I have made a few improvements and grammatical fixes to the search_solr language strings in en_fix: https://lang.moodle.org/local/amos/view.php?t=1463426587&v=3100&l=en_fix&c=search_solr&s&d

              In particular, I have simplified searchinfo_help so that it now reads:

              'The field to be searched may be specified by prefixing the search query with 'title:', 'content:', 'name:', or 'intro:'. For example, searching for 'title:news' would return results with the word 'news' in the title. Boolean operators ('AND', 'OR', 'NOT') may be used to combine or exclude keywords. Wildcard characters ('*' or '?' ) may be used to represent characters in the search query.'

              Please comment if you can suggest any further improvements to the language strings.

              Show
              tsala Helen Foster added a comment - Just noting that I have made a few improvements and grammatical fixes to the search_solr language strings in en_fix: https://lang.moodle.org/local/amos/view.php?t=1463426587&v=3100&l=en_fix&c=search_solr&s&d In particular, I have simplified searchinfo_help so that it now reads: 'The field to be searched may be specified by prefixing the search query with 'title:', 'content:', 'name:', or 'intro:'. For example, searching for 'title:news' would return results with the word 'news' in the title. Boolean operators ('AND', 'OR', 'NOT') may be used to combine or exclude keywords. Wildcard characters ('*' or '?' ) may be used to represent characters in the search query.' Please comment if you can suggest any further improvements to the language strings.
              Hide
              tsala Helen Foster added a comment -

              PS I have added the additional information about proximity searches and boosting terms to the user docs Global search.

              Show
              tsala Helen Foster added a comment - PS I have added the additional information about proximity searches and boosting terms to the user docs Global search .
              Hide
              dmonllao David Monllaó added a comment -

              Thanks for the changes Helen but note that this syntax is specific to Solr, not all search engines will support all these features and even if they do they may use other syntax.

              Show
              dmonllao David Monllaó added a comment - Thanks for the changes Helen but note that this syntax is specific to Solr, not all search engines will support all these features and even if they do they may use other syntax.
              Hide
              tsala Helen Foster added a comment -

              Thanks David, I have amended the user docs accordingly.

              Show
              tsala Helen Foster added a comment - Thanks David, I have amended the user docs accordingly.
              Hide
              emerrill Eric Merrill added a comment -

              Some other notes: name and intro are not fields. Plus there are description1 and description2, which are really just additional content fields that different plugins use in different ways. Unfortunately the schema isn't terribly consistent with user expectations, and I wonder if we should even advertise field specific searches.

              Show
              emerrill Eric Merrill added a comment - Some other notes: name and intro are not fields. Plus there are description1 and description2, which are really just additional content fields that different plugins use in different ways. Unfortunately the schema isn't terribly consistent with user expectations, and I wonder if we should even advertise field specific searches.

                Dates

                • Created:
                  Updated:
                  Resolved:
                  Fix Release Date:
                  23/May/16