-
Improvement
-
Resolution: Fixed
-
Minor
-
3.10
-
MOODLE_310_STABLE
-
MOODLE_310_STABLE
-
MDL-68726-master -
Moodle automatically runs the Solr 'optimize' task every night via a scheduled task. This is a bad thing. At least in most versions of Solr, what optimize does is to rewrite the entire index (which may consist of multiple 'segment' files) into a single file. This has the following impacts:
- In theory it might save disk space (the single file will not have any 'deleted' spaces in it), but in practice, you need 2x disk space each time it runs optimize, so I'm not sure what was the point of saving the space given that you need to keep it free anyway...
- In theory it might make it faster, but Solr is perfectly fast with multiple segments.
- With a sufficiently large index, it stops working anyway because it hits a Moodle time limit (120 seconds probably).
- If you ever stop running optimise, you end up with one massive segment (e.g. 35GB) and Solr will create and use other segments, but will never free up that one unless 32.5GB of the 35GB gets deleted.
It seems to be a generally-held belief that optimise is not usually necessary, and is only maybe a benefit if you have a static index (e.g. your website is only updated every 3 months, you update it and hit optimise then). From experience at the OU (struggling to get the 35GB segment deleted) I concur with this belief.
Anyway, I would suggest that deciding to optimise, which is a quite drastic step, is something you should only do with full understanding and due consideration, which would indicate to me that the application shouldn't do it automatically!
The scheduled task is used by other search engine plugins so rather than literally remove the task, I propose simply deleting the implementation within Solr.
Along with this change I'm also going to make it add an mtrace when optimise does nothing (the default implementation) so that admins can tell if the search engine has not implemented it, basically just if they're wondering 'does this task do anything'.