|
|
|
I've been looking at logs today and have seen that there are a lot of forum/search.php requests causing the DB to get stressed a lot (confirmed by the slow query log of mysql).
Here there are some facts about that:
- Current behaviour executes the query twice (once to count results and another one to fetch the current page ones).
- The count is SLOW (very SLOW) because it forces to calculate the fulltext search completely.
- There are tons of requests about page 600, 800 and other big numbers. Those requests are also specially (very) slow and seems to have no sense to search for "moodle" and then navigate to page 600 of results.
- Performing some numbers against las access_log file I've found that for a total of 24640 forum searches, 15585 are performed by Googlebot!
- A lot of those Googlebot searches are the ones using BIG page numbers (SLOW).
I've think that we could try some different solutions to minimize de cost of those continuos searches:
1) robots.txt, prevent bots from crawling mod/forum/search.php completely. I think it's ok to allow crawling of discussions, withe their pages and so on, but IMO searches shouldn't be crawled.
2) limit the number of records returned by the search to, say, 1000. Could be a new forum module option (or one hidden setting for now, as you want). Change the COUNT(*) to one subselect to be able to limit to 1000 too.
3) When the limit is raised, instead of showing: "There are 123456 results" it will show, "There are more than 1000 results" and navigation only will be allowed for those 1000 results.
Compare results in moodle.org. But I think that it could help, mainly because seems to be the biggest eater of resources of the server (apart from cron).
FYC, ciao :-)
|
|
Description
|
I've been looking at logs today and have seen that there are a lot of forum/search.php requests causing the DB to get stressed a lot (confirmed by the slow query log of mysql).
Here there are some facts about that:
- Current behaviour executes the query twice (once to count results and another one to fetch the current page ones).
- The count is SLOW (very SLOW) because it forces to calculate the fulltext search completely.
- There are tons of requests about page 600, 800 and other big numbers. Those requests are also specially (very) slow and seems to have no sense to search for "moodle" and then navigate to page 600 of results.
- Performing some numbers against las access_log file I've found that for a total of 24640 forum searches, 15585 are performed by Googlebot!
- A lot of those Googlebot searches are the ones using BIG page numbers (SLOW).
I've think that we could try some different solutions to minimize de cost of those continuos searches:
1) robots.txt, prevent bots from crawling mod/forum/search.php completely. I think it's ok to allow crawling of discussions, withe their pages and so on, but IMO searches shouldn't be crawled.
2) limit the number of records returned by the search to, say, 1000. Could be a new forum module option (or one hidden setting for now, as you want). Change the COUNT(*) to one subselect to be able to limit to 1000 too.
3) When the limit is raised, instead of showing: "There are 123456 results" it will show, "There are more than 1000 results" and navigation only will be allowed for those 1000 results.
Compare results in moodle.org. But I think that it could help, mainly because seems to be the biggest eater of resources of the server (apart from cron).
FYC, ciao :-) |
Show » |
| There are no comments yet on this issue.
|
|