Moodle

Poor performance when deleting questions on large sites, > 250000 questions

Details

  • Type: Bug Bug
  • Status: Closed Closed
  • Priority: Major Major
  • Resolution: Fixed
  • Affects Version/s: 1.9
  • Fix Version/s: 1.7.5, 1.8.6, 1.9.1
  • Component/s: Questions
  • Labels:
    None
  • Affected Branches:
    MOODLE_19_STABLE
  • Fixed Branches:
    MOODLE_17_STABLE, MOODLE_18_STABLE, MOODLE_19_STABLE

Description

We have upgraded our production 1.8.4+ site on a test install to "1.9 + (Build: 20080312) (2007101509)". We have over 250,000 questions and over 6000 quizzes. Some other details:
Solaris 10 and Coolstack 1.1
We have separate web and db servers.
Courses: 13299
Users: 24834
Role assignments: 154804
Teachers: 1402
Questions: 264090

Deleting questions has a roughly linear behavior, taking 7 seconds per question when selecting delete from the question bank (/question/edit.php). It appears to be limited completely on the DB side as there is no change in CPU load on the web server but significant change in CPU load on the DB.

The performance has improved over 1.8. The confirmation step "Are you absolutely sure you want to delete the following questions?", is <<1 sec, while it takes a significant amount of time on 1.8. The CPU load on the DB is less.

Can the queries when checking wether questions are used in a quiz be improved in 1.9 or is it a case of beef-up/tune the DB server? What information can we provide to help?

Issue Links

Activity

Hide
Tim Hunt added a comment -

Just to be clear, is it showing the confirm page that is slow, or actually doing the delete after confirming?

And I would encourage you to trace through the code, and work out what it is doing, but my guess would be that most of the work is in quiz_question_list_instances in /mod/quiz/lib.php.

However, that should be really fast, since it is a single query, and all the necessary indices should exist. However, it would be nice to confirm that.

Of course, it is silly that this function returns a list of quizzes, and we later just check whether it is empty or not.

Anyway, it would be great to investigate this.

Show
Tim Hunt added a comment - Just to be clear, is it showing the confirm page that is slow, or actually doing the delete after confirming? And I would encourage you to trace through the code, and work out what it is doing, but my guess would be that most of the work is in quiz_question_list_instances in /mod/quiz/lib.php. However, that should be really fast, since it is a single query, and all the necessary indices should exist. However, it would be nice to confirm that. Of course, it is silly that this function returns a list of quizzes, and we later just check whether it is empty or not. Anyway, it would be great to investigate this.
Hide
Michael Spall added a comment -

> Just to be clear, is it showing the confirm page that is slow, or actually doing the delete after confirming?

Yes, it is slow to get to the confirmation page. The actual delete is fast. The delete in 1.8 is/was slow.

> And I would encourage you to trace through the code, and work out what it is doing, but my guess would be that most of the work is in quiz_question_list_instances in /mod/quiz/lib.php.

We will do this. Any tips for tracing php code besides adding print and var_dump() statements. Do you use xdebug?

Show
Michael Spall added a comment - > Just to be clear, is it showing the confirm page that is slow, or actually doing the delete after confirming? Yes, it is slow to get to the confirmation page. The actual delete is fast. The delete in 1.8 is/was slow. > And I would encourage you to trace through the code, and work out what it is doing, but my guess would be that most of the work is in quiz_question_list_instances in /mod/quiz/lib.php. We will do this. Any tips for tracing php code besides adding print and var_dump() statements. Do you use xdebug?
Hide
Tim Hunt added a comment -

I tend to use the moodle-specific debugging and print_object instead of print and var_dump, simply because they are styled nicely in the moodle standard theme. Also, the former outputs a stack dump too.

The confirm and delete code is in /question/editlib.php between lines 470 and 550.

Hmm. That code isn't using question list instances. That is bad. Also it accesses $_REQUEST directly. It should use optional_param. So it probably needs a rewrite anyway.

What is very odd is that the confirm and delete code are very similar. Perhaps that suggests that your DB is doing a good job of caching repeat queries?

Show
Tim Hunt added a comment - I tend to use the moodle-specific debugging and print_object instead of print and var_dump, simply because they are styled nicely in the moodle standard theme. Also, the former outputs a stack dump too. The confirm and delete code is in /question/editlib.php between lines 470 and 550. Hmm. That code isn't using question list instances. That is bad. Also it accesses $_REQUEST directly. It should use optional_param. So it probably needs a rewrite anyway. What is very odd is that the confirm and delete code are very similar. Perhaps that suggests that your DB is doing a good job of caching repeat queries?
Hide
Michael Spall added a comment -

Thank you for the reminder on debugging and print_object.

> What is very odd is that the confirm and delete code are very similar. Perhaps that suggests that your DB is doing a good job of caching repeat queries?

I think that you are correct. I set up a question delete and had to leave my computer. When I got back I was logged out of Moodle and had to start the delete again. Both halves were fast. It made me almost believe in magic.

Show
Michael Spall added a comment - Thank you for the reminder on debugging and print_object. > What is very odd is that the confirm and delete code are very similar. Perhaps that suggests that your DB is doing a good job of caching repeat queries? I think that you are correct. I set up a question delete and had to leave my computer. When I got back I was logged out of Moodle and had to start the delete again. Both halves were fast. It made me almost believe in magic.
Hide
Michael Spall added a comment - - edited

Tim,
We can confirm that you are correct on where the bottleneck is. We have nailed it down to inside
function question_showbank(... inside if (preg_match...
...
Lines 537-548
foreach ($rawquestions as $key => $value) { // Parse input for question ids
if (preg_match('!^q([0-9]+)$!', $key, $matches)) {
$key = $matches[1]; $questionlist .= $key.',';
question_require_capability_on($key, 'edit');
if (record_exists('quiz_question_instances', 'question', $key) or
record_exists('question_states', 'originalquestion', $key)) { $questionnames .= '* '; $inuse = true; }
$questionnames .= get_field('question', 'name', 'id', $key).'<br />';
}
}

Due to server caching or page rendering, I can't see where exactly, but I can look some more tomorrow. Or do you have enough to go on. I wonder if the repeated calls to question_require_capability_on($key, 'edit'); are part of the problem. The capability to edit questions is based on the category. If you are not showing sub categories, that would only need to be done once. If you were showing subcategories, it could be done more efficiently. The other guess is the logic to tell if a question is in use.

It is funny that you mentioned $_REQUEST. When I first was profiling this on our test machines $_REQUEST was loading slowly on its first use through the delete process; on the order of 3-8 seconds. But then magically, it loaded without problem. We had turned off APC cahce earlier on the box for testing.

Show
Michael Spall added a comment - - edited Tim, We can confirm that you are correct on where the bottleneck is. We have nailed it down to inside function question_showbank(... inside if (preg_match... ... Lines 537-548 foreach ($rawquestions as $key => $value) { // Parse input for question ids if (preg_match('!^q([0-9]+)$!', $key, $matches)) { $key = $matches[1]; $questionlist .= $key.','; question_require_capability_on($key, 'edit'); if (record_exists('quiz_question_instances', 'question', $key) or record_exists('question_states', 'originalquestion', $key)) { $questionnames .= '* '; $inuse = true; } $questionnames .= get_field('question', 'name', 'id', $key).'<br />'; } } Due to server caching or page rendering, I can't see where exactly, but I can look some more tomorrow. Or do you have enough to go on. I wonder if the repeated calls to question_require_capability_on($key, 'edit'); are part of the problem. The capability to edit questions is based on the category. If you are not showing sub categories, that would only need to be done once. If you were showing subcategories, it could be done more efficiently. The other guess is the logic to tell if a question is in use. It is funny that you mentioned $_REQUEST. When I first was profiling this on our test machines $_REQUEST was loading slowly on its first use through the delete process; on the order of 3-8 seconds. But then magically, it loaded without problem. We had turned off APC cahce earlier on the box for testing.
Hide
Tim Hunt added a comment -

What you have found out so far is very helpful, but it would be very helpful if you could pin down in even more detail which call is causing the slowness.

question_has_capability_on has some caching built in, but it does not cache the final answer, so maybe it could be improved, but we would need to confirm that it is really the problem first.

Show
Tim Hunt added a comment - What you have found out so far is very helpful, but it would be very helpful if you could pin down in even more detail which call is causing the slowness. question_has_capability_on has some caching built in, but it does not cache the final answer, so maybe it could be improved, but we would need to confirm that it is really the problem first.
Hide
Michael Spall added a comment -

I have found where the problem is.

if (record_exists('quiz_question_instances', 'question', $questionid) or
481 : record_exists('question_states', 'originalquestion', $questionid)) {

if (record_exists('quiz_question_instances', 'question', $key) or
542 : record_exists('question_states', 'originalquestion', $key)) {

The record_exists calls on lines 481&542 take up ~99% of the execution time, except when the earlier record_exists evaluates true and boolean short-circuit logic causes the statements not to be evaluated. The confirmation page will take a while to run in cases where record_exists('quiz_question_instances', 'question', $questionid) is false and there isn't short-circuit logic. In my testing I was never able to get record_exists('question_states', 'originalquestion', $key) to evaluate true even when I rewrote the code so both lines always evaluated. Line 481 never evaluated true when line 480 was false. Is this checking for a condition that no longer can occur in 1.9.

I tested deleting under these conditions:
unused question
questions in an exam (which hides them instead of deletes them)
removing attempts then removing question from exam then deleting both hidden questions and questions that weren't hidden
(currently you can't remove a question from an exam if it still has attempts associated with it)
deleting exam then deleting questions, both hidden and not

I will attach a patch.

Show
Michael Spall added a comment - I have found where the problem is. if (record_exists('quiz_question_instances', 'question', $questionid) or 481 : record_exists('question_states', 'originalquestion', $questionid)) { if (record_exists('quiz_question_instances', 'question', $key) or 542 : record_exists('question_states', 'originalquestion', $key)) { The record_exists calls on lines 481&542 take up ~99% of the execution time, except when the earlier record_exists evaluates true and boolean short-circuit logic causes the statements not to be evaluated. The confirmation page will take a while to run in cases where record_exists('quiz_question_instances', 'question', $questionid) is false and there isn't short-circuit logic. In my testing I was never able to get record_exists('question_states', 'originalquestion', $key) to evaluate true even when I rewrote the code so both lines always evaluated. Line 481 never evaluated true when line 480 was false. Is this checking for a condition that no longer can occur in 1.9. I tested deleting under these conditions: unused question questions in an exam (which hides them instead of deletes them) removing attempts then removing question from exam then deleting both hidden questions and questions that weren't hidden (currently you can't remove a question from an exam if it still has attempts associated with it) deleting exam then deleting questions, both hidden and not I will attach a patch.
Hide
Michael Spall added a comment -

Patch removes two record_exists calls that seem to never evaluate true and are DB intensive.

Show
Michael Spall added a comment - Patch removes two record_exists calls that seem to never evaluate true and are DB intensive.
Hide
Tim Hunt added a comment -

Brilliant. I bet that question_states.originalquestion does not have an index on it, and question_states is a huge table. Actually, it is obsolete. It was part of a failed attempt to introduce question versioning.

Actually, I will probably not use your patch. Instead I will rewrite this code in terms of question_list_instances.

Show
Tim Hunt added a comment - Brilliant. I bet that question_states.originalquestion does not have an index on it, and question_states is a huge table. Actually, it is obsolete. It was part of a failed attempt to introduce question versioning. Actually, I will probably not use your patch. Instead I will rewrite this code in terms of question_list_instances.
Hide
Michael Spall added a comment -

I can confirm that question_states.originalquestion does not have an index on it. Also for our production site with over 4,000,000 entries in question_states, originalquestion is always = 0, so I can confirm that it is not used and that both record_exists('question_states', 'originalquestion',... never evaluate true and therefore are currently not needed.

Because Tim's rewrite will probably be applied only to Moodle 1.9, and understandably so, I have attached another patch for Moodle 1.8 to remove those calls from showbank.php. Our testing on test sites has shown no problems with this. An alternative solution would be to add an index manually to question_states.originalquestion.

Caveats: I have only tested and looked at 1.8 and 1.9 code. I have searched in contrib for 1.8 and 1.9 and didn't find originalquestion, but there could be third party modules that do use this field. So before applying this patch, check your modifications, and check to make sure question_states.originalquestion is always 0. Also before applying the 1.9 patch, keep in mind that Tim will rewriting code and not using the 1.9 patch.

Show
Michael Spall added a comment - I can confirm that question_states.originalquestion does not have an index on it. Also for our production site with over 4,000,000 entries in question_states, originalquestion is always = 0, so I can confirm that it is not used and that both record_exists('question_states', 'originalquestion',... never evaluate true and therefore are currently not needed. Because Tim's rewrite will probably be applied only to Moodle 1.9, and understandably so, I have attached another patch for Moodle 1.8 to remove those calls from showbank.php. Our testing on test sites has shown no problems with this. An alternative solution would be to add an index manually to question_states.originalquestion. Caveats: I have only tested and looked at 1.8 and 1.9 code. I have searched in contrib for 1.8 and 1.9 and didn't find originalquestion, but there could be third party modules that do use this field. So before applying this patch, check your modifications, and check to make sure question_states.originalquestion is always 0. Also before applying the 1.9 patch, keep in mind that Tim will rewriting code and not using the 1.9 patch.
Hide
Tim Hunt added a comment -

Now fixed in all branches from 1.7.x onwards.

In 1.9 and HEAD, I also took the opportunity to eliminate the direct access to $_REQUEST and $_POST that this code was doing in contravention of the Moodle coding guidelines.

Thanks for working out this fix Michael.

Show
Tim Hunt added a comment - Now fixed in all branches from 1.7.x onwards. In 1.9 and HEAD, I also took the opportunity to eliminate the direct access to $_REQUEST and $_POST that this code was doing in contravention of the Moodle coding guidelines. Thanks for working out this fix Michael.
Hide
Petr Škoda (skodak) added a comment -

code reviewed, closing

Show
Petr Škoda (skodak) added a comment - code reviewed, closing

People

Vote (1)
Watch (6)

Dates

  • Created:
    Updated:
    Resolved: