Moodle

Activity reports (outline and complete) are very slow

Details

  • Type: Bug Bug
  • Status: Closed Closed
  • Priority: Minor Minor
  • Resolution: Fixed
  • Affects Version/s: 1.6, 1.9
  • Fix Version/s: 1.9
  • Component/s: Administration
  • Labels:
    None
  • Environment:
    All
  • Database:
    MySQL
  • Affected Branches:
    MOODLE_16_STABLE, MOODLE_19_STABLE
  • Fixed Branches:
    MOODLE_19_STABLE

Description

With a busy site there are a lot of rows in the log table, and the activity report does a lot of WHEREs that are hitting unindexed columns.

Index the mdl_logs table column info and watch the magic of Outline report It's faster by several orders of magnitude. The extra index adds less overhead than one (at least I) would think.

Issue Links

Activity

Hide
Martin Dougiamas added a comment -

From Samuli Karevaara (samuli.karevaara at lamk.fi) Thursday, 13 October 2005, 07:57 PM:

On moodle.org the reports come up pretty quickly, but I don't see this column being indexed in the latest CVS (1.5 and dev) versions. Is there something else going on too? On all of our test sites the queries that hit WHERE in the unindexed info column take very long...

From (liling at bitsde.com) Tuesday, 7 March 2006, 07:40 PM:

I met this problem too. So I have created two new index

1. userid(userid)

2. infomodule(info, module)

From Martin Dougiamas (martin at moodle.com) Tuesday, 7 March 2006, 09:08 PM:

Hey database optimisation team can you put this into 1.6?

From (penny at catalyst.net.nz) Wednesday, 8 March 2006, 03:56 AM:

Can you give me an example of where the info column is in the WHERE part of the query? I want an example query to run through the query planner

From (penny at catalyst.net.nz) Wednesday, 8 March 2006, 04:22 AM:

nevermind, I found some.

I would think that info & userid both benefit from being indexed, probably separately.

I don't think that having an index on infomodule together is the best thing to do as there's already two multi column indexes including module:

mdl_log_timecoursemoduleaction_idx

mdl_log_coursemoduleaction_idx

which relate to the most often used queries that include module.

From (penny at catalyst.net.nz) Wednesday, 8 March 2006, 05:04 AM:

I added the info index and for the first query I found that used info in the WHERE clause was for get_recent_enrolments (query follows), the query planner didn't actually USE the info index, it decided to use mdl_log_timecoursemoduleaction_idx instead.

Incidently, for all the queries that use info and join it onto something else like l.info = u.id, in postgres at least, casting l.info to integer first helps performance quite a lot. That may be something that's worth fixing although it's tricky because the syntax is different for mysql & postgres, the closest we get is:

cast(l.info as unsigned) - mysql

cast(l.info as integer) - postgres

:/

Anyway, the queries I am trying that I've found that use info don't use the info index in the query planner - can someone provide me with an example of a query where performance improves after adding this index so I can test?

Also - I assume this is for mysql - are you using innodb or myisam?

From (penny at catalyst.net.nz) Wednesday, 8 March 2006, 05:05 AM:

The query I found that would not use the info index was:

explain analyze select u.id, u.firstname, u.lastname, l.time

FROM mdl_user u,

mdl_user_students s,

mdl_log l

WHERE l.time > '1141589542'

AND l.course = '32967'

AND l.module = 'course'

AND l.action = 'enrol'

AND l.info = u.id

AND u.id = s.userid

AND s.course = '32967'

From (penny at catalyst.net.nz) Wednesday, 8 March 2006, 05:56 AM:

ok, after a lot of messing around with query planners with postgres and mysql, I am going to add two new single column indexes, one for info and one for userid.

Also, adding the info index means that the cast to integer on the info field doesn't make sense anymore, as it's a text index.

From (penny at catalyst.net.nz) Wednesday, 8 March 2006, 06:16 AM:

in HEAD now.

From Samuli Karevaara (samuli.karevaara at lamk.fi) Wednesday, 8 March 2006, 07:03 PM:

The example query couldn't use the info index, but there are queries in the outline report like

SELECT * FROM mdl_log WHERE userid='1' AND module='resource' AND action='view' AND info='5900' ORDER BY time ASC;

which obviously can, so adding it is a good thing

The above query is quite quick with or without the index, but with the index (ymmv): 520 rows scanned, without: 19672 rows scanned. The report page generates a lot (dozens) of these, so the combined effect might flip the mysql to the swap/trash mode even with the single queries running along nicely.

From (penny at catalyst.net.nz) Thursday, 9 March 2006, 04:59 AM:

magic. That resource query was the exact same one I was playing with and got the improvements for

From Samuli Karevaara (samuli.karevaara at lamk.fi) Friday, 10 March 2006, 02:58 PM:

Should the indexes be added to the original CREATE TABLE SQL commands too?

From (penny at catalyst.net.nz) Monday, 13 March 2006, 02:28 AM:

yes, they should. I am a fool

I will add them today.

From (penny at catalyst.net.nz) Monday, 13 March 2006, 08:48 AM:

there now! sorry for silliness

From (penny at catalyst.net.nz) Monday, 13 March 2006, 09:02 AM:

closing!

Show
Martin Dougiamas added a comment - From Samuli Karevaara (samuli.karevaara at lamk.fi) Thursday, 13 October 2005, 07:57 PM: On moodle.org the reports come up pretty quickly, but I don't see this column being indexed in the latest CVS (1.5 and dev) versions. Is there something else going on too? On all of our test sites the queries that hit WHERE in the unindexed info column take very long... From (liling at bitsde.com) Tuesday, 7 March 2006, 07:40 PM: I met this problem too. So I have created two new index 1. userid(userid) 2. infomodule(info, module) From Martin Dougiamas (martin at moodle.com) Tuesday, 7 March 2006, 09:08 PM: Hey database optimisation team can you put this into 1.6? From (penny at catalyst.net.nz) Wednesday, 8 March 2006, 03:56 AM: Can you give me an example of where the info column is in the WHERE part of the query? I want an example query to run through the query planner From (penny at catalyst.net.nz) Wednesday, 8 March 2006, 04:22 AM: nevermind, I found some. I would think that info & userid both benefit from being indexed, probably separately. I don't think that having an index on infomodule together is the best thing to do as there's already two multi column indexes including module: mdl_log_timecoursemoduleaction_idx mdl_log_coursemoduleaction_idx which relate to the most often used queries that include module. From (penny at catalyst.net.nz) Wednesday, 8 March 2006, 05:04 AM: I added the info index and for the first query I found that used info in the WHERE clause was for get_recent_enrolments (query follows), the query planner didn't actually USE the info index, it decided to use mdl_log_timecoursemoduleaction_idx instead. Incidently, for all the queries that use info and join it onto something else like l.info = u.id, in postgres at least, casting l.info to integer first helps performance quite a lot. That may be something that's worth fixing although it's tricky because the syntax is different for mysql & postgres, the closest we get is: cast(l.info as unsigned) - mysql cast(l.info as integer) - postgres :/ Anyway, the queries I am trying that I've found that use info don't use the info index in the query planner - can someone provide me with an example of a query where performance improves after adding this index so I can test? Also - I assume this is for mysql - are you using innodb or myisam? From (penny at catalyst.net.nz) Wednesday, 8 March 2006, 05:05 AM: The query I found that would not use the info index was: explain analyze select u.id, u.firstname, u.lastname, l.time FROM mdl_user u, mdl_user_students s, mdl_log l WHERE l.time > '1141589542' AND l.course = '32967' AND l.module = 'course' AND l.action = 'enrol' AND l.info = u.id AND u.id = s.userid AND s.course = '32967' From (penny at catalyst.net.nz) Wednesday, 8 March 2006, 05:56 AM: ok, after a lot of messing around with query planners with postgres and mysql, I am going to add two new single column indexes, one for info and one for userid. Also, adding the info index means that the cast to integer on the info field doesn't make sense anymore, as it's a text index. From (penny at catalyst.net.nz) Wednesday, 8 March 2006, 06:16 AM: in HEAD now. From Samuli Karevaara (samuli.karevaara at lamk.fi) Wednesday, 8 March 2006, 07:03 PM: The example query couldn't use the info index, but there are queries in the outline report like SELECT * FROM mdl_log WHERE userid='1' AND module='resource' AND action='view' AND info='5900' ORDER BY time ASC; which obviously can, so adding it is a good thing The above query is quite quick with or without the index, but with the index (ymmv): 520 rows scanned, without: 19672 rows scanned. The report page generates a lot (dozens) of these, so the combined effect might flip the mysql to the swap/trash mode even with the single queries running along nicely. From (penny at catalyst.net.nz) Thursday, 9 March 2006, 04:59 AM: magic. That resource query was the exact same one I was playing with and got the improvements for From Samuli Karevaara (samuli.karevaara at lamk.fi) Friday, 10 March 2006, 02:58 PM: Should the indexes be added to the original CREATE TABLE SQL commands too? From (penny at catalyst.net.nz) Monday, 13 March 2006, 02:28 AM: yes, they should. I am a fool I will add them today. From (penny at catalyst.net.nz) Monday, 13 March 2006, 08:48 AM: there now! sorry for silliness From (penny at catalyst.net.nz) Monday, 13 March 2006, 09:02 AM: closing!
Hide
Samuli Karevaara added a comment -

Fixing MDL-13209 re-opened this one. The activity reports produce a lot of queries like
SELECT * FROM mdl_log WHERE userid='1' AND module='resource' AND action='view' AND info='36888' ORDER BY time ASC

With the current indexes the query has to use the action index. However, this column has a lot of repetition and produces a lot of rows to scan (468896 rows in one test).

One way to ease this is to implement MDL-4667 (Change the column order of the mdl_log index 'courseuserid'). That was deferred when the userid index was added. With the above test the number of rows to scan was reduced to 869 rows with the switch of the index order. See MDL-4667 for more details.

The info column is probably the one with less repetition, so an index on that column would produce the least amount of rows to scan, but the column order switch in MDL-4667 produced nice results also.

Show
Samuli Karevaara added a comment - Fixing MDL-13209 re-opened this one. The activity reports produce a lot of queries like SELECT * FROM mdl_log WHERE userid='1' AND module='resource' AND action='view' AND info='36888' ORDER BY time ASC With the current indexes the query has to use the action index. However, this column has a lot of repetition and produces a lot of rows to scan (468896 rows in one test). One way to ease this is to implement MDL-4667 (Change the column order of the mdl_log index 'courseuserid'). That was deferred when the userid index was added. With the above test the number of rows to scan was reduced to 869 rows with the switch of the index order. See MDL-4667 for more details. The info column is probably the one with less repetition, so an index on that column would produce the least amount of rows to scan, but the column order switch in MDL-4667 produced nice results also.
Hide
Petr Škoda (skodak) added a comment -

working on this, thanks!!

Show
Petr Škoda (skodak) added a comment - working on this, thanks!!
Hide
Petr Škoda (skodak) added a comment -

fixed, thanks!!

Show
Petr Škoda (skodak) added a comment - fixed, thanks!!
Hide
Petr Škoda (skodak) added a comment -

file a new bug MDL-13548 for various other problems spotted in log related reports code, please feel free to add more problems there

Show
Petr Škoda (skodak) added a comment - file a new bug MDL-13548 for various other problems spotted in log related reports code, please feel free to add more problems there
Hide
Petr Škoda (skodak) added a comment -

thinking a bit more this ( see MDL-13549 ) the use of info was not correct, we have to use cmid instead to get correct data - adding index on cmid might be a good idea, what do you think?

Show
Petr Škoda (skodak) added a comment - thinking a bit more this ( see MDL-13549 ) the use of info was not correct, we have to use cmid instead to get correct data - adding index on cmid might be a good idea, what do you think?

People

Vote (0)
Watch (1)

Dates

  • Created:
    Updated:
    Resolved: