Details
Description
the function get_my_courses of lib/datalib.php is very very slow (more than 1 minute) with a database containing a lot of courses (2700 for this exemple).
Attachments
-
- datalib.diff
- 17/May/07 10:58 PM
- 4 kB
- Matthew Davidson
-
- datalib.diff
- 16/May/07 8:04 PM
- 4 kB
- Matthew Davidson
-
- datalib.diff
- 04/May/07 9:42 PM
- 4 kB
- Matthew Davidson
-
- datalib.diff
- 04/May/07 8:27 PM
- 3 kB
- Matthew Davidson
-
- datalib.php
- 17/May/07 11:01 PM
- 57 kB
- Matthew Davidson
-
- datalib.php
- 17/May/07 10:59 PM
- 57 kB
- Matthew Davidson
Issue Links
Activity
- All
- Comments
- History
- Activity
- Source
- Test Sessions
I am not sure what exactly happens when this function is called but when ever I am waiting for the a page to finish showing me a list of courses for a person I see the following proccess running in mySQL.
Copying to tmp table
SELECT rc.capability, c1.id AS id1, c2.id AS id2, (
c1.contextlevel *100 + c2.contextlevel
) AS aggrl
I know from past experience that when mySQL uses a tmp table performance suffers tremendously I am guessing that some SQL is being generated poorly by the function call.
Also I am noticing that course creators of the site for some reason are marked as participants in all the courses when their profile is viewed but administrators are not. Is their some reason for that?
The get_my_courses function is taking a list of all courses within the site and then testing for user priviliges via the has_capability function.
We're currently attempting to leave courses live in the system for the full school year and we have a bunch of course sites that are created even though people may not actually be using them (one for each course). At the moment that's about 1800 sites. It takes several minutes to loop through all of those courses testing them for permissions.
I'll bet if we culled unused sites it would be much faster, but I'm hoping that it's possible to optimize this query somehow so we won't have to change they way people are expecting to use the system ( they log in and all of their courses are there automatically with complete enrollment ).
Why does it loop thought the courses instead of just using the role_assignments table? To loop through all the courses and have it check on each one seams like to me that will just end of querying the role_assignments table once for each course and being very inefficient. Can it be made so that it just queries the right the user is assigned. Here is an example
SELECT *
FROM mdl_role_assignments, mdl_course, mdl_context
WHERE userid =551
AND mdl_role_assignments.contextid = mdl_context.id
AND contextlevel =50
AND mdl_context.instanceid = mdl_course.id
that gives me accurate output and it runs in a faction of a second. The only thing is might not be able to do is if a person is an admin to give them access to other course but I believe that would be another context and you can just parse that in PHP and optionally run another query to get the full list of classes. I would prefer to only see courses they are assigned to and not have them inherited if they are a course creator/admin. It just clutters up the participants list and people receive emails to courses they don't really participate in.
We have recently updated our server and migrated from 1.5.4. Our site has:
Courses: 609
Users: 1062
Role assignments: 23925
Teachers: 72
Posts: 294
Questions: 507
Resources: 745
Enabling Performance data shows that to load a profile page:
27.042793 secs RAM: 4.8MB Included 55 files ticks: 2705 user: 505 sys: 39 cuser: 0 csys: 0 Load average: 0.32
Also seems to take very long time when adding or removing site wide roles.
We have 11000 users and about 3000 courses (windows) - some (not all) queries are taking between 20s & 1m since 1.7.1+ upgrade.
Slow areas include:
Profile
Forum Posts
Assign Roles (attempting to remove multiple course creators stops the system altogether and requires a server re-boot!)
Also, whenever someone Enrols on course
Edit Profile though runs at normal pace even though this query must get quite a lot of information from the database.
Perhaps someone with the database knowledge can see a commonality here.
I really want to get this fixed.
Those with problems, can you all post what versions of Moodle, OS and database you are using?
Michael, thanks for the patch. Can you tell us exactly what version it relates to? A diff of your changed version against the original one would be useful.
Assigning to Martin Langhoff because he has been looking at this
Here you go:
Moodle version: 1.7
OS: Mac OS X 10.4.9 (built-in apache v1.3)
Dbase: mysql (built-in mysql v4)
Added Matt Clarkson too.
We are looking at this - Matt's patch to MDL-9617 does change things quite significantly for us, and I'm keen on applying it to 18_STABLE, HEAD and 17_STABLE (if possible).
- Matthew - can you post your changes as a unified diff?
- MartinD - be aware that we may have to make changes to accesslib to fix this up – scared yet?

- Matthew - can you post your changes as a unified diff?
- MartinD - be aware that we may have to make changes to accesslib to fix this up – scared yet?

I have a series of patches to get_my_courses() listed here that improve the behaviour of get_my_courses() significantly - less DB queries for admin, tighter/better cacheability, and deals with less data.
http://git.catalyst.net.nz/gitweb?p=moodle-r2.git;a=shortlog;h=mdl18-local
The guest login is very slow with many courses and this is due to get_my_courses().
From reading the code it appears that get_my_courses() will never return any courses for the guest account, so I have make an optimisation based on this - Can someone confirm that this is the correct behaviour?
When Michael Brown is talking about:
> I am not sure what exactly happens when this function is called but when ever I am waiting > for the a page to finish showing me a list of courses for a person I see the following proccess > running in mySQL.
I can repro that problem easily – but with the current has_capability() implementation I don't think I can fix it without changing some of the assumptions behind has_capability(). Further discussion at http://moodle.org/mod/forum/discuss.php?d=70739
Matthew -
I've reviewed the SQL patch. I see you've moved the UNION to a subselect. A few questions
- Does this actually make things faster for you?
- If you execute the old SQL and the new SQL, what timings do you get from mySQL?
If this SQL makes things faster, I'm all for it. But it's not the only thing we need to address in get_my_courses()... did you have a chance to try the patches Matt and I have been working on?
- Does this actually make things faster for you?
- If you execute the old SQL and the new SQL, what timings do you get from mySQL?
Ok - I've merged all the patches related to get_my_courses() from Matt and me - and a few additional fixes too ![]()
If Matthew can report a bit on the performance of that SQL change, and the numbers look good, it should go in as well.
The SQL at the bottom, timed about .02 secs faster than my SQL did...almost every time. I put the extra checks inside it also to not show course if only enrolled as a guest ,etc.
You can't depend on that roleid 6 being the guest user.
e.g. We use auto_increment_increment / auto_increment_offset for replication purposes in mysql
But also, anyone is at liberty to change the permissions of that role.
I think the second part of your comment is really up to the moodle guys. They're previous code didn't show any courses in which the person was only enrolled as a guest. If they want to change that behavior then that is up to you guys. Personally, I don't know why, if the user is in fact enrolled in any capacity, you wouldn't want the course to show up....but I wasn't about changing the results, just the method of this code.
What's the status of this currently? ... is the current code in CVS good enough for 1.8.1?
Newest version...includes the dynamic guest roleid, instead of static "6" in the SQL;
Yes - I think it's good for 1.8.1. Will test Matthew's alternative SQL (and a few other bits and pieces) for post 1.8.1.
Just committed a change that Yu had proposed and I had forgotten to merge into CVS. In 18_STABLE and HEAD now.
(BTW, MartinD - Jira finally picked up the commits I made)
This diff is built off of the latest datalib.php The SQL removes the need for the has_capability() calls that drastically slow down the function. I also just realized that there is an extra a.id on line 795. It doesn't cause an error, but it is unneeded.
The latest datalib.php does NOT solve to speed issues. My change fixes the speed issues....but please test it hard to make sure it is getting the correct results every time. I haven't had any issues yet, but more testers are always welcome.
Here is the whole file, for the people who want a drag and drop solution.
EDIT.....darn debug statements.....sorry people....get the latest file.
Petr or Martin....can you guys delete the 3 old diffs and the 1st datalib.php? Thanks.
Matthew - I agree that get_my_courses() is still not as fast as I'd like... but your pure SQL approach is not dealing with some cases that we need to cover:
- enrolments to nested categories
- enrolment capabilities "cascading" through
- enrolment capabilities being overriden
This is hard, hard hard to do in pure SQL - I've been hitting my head against this issue quite a bit. ![]()
In any case, get_my_courses() is significantly better than it was before - so it's ok to tag 1.8.1 with it. And let's keep working on improving this.
- enrolments to nested categories
- enrolment capabilities "cascading" through
- enrolment capabilities being overriden
Martin,
I have about 1100 users, 1500 courses. As admin, when I pull up a user's profile it takes about 30 seconds. What information do you need to continue working on this.
29.922262 secs RAM: 1MB Included 61 files ticks: 2992 user: 33 sys: 2 cuser: 0 csys: 0 Load average: 10.87 Record cache hit/miss ratio : 0/0
I am getting a similar delay with mod/assignment/index.php although I have not tracked down the offending SQL.
Peace - Anthony
Enable the DB stats (part of perf stats). The most likely reason is not 1 bad SQL query but a few hundred or thousand little stupid queries. If you can try Matthew's patch and report on whether it's faster (and still correct) much appreciated...
Martin,
I was under the impression the Matthew's patches were committed to CVS already. I will go back and read a little more carefully and see if I can give it a shot. At this point, when I pull up the user info page I get the following for /user/view.php I get:
27.594947 secs RAM: 1.7MB Included 61 files DB queries 612 Log writes 1 ticks: 2759 user: 36 sys: 1 cuser: 0 csys: 0 Load average: 1.12 Record cache hit/miss ratio : 0/0
Just a followup on the /mod/assignment/index.php page performance info:
65.052276 secs RAM: 1.8MB Included 54 files DB queries 1421 Log writes 1 ticks: 6505 user: 3798 sys: 16 cuser: 0 csys: 0 Load average: 1.29 Record cache hit/miss ratio : 0/0
Martin - On my test server I tried out the /user/view.php page. With the latest CVS I get:
10.315002 secs RAM: 2.3MB Included 59 files DB queries 633 Log writes 1 ticks: 1031 user: 91 sys: 5 cuser: 0 csys: 0 Load average: 0.29 Record cache hit/miss ratio : 0/0
However, with Matthew's code I get for the same page:
1.079814 secs RAM: 1.9MB Included 60 files DB queries 86 Log writes 1 ticks: 108 user: 37 sys: 1 cuser: 0 csys: 0 Load average: 0.41
Bonjour à tous,
Indeed, time for displaying user's profiles is very variable: it ranges from 2 seconds to 3... minutes!
My context:
- 34,000 users
- 700 courses
- 11 different roles
- 130 course categories and up to 6 levels of categories (nested categories)
- 43,000 records in mdl_roles_assignments
Linux 2.6.9-42.0.3.plus.c4smp + MySQL 4.1.20 + latest MOODLE_18_STABLE
1) Connected as admin, I look at IreneL's profile:
- she is assigned as a teacher at a main category level that gives her access to 387 courses
- she is also assigned as a teacher at 32 courses (not included in the 387 courses above)
~~> time for displaying her profile: 18 seconds
2) I assign her a 33th course (no matter what course it is):
~~> time for displaying her profile: 160 seconds !?
I try to debug but it's not easy...
Hope we can soon solve this critical bug.
Arno
- 34,000 users
- 700 courses
- 11 different roles
- 130 course categories and up to 6 levels of categories (nested categories)
- 43,000 records in mdl_roles_assignments Linux 2.6.9-42.0.3.plus.c4smp + MySQL 4.1.20 + latest MOODLE_18_STABLE
- she is assigned as a teacher at a main category level that gives her access to 387 courses
- she is also assigned as a teacher at 32 courses (not included in the 387 courses above) ~~> time for displaying her profile: 18 seconds 2) I assign her a 33th course (no matter what course it is): ~~> time for displaying her profile: 160 seconds !?
OK, we just noticed that for the /course/index.php, at around line #276 /// Find any orphan courses that don't yet have a valid category and set to default, this "if" statement is not efficient at all and would generate tons of redundant sql queries if you have large number of courses. If you comment out this if statement, it would increase the performance tremendously. Just a quick FYI
May be is this better for listing the courses of a category.
But this has no link with get_my_courses or viewing user's profile, I suppose.
Arnaud, yes it's a separate issue, not directly link to get_my_courses issue but it would REALLY speed up listing the courses of a category and I just thought that people here want to know this. ![]()
made an SQL fix that was causing all overrides of course:viewable to show up in everyones list of sites. Sorry bout that.
$SQL = "
SELECT
$fields
FROM
{$CFG->prefix}course c
WHERE
(
(c.id IN
(SELECT x.instanceid
FROM
{$CFG->prefix}role_assignments ra
INNER JOIN {$CFG->prefix}context x ON x.id=ra.contextid AND x.contextlevel=50
WHERE ra.userid=$userid AND ra.roleid != $guestrole->id)
OR c.id IN
(SELECT z.id
FROM
{$CFG->prefix}role_assignments ra
INNER JOIN {$CFG->prefix}context x ON x.id=ra.contextid
INNER JOIN {$CFG->prefix}course_categories a ON (a.path LIKE ".sql_concat('"%/"', sql_concat('x.instanceid', '"/%"'))." OR a.id=x.instanceid)
INNER JOIN {$CFG->prefix}course z ON z.category=a.id
WHERE ra.userid=$userid AND x.contextlevel=40)
OR c.id IN
(SELECT x.instanceid
FROM
{$CFG->prefix}role_capabilities rc
INNER JOIN {$CFG->prefix}context x ON x.id=rc.contextid AND x.contextlevel=50
INNER JOIN {$CFG->prefix}role_assignments a ON a.contextid=rc.contextid
WHERE a.userid=$userid AND rc.contextid <> {$sitecontext->id} AND (rc.permission=1 AND (c.visible=1 AND rc.capability='moodle/course:view') OR (c.visible=0 AND rc.capability = 'moodle/course:viewhiddencourses')))
)
AND NOT c.id IN
(SELECT x.instanceid
FROM
{$CFG->prefix}role_capabilities rc
INNER JOIN {$CFG->prefix}context x ON x.id=rc.contextid AND x.contextlevel=50
INNER JOIN {$CFG->prefix}role_assignments a ON a.contextid=rc.contextid
WHERE a.userid=$userid AND rc.contextid <> {$sitecontext->id} AND (rc.permission!=1 AND (c.visible=1 AND rc.capability='moodle/course:view') OR (c.visible=0 AND rc.capability = 'moodle/course:viewhiddencourses')))
AND NOT c.category IN
(SELECT cc.id
FROM
{$CFG->prefix}role_capabilities rc
INNER JOIN {$CFG->prefix}context x ON x.id=rc.contextid AND x.contextlevel=40
INNER JOIN {$CFG->prefix}course_categories cc ON cc.path LIKE CONCAT('%/',x.instanceid,'/%') OR cc.id = x.instanceid
INNER JOIN {$CFG->prefix}role_assignments a ON a.contextid=rc.contextid
WHERE a.userid=$userid AND rc.contextid <> {$sitecontext->id} AND (rc.permission!=1 AND (rc.capability='moodle/course:view') OR (rc.capability = 'moodle/course:viewhiddencourses')))
)
ORDER BY $sort";
Thought I would give the update to this sql statement method. This was posted in MDL-9238, but it really also goes here. This also fixes the problem of people who are enrolled at a category level, or subcategory level...and getting the proper courses in that instance as well. The only thing...(I know of) that this SQL doesn't take into account is what is called (role summing)...I have yet to get a firm grasp on how and why that is done...but if someone could help with that, maybe we could finally finish fixing this get_my_courses() issue.
We are still testing this right now...but it looks like our problem has been fixed. If you are having problems with this, check to see if your admin role has viewcourses set to allow. It is not supposed to be, for admins. This seems to have fixed our slow loading profiles, so please check your role definitions. Thanks to Yu for pointing out this setting.
This being said, I will continue development on the full SQL version of this function because of it's performance enhancements, but (at least for us) this bug can be lowered in priority.
Note that is no longer a problem in Moodle 1.9 because the general roles-related improvements.
This is resolved in HEAD/1.9 - the solution involved a sizable rewrite of lib/accesslib, so it is too risky for the 1.8 series. With the new accesslib code logins and get_my_courses() are instantaneous on sites with >60K courses.
Closing this bug as fixed-in-19.
The moodle site is slower each time a page wich call get_my_courses is displayed. But it is more slower for the user's profile page.