Moodle

Bulk user actions -> Download as Excel/ODS fails with custom fields

Details

  • Type: Bug Bug
  • Status: Open Open
  • Priority: Critical Critical
  • Resolution: Unresolved
  • Affects Version/s: 1.9.5
  • Fix Version/s: None
  • Component/s: Administration
  • Database:
    MySQL
  • Affected Branches:
    MOODLE_19_STABLE

Description

When trying to download users in an ODS or Excel file, the site comes to a halt and eventually the download fails.

There don't seem to be any indexes on mdl_user_info_data, which may contribute to the problem.

The site this is an issue with has about 6200 users and over 17K rows in the user_info_data table.Just let me know what you need.

Issue Links

Activity

Hide
Anthony Borrow added a comment -

Thanks Dakota for creating this issue. I am now watching this and hope to look into what needs to be done to resolve it. Peace - Anthony

Show
Anthony Borrow added a comment - Thanks Dakota for creating this issue. I am now watching this and hope to look into what needs to be done to resolve it. Peace - Anthony
Hide
Anthony Borrow added a comment -

I look a quick look and did not notice anything obvious to me as to what may be causing this to lock up as it is. I suspect there is a way to improve the performance but at this point I am going to move on to other issues. Peace - Anthony

Show
Anthony Borrow added a comment - I look a quick look and did not notice anything obvious to me as to what may be causing this to lock up as it is. I suspect there is a way to improve the performance but at this point I am going to move on to other issues. Peace - Anthony
Hide
Anthony Borrow added a comment -

While not necessarily related to the database activity module, here is another example of where sql inefficiency could bring a site to a halt (and has). In this case however it is only the admin that can really do it so it is less of a security concern. Nevertheless, if someone wants to take a crack at improving getting the user profile fields that would be great. I would think that for exporting purposes we should be able to get away with a single query to user_info_data per user and to have an $data_fields variable holding the contents of user_info_field and work through things differently. It has been a while since I have looked at the current algorithm but Dakota Duff of Remote-Learner did some testing for one of my sites and confirmed that exporting the users was creating an excessively high number of queries on a site with about 6000 users. Peace - Anthony

Show
Anthony Borrow added a comment - While not necessarily related to the database activity module, here is another example of where sql inefficiency could bring a site to a halt (and has). In this case however it is only the admin that can really do it so it is less of a security concern. Nevertheless, if someone wants to take a crack at improving getting the user profile fields that would be great. I would think that for exporting purposes we should be able to get away with a single query to user_info_data per user and to have an $data_fields variable holding the contents of user_info_field and work through things differently. It has been a while since I have looked at the current algorithm but Dakota Duff of Remote-Learner did some testing for one of my sites and confirmed that exporting the users was creating an excessively high number of queries on a site with about 6000 users. Peace - Anthony
Hide
Jonathan Robson added a comment -

Hello, I work with Dakota and I've been doing some testing on this. It looks like this is directly related to the number of custom profile fields a site has.

On the site Dakota mentioned in the description, they have 18 custom profile fields. When downloading 2137 users (below the point at which it starts failing) in Excel format, the MySQL server received 81385 queries in 26 seconds. In the immediately following 5 minutes, there were only 7174 queries. The queries per second since startup is 8.96 for about 31 days, but during those 26 seconds, the queries per second was about 3130!

On another site, with 11 custom profile fields, I downloaded 2700 users (also below the point at which it starts failing) in Excel format. The MySQL server received 59480 queries in 45 seconds. And in the immediately following 5 minutes, there were only 2340 queries. That server has also been up for about 31 days, during which time it has received 6.21 queries per second. It received about 1322 queries per seconds during the test.

I didn't really stress test the first site, but the second site I mentioned starts failing at about 2722 users.

Show
Jonathan Robson added a comment - Hello, I work with Dakota and I've been doing some testing on this. It looks like this is directly related to the number of custom profile fields a site has. On the site Dakota mentioned in the description, they have 18 custom profile fields. When downloading 2137 users (below the point at which it starts failing) in Excel format, the MySQL server received 81385 queries in 26 seconds. In the immediately following 5 minutes, there were only 7174 queries. The queries per second since startup is 8.96 for about 31 days, but during those 26 seconds, the queries per second was about 3130! On another site, with 11 custom profile fields, I downloaded 2700 users (also below the point at which it starts failing) in Excel format. The MySQL server received 59480 queries in 45 seconds. And in the immediately following 5 minutes, there were only 2340 queries. That server has also been up for about 31 days, during which time it has received 6.21 queries per second. It received about 1322 queries per seconds during the test. I didn't really stress test the first site, but the second site I mentioned starts failing at about 2722 users.
Hide
Anthony Borrow added a comment -

Jonathan - Thanks for the data. Better understanding what is happening and the impact on the database is helpful. Speaking theoretically and without looking at code, I think I would like to see that we obtain all of the fields and data for a given user with a single query and then process the data in the code. We are are retrieving the user data in blocks (say 100 users at a time), then we could have one query from mdl_user and another to get the user profile data for all of those users. Memory wise it will be more expensive but database wise we can make some improvements. I'll put this back on to my active radar of things to play with when I have time since it sounds like you have other folks running into this issue as well. Peace - Anthony

Show
Anthony Borrow added a comment - Jonathan - Thanks for the data. Better understanding what is happening and the impact on the database is helpful. Speaking theoretically and without looking at code, I think I would like to see that we obtain all of the fields and data for a given user with a single query and then process the data in the code. We are are retrieving the user data in blocks (say 100 users at a time), then we could have one query from mdl_user and another to get the user profile data for all of those users. Memory wise it will be more expensive but database wise we can make some improvements. I'll put this back on to my active radar of things to play with when I have time since it sounds like you have other folks running into this issue as well. Peace - Anthony
Hide
Anthony Borrow added a comment -

I've looked at this again and the ugly part of the code is the profile_load_data function in /admin/user/lib.php. For each user we are getting the profile data which goes out and gets all of the profile fileds, for each of those it goes out and loads the data for each field. If I am reading the code correctly the number of queries will be approximately #users + ((#users * #profile_fields)*2). This needs to be reduced but I need to think about how best to accomplish it. Peace - Anthony

Show
Anthony Borrow added a comment - I've looked at this again and the ugly part of the code is the profile_load_data function in /admin/user/lib.php. For each user we are getting the profile data which goes out and gets all of the profile fileds, for each of those it goes out and loads the data for each field. If I am reading the code correctly the number of queries will be approximately #users + ((#users * #profile_fields)*2). This needs to be reduced but I need to think about how best to accomplish it. Peace - Anthony
Hide
Jason Hollowell added a comment -

I just locked up our server yesterday trying to run a filter via Bulk User Actions and was reminded on this issue. Just got on the Tracker here and found this post. I haven't looked at the Moodle core (admin/user/lib.php) file yet but I'm guessing Anthony must be right or at least the the problem is somewhere in the Moodle code because I was able to run a query to obtain the same information I was trying to retrieve, via PHPMyAdmin, and it took less than a second to return the information.

I ended up having to restart MySQL to get out of the jam the filter created Don't know how I can help but if there is anything I can do, let me know. Always willing to spread myself thin and add something to the plate

Jason

Show
Jason Hollowell added a comment - I just locked up our server yesterday trying to run a filter via Bulk User Actions and was reminded on this issue. Just got on the Tracker here and found this post. I haven't looked at the Moodle core (admin/user/lib.php) file yet but I'm guessing Anthony must be right or at least the the problem is somewhere in the Moodle code because I was able to run a query to obtain the same information I was trying to retrieve, via PHPMyAdmin, and it took less than a second to return the information. I ended up having to restart MySQL to get out of the jam the filter created Don't know how I can help but if there is anything I can do, let me know. Always willing to spread myself thin and add something to the plate Jason
Hide
Anthony Borrow added a comment -

Jason - Sorry to hear you got hit with this. We need to look at the code and see if we can come up with a more efficient algorithm; however, I have been hesitant to push for this until 2.0 is released. Peace - Anthony

Show
Anthony Borrow added a comment - Jason - Sorry to hear you got hit with this. We need to look at the code and see if we can come up with a more efficient algorithm; however, I have been hesitant to push for this until 2.0 is released. Peace - Anthony
Hide
Jenny Gray added a comment -

I also have a site experiencing this problem. Users have reported that it goes wrong with as few as 10 users. We're running on Postgres and the database server is pretty high-powered (openlearn also runs on the same box and that gets a lot of hits).

While the file has been updated in Moodle 2.0, its only for basic DB and output handling changes. I doubt we can expect the key core developers to fix this one for us

Show
Jenny Gray added a comment - I also have a site experiencing this problem. Users have reported that it goes wrong with as few as 10 users. We're running on Postgres and the database server is pretty high-powered (openlearn also runs on the same box and that gets a lot of hits). While the file has been updated in Moodle 2.0, its only for basic DB and output handling changes. I doubt we can expect the key core developers to fix this one for us
Hide
Jenny Gray added a comment -

I managed to get data out of my 3000 user system with 18 custom profile fields by replacing the get_record('user') and load_profile_data calls with this kind of query in a get_recordset_select.

$sql = 'select u.id,u.username,u.email,u.firstname,u.lastname,u.idnumber,u.institution,u.department,u.phone1,u.phone2,u.city,u.url,u.icq,u.skype,u.aim,u.yahoo,u.msn,u.country,f.shortname,d.data from '.$CFG->prefix.'user u
JOIN '.$CFG->prefix.'user_info_data d on userid=u.id
JOIN '.$CFG->prefix.'user_info_field f on f.id=d.fieldid
WHERE u.id='.$userid;

Requires quite a bit of jigging around of the rest of the file though particularly as you get multiple rows for each user, rather than 1. Maybe this is a step in the right direction though? Or even just gives Anthony something to pick holes in while he thinks about the best algorithm.

Show
Jenny Gray added a comment - I managed to get data out of my 3000 user system with 18 custom profile fields by replacing the get_record('user') and load_profile_data calls with this kind of query in a get_recordset_select. $sql = 'select u.id,u.username,u.email,u.firstname,u.lastname,u.idnumber,u.institution,u.department,u.phone1,u.phone2,u.city,u.url,u.icq,u.skype,u.aim,u.yahoo,u.msn,u.country,f.shortname,d.data from '.$CFG->prefix.'user u JOIN '.$CFG->prefix.'user_info_data d on userid=u.id JOIN '.$CFG->prefix.'user_info_field f on f.id=d.fieldid WHERE u.id='.$userid; Requires quite a bit of jigging around of the rest of the file though particularly as you get multiple rows for each user, rather than 1. Maybe this is a step in the right direction though? Or even just gives Anthony something to pick holes in while he thinks about the best algorithm.

People

Vote (3)
Watch (4)

Dates

  • Created:
    Updated: