Moodle

Backup process generate big memory footprint and would stop when have huge number of users

Details

  • Type: Bug Bug
  • Status: Closed Closed
  • Priority: Major Major
  • Resolution: Fixed
  • Affects Version/s: 1.5, 1.5.1, 1.5.2, 1.5.3, 1.5.4, 1.6
  • Fix Version/s: 1.8.5, 1.9
  • Component/s: Backup
  • Labels:
    None
  • Affected Branches:
    MOODLE_15_STABLE, MOODLE_16_STABLE
  • Fixed Branches:
    MOODLE_18_STABLE, MOODLE_19_STABLE

Description

On our production site our staff programmer Cliff found this issue, when we added the winter/spring 2008 students to the DB, then course level manual backup will not complete. We did some debugging and found that the function user_check_backup in /backup/backuplib.php were not implemented efficiently and would generate big memory footprint:

function user_check_backup($course,$backup_unique_code,$backup_users,$backup_messages) {
//$backup_users=0-->all
// 1-->course (needed + enrolled)
// 2-->none

global $CFG;
global $db;

$count_users = 0;

//If we've selected none, simply return 0
if ($backup_users == 0 or $backup_users == 1) {

//Calculate needed users (calling every xxxx_get_participants function + scales users)
$needed_users = backup_get_needed_users($course, $backup_messages);

//Calculate enrolled users (students + teachers)
$enrolled_users = backup_get_enrolled_users($course);

//Calculate all users (every record in users table)
$all_users = backup_get_all_users();

//Calculate course users (needed + enrolled)
//First, needed
$course_users = $needed_users;

//Now, enrolled
if ($enrolled_users) {
foreach ($enrolled_users as $enrolled_user) { $course_users[$enrolled_user->id]->id = $enrolled_user->id; }
}

//Now, depending of parameters, create $backupable_users
if ($backup_users == 0) { $backupable_users = $all_users; } else { $backupable_users = $course_users; }

We rewrote it a little bit and are still testing the patch (patch is not included here in this ticket right now), but would like to create a bug ticket here so that people who are running large sites would aware of this issue. Thanks!

  1. backuplib.diff
    20/Dec/07 4:35 AM
    1 kB
    Wen Hao Chuang
  2. CSDiff.htm
    20/Dec/07 4:35 AM
    84 kB
    Wen Hao Chuang

Issue Links

Activity

Hide
Wen Hao Chuang added a comment -

OK, the key is to avoid that line of "$all_users = backup_get_all_users();" when you does NOT need to, as this would create a big problem when you have LOTS of users (like 60,000+) and the backup process would break. I have attached a diff file here, we simply rewrite the logic a little bit. Hope this make sense...

Show
Wen Hao Chuang added a comment - OK, the key is to avoid that line of "$all_users = backup_get_all_users();" when you does NOT need to, as this would create a big problem when you have LOTS of users (like 60,000+) and the backup process would break. I have attached a diff file here, we simply rewrite the logic a little bit. Hope this make sense...
Hide
Anthony Borrow added a comment -

Wen - The footprint will still be large when you are backing up all users; however, it sounds like you are wanting to be able to do course backups where it would be unlikely that you have all 60000+ students enrolled. It does seem to me to be more efficient memory-wise for the script to populate the $all_users array only when it is in fact needed. In other words, the script should only get the data that it needs which makes sense to me. I wonder if for certain functions like this one if it would be possible/worthwhile to check the amount of memory available for a script and estimate how much memory it would take. Might we be able to warn an administrator when they are getting into danger or a possible area of conflict? Peace - Anthony

Show
Anthony Borrow added a comment - Wen - The footprint will still be large when you are backing up all users; however, it sounds like you are wanting to be able to do course backups where it would be unlikely that you have all 60000+ students enrolled. It does seem to me to be more efficient memory-wise for the script to populate the $all_users array only when it is in fact needed. In other words, the script should only get the data that it needs which makes sense to me. I wonder if for certain functions like this one if it would be possible/worthwhile to check the amount of memory available for a script and estimate how much memory it would take. Might we be able to warn an administrator when they are getting into danger or a possible area of conflict? Peace - Anthony
Hide
Wen Hao Chuang added a comment -

Anthony yes you are right about the idea of avoiding populate the $all_users array only when it is in fact REALLY needed. I also like your idea to check the amount of memory available for a script and estimate how much memory it would take, and warn an administrator when they are getting into danger or a possible area of conflict. But I think this would probably take another moodle tracker ticket for "improvement", as I don't see such function in moodlelib.php right now... Merry Christmas!

Show
Wen Hao Chuang added a comment - Anthony yes you are right about the idea of avoiding populate the $all_users array only when it is in fact REALLY needed. I also like your idea to check the amount of memory available for a script and estimate how much memory it would take, and warn an administrator when they are getting into danger or a possible area of conflict. But I think this would probably take another moodle tracker ticket for "improvement", as I don't see such function in moodlelib.php right now... Merry Christmas!
Hide
Dan Marsden added a comment -

this was fixed in 1.8.5 and 1.9 by MDL-10721

Dan

Show
Dan Marsden added a comment - this was fixed in 1.8.5 and 1.9 by MDL-10721 Dan

People

Vote (1)
Watch (4)

Dates

  • Created:
    Updated:
    Resolved: