Moodle

Add some exclusion locks to cron so that only one runs at any time

Details

  • Type: Bug Bug
  • Status: Open Open
  • Priority: Minor Minor
  • Resolution: Unresolved
  • Affects Version/s: 1.8.2, 1.9
  • Fix Version/s: STABLE backlog
  • Component/s: Backup
  • Labels:
    None
  • Affected Branches:
    MOODLE_18_STABLE, MOODLE_19_STABLE

Description

1. There is no record of wether the last cron job was completed in it's entirety or not.

Bugs like MDL-11170 cause the cron job to fail silently (unless someone is frequently reading the output of it, which is unlikely). In the case of MDL-11170, the cron job crashed before any backups had the chance to run, which makes this kind of problem dangerous.

Timestamping and recording the running status of the cron job (the same way as with scheduled backups) will allow the detection of an incomplete cron job the next time one is started. This will make it possible to notify the administrators that something is wrong.

2. Every individual cron activity has a race condition, and so any cron activity migt be running simultaneously with itself.

This is a real problem if cron jobs are run very frequently (it ran every minute in my case), and if the server is under heavy load (while running all the nightly cron jobs, in my case). Under these load conditions database requests took extremely long, so long that several cron job invocation inadvertently were synchronized and managed to even start backing up the same individual courses simultaneously.

There are at least three race conditions/critical sections in the code that is supposed to prevent simultaneous backups (the other cron activities have similar ones):

  • While checking for a correctly terminated run of backups, in schedule_backup_cron between lines 26 and 47
  • While restarting an incorrectly terminated run of backups, betwheen schedule_backup_cron line 36 and the first invocation of schedule_backup_launch_backup line 217
  • While backuping each course, in schedule_backup_cron between lines 89 and 137

These sections should be protected from being run by several processes at once with a proper mutual exclusion lock.

Issue Links

Activity

Hide
Martin Dougiamas added a comment -

A mutual exclusion lock sounds like a good idea. it would need an age so that if a bug happened and cron died without finishing then a new one would be allowed to start after some given time (like 6 hours or so).

Show
Martin Dougiamas added a comment - A mutual exclusion lock sounds like a good idea. it would need an age so that if a bug happened and cron died without finishing then a new one would be allowed to start after some given time (like 6 hours or so).
Hide
Martin Dougiamas added a comment -
Show
Martin Dougiamas added a comment - Sam has some code here: http://moodle.org/mod/forum/discuss.php?d=97457#p431440
Hide
Clinton Graham added a comment -

I would add a vote for actually verifying the existence of the process that created the lock rather than just aging the lockfile. This is obviously trivial in Linux, but I'm not familiar enough with PHP on Windows to suggest a cross-platform solution.

Show
Clinton Graham added a comment - I would add a vote for actually verifying the existence of the process that created the lock rather than just aging the lockfile. This is obviously trivial in Linux, but I'm not familiar enough with PHP on Windows to suggest a cross-platform solution.
Hide
Rosario Carcò added a comment - - edited

I just run into the same problems with our SUSE SLES 11 SP1, 8 core 8GB RAM server: cron.php jobs started to overlap, hang and slow down the whole machine. Actually we have more than 4'700 courses and 450 GB in the Moodle data dir. Calculating that half the data is made up by the backup-zip-files themselves, we have to backup/zip 225 GB. So, backing up and retaining only one backup-zip-file should take exactly the time you need to zip 225 GB. I did not test and calculate this, but the last backup that finished without being killed by myself, took 27 hours!!
And we have complaining people because some students do and some other do not receive the forum posts of one and the same forum. How should I track down this? I just emptied the mdl_backup_log file which in 15 months since I set up this new server had grown to over 12 Million records. Why are these records not purged on a regular basis? Or is this also a symptom of the hanging and overlapping cron.php jobs?

Show
Rosario Carcò added a comment - - edited I just run into the same problems with our SUSE SLES 11 SP1, 8 core 8GB RAM server: cron.php jobs started to overlap, hang and slow down the whole machine. Actually we have more than 4'700 courses and 450 GB in the Moodle data dir. Calculating that half the data is made up by the backup-zip-files themselves, we have to backup/zip 225 GB. So, backing up and retaining only one backup-zip-file should take exactly the time you need to zip 225 GB. I did not test and calculate this, but the last backup that finished without being killed by myself, took 27 hours!! And we have complaining people because some students do and some other do not receive the forum posts of one and the same forum. How should I track down this? I just emptied the mdl_backup_log file which in 15 months since I set up this new server had grown to over 12 Million records. Why are these records not purged on a regular basis? Or is this also a symptom of the hanging and overlapping cron.php jobs?
Hide
Rosario Carcò added a comment -

I wonder whether it can be solved inside cron.php: I think rather that all the different cron jobs should be scheduled one by one, so that an unexpected hanging of one cron job would not interfere with the other ones. e.g. the backup-job, the sending of forum-posts and digests, the messaging system, the database management and purging, etc. should be jobs scheduled one by one.

Show
Rosario Carcò added a comment - I wonder whether it can be solved inside cron.php: I think rather that all the different cron jobs should be scheduled one by one, so that an unexpected hanging of one cron job would not interfere with the other ones. e.g. the backup-job, the sending of forum-posts and digests, the messaging system, the database management and purging, etc. should be jobs scheduled one by one.
Hide
Rosario Carcò added a comment -

My last cron.php is running since 48 hours and more now. In the log I produce I can see the last course cron wants to backup. The backup-zip-file of this course is still the previous one, which means that cron stalled somewhere inbetween creating the zip file and removing the previous one, or in other words, during the phase where it gathers the data. I checked the course with FireFox and can not notice anything special. I can even backup the course successfully using the Admin-Block-GUI.

So first question: does PHP not allow cron.php to run such a long time? Then how should I raise this to backup more than 4'700 courses with 225 GB Data?

moodle:/var/log # ps ax |grep php5
24458 ? R 996:06 php5 -f /.../moodle/htdocs/admin/cron.php

Second question: I consider this a MAJOR Bug if cron.php happens to hang like it does since 9 months on my production server, whilst it worked fine since 2004, from Moodle 1.5x to 1.9.12. So could you raise it to major??

Show
Rosario Carcò added a comment - My last cron.php is running since 48 hours and more now. In the log I produce I can see the last course cron wants to backup. The backup-zip-file of this course is still the previous one, which means that cron stalled somewhere inbetween creating the zip file and removing the previous one, or in other words, during the phase where it gathers the data. I checked the course with FireFox and can not notice anything special. I can even backup the course successfully using the Admin-Block-GUI. So first question: does PHP not allow cron.php to run such a long time? Then how should I raise this to backup more than 4'700 courses with 225 GB Data? moodle:/var/log # ps ax |grep php5 24458 ? R 996:06 php5 -f /.../moodle/htdocs/admin/cron.php Second question: I consider this a MAJOR Bug if cron.php happens to hang like it does since 9 months on my production server, whilst it worked fine since 2004, from Moodle 1.5x to 1.9.12. So could you raise it to major??
Hide
Rosario Carcò added a comment - - edited

I just saw, that I received the cron-backup-eMail this night at 4 o'clock,ie. after 38 hours, despite cron.php continuing to hang still now:

Von: Admin Moodle Rosario Carcò moodlehelpATourDomain
Gesendet: Donnerstag, 27. Oktober 2011 04:57
An: _mb_moodlehelp
Betreff: FHNW: [FEHLER] Geplanter Sicherungsstatus
Wichtigkeit: Hoch

Zusammenfassung/Übersicht
==================================================
Kurse: 4772
OK: 4012
Übersprungen: 727
Fehler: 0
Noch nicht abgeschlossen: 33

Einige Ihrer Kurse wurden nicht gesichert !!

Bitte schauen Sie in das Sicherungsprotokoll:
https://moodle.fhnw.ch/admin/report/backups/index.php

Yesterday, after 24 hours running, I searched for the previous backup-zip-Files not having been deleted and replaced yet and I got roughly 1'000 files. I just repeated the search now with find type f -name backup-20111024.zip and I get 11 Files/Courses. Which would mean that in the past 24 hours cron.php continued to execute backups processing nearly all courses, except 11.

Show
Rosario Carcò added a comment - - edited I just saw, that I received the cron-backup-eMail this night at 4 o'clock,ie. after 38 hours, despite cron.php continuing to hang still now: Von: Admin Moodle Rosario Carcò moodlehelpATourDomain Gesendet: Donnerstag, 27. Oktober 2011 04:57 An: _mb_moodlehelp Betreff: FHNW: [FEHLER] Geplanter Sicherungsstatus Wichtigkeit: Hoch Zusammenfassung/Übersicht ================================================== Kurse: 4772 OK: 4012 Übersprungen: 727 Fehler: 0 Noch nicht abgeschlossen: 33 Einige Ihrer Kurse wurden nicht gesichert !! Bitte schauen Sie in das Sicherungsprotokoll: https://moodle.fhnw.ch/admin/report/backups/index.php Yesterday, after 24 hours running, I searched for the previous backup-zip-Files not having been deleted and replaced yet and I got roughly 1'000 files. I just repeated the search now with find type f -name backup-20111024.zip and I get 11 Files/Courses. Which would mean that in the past 24 hours cron.php continued to execute backups processing nearly all courses, except 11.
Hide
Rosario Carcò added a comment -

Dear all, I tried to solve the situation with a complete reboot of the SUSE SLES 11 SP1 Server. And indeed since then I get better results in terms of speed, the backup goes through in 5 hours and cron-jobs do not overlap each other. I still have to investigate the 2 errro-courses and the 32 courses not finished/hanging:

Von: Admin Moodle Rosario Carcò moodlehelpAtOurDomain
Gesendet: Dienstag, 8. November 2011 05:09
An: _mb_moodlehelp
Betreff: FHNW: [FEHLER] Geplanter Sicherungsstatus
Wichtigkeit: Hoch

Zusammenfassung/Übersicht
==================================================
Kurse: 4785
OK: 4005
Übersprungen: 746
Fehler: 2
Noch nicht abgeschlossen: 32

Einige Ihrer Kurse wurden nicht gesichert !!

Bitte schauen Sie in das Sicherungsprotokoll:
https://moodle.fhnw.ch/admin/report/backups/index.php

Show
Rosario Carcò added a comment - Dear all, I tried to solve the situation with a complete reboot of the SUSE SLES 11 SP1 Server. And indeed since then I get better results in terms of speed, the backup goes through in 5 hours and cron-jobs do not overlap each other. I still have to investigate the 2 errro-courses and the 32 courses not finished/hanging: Von: Admin Moodle Rosario Carcò moodlehelpAtOurDomain Gesendet: Dienstag, 8. November 2011 05:09 An: _mb_moodlehelp Betreff: FHNW: [FEHLER] Geplanter Sicherungsstatus Wichtigkeit: Hoch Zusammenfassung/Übersicht ================================================== Kurse: 4785 OK: 4005 Übersprungen: 746 Fehler: 2 Noch nicht abgeschlossen: 32 Einige Ihrer Kurse wurden nicht gesichert !! Bitte schauen Sie in das Sicherungsprotokoll: https://moodle.fhnw.ch/admin/report/backups/index.php
Hide
Rosario Carcò added a comment -

Unfortunately the situation is already getting worse. In the pevious comment you can see that the backup took 5 hours. Since last friday morning it takes about 7 hours, and since today the cron jobs, i.e. mainly the backup started rerunning at 18.15 because it did not finish this morning when it started at 00:10

So there MUST BE some sort of Bug or MEMORY LEAK in the code.

Von: Admin Moodle Rosario Carcò moodlehelAtOurDomain
Gesendet: Samstag, 12. November 2011 06:59
An: _mb_moodlehelp
Betreff: FHNW: [FEHLER] Geplanter Sicherungsstatus
Wichtigkeit: Hoch

Zusammenfassung/Übersicht
==================================================
Kurse: 4791
OK: 4002
Übersprungen: 752
Fehler: 2
Noch nicht abgeschlossen: 35

Show
Rosario Carcò added a comment - Unfortunately the situation is already getting worse. In the pevious comment you can see that the backup took 5 hours. Since last friday morning it takes about 7 hours, and since today the cron jobs, i.e. mainly the backup started rerunning at 18.15 because it did not finish this morning when it started at 00:10 So there MUST BE some sort of Bug or MEMORY LEAK in the code. Von: Admin Moodle Rosario Carcò moodlehelAtOurDomain Gesendet: Samstag, 12. November 2011 06:59 An: _mb_moodlehelp Betreff: FHNW: [FEHLER] Geplanter Sicherungsstatus Wichtigkeit: Hoch Zusammenfassung/Übersicht ================================================== Kurse: 4791 OK: 4002 Übersprungen: 752 Fehler: 2 Noch nicht abgeschlossen: 35
Hide
Rosario Carcò added a comment -

And this is what I can read in the cron.log:

"cron.moodle.20111114-181501.log" 34504L, 2999878C 79,1 0%
Server Time: Mon, 14 Nov 2011 18:15:01 +0100

Starting activity modules
Processing module function assignment_cron ...... used 2 dbqueries
... used 0.03216814994812 seconds
done.
Processing module function forum_cron ...Processing user 17402
:
:
Starting processing the event queue...
done.
Running backups if required...
Checking backup status...RUNNING
No activity in last 30 minutes. Unlocking status
Getting admin info
Deleting old data
Checking courses
Skipping deleted courses
0 courses
Moodle FHNW
Next execution: Wednesday, 16. November 2011, 00:10
Betrieb&Support
SKIPPING - hidden+unmodified
Next execution: Wednesday, 16. November 2011, 00:10
Suche & kaufe
Next execution: Wednesday, 16. November 2011, 00:10
>>>

So this is repeating what the backup-job should have done already this morning at 00:10 o'clock. Next scheduled run is Wednesday morning. (my schedules are Monday, Wednesday and Friday morning)

Show
Rosario Carcò added a comment - And this is what I can read in the cron.log: "cron.moodle.20111114-181501.log" 34504L, 2999878C 79,1 0% Server Time: Mon, 14 Nov 2011 18:15:01 +0100 Starting activity modules Processing module function assignment_cron ...... used 2 dbqueries ... used 0.03216814994812 seconds done. Processing module function forum_cron ...Processing user 17402 : : Starting processing the event queue... done. Running backups if required... Checking backup status...RUNNING No activity in last 30 minutes. Unlocking status Getting admin info Deleting old data Checking courses Skipping deleted courses 0 courses Moodle FHNW Next execution: Wednesday, 16. November 2011, 00:10 Betrieb&Support SKIPPING - hidden+unmodified Next execution: Wednesday, 16. November 2011, 00:10 Suche & kaufe Next execution: Wednesday, 16. November 2011, 00:10 >>> So this is repeating what the backup-job should have done already this morning at 00:10 o'clock. Next scheduled run is Wednesday morning. (my schedules are Monday, Wednesday and Friday morning)
Hide
Daniel Neis added a comment -

Hello,

IMHO you should give up using moodle's automated backup. It is almost impossible to ZIP more than a hundred gigas in an acceptable time. You may try other kind of backups like snapshots and/or rsync to a dedicated backup server.

HTH,
Daniel

Show
Daniel Neis added a comment - Hello, IMHO you should give up using moodle's automated backup. It is almost impossible to ZIP more than a hundred gigas in an acceptable time. You may try other kind of backups like snapshots and/or rsync to a dedicated backup server. HTH, Daniel
Hide
Rosario Carcò added a comment -

Thanks Daniel. This is really a point to think about. As I said, I have more than 450 Gigs in more than 4'790 courses. And I DO a mysqlHotcopy at noon and a mySQL-Dump at evening, just before the nightly directory-backup to tape starts. The directory-Backup is good for total crashes, where you have to roll back everything to the last known and good status of the system. But for simple, daily problems our teachers encounter like I lost my file x or I just messed up my course with Microsoft Word HTML-Code I copy/pasted, I was very happy with the Backup-Zip-Files because you can even restore a whole course and then import little chunks into the original course or copy a single file from one directory to another and so on. But you are right, Moodle was intended for little schools and it was a little mistake to use it at so a large scale as an University is.
Nontheless, if there is a Bug or a Memory leak in the code, it should be fixed. Either in Moodle 1.9.x or 2.x What else is programming for than creating bugs and correcting them

Show
Rosario Carcò added a comment - Thanks Daniel. This is really a point to think about. As I said, I have more than 450 Gigs in more than 4'790 courses. And I DO a mysqlHotcopy at noon and a mySQL-Dump at evening, just before the nightly directory-backup to tape starts. The directory-Backup is good for total crashes, where you have to roll back everything to the last known and good status of the system. But for simple, daily problems our teachers encounter like I lost my file x or I just messed up my course with Microsoft Word HTML-Code I copy/pasted, I was very happy with the Backup-Zip-Files because you can even restore a whole course and then import little chunks into the original course or copy a single file from one directory to another and so on. But you are right, Moodle was intended for little schools and it was a little mistake to use it at so a large scale as an University is. Nontheless, if there is a Bug or a Memory leak in the code, it should be fixed. Either in Moodle 1.9.x or 2.x What else is programming for than creating bugs and correcting them
Hide
Daniel Neis added a comment -

Hello, Rosario

at my university we do an rsync to a remote server, snapshot the moodledata and mysql db. When a teacher wants to restore a course or a file, we copy the snapshots to a development machine with moodle installed, do a moodle backup and give the zip to teacher or just the file needed. It was working for a year or so, and the slowest part is to restore the entire database from mysqldump.

HTH,
Daniel

Show
Daniel Neis added a comment - Hello, Rosario at my university we do an rsync to a remote server, snapshot the moodledata and mysql db. When a teacher wants to restore a course or a file, we copy the snapshots to a development machine with moodle installed, do a moodle backup and give the zip to teacher or just the file needed. It was working for a year or so, and the slowest part is to restore the entire database from mysqldump. HTH, Daniel
Hide
Rosario Carcò added a comment -

Good idea. That's basically what I did several times when migrating the whole moodle server from one hardware platform to another, several times from 2004 until now. And it worked fine. And I recovered also some missed courses and files from the old hardware which I left running for emergency situations. But running two clones of the same server-hardware is really another approach to think about. Two years ago I made a migration onto VMware vSphere virtual machines but performance of mySQL was a pain and so I migrated back onto a real physical server. But I could use the VM, still in place, as rsync clone. But I am also preparing migration to Moodle 2.x and so I would need another pair of clone hardware servers... And on Moodle 2.x the whole file system changed to the worse, so that as Admin you have to consult the tables with the metadata and hashcodes for files and directories before you can recover them from tape. You will find my complaints in the moodle forums.

Show
Rosario Carcò added a comment - Good idea. That's basically what I did several times when migrating the whole moodle server from one hardware platform to another, several times from 2004 until now. And it worked fine. And I recovered also some missed courses and files from the old hardware which I left running for emergency situations. But running two clones of the same server-hardware is really another approach to think about. Two years ago I made a migration onto VMware vSphere virtual machines but performance of mySQL was a pain and so I migrated back onto a real physical server. But I could use the VM, still in place, as rsync clone. But I am also preparing migration to Moodle 2.x and so I would need another pair of clone hardware servers... And on Moodle 2.x the whole file system changed to the worse, so that as Admin you have to consult the tables with the metadata and hashcodes for files and directories before you can recover them from tape. You will find my complaints in the moodle forums.
Hide
Andrew Nicols added a comment -

FYI, Mahara already has a system with locks in place which Moodle could borrow from. Although the method in which cron is defined in Mahara is slightly different, the cron locking system should be compatible. If one job is running and takes a longer period of time to run than anticipated; then other cron components will be carried out as normal on the next call of cron, and the running job will be skipped.

This is managed through use of an entry into the lock table - see http://gitorious.org/mahara/mahara/blobs/master/htdocs/lib/cron.php#line469 for details on the lock implementation

Show
Andrew Nicols added a comment - FYI, Mahara already has a system with locks in place which Moodle could borrow from. Although the method in which cron is defined in Mahara is slightly different, the cron locking system should be compatible. If one job is running and takes a longer period of time to run than anticipated; then other cron components will be carried out as normal on the next call of cron, and the running job will be skipped. This is managed through use of an entry into the lock table - see http://gitorious.org/mahara/mahara/blobs/master/htdocs/lib/cron.php#line469 for details on the lock implementation
Hide
Rosario Carcò added a comment -

Andrew, thanks for this too. So we are facing TWO problems: one the locks to avoid overlapping cron-jobs, like the backup starting over when the previous one hangs or died for an unknown reason. And two the backup-code itself which seems to cause the hang or memory leak or whatever. Could it be a time-out when a course contains a lot of uploaded files, lets say 1GB or more, that take a rather long time to zip?? But yesterday I checked the last course mentioned in the backup-log and it was a very banal course, nothing special. The last backup-file was from september, which means the course had been skipped. So I deleted it and ran a manual backup which got through fine and smoothly without any errors. And all the courses I already checked in the last years could be backed up manually too. So my actual guess is that it is not actually the courses themselves but someting in the code that comes to hang or die. This last banal course has only 42 MB of data zipped in the backup-file. I will check tomorrow if the automated backup renews this backup-zip-file.

Show
Rosario Carcò added a comment - Andrew, thanks for this too. So we are facing TWO problems: one the locks to avoid overlapping cron-jobs, like the backup starting over when the previous one hangs or died for an unknown reason. And two the backup-code itself which seems to cause the hang or memory leak or whatever. Could it be a time-out when a course contains a lot of uploaded files, lets say 1GB or more, that take a rather long time to zip?? But yesterday I checked the last course mentioned in the backup-log and it was a very banal course, nothing special. The last backup-file was from september, which means the course had been skipped. So I deleted it and ran a manual backup which got through fine and smoothly without any errors. And all the courses I already checked in the last years could be backed up manually too. So my actual guess is that it is not actually the courses themselves but someting in the code that comes to hang or die. This last banal course has only 42 MB of data zipped in the backup-file. I will check tomorrow if the automated backup renews this backup-zip-file.
Hide
Rosario Carcò added a comment -

As you can see, the backup of Monday/yesterday morning lasted until now, about 29 hours:

Von: Admin Moodle Rosario Carcò moodlehelpAtOurDomain
Gesendet: Dienstag, 15. November 2011 09:38
An: _mb_moodlehelp
Betreff: FHNW: [FEHLER] Geplanter Sicherungsstatus
Wichtigkeit: Hoch

Zusammenfassung/Übersicht
==================================================
Kurse: 4790
OK: 3996
Übersprungen: 754
Fehler: 0
Noch nicht abgeschlossen: 40

Show
Rosario Carcò added a comment - As you can see, the backup of Monday/yesterday morning lasted until now, about 29 hours: Von: Admin Moodle Rosario Carcò moodlehelpAtOurDomain Gesendet: Dienstag, 15. November 2011 09:38 An: _mb_moodlehelp Betreff: FHNW: [FEHLER] Geplanter Sicherungsstatus Wichtigkeit: Hoch Zusammenfassung/Übersicht ================================================== Kurse: 4790 OK: 3996 Übersprungen: 754 Fehler: 0 Noch nicht abgeschlossen: 40
Hide
Rosario Carcò added a comment - - edited

After a couple of weeks the backups take up again more than 24 hours. Today I noticed that the swap-File has grown to over 10 GB, despite having 8 GB RAM, so the backup-cron-job really has/causes a memory leak which speeds down everything. I have nothing else running on my SLES 11 server except what is needed by Moodle and note that it is really IDLE, but very slow in response. e.g. I am testing my uploadusersandcourses.php script which took 5 seconds to finish what normally takes not even a second:

>>>
top - 15:52:00 up 27 days, 12:14, 2 users, load average: 1.23, 1.20, 1.03
Tasks: 214 total, 2 running, 212 sleeping, 0 stopped, 0 zombie
Cpu0 : 0.3%us, 0.7%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu1 : 14.2%us, 18.8%sy, 0.0%ni, 13.5%id, 53.1%wa, 0.0%hi, 0.3%si, 0.0%st
Cpu2 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu3 : 0.7%us, 0.0%sy, 0.0%ni, 99.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu4 : 7.6%us, 3.2%sy, 0.0%ni, 86.9%id, 2.2%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu5 : 2.0%us, 0.7%sy, 0.0%ni, 97.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu6 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu7 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 8100052k total, 4396156k used, 3703896k free, 43940k buffers
Swap: 10482404k total, 41424k used, 10440980k free, 3351568k cached
>>>

Show
Rosario Carcò added a comment - - edited After a couple of weeks the backups take up again more than 24 hours. Today I noticed that the swap-File has grown to over 10 GB, despite having 8 GB RAM, so the backup-cron-job really has/causes a memory leak which speeds down everything. I have nothing else running on my SLES 11 server except what is needed by Moodle and note that it is really IDLE, but very slow in response. e.g. I am testing my uploadusersandcourses.php script which took 5 seconds to finish what normally takes not even a second: >>> top - 15:52:00 up 27 days, 12:14, 2 users, load average: 1.23, 1.20, 1.03 Tasks: 214 total, 2 running, 212 sleeping, 0 stopped, 0 zombie Cpu0 : 0.3%us, 0.7%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 14.2%us, 18.8%sy, 0.0%ni, 13.5%id, 53.1%wa, 0.0%hi, 0.3%si, 0.0%st Cpu2 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 : 0.7%us, 0.0%sy, 0.0%ni, 99.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu4 : 7.6%us, 3.2%sy, 0.0%ni, 86.9%id, 2.2%wa, 0.0%hi, 0.0%si, 0.0%st Cpu5 : 2.0%us, 0.7%sy, 0.0%ni, 97.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu6 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu7 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 8100052k total, 4396156k used, 3703896k free, 43940k buffers Swap: 10482404k total, 41424k used, 10440980k free, 3351568k cached >>>
Hide
Rosario Carcò added a comment - - edited

Oh I beg your pardon, I only noticed today, when analyzing again, that I had taken the TOTAL of 10 GB which is what I allocated wrongly as the actual size. The actual size seems to be ok with 41 MegaBytes USED!
And I noticed that restarting either mysql nor apache nor swapoff -a -v followed by a swapon -a -v solves the problem. Normally I have to reboot the server. But I am going to check the tables and repair them if necessary. Some years ago I had such performace problems I resolved by repairing the tables and indexes with phpMyAdmin and myisamchk -a -S /pathtomysql/data/moodledir/*.MYI (the parameters are in the Moodle Forums or Docs) and by disabling stats and retaining logs only for 3 months or less.

Show
Rosario Carcò added a comment - - edited Oh I beg your pardon, I only noticed today, when analyzing again, that I had taken the TOTAL of 10 GB which is what I allocated wrongly as the actual size. The actual size seems to be ok with 41 MegaBytes USED! And I noticed that restarting either mysql nor apache nor swapoff -a -v followed by a swapon -a -v solves the problem. Normally I have to reboot the server. But I am going to check the tables and repair them if necessary. Some years ago I had such performace problems I resolved by repairing the tables and indexes with phpMyAdmin and myisamchk -a -S /pathtomysql/data/moodledir/*.MYI (the parameters are in the Moodle Forums or Docs) and by disabling stats and retaining logs only for 3 months or less.

Dates

  • Created:
    Updated: