Moodle Community Sites

Database Error Messages seem to be getting more common

Details

  • Type: Task Task
  • Status: Closed Closed
  • Priority: Major Major
  • Resolution: Cannot Reproduce
  • Component/s: moodle.org
  • Labels:
    None

Description

Errors seem to be getting more commonplace on moodle.org
Moodle.org is non-responsive at this time. There is no way what the source of the problem so one is left to report all instances as one does not know whether this is the start of some major disaster or just am anomaly.

Error: Database connection failed.
It is possible that the database is overloaded or otherwise not running properly.
The site administrator should also check that the database details have been correctly specified in config.php

Activity

Hide
Marc Grober added a comment -

Wanted to note that it has been 30 minutes and moodle.org still is not responding.
Tracker and Docs appear to be fine.

Show
Marc Grober added a comment - Wanted to note that it has been 30 minutes and moodle.org still is not responding. Tracker and Docs appear to be fine.
Hide
Marc Grober added a comment -

One hour now and site just came back up this minute......

Show
Marc Grober added a comment - One hour now and site just came back up this minute......
Hide
Marc Grober added a comment -

Someone suggested that this was a schedule db upgrade....
I don;t know if I am mistaken but I thought that there was some discussion during the great fire that it would be handy to implement notice and failover so that should a major moodle site go down or become inaccessible, the user would not get a moodle error or a not found error but would get notice of what was happening...... even if the db was taken down one would think that would not effect the ability to provide notice .....

Show
Marc Grober added a comment - Someone suggested that this was a schedule db upgrade.... I don;t know if I am mistaken but I thought that there was some discussion during the great fire that it would be handy to implement notice and failover so that should a major moodle site go down or become inaccessible, the user would not get a moodle error or a not found error but would get notice of what was happening...... even if the db was taken down one would think that would not effect the ability to provide notice .....
Hide
Helen Foster added a comment -

Marc, thanks for reporting this issue and for your comments. As I understand things, to resolve the problem, moodle.org really needs more hardware.

Reassigning to Martin for further comments.

Show
Helen Foster added a comment - Marc, thanks for reporting this issue and for your comments. As I understand things, to resolve the problem, moodle.org really needs more hardware. Reassigning to Martin for further comments.
Hide
Martin Dougiamas added a comment -

Hmm, it looks like MySQL was actually running but non-responsive during that time. I didn't get any notifications about MySQL from any of the server monitoring we have set up, but I think Eloy was working on this (and fixing the monitors this morning too).

it's not actually an overload situation this time, though we do occasionally get those too (a LOT of robots and casual browsers hit moodle.org). Additional servers are planned soon to cope with this. Can't wait for that dedicated sys admin!

Show
Martin Dougiamas added a comment - Hmm, it looks like MySQL was actually running but non-responsive during that time. I didn't get any notifications about MySQL from any of the server monitoring we have set up, but I think Eloy was working on this (and fixing the monitors this morning too). it's not actually an overload situation this time, though we do occasionally get those too (a LOT of robots and casual browsers hit moodle.org). Additional servers are planned soon to cope with this. Can't wait for that dedicated sys admin!
Hide
Marc Grober added a comment -

One of the simplest monitors is a trivial web client that simply hits the front page every X minutes and then does the alarm thing if the page does not come up. At some point I had a class build something simple based on Wong's "Web Client Programming" (and I think I got them to actually marry the webping tcl/tk gui to it).

Benefits are that it tests the site remotely as a user would see the site, so it will trap situations where the server thinks all is fine (see my tale of woe below)..... and you can always use it to chart lag/latency. I used a more sophisticated version to SMS sysadmins based on a duty roster. Additionally something like that could be used to trigger error pages advising folks to stop hitting the site until service is restored, etc. if the site does respond for X minutes or what have you.

Now for the tale of woe: I had a "tech" at DH tell me "all was fine" with mysql because their "monitors" showed nothing while one of the users on the shared host had a runaway sql process that ate the hard drive... not a true race condition but produced intriguing results, all is well/nothing works - LOL...

Show
Marc Grober added a comment - One of the simplest monitors is a trivial web client that simply hits the front page every X minutes and then does the alarm thing if the page does not come up. At some point I had a class build something simple based on Wong's "Web Client Programming" (and I think I got them to actually marry the webping tcl/tk gui to it). Benefits are that it tests the site remotely as a user would see the site, so it will trap situations where the server thinks all is fine (see my tale of woe below)..... and you can always use it to chart lag/latency. I used a more sophisticated version to SMS sysadmins based on a duty roster. Additionally something like that could be used to trigger error pages advising folks to stop hitting the site until service is restored, etc. if the site does respond for X minutes or what have you. Now for the tale of woe: I had a "tech" at DH tell me "all was fine" with mysql because their "monitors" showed nothing while one of the users on the shared host had a runaway sql process that ate the hard drive... not a true race condition but produced intriguing results, all is well/nothing works - LOL...

People

Vote (0)
Watch (1)

Dates

  • Created:
    Updated:
    Resolved: