Moodle

Chat deamon stops working after a few hours

Details

  • Type: Sub-task Sub-task
  • Status: Open Open
  • Priority: Minor Minor
  • Resolution: Unresolved
  • Affects Version/s: 1.7.1, 2.1.1
  • Fix Version/s: DEV backlog
  • Component/s: Chat
  • Labels:
    None
  • Environment:
    Moodle 1.7.1+ (2006101010) on a SuSE 9 server, running Apache with MySQL 5.0.27, with PHP 5.2.0.

Description

Chat deamon is working great for a few hours, then dying for no reason. Restarting the script makes it work again, but I can't monitor it 24/7.

Our server guy said that the chat deamon that came with the 1.71 build was version1.0 and he has upgraded this to v1.3 if that makes any difference?

Activity

Hide
stan hoeppner added a comment -

Server: SuSE Linux Enterprise Server 10
Apache/2.2.3
mysql Ver 14.12 Distrib 5.0.18, for suse-linux (i686) using readline 5.1
PHP 5.1.2 (cli) (built: Nov 7 2006 14:30:25) Copyright (c) 1997-2006 The PHP Group Zend Engine v2.1.0, Copyright (c) 1998-2006 Zend Technologies
Moodle chat daemon v1.0 on PHP 5.1.2 ($Id: chatd.php,v 1.31 2007/01/03 14:44:45 moodler Exp $)
Moodle 1.7.x

client: SuSE Linux Enterprise Desktop
Firefox 1.5.0.9

Server setup:

cront tab: @reboot /srv/www/htdocs/moodle/mod/chat/mdl-chat-daemon.sh

  1. mdl-chat-daemon.sh
    #! /bin/bash
    cd /srv/www/htdocs/moodle/mod/chat/
    php5 chatd.php --start >/var/log/moodlechat.log &

I am experiencing two loss of function scenarios:

1. The process, "php5 chatd.php" exits without logging an error to std_out. As you can see from my (albeit very basic) start script that I'm redirecting std_out to a file instead of using the chatd.php start script logging option.

2. The process, "php5 chatd.php -start" as listed with ps -ef, is still running, and I can establish a telnet connection to port 9111, but the chat window within the remote Firefox broswer has stopped functioning, and more times that not, when this occurrs, Firefox hard crashes, and exits. I've never seen Firefox crash before-ever. If no one else is experiencing the Firefox crashes, or the chat window not working even though the server daemon is still accepting incoming connections, then this may be due to the XGL 3D desktop. Look at the "window open ()" function for the chat pop up window. I'm not certain, but it seems the Firefox crash usually occurs when I rotate the desktop. The Firefox crash only occurs when I have Moodle chat open. I have not tested this scenario with other browsers, or Firefox on my system with xgl disabled, so I don't know if this issue is client or server. I will do further client side testing and report back.

I may file a separate bug report on #2 depending on the results of my testing.

Show
stan hoeppner added a comment - Server: SuSE Linux Enterprise Server 10 Apache/2.2.3 mysql Ver 14.12 Distrib 5.0.18, for suse-linux (i686) using readline 5.1 PHP 5.1.2 (cli) (built: Nov 7 2006 14:30:25) Copyright (c) 1997-2006 The PHP Group Zend Engine v2.1.0, Copyright (c) 1998-2006 Zend Technologies Moodle chat daemon v1.0 on PHP 5.1.2 ($Id: chatd.php,v 1.31 2007/01/03 14:44:45 moodler Exp $) Moodle 1.7.x client: SuSE Linux Enterprise Desktop Firefox 1.5.0.9 Server setup: cront tab: @reboot /srv/www/htdocs/moodle/mod/chat/mdl-chat-daemon.sh
  1. mdl-chat-daemon.sh #! /bin/bash cd /srv/www/htdocs/moodle/mod/chat/ php5 chatd.php --start >/var/log/moodlechat.log &
I am experiencing two loss of function scenarios: 1. The process, "php5 chatd.php" exits without logging an error to std_out. As you can see from my (albeit very basic) start script that I'm redirecting std_out to a file instead of using the chatd.php start script logging option. 2. The process, "php5 chatd.php -start" as listed with ps -ef, is still running, and I can establish a telnet connection to port 9111, but the chat window within the remote Firefox broswer has stopped functioning, and more times that not, when this occurrs, Firefox hard crashes, and exits. I've never seen Firefox crash before-ever. If no one else is experiencing the Firefox crashes, or the chat window not working even though the server daemon is still accepting incoming connections, then this may be due to the XGL 3D desktop. Look at the "window open ()" function for the chat pop up window. I'm not certain, but it seems the Firefox crash usually occurs when I rotate the desktop. The Firefox crash only occurs when I have Moodle chat open. I have not tested this scenario with other browsers, or Firefox on my system with xgl disabled, so I don't know if this issue is client or server. I will do further client side testing and report back. I may file a separate bug report on #2 depending on the results of my testing.
Hide
stan hoeppner added a comment -

Please contact me if there is any more information I could provide, or if you would like me to do further/additional testing. I want this issue resolved.

Show
stan hoeppner added a comment - Please contact me if there is any more information I could provide, or if you would like me to do further/additional testing. I want this issue resolved.
Hide
stan hoeppner added a comment -

Ok, I've confirmed that the chatd.php daemon stops responding to requests while it is still in the process list. I tested it from home last night with Firefox 2.0.0.2 on Windows. My testing to this point was only on Firefox 1.5.0.9 on Linux. The time I find chat not responding I'll test with Internet Explorer as well.

I'm pretty sure the Firefox crashing issue is strictly related to the XGL Linux interface. It only crashes when I have rotated to another cube face, and then back to the cube face that the Firefox moodle chat window is on.

Show
stan hoeppner added a comment - Ok, I've confirmed that the chatd.php daemon stops responding to requests while it is still in the process list. I tested it from home last night with Firefox 2.0.0.2 on Windows. My testing to this point was only on Firefox 1.5.0.9 on Linux. The time I find chat not responding I'll test with Internet Explorer as well. I'm pretty sure the Firefox crashing issue is strictly related to the XGL Linux interface. It only crashes when I have rotated to another cube face, and then back to the cube face that the Firefox moodle chat window is on.
Hide
Anton Pienaar added a comment -

(1.7.2) Please please help with this. Our chat daemon stops after 20 min at the moment, manually go over to normal method. 18 Users in one chat seems to collapse as well. Under pressure from management to get to a problem free chat scenario.

Will upgrading to 1.8.x help???? pienaara.rd@mail.uovs.ac.za

Show
Anton Pienaar added a comment - (1.7.2) Please please help with this. Our chat daemon stops after 20 min at the moment, manually go over to normal method. 18 Users in one chat seems to collapse as well. Under pressure from management to get to a problem free chat scenario. Will upgrading to 1.8.x help???? pienaara.rd@mail.uovs.ac.za
Hide
Anton Pienaar added a comment -

We seem to have isolated the chat problem Martin. See below:

Every 2.0s: netstat -tap Tue Jun 12 12:45:59 2007

Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 :32769 *: LISTEN 2346/rpc.statd
tcp 0 0 :mysql *: LISTEN 2632/mysqld
tcp 0 0 :sunrpc *: LISTEN 2326/portmap
tcp 0 0 moodledev.uovs.ac.za:9111 : LISTEN 347/php
tcp 0 0 localhost.localdomain:ipp : LISTEN 20485/cupsd
tcp 0 0 localhost.localdomain:smtp : LISTEN 2665/sendmail: acce
tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1979 TIME_WAIT -
tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1969 TIME_WAIT -
tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1968 TIME_WAIT -
tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1970 TIME_WAIT -
tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1973 TIME_WAIT -
tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1941 ESTABLISHED 347/php
tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1972 TIME_WAIT -
tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1975 TIME_WAIT -
tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1974 TIME_WAIT -
tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1963 TIME_WAIT -
tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1965 TIME_WAIT -
tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1967 TIME_WAIT -
tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1966 TIME_WAIT -
tcp 0 0 :http *: LISTEN 32192/httpd
tcp 0 0 :ssh *: LISTEN 2522/sshd
tcp 0 0 moodledev.uovs.ac.za:http rspc111.uovs.ac.za:1980 ESTABLISHED 32197/httpd
tcp 0 0 moodledev.uovs.ac.za:ssh rspc111.uovs.ac.za:1748 ESTABLISHED 32345/2
tcp 0 0 moodledev.uovs.ac.za:ssh rspc36.uovs.ac.za:1444 ESTABLISHED 32707/1
#######################################################################

for each post to the chat it opens a port and never closes that port again. This will obviously cause high utilisation and also cause the chat to slow down. This is a programatic problem.

We have also tested the chat_revamped version, but found it extremely buggy.

Show
Anton Pienaar added a comment - We seem to have isolated the chat problem Martin. See below: Every 2.0s: netstat -tap Tue Jun 12 12:45:59 2007 Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 :32769 *: LISTEN 2346/rpc.statd tcp 0 0 :mysql *: LISTEN 2632/mysqld tcp 0 0 :sunrpc *: LISTEN 2326/portmap tcp 0 0 moodledev.uovs.ac.za:9111 : LISTEN 347/php tcp 0 0 localhost.localdomain:ipp : LISTEN 20485/cupsd tcp 0 0 localhost.localdomain:smtp : LISTEN 2665/sendmail: acce tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1979 TIME_WAIT - tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1969 TIME_WAIT - tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1968 TIME_WAIT - tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1970 TIME_WAIT - tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1973 TIME_WAIT - tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1941 ESTABLISHED 347/php tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1972 TIME_WAIT - tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1975 TIME_WAIT - tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1974 TIME_WAIT - tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1963 TIME_WAIT - tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1965 TIME_WAIT - tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1967 TIME_WAIT - tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1966 TIME_WAIT - tcp 0 0 :http *: LISTEN 32192/httpd tcp 0 0 :ssh *: LISTEN 2522/sshd tcp 0 0 moodledev.uovs.ac.za:http rspc111.uovs.ac.za:1980 ESTABLISHED 32197/httpd tcp 0 0 moodledev.uovs.ac.za:ssh rspc111.uovs.ac.za:1748 ESTABLISHED 32345/2 tcp 0 0 moodledev.uovs.ac.za:ssh rspc36.uovs.ac.za:1444 ESTABLISHED 32707/1 ####################################################################### for each post to the chat it opens a port and never closes that port again. This will obviously cause high utilisation and also cause the chat to slow down. This is a programatic problem. We have also tested the chat_revamped version, but found it extremely buggy.
Hide
Martin Dougiamas added a comment -

Dongsheng did you have any progress on this?

Show
Martin Dougiamas added a comment - Dongsheng did you have any progress on this?
Hide
Dongsheng Cai added a comment -

Not yet, I will research this in the next few days.

Show
Dongsheng Cai added a comment - Not yet, I will research this in the next few days.
Hide
israel added a comment -

My Environment:
Moodle 1.7.1+ Apache with MySQL with PHP 5.1.6.

my chat deamon stops working after a few hours, and we found an error. we losted ddbb conection:

this the lines we added:

while(true) {
$active = array();

//----------------------------------
//@MODIFICATION: fer que els deamon sobrevisqui a les caigudes de MySQL
//-------- ADDED:
$mypingnow->value = time();
if (!update_record('config',$mypingnow)) {
//if (!get_record ('config','name','chatd_ping')) {
$postlog[] = 'Error de connexio';
$intents = 0;
$maxintents = 600;
$dbconnected = false;
$db->Disconnect();

//implementem el timeout
while (!$dbconnected && $intents<$maxintents) {
//mirem el tmep d'espera
$waiting = ($intents<count($timeouts))?$timeouts[$intents]:$timeouts[count($timeouts)-1];
echo 'esperant '.$waiting.' segons...';
$postlog[] = "Reintent #$intents, esperant $waiting segons";
sleep($waiting);
echo "torne-m'hi\n!";

// See MDL-6760 for why this is necessary. In Moodle 1.8, once we start using NULLs properly,
// we probably want to change this value to ''.
$db->null2null = 'A long random string that will never, ever match something we want to insert into the database, I hope. \'';

error_reporting(0); // Hide errors

if (!isset($CFG->dbpersist) or !empty($CFG->dbpersist)) { // Use persistent connection (default) $dbconnected = $db->PConnect($CFG->dbhost,$CFG->dbuser,$CFG->dbpass,$CFG->dbname,true); } else { // Use single connection $dbconnected = $db->Connect($CFG->dbhost,$CFG->dbuser,$CFG->dbpass,$CFG->dbname,true); }
$intents++;
}
if (! $dbconnected) {
//@socket_shutdown($DAEMON->listen_socket, 0);
$postlog[] = "$intents intents, no es possible connetar";
foreach ($postlog as $plog) { echo "$plog\n"; }
die("No es pot connectar a la BD");
} else {
foreach ($postlog as $plog) { add_to_log(1, 'chat', 'chatd', "index.php?id=1", $plog); }
add_to_log(1, 'chat', 'chatd', "index.php?id=1", "Reestablerta la connexio amb $intents intents");
}
$postlog = array();

/// Forcing ASSOC mode for ADOdb (some DBs default to FETCH_BOTH)
$db->SetFetchMode(ADODB_FETCH_ASSOC);
}
//-------- END (last)

// First of all, let's see if any of our UFOs has identified itself
if($DAEMON->conn_activity_ufo($active)) {

Show
israel added a comment - My Environment: Moodle 1.7.1+ Apache with MySQL with PHP 5.1.6. my chat deamon stops working after a few hours, and we found an error. we losted ddbb conection: this the lines we added: while(true) { $active = array(); //---------------------------------- //@MODIFICATION: fer que els deamon sobrevisqui a les caigudes de MySQL //-------- ADDED: $mypingnow->value = time(); if (!update_record('config',$mypingnow)) { //if (!get_record ('config','name','chatd_ping')) { $postlog[] = 'Error de connexio'; $intents = 0; $maxintents = 600; $dbconnected = false; $db->Disconnect(); //implementem el timeout while (!$dbconnected && $intents<$maxintents) { //mirem el tmep d'espera $waiting = ($intents<count($timeouts))?$timeouts[$intents]:$timeouts[count($timeouts)-1]; echo 'esperant '.$waiting.' segons...'; $postlog[] = "Reintent #$intents, esperant $waiting segons"; sleep($waiting); echo "torne-m'hi\n!"; // See MDL-6760 for why this is necessary. In Moodle 1.8, once we start using NULLs properly, // we probably want to change this value to ''. $db->null2null = 'A long random string that will never, ever match something we want to insert into the database, I hope. \''; error_reporting(0); // Hide errors if (!isset($CFG->dbpersist) or !empty($CFG->dbpersist)) { // Use persistent connection (default) $dbconnected = $db->PConnect($CFG->dbhost,$CFG->dbuser,$CFG->dbpass,$CFG->dbname,true); } else { // Use single connection $dbconnected = $db->Connect($CFG->dbhost,$CFG->dbuser,$CFG->dbpass,$CFG->dbname,true); } $intents++; } if (! $dbconnected) { //@socket_shutdown($DAEMON->listen_socket, 0); $postlog[] = "$intents intents, no es possible connetar"; foreach ($postlog as $plog) { echo "$plog\n"; } die("No es pot connectar a la BD"); } else { foreach ($postlog as $plog) { add_to_log(1, 'chat', 'chatd', "index.php?id=1", $plog); } add_to_log(1, 'chat', 'chatd', "index.php?id=1", "Reestablerta la connexio amb $intents intents"); } $postlog = array(); /// Forcing ASSOC mode for ADOdb (some DBs default to FETCH_BOTH) $db->SetFetchMode(ADODB_FETCH_ASSOC); } //-------- END (last) // First of all, let's see if any of our UFOs has identified itself if($DAEMON->conn_activity_ufo($active)) {
Hide
Dongsheng Cai added a comment -

That script needs rewrite, low priority.

Show
Dongsheng Cai added a comment - That script needs rewrite, low priority.

People

Vote (4)
Watch (6)

Dates

  • Created:
    Updated: