Issue Details (XML | Word | Printable)

Key: MDL-8465
Type: Sub-task Sub-task
Status: In Progress In Progress
Priority: Major Major
Assignee: Dongsheng Cai
Reporter: Roger Emery
Votes: 3
Watchers: 5
Operations

Add/Edit UI Mockup to this issue
If you were logged in you would be able to see more operations.
Moodle
MDL-8224

Chat deamon stops working after a few hours

Created: 08/Feb/07 09:40 PM   Updated: 21/Oct/09 04:17 PM
Return to search
Component/s: Chat
Affects Version/s: 1.7.1
Fix Version/s: 1.9.7

Environment: Moodle 1.7.1+ (2006101010) on a SuSE 9 server, running Apache with MySQL 5.0.27, with PHP 5.2.0.

Database: MySQL
URL: http://mycourse.solent.ac.uk
Participants: Anton Pienaar, Dongsheng Cai, israel, Martin Dougiamas, Roger Emery and stan hoeppner
Security Level: None
Affected Branches: MOODLE_17_STABLE
Fixed Branches: MOODLE_19_STABLE


 Description  « Hide
Chat deamon is working great for a few hours, then dying for no reason. Restarting the script makes it work again, but I can't monitor it 24/7.

Our server guy said that the chat deamon that came with the 1.71 build was version1.0 and he has upgraded this to v1.3 if that makes any difference?

 All   Comments   Change History   Version Control      Sort Order: Ascending order - Click to sort in descending order
stan hoeppner added a comment - 28/Feb/07 01:39 AM
Server: SuSE Linux Enterprise Server 10
Apache/2.2.3
mysql Ver 14.12 Distrib 5.0.18, for suse-linux (i686) using readline 5.1
PHP 5.1.2 (cli) (built: Nov 7 2006 14:30:25) Copyright (c) 1997-2006 The PHP Group Zend Engine v2.1.0, Copyright (c) 1998-2006 Zend Technologies
Moodle chat daemon v1.0 on PHP 5.1.2 ($Id: chatd.php,v 1.31 2007/01/03 14:44:45 moodler Exp $)
Moodle 1.7.x

client: SuSE Linux Enterprise Desktop
Firefox 1.5.0.9

Server setup:

cront tab: @reboot /srv/www/htdocs/moodle/mod/chat/mdl-chat-daemon.sh

  1. mdl-chat-daemon.sh
    #! /bin/bash
    cd /srv/www/htdocs/moodle/mod/chat/
    php5 chatd.php --start >/var/log/moodlechat.log &

I am experiencing two loss of function scenarios:

1. The process, "php5 chatd.php" exits without logging an error to std_out. As you can see from my (albeit very basic) start script that I'm redirecting std_out to a file instead of using the chatd.php start script logging option.

2. The process, "php5 chatd.php -start" as listed with ps -ef, is still running, and I can establish a telnet connection to port 9111, but the chat window within the remote Firefox broswer has stopped functioning, and more times that not, when this occurrs, Firefox hard crashes, and exits. I've never seen Firefox crash before-ever. If no one else is experiencing the Firefox crashes, or the chat window not working even though the server daemon is still accepting incoming connections, then this may be due to the XGL 3D desktop. Look at the "window open ()" function for the chat pop up window. I'm not certain, but it seems the Firefox crash usually occurs when I rotate the desktop. The Firefox crash only occurs when I have Moodle chat open. I have not tested this scenario with other browsers, or Firefox on my system with xgl disabled, so I don't know if this issue is client or server. I will do further client side testing and report back.

I may file a separate bug report on #2 depending on the results of my testing.


stan hoeppner added a comment - 28/Feb/07 01:50 AM
Please contact me if there is any more information I could provide, or if you would like me to do further/additional testing. I want this issue resolved.

stan hoeppner added a comment - 01/Mar/07 11:33 PM
Ok, I've confirmed that the chatd.php daemon stops responding to requests while it is still in the process list. I tested it from home last night with Firefox 2.0.0.2 on Windows. My testing to this point was only on Firefox 1.5.0.9 on Linux. The time I find chat not responding I'll test with Internet Explorer as well.

I'm pretty sure the Firefox crashing issue is strictly related to the XGL Linux interface. It only crashes when I have rotated to another cube face, and then back to the cube face that the Firefox moodle chat window is on.


Anton Pienaar added a comment - 12/Jun/07 03:17 AM
(1.7.2) Please please help with this. Our chat daemon stops after 20 min at the moment, manually go over to normal method. 18 Users in one chat seems to collapse as well. Under pressure from management to get to a problem free chat scenario.

Will upgrading to 1.8.x help???? pienaara.rd@mail.uovs.ac.za


Anton Pienaar added a comment - 12/Jun/07 07:39 PM
We seem to have isolated the chat problem Martin. See below:

Every 2.0s: netstat -tap Tue Jun 12 12:45:59 2007

Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 :32769 *: LISTEN 2346/rpc.statd
tcp 0 0 :mysql *: LISTEN 2632/mysqld
tcp 0 0 :sunrpc *: LISTEN 2326/portmap
tcp 0 0 moodledev.uovs.ac.za:9111 : LISTEN 347/php
tcp 0 0 localhost.localdomain:ipp : LISTEN 20485/cupsd
tcp 0 0 localhost.localdomain:smtp : LISTEN 2665/sendmail: acce
tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1979 TIME_WAIT -
tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1969 TIME_WAIT -
tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1968 TIME_WAIT -
tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1970 TIME_WAIT -
tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1973 TIME_WAIT -
tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1941 ESTABLISHED 347/php
tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1972 TIME_WAIT -
tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1975 TIME_WAIT -
tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1974 TIME_WAIT -
tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1963 TIME_WAIT -
tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1965 TIME_WAIT -
tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1967 TIME_WAIT -
tcp 0 0 moodledev.uovs.ac.za:9111 rspc111.uovs.ac.za:1966 TIME_WAIT -
tcp 0 0 :http *: LISTEN 32192/httpd
tcp 0 0 :ssh *: LISTEN 2522/sshd
tcp 0 0 moodledev.uovs.ac.za:http rspc111.uovs.ac.za:1980 ESTABLISHED 32197/httpd
tcp 0 0 moodledev.uovs.ac.za:ssh rspc111.uovs.ac.za:1748 ESTABLISHED 32345/2
tcp 0 0 moodledev.uovs.ac.za:ssh rspc36.uovs.ac.za:1444 ESTABLISHED 32707/1
#######################################################################

for each post to the chat it opens a port and never closes that port again. This will obviously cause high utilisation and also cause the chat to slow down. This is a programatic problem.

We have also tested the chat_revamped version, but found it extremely buggy.


Martin Dougiamas added a comment - 06/Oct/08 02:11 PM
Dongsheng did you have any progress on this?

Dongsheng Cai added a comment - 06/Oct/08 03:13 PM
Not yet, I will research this in the next few days.

israel added a comment - 13/Nov/08 01:47 AM
My Environment:
Moodle 1.7.1+ Apache with MySQL with PHP 5.1.6.

my chat deamon stops working after a few hours, and we found an error. we losted ddbb conection:

this the lines we added:

while(true) {
$active = array();

//----------------------------------
//@MODIFICATION: fer que els deamon sobrevisqui a les caigudes de MySQL
//-------- ADDED:
$mypingnow->value = time();
if (!update_record('config',$mypingnow)) {
//if (!get_record ('config','name','chatd_ping')) {
$postlog[] = 'Error de connexio';
$intents = 0;
$maxintents = 600;
$dbconnected = false;
$db->Disconnect();

//implementem el timeout
while (!$dbconnected && $intents<$maxintents) {
//mirem el tmep d'espera
$waiting = ($intents<count($timeouts))?$timeouts[$intents]:$timeouts[count($timeouts)-1];
echo 'esperant '.$waiting.' segons...';
$postlog[] = "Reintent #$intents, esperant $waiting segons";
sleep($waiting);
echo "torne-m'hi\n!";

// See MDL-6760 for why this is necessary. In Moodle 1.8, once we start using NULLs properly,
// we probably want to change this value to ''.
$db->null2null = 'A long random string that will never, ever match something we want to insert into the database, I hope. \'';

error_reporting(0); // Hide errors

if (!isset($CFG->dbpersist) or !empty($CFG->dbpersist)) { // Use persistent connection (default) $dbconnected = $db->PConnect($CFG->dbhost,$CFG->dbuser,$CFG->dbpass,$CFG->dbname,true); } else { // Use single connection $dbconnected = $db->Connect($CFG->dbhost,$CFG->dbuser,$CFG->dbpass,$CFG->dbname,true); }
$intents++;
}
if (! $dbconnected) {
//@socket_shutdown($DAEMON->listen_socket, 0);
$postlog[] = "$intents intents, no es possible connetar";
foreach ($postlog as $plog) { echo "$plog\n"; }
die("No es pot connectar a la BD");
} else {
foreach ($postlog as $plog) { add_to_log(1, 'chat', 'chatd', "index.php?id=1", $plog); }
add_to_log(1, 'chat', 'chatd', "index.php?id=1", "Reestablerta la connexio amb $intents intents");
}
$postlog = array();

/// Forcing ASSOC mode for ADOdb (some DBs default to FETCH_BOTH)
$db->SetFetchMode(ADODB_FETCH_ASSOC);
}
//-------- END (last)

// First of all, let's see if any of our UFOs has identified itself
if($DAEMON->conn_activity_ufo($active)) {