Issue Details (XML | Word | Printable)

Key: MDL-11597
Type: Bug Bug
Status: Resolved Resolved
Resolution: Fixed
Priority: Critical Critical
Assignee: Eloy Lafuente (stronk7)
Reporter: Eloy Lafuente (stronk7)
Votes: 0
Watchers: 1
Operations

Add/Edit UI Mockup to this issue
If you were logged in you would be able to see more operations.
Moodle

cron execution dies two minutes after forum cron execution

Created: 04/Oct/07 10:55 PM   Updated: 10/Oct/07 02:47 PM
Return to search
Component/s: Administration
Affects Version/s: 1.7.2, 1.8.2, 1.9
Fix Version/s: 1.7.3, 1.8.3, 1.9

Database: Any
Participants: Eloy Lafuente (stronk7), Martin Dougiamas and Petr Skoda
Security Level: None
Resolved date: 10/Oct/07
Affected Branches: MOODLE_17_STABLE, MOODLE_18_STABLE, MOODLE_19_STABLE
Fixed Branches: MOODLE_17_STABLE, MOODLE_18_STABLE, MOODLE_19_STABLE


 Description  « Hide
While monitoring cron execution under one big site I've noticed that the execution was dying after some undetermined period (happening in different steps of the cron execution).

After some research, I think I've traced the cause of the problem. Explanation follows:

1) admin/cron.php sets time_limit to 0 (unlimited)
2) forum_cron is called and time_limit is set to 120
3) cron execution continues and, after 2 minutes, dies!

And this can be the cause of a lot of problems because some parts of the cron execution are never raised (clean-up tasks, backup...).

There are two possible solutions:

1) Forbid the use of "custom" time_limits within module/block... cron functions.
2) In the admin/cron-php set time_limit to 0 after each module/block... cron call.

Note that this is a problem since, at least, Moodle 1.7 ! And seems important!

Ciao :-)

 All   Comments   Change History   Version Control      Sort Order: Ascending order - Click to sort in descending order
Petr Skoda added a comment - 04/Oct/07 11:18 PM
I do not like setting it to 0, if something goes wrong then it may stay there for a long time, each part could IHMO reset the timer; or we could calculate the time of next cron run and set it just before it...

I guess it was me who added the short timeout there because we had some trouble with forum mailing at that time, hmmm


Eloy Lafuente (stronk7) added a comment - 05/Oct/07 01:08 AM
Hi I agree about 0 not being the best solution. But I really think it isn't easy to guess one perfect timeout (mainly because different cron executions can have different parts executed - clean-up tasks, backup...)

So I would propose to:

1) NOW, and for 1.7, 1.8 and HEAD, reset time_limit to 0 inside the module loop.
2) Investigate if we can introduce a better solution after release.

And I'm going to implement 1) now. Mainly because current behaviour is, IMO, worse than having 0 (for big sites it causes big problems). Let's do better than 0 later.

Oki? We need cron running completely to be able to continue testing the whole cron execution.

Thanks and ciao


Eloy Lafuente (stronk7) added a comment - 05/Oct/07 01:44 AM
Committed to HEAD. Going to perform some tests before backporting to 1.7 and 1.8...

Eloy Lafuente (stronk7) added a comment - 05/Oct/07 01:56 AM
Also to HEAD, I've added the same reset of time_limit=0 to blocks cron.

Petr Skoda added a comment - 05/Oct/07 02:02 AM
thanks Eloy, it is funny we often have the same ideas how to solve problems

thanks


Eloy Lafuente (stronk7) added a comment - 05/Oct/07 06:34 AM
Backported to both 17_STABLE and 18_STABLE.

I think this can be causing real problem to a bunch of sites (with some important tasks like users/logs/cache maintenance never being executed).

So, I would review 17_STABLE and 18_STABLE status and release 1.7.3 and 1.8.3 ASAP. +1 for that.

I continue reviewing cron... one note from HQ chat that, perhaps should become a new bug in the tracker, for your consideration:

"I think we could detect unfinished cron and mail admins. Yep should introduce some "inteligence" to the cron, like scheduled backups, both preventing multiple executions and informing admin if something is unfinished. yep. (agree). oki. 100%"


Martin Dougiamas added a comment - 10/Oct/07 02:47 PM
I guess this can be marked resolved now.