Moodle

cron execution dies two minutes after forum cron execution

Details

  • Type: Bug Bug
  • Status: Closed Closed
  • Priority: Critical Critical
  • Resolution: Fixed
  • Affects Version/s: 1.7.2, 1.8.2, 1.9
  • Fix Version/s: 1.7.3, 1.8.3, 1.9
  • Component/s: Administration
  • Labels:
    None
  • Database:
    Any
  • Affected Branches:
    MOODLE_17_STABLE, MOODLE_18_STABLE, MOODLE_19_STABLE
  • Fixed Branches:
    MOODLE_17_STABLE, MOODLE_18_STABLE, MOODLE_19_STABLE

Description

While monitoring cron execution under one big site I've noticed that the execution was dying after some undetermined period (happening in different steps of the cron execution).

After some research, I think I've traced the cause of the problem. Explanation follows:

1) admin/cron.php sets time_limit to 0 (unlimited)
2) forum_cron is called and time_limit is set to 120
3) cron execution continues and, after 2 minutes, dies!

And this can be the cause of a lot of problems because some parts of the cron execution are never raised (clean-up tasks, backup...).

There are two possible solutions:

1) Forbid the use of "custom" time_limits within module/block... cron functions.
2) In the admin/cron-php set time_limit to 0 after each module/block... cron call.

Note that this is a problem since, at least, Moodle 1.7 ! And seems important!

Ciao

Activity

Hide
Petr Škoda (skodak) added a comment -

I do not like setting it to 0, if something goes wrong then it may stay there for a long time, each part could IHMO reset the timer; or we could calculate the time of next cron run and set it just before it...

I guess it was me who added the short timeout there because we had some trouble with forum mailing at that time, hmmm

Show
Petr Škoda (skodak) added a comment - I do not like setting it to 0, if something goes wrong then it may stay there for a long time, each part could IHMO reset the timer; or we could calculate the time of next cron run and set it just before it... I guess it was me who added the short timeout there because we had some trouble with forum mailing at that time, hmmm
Hide
Eloy Lafuente (stronk7) added a comment -

Hi I agree about 0 not being the best solution. But I really think it isn't easy to guess one perfect timeout (mainly because different cron executions can have different parts executed - clean-up tasks, backup...)

So I would propose to:

1) NOW, and for 1.7, 1.8 and HEAD, reset time_limit to 0 inside the module loop.
2) Investigate if we can introduce a better solution after release.

And I'm going to implement 1) now. Mainly because current behaviour is, IMO, worse than having 0 (for big sites it causes big problems). Let's do better than 0 later.

Oki? We need cron running completely to be able to continue testing the whole cron execution.

Thanks and ciao

Show
Eloy Lafuente (stronk7) added a comment - Hi I agree about 0 not being the best solution. But I really think it isn't easy to guess one perfect timeout (mainly because different cron executions can have different parts executed - clean-up tasks, backup...) So I would propose to: 1) NOW, and for 1.7, 1.8 and HEAD, reset time_limit to 0 inside the module loop. 2) Investigate if we can introduce a better solution after release. And I'm going to implement 1) now. Mainly because current behaviour is, IMO, worse than having 0 (for big sites it causes big problems). Let's do better than 0 later. Oki? We need cron running completely to be able to continue testing the whole cron execution. Thanks and ciao
Hide
Eloy Lafuente (stronk7) added a comment -

Committed to HEAD. Going to perform some tests before backporting to 1.7 and 1.8...

Show
Eloy Lafuente (stronk7) added a comment - Committed to HEAD. Going to perform some tests before backporting to 1.7 and 1.8...
Hide
Eloy Lafuente (stronk7) added a comment -

Also to HEAD, I've added the same reset of time_limit=0 to blocks cron.

Show
Eloy Lafuente (stronk7) added a comment - Also to HEAD, I've added the same reset of time_limit=0 to blocks cron.
Hide
Petr Škoda (skodak) added a comment -

thanks Eloy, it is funny we often have the same ideas how to solve problems

thanks

Show
Petr Škoda (skodak) added a comment - thanks Eloy, it is funny we often have the same ideas how to solve problems thanks
Hide
Eloy Lafuente (stronk7) added a comment -

Backported to both 17_STABLE and 18_STABLE.

I think this can be causing real problem to a bunch of sites (with some important tasks like users/logs/cache maintenance never being executed).

So, I would review 17_STABLE and 18_STABLE status and release 1.7.3 and 1.8.3 ASAP. +1 for that.

I continue reviewing cron... one note from HQ chat that, perhaps should become a new bug in the tracker, for your consideration:

"I think we could detect unfinished cron and mail admins. Yep should introduce some "inteligence" to the cron, like scheduled backups, both preventing multiple executions and informing admin if something is unfinished. yep. (agree). oki. 100%"

Show
Eloy Lafuente (stronk7) added a comment - Backported to both 17_STABLE and 18_STABLE. I think this can be causing real problem to a bunch of sites (with some important tasks like users/logs/cache maintenance never being executed). So, I would review 17_STABLE and 18_STABLE status and release 1.7.3 and 1.8.3 ASAP. +1 for that. I continue reviewing cron... one note from HQ chat that, perhaps should become a new bug in the tracker, for your consideration: "I think we could detect unfinished cron and mail admins. Yep should introduce some "inteligence" to the cron, like scheduled backups, both preventing multiple executions and informing admin if something is unfinished. yep. (agree). oki. 100%"
Hide
Martin Dougiamas added a comment -

I guess this can be marked resolved now.

Show
Martin Dougiamas added a comment - I guess this can be marked resolved now.

People

Dates

  • Created:
    Updated:
    Resolved: