Uploaded image for project: 'Moodle'
  1. Moodle
  2. MDL-21296

convert_urls_into_links() causing slowdown/timeouts

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Won't Fix
    • Affects Version/s: 1.9.7
    • Fix Version/s: None
    • Component/s: General
    • Labels:
      None
    • Affected Branches:
      MOODLE_19_STABLE

      Description

      Some users are experiencing slow downs/timeouts caused by convert_urls_into_links() See MDL-21168 for previous discussion.

      A machine running php 5.2.1, pcre 6.6 and missing the 2 pcre configuration directives is not working.

      A machine with php 5.2.4 and pcre 7.4 with the directives is working. I've got php 5.2.10 and pcre 7.8 with the directives and its working.

      Theoretically pcre 6.6 or the missing configuration directives could be the source of the problem.

      http://www.phptutorial.info/?pcre.configuration
      Those directives were introduced in php 5.2.0 so they should be there on the php 5.2.1 machine. That said, if they're missing I would have assumed sensible defaults would have been used.

      The workaround described here (http://tracker.moodle.org/browse/MDL-21168?focusedCommentId=79154&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_79154) makes this work but the non-unicode modifier regexes fail the 2 unit tests to do with detecting urls containing non-ascii characters ie utf 8 characters.

        Gliffy Diagrams

          Issue Links

            Activity

            Hide
            andyjdavis Andrew Davis added a comment -

            I've split this issue out of MDL-21168

            Show
            andyjdavis Andrew Davis added a comment - I've split this issue out of MDL-21168
            Hide
            andyjdavis Andrew Davis added a comment -

            I'm attaching a patch that avoids this issue. It just forces the use of the regular expressions that don't use unicode modifiers.

            The only situation in which these regexs perform worse than those with unicode modifiers is in identifying urls that contain non-ascii characters. The characters around the url don't matter. This difference only becomes apparent if the non-ascii characters are within the url itself.

            That situation is sufficiently unlikely at present that this is probably a workable bandaid until we have discovered the real source of the problem. Eloy, are you happy for me to commit this?

            Show
            andyjdavis Andrew Davis added a comment - I'm attaching a patch that avoids this issue. It just forces the use of the regular expressions that don't use unicode modifiers. The only situation in which these regexs perform worse than those with unicode modifiers is in identifying urls that contain non-ascii characters. The characters around the url don't matter. This difference only becomes apparent if the non-ascii characters are within the url itself. That situation is sufficiently unlikely at present that this is probably a workable bandaid until we have discovered the real source of the problem. Eloy, are you happy for me to commit this?
            Hide
            stronk7 Eloy Lafuente (stronk7) added a comment -

            Hi Andrew,

            I think it's ok to force the non-unicode version of the URLs as a quick solution. So feel free to do so (in 1.9 stable).

            Anyway, we need to, programmatically, be able to detect that situations. In fact, in my TODO, I've one task about to create one environmental test detecting (on install & upgrade) the unicode support in regular expressions in order to inform the user about it and encourage him to fix it if necessary. The key here is that everyday, there are more and more parts being dependent of unicode support in regular expressions so we need to, programmatically, detect if they are really supported on each server.

            I thought this was enough (in fact I've seen it in various Zend libraries and backup & restore is also using it since some time ago):

            $unicoderegexp = @preg_match('/\pL/u', 'a'); // This will fail silenty, returning false,

            But it seems we need to go one step ahead and check more things, PCRE version, configuration settings... whatever. But need to known it properly ASAP. I guess it's a matter of the PCRE lib used... perhaps we could go backwards.... to see when it started to work, and then base our test in that PCRE_VERSION (or whatever the constant is named). Please try to catch it, TIA!

            Ciao

            Show
            stronk7 Eloy Lafuente (stronk7) added a comment - Hi Andrew, I think it's ok to force the non-unicode version of the URLs as a quick solution. So feel free to do so (in 1.9 stable). Anyway, we need to, programmatically , be able to detect that situations. In fact, in my TODO, I've one task about to create one environmental test detecting (on install & upgrade) the unicode support in regular expressions in order to inform the user about it and encourage him to fix it if necessary. The key here is that everyday, there are more and more parts being dependent of unicode support in regular expressions so we need to, programmatically, detect if they are really supported on each server. I thought this was enough (in fact I've seen it in various Zend libraries and backup & restore is also using it since some time ago): $unicoderegexp = @preg_match('/\pL/u', 'a'); // This will fail silenty, returning false, But it seems we need to go one step ahead and check more things, PCRE version, configuration settings... whatever. But need to known it properly ASAP. I guess it's a matter of the PCRE lib used... perhaps we could go backwards.... to see when it started to work, and then base our test in that PCRE_VERSION (or whatever the constant is named). Please try to catch it, TIA! Ciao
            Hide
            andyjdavis Andrew Davis added a comment -

            I have committed the temporary work around into 1.9. I've also committed a comment in 2 in that spot.

            Show
            andyjdavis Andrew Davis added a comment - I have committed the temporary work around into 1.9. I've also committed a comment in 2 in that spot.
            Hide
            stronk7 Eloy Lafuente (stronk7) added a comment -

            Uhm, according with this blog, the version isn't enough to know if the PCRE lib is supporting or no unicode properly. The same version (6.6) can be compiled with and without unicode support: http://gaarai.com/2009/01/31/unicode-support-on-centos-52-with-php-and-pcre/

            In any case, I think that the problem isn't unicode support, that is checked properly with the preg_match() test above, but to know which libraries are buggy (cause slowdown) and which ones aren't.

            Perhaps it's a matter of running PHP with its own PCRE library instead of running it with the OS library, or something like that. I guess that the own one is always utf-8 enabled, but the OS one isn't.

            Who knows! But we need to be able to address this, calculating VALID $unicoderegexp values.

            Ciao

            PS: About the comment in 2.0, Andrew... can you use to add always the // TODO: keyword (for easier searching) in the future. It helps a lot. No big problem, just a trick.

            Show
            stronk7 Eloy Lafuente (stronk7) added a comment - Uhm, according with this blog, the version isn't enough to know if the PCRE lib is supporting or no unicode properly. The same version (6.6) can be compiled with and without unicode support: http://gaarai.com/2009/01/31/unicode-support-on-centos-52-with-php-and-pcre/ In any case, I think that the problem isn't unicode support, that is checked properly with the preg_match() test above, but to know which libraries are buggy (cause slowdown) and which ones aren't. Perhaps it's a matter of running PHP with its own PCRE library instead of running it with the OS library, or something like that. I guess that the own one is always utf-8 enabled, but the OS one isn't. Who knows! But we need to be able to address this, calculating VALID $unicoderegexp values. Ciao PS: About the comment in 2.0, Andrew... can you use to add always the // TODO: keyword (for easier searching) in the future. It helps a lot. No big problem, just a trick.
            Hide
            timhunt Tim Hunt added a comment -

            I think the temporary fix to this issue is causing two unit test failures in 1.9.8. As a nicety, should we comment out those two unit tests until this bug is properly resolved?

            Show
            timhunt Tim Hunt added a comment - I think the temporary fix to this issue is causing two unit test failures in 1.9.8. As a nicety, should we comment out those two unit tests until this bug is properly resolved?
            Hide
            salvetore Michael de Raadt added a comment -

            Thanks for reporting this issue.

            We have detected that this issue has been inactive for over a year has been recorded as affecting versions that are no longer supported.

            If you believe that this issue is still relevant to current versions (2.1 and beyond), please comment on the issue. Issues left inactive for a further month will be closed.

            Michael d;

            lqjjLKA0p6

            Show
            salvetore Michael de Raadt added a comment - Thanks for reporting this issue. We have detected that this issue has been inactive for over a year has been recorded as affecting versions that are no longer supported. If you believe that this issue is still relevant to current versions (2.1 and beyond), please comment on the issue. Issues left inactive for a further month will be closed. Michael d; lqjjLKA0p6
            Hide
            salvetore Michael de Raadt added a comment -

            I'm closing this issue as it appears to have become inactive and is probably not relevant to a current supported version. If you are encountering this problem or one similar, please launch a new issue.

            Show
            salvetore Michael de Raadt added a comment - I'm closing this issue as it appears to have become inactive and is probably not relevant to a current supported version. If you are encountering this problem or one similar, please launch a new issue.
            Hide
            poltawski Dan Poltawski added a comment -

            Note that a fix was applied here, but it hasn't solved the issue MDL-38466 .

            Show
            poltawski Dan Poltawski added a comment - Note that a fix was applied here, but it hasn't solved the issue MDL-38466 .

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: