Uploaded image for project: 'Moodle'
  1. Moodle
  2. MDL-21168

convert_urls_into_links() is converting inside image tags

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.9.7
    • Fix Version/s: 1.9.8
    • Component/s: Usability
    • Labels:
      None
    • Database:
      Any
    • Affected Branches:
      MOODLE_19_STABLE
    • Fixed Branches:
      MOODLE_19_STABLE

      Description

      It appears that the convert_urls_into_links() function is converting URLs that are inside image tags. For instance, it converts this:

      <img src="http://moodle.org/logo/logo-240x60.gif" />

      into this:

      <img src="<a href="http://moodle.org/logo/logo-240x60.gif" target="_blank">http://moodle.org/logo/logo-240x60.gif</a>" />

      I can confirm that it is the convert_urls_into_links() function that's causing this by commenting out line ~2246 of lib/weblib.php that reads:

      convert_urls_into_links($text);

      With that line commented out, the images display as expected with the correct HTML.

      P.S. I'm assuming this has to do with the changes made in MDL-20826, but I wasn't sure whether or not I should create a new issue since that one is "Resolved."

        Gliffy Diagrams

        1. convert_urls_into_links.patch
          3 kB
          Jonathan Robson
        2. patch.txt
          3 kB
          Matthew Davidson
        3. patched.php
          254 kB
          Matthew Davidson
        4. patched2.php
          254 kB
          Matthew Davidson
        5. quiz_buggy_description.txt
          0.6 kB
          Patrick Pollet
        6. testweblib.patch
          1 kB
          Andrew Davis
        7. weblib.patch
          0.9 kB
          Andrew Davis
        1. Broken links.jpg
          65 kB

          Issue Links

            Activity

            Hide
            tsala Helen Foster added a comment -

            Jonathan, thanks for your report. Assigning to Andrew, as he fixed MDL-20826.

            Show
            tsala Helen Foster added a comment - Jonathan, thanks for your report. Assigning to Andrew, as he fixed MDL-20826 .
            Hide
            jnrbsn Jonathan Robson added a comment -

            Thanks, Helen.

            Andrew,

            I've been able to fix this by altering the regular expressions to exclude URLs enclosed in <>. I've attached a patch with the changes that I've tested and are working for me.

            • Jonathan
            Show
            jnrbsn Jonathan Robson added a comment - Thanks, Helen. Andrew, I've been able to fix this by altering the regular expressions to exclude URLs enclosed in <>. I've attached a patch with the changes that I've tested and are working for me. Jonathan
            Hide
            jnrbsn Jonathan Robson added a comment -

            Alters regular expressions to exclude URLs enclosed in <>.

            Show
            jnrbsn Jonathan Robson added a comment - Alters regular expressions to exclude URLs enclosed in <>.
            Hide
            jnrbsn Jonathan Robson added a comment -

            Oops. Sorry for the extra comment. Here's another!

            Show
            jnrbsn Jonathan Robson added a comment - Oops. Sorry for the extra comment. Here's another!
            Hide
            andyjdavis Andrew Davis added a comment -

            Hi Jonathon. Thanks for the patch although that makes text like <p>http://moodle.com</p> pass through without getting converted into a url.

            I've attached patches to weblib.php and testweblib.php

            The change to testweblib.php is an addition to the unit tests containing an img tag.

            The other is a change that prevents the problem. Shortly before the regular expressions is some code that pulls out tags that shouldn't be touched by the substitution. Look for the call to filter_save_ignore_tags(). I've added img tags to that list. Its not fantastic having to specifically list tags that shouldn't be altered but I'm not sure whether trying to create a bullet proof regular expression will be possible.

            Show
            andyjdavis Andrew Davis added a comment - Hi Jonathon. Thanks for the patch although that makes text like <p> http://moodle.com </p> pass through without getting converted into a url. I've attached patches to weblib.php and testweblib.php The change to testweblib.php is an addition to the unit tests containing an img tag. The other is a change that prevents the problem. Shortly before the regular expressions is some code that pulls out tags that shouldn't be touched by the substitution. Look for the call to filter_save_ignore_tags(). I've added img tags to that list. Its not fantastic having to specifically list tags that shouldn't be altered but I'm not sure whether trying to create a bullet proof regular expression will be possible.
            Hide
            jnrbsn Jonathan Robson added a comment -

            Actually, it handles text like <p>http://moodle.com</p> just fine. Did you even test it? That was the whole point of the [^>]* and [^<]* parts of the regular expression. For instance, the regular expression (?Unable to render embedded object: File (>]*)...(?) not found.[^<]*>) will match the "..." in "<p>...</p>" but not in "<img src='...' />" So it will exclude URLs that are enclosed in <> but NOT IF there is a > between the < and the start of the URL or a < between the end of the URL and the >. This seems a lot better than listing tags to be excluded. I can't think of any reason why anyone would want to convert a URL enclosed in <>. Especially because it will break the tag every time since something like <...<...>...> is basically a syntax error in HTML.

            Show
            jnrbsn Jonathan Robson added a comment - Actually, it handles text like <p> http://moodle.com </p> just fine. Did you even test it? That was the whole point of the [^>] * and [^<] * parts of the regular expression. For instance, the regular expression (? Unable to render embedded object: File (>]*)...(?) not found. [^<] *>) will match the "..." in "<p>...</p>" but not in "<img src='...' />" So it will exclude URLs that are enclosed in <> but NOT IF there is a > between the < and the start of the URL or a < between the end of the URL and the >. This seems a lot better than listing tags to be excluded. I can't think of any reason why anyone would want to convert a URL enclosed in <>. Especially because it will break the tag every time since something like <...<...>...> is basically a syntax error in HTML.
            Hide
            andyjdavis Andrew Davis added a comment - - edited

            Sorry, I didnt give enough information. It's a variation on <p>url</p> thats causing a unit test to fail with the modified regular expressions. Spaces added to stop character substitution being done by the browser.
            <p>text www.moodle.org& l t ; /p>
            The unit tests for convert_urls_into_links are in lib/simpletest/testweblib.php There is now quite a comprehensive range of tests that Im slowly building up.

            As soon as you introduce escaped characters its pretty easy right now to construct more bits of text that won't be handled in a desirable way.
            & l t ;img src="http://moodle.org/logo/logo-240x60.gif" /& g t ;

            Show
            andyjdavis Andrew Davis added a comment - - edited Sorry, I didnt give enough information. It's a variation on <p>url</p> thats causing a unit test to fail with the modified regular expressions. Spaces added to stop character substitution being done by the browser. <p>text www.moodle.org& l t ; /p> The unit tests for convert_urls_into_links are in lib/simpletest/testweblib.php There is now quite a comprehensive range of tests that Im slowly building up. As soon as you introduce escaped characters its pretty easy right now to construct more bits of text that won't be handled in a desirable way. & l t ;img src="http://moodle.org/logo/logo-240x60.gif" /& g t ;
            Hide
            timhunt Tim Hunt added a comment -

            For the filter that converts MDL-123 strings in Moodle forum posts, I wrote quite a lot of tests. See http://moodle.org/filter/moodlelinks/test.php.

            I don't really want to look at your code, but if your approach is to write lots of unit tests, and make them pass, then that sounds great to me.

            Show
            timhunt Tim Hunt added a comment - For the filter that converts MDL-123 strings in Moodle forum posts, I wrote quite a lot of tests. See http://moodle.org/filter/moodlelinks/test.php . I don't really want to look at your code, but if your approach is to write lots of unit tests, and make them pass, then that sounds great to me.
            Hide
            andyjdavis Andrew Davis added a comment - - edited

            Marking this issue resolved as I've committed a workable solution. I've added some new unit tests. 3 of them currently fail due to us not handling escaped img and anchor tags. I have opened another issue to deal with that specifically. MDL-21183

            Show
            andyjdavis Andrew Davis added a comment - - edited Marking this issue resolved as I've committed a workable solution. I've added some new unit tests. 3 of them currently fail due to us not handling escaped img and anchor tags. I have opened another issue to deal with that specifically. MDL-21183
            Hide
            ppollet Patrick Pollet added a comment -

            Hello,

            Sorry to destroy your optimism but I am getting since last Friday, many php timeout errors with le latest build in this function

            [Sat Jan 02 13:13:19 2010] [error] [client 207.46.199.191] PHP Fatal error: Maximum execution time of 30 seconds exceeded in /var/www/html/moodle.195/lib/weblib.php on line 2329 that read

            $text = preg_replace('#((www\.([\pLl0-9]([\pLl0-9]|)[\pLl0-9]|[\pLl0-9])\.)+([\pLl]([\pLl0-9]|)[\pLl0-9]|[\pLl])(:[\pL0-9]*)?(/([\pLl0-9\.!$&\'\(\)*+,;=_~:@-]|%[a-fA-F0-9]

            {2}

            ))(?[\pLl0-9\.!$&\'\(\)*+,;=_~:@/?-]*)?(#[\pLl0-9\.Unable to render embedded object: File (@/?-]*)?(?<) not found.[,.;]))#i',
            '<a href="http://
            1" target="_blank">
            1</a>', $text);

            so I had to comment out the call at line 2334 of function text_to_html to return to normal operations.

            Cheers.

            Show
            ppollet Patrick Pollet added a comment - Hello, Sorry to destroy your optimism but I am getting since last Friday, many php timeout errors with le latest build in this function [Sat Jan 02 13:13:19 2010] [error] [client 207.46.199.191] PHP Fatal error: Maximum execution time of 30 seconds exceeded in /var/www/html/moodle.195/lib/weblib.php on line 2329 that read $text = preg_replace('#((www\.( [\pLl0-9] ( [\pLl0-9] | ) [\pLl0-9] | [\pLl0-9] )\.)+( [\pLl] ( [\pLl0-9] | ) [\pLl0-9] | [\pLl] )(: [\pL0-9] *)?(/( [\pLl0-9\.!$&\'\(\)*+,;=_~:@-] |% [a-fA-F0-9] {2} ) ) (? [\pLl0-9\.!$&\'\(\)*+,;=_~:@/?-] *)?(#[\pLl0-9\. Unable to render embedded object: File (@/?-]*)?(?<) not found. [,.;] ))#i', '<a href="http:// 1" target="_blank"> 1</a>', $text); so I had to comment out the call at line 2334 of function text_to_html to return to normal operations. Cheers.
            Hide
            andyjdavis Andrew Davis added a comment -

            That's not good. Reopening this issue although it may be Monday before I have time to look into it more.

            Show
            andyjdavis Andrew Davis added a comment - That's not good. Reopening this issue although it may be Monday before I have time to look into it more.
            Hide
            stronk7 Eloy Lafuente (stronk7) added a comment -

            Hi Patrick,

            and in what page are you getting that error? Every page?

            Are you able to debug the texts being processed in order to check if it always halt in the same one?

            Also... can you post here your OS and PHP version? Surely it will help to check if there is some known bug causing those problems in your site.

            TIA and ciao

            Show
            stronk7 Eloy Lafuente (stronk7) added a comment - Hi Patrick, and in what page are you getting that error? Every page? Are you able to debug the texts being processed in order to check if it always halt in the same one? Also... can you post here your OS and PHP version? Surely it will help to check if there is some known bug causing those problems in your site. TIA and ciao
            Hide
            stronk7 Eloy Lafuente (stronk7) added a comment -

            Also... in the phpinfo() page... can you look for - the "pcre" section and post here all the information? (PCRE Library Version, pcre.backtrack_limit and pcre.recursion_limit). Re-ciao

            Show
            stronk7 Eloy Lafuente (stronk7) added a comment - Also... in the phpinfo() page... can you look for - the "pcre" section and post here all the information? (PCRE Library Version, pcre.backtrack_limit and pcre.recursion_limit). Re-ciao
            Hide
            korpelainen Mauno Korpelainen added a comment -

            Just a very wild guess but supposing that some sites may have for example Google Analytics code does this current code in weblib.php break source on pages having analytics code like

            <script type="text/javascript">
            var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");
            document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));
            </script>

            Show
            korpelainen Mauno Korpelainen added a comment - Just a very wild guess but supposing that some sites may have for example Google Analytics code does this current code in weblib.php break source on pages having analytics code like <script type="text/javascript"> var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www."); document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E")); </script>
            Hide
            stronk7 Eloy Lafuente (stronk7) added a comment -

            Hi Mauno,

            I don't think that will be transformed at all as far as "http://www." isn't a well-formed url and that's one of the things checked by the regexp (that tries to follow the corresponding URL/URI RFC as possible).

            Anyway things like "http://www.app." ARE correct and they will be transformed. I can imagine javascripts having things like that.

            In any case, note they only will be converted if are printed using format_text() and friends (Moodle contents only). Javascripts included in theme / programming aren't processed by that function so they should be free from any conversion.

            Uhm... just guessing if simply preventing URLs preceded by double (and perhaps simple too) quotes isn't a good method to take out all those URLs happening inside images, flashes, javascripts... in a simple way, instead of taking them out "by hand". Just a possible general solution to be considered.

            Ciao

            Show
            stronk7 Eloy Lafuente (stronk7) added a comment - Hi Mauno, I don't think that will be transformed at all as far as "http://www." isn't a well-formed url and that's one of the things checked by the regexp (that tries to follow the corresponding URL/URI RFC as possible). Anyway things like "http://www.app." ARE correct and they will be transformed. I can imagine javascripts having things like that. In any case, note they only will be converted if are printed using format_text() and friends (Moodle contents only). Javascripts included in theme / programming aren't processed by that function so they should be free from any conversion. Uhm... just guessing if simply preventing URLs preceded by double (and perhaps simple too) quotes isn't a good method to take out all those URLs happening inside images, flashes, javascripts... in a simple way, instead of taking them out "by hand". Just a possible general solution to be considered. Ciao
            Hide
            ppollet Patrick Pollet added a comment -

            Hi Eloy,

            No its does not die on every page but on some. For example a quiz information page with a very simple three lines description and some colors and underlines , but no URL to convert to links inside it ...
            (I am going to attach it). If I remove the description, the page loads entirely). If I put it back it dies before emitting the footer div ...

            I did try to remove the google analytics code as suggested above : no change...

            I am running php PHP 5.2.1 (cli) (built: Mar 31 2007 11:39:16)
            Copyright (c) 1997-2007 The PHP Group
            Zend Engine v2.2.0, Copyright (c) 1998-2007 Zend Technologies

            on a Fedora Core Relase 6 (production server, so not too often upgraded).

            and the latest Moodle weekly1.9.7+ (Build: 20091230)

            phpinfo() gives for PCRE

            pcre
            PCRE (Perl Compatible Regular Expressions) Support enabled
            PCRE Library Version 6.6 06-Feb-2006y

            with no information about pcre.backtrack_limit and pcre.recursion_limit

            The very same description on my laptop running php PHP Version 5.2.4-2ubuntu5.9 and same Moodle build (CVS update) does
            not gives the error ???
            in that case phpinfo says pcre
            PCRE (Perl Compatible Regular Expressions) Support enabled
            PCRE Library Version 7.4 2007-09-21

            Directive Local Value Master Value
            pcre.backtrack_limit 20971520 100000
            pcre.recursion_limit 100000 100000

            So we will say that my prod server should be ugraded to a more recent php/pcre despite the affirmation that php 4.3
            is enough for Moodle 1.9 as per http://docs.moodle.org/en/PHP_settings_by_Moodle_version

            Show
            ppollet Patrick Pollet added a comment - Hi Eloy, No its does not die on every page but on some. For example a quiz information page with a very simple three lines description and some colors and underlines , but no URL to convert to links inside it ... (I am going to attach it). If I remove the description, the page loads entirely). If I put it back it dies before emitting the footer div ... I did try to remove the google analytics code as suggested above : no change... I am running php PHP 5.2.1 (cli) (built: Mar 31 2007 11:39:16) Copyright (c) 1997-2007 The PHP Group Zend Engine v2.2.0, Copyright (c) 1998-2007 Zend Technologies on a Fedora Core Relase 6 (production server, so not too often upgraded). and the latest Moodle weekly1.9.7+ (Build: 20091230) phpinfo() gives for PCRE pcre PCRE (Perl Compatible Regular Expressions) Support enabled PCRE Library Version 6.6 06-Feb-2006y with no information about pcre.backtrack_limit and pcre.recursion_limit The very same description on my laptop running php PHP Version 5.2.4-2ubuntu5.9 and same Moodle build (CVS update) does not gives the error ??? in that case phpinfo says pcre PCRE (Perl Compatible Regular Expressions) Support enabled PCRE Library Version 7.4 2007-09-21 Directive Local Value Master Value pcre.backtrack_limit 20971520 100000 pcre.recursion_limit 100000 100000 So we will say that my prod server should be ugraded to a more recent php/pcre despite the affirmation that php 4.3 is enough for Moodle 1.9 as per http://docs.moodle.org/en/PHP_settings_by_Moodle_version
            Hide
            stronk7 Eloy Lafuente (stronk7) added a comment -

            About preg_replace() and unicode modifiers I've found this PHP bug: http://bugs.php.net/bug.php?id=44336

            It' seems that was fixed for PHP 5.2.9 (based on dates in the bug)... not sure if that can be the cause of your problems, Patrick.

            I think you can test if the unicode thing is your problem by setting:

            $unicoderegexp = false;

            exactly before this line:

            if ($unicoderegexp) { //We can use unicode modifiers

            That way your Moodle will use the non-unicode regular expressions and you'll be able to test and compare executions times and so on.

            If finally the PHP version is the problem (bug above)... we can enforce non-unicode regexp for PHP < 5.2.9 but we need to be sure it's the cause before anything else. So any feedback will be welcome Patrick.

            Ciao

            Show
            stronk7 Eloy Lafuente (stronk7) added a comment - About preg_replace() and unicode modifiers I've found this PHP bug: http://bugs.php.net/bug.php?id=44336 It' seems that was fixed for PHP 5.2.9 (based on dates in the bug)... not sure if that can be the cause of your problems, Patrick. I think you can test if the unicode thing is your problem by setting: $unicoderegexp = false; exactly before this line: if ($unicoderegexp) { //We can use unicode modifiers That way your Moodle will use the non-unicode regular expressions and you'll be able to test and compare executions times and so on. If finally the PHP version is the problem (bug above)... we can enforce non-unicode regexp for PHP < 5.2.9 but we need to be sure it's the cause before anything else. So any feedback will be welcome Patrick. Ciao
            Hide
            rezeau Joseph Rézeau added a comment -

            Hi all,
            I do hope a solution will be found very soon. On our university site, after the recent upgrade to 1.9.7, this bug prevents images from being displayed in the Quiz and Lesson modules. However, as far as I can see, images are correctly displayed in most other places on our site (forum discussions, composed web pages, etc.), probably because those other modules do not call the the convert_urls_into_links() function ?
            Joseph

            Show
            rezeau Joseph Rézeau added a comment - Hi all, I do hope a solution will be found very soon. On our university site, after the recent upgrade to 1.9.7, this bug prevents images from being displayed in the Quiz and Lesson modules. However, as far as I can see, images are correctly displayed in most other places on our site (forum discussions, composed web pages, etc.), probably because those other modules do not call the the convert_urls_into_links() function ? Joseph
            Hide
            ppollet Patrick Pollet added a comment -

            the 3 lines quiz description that breaks (???) convert_urls_into_links

            Show
            ppollet Patrick Pollet added a comment - the 3 lines quiz description that breaks (???) convert_urls_into_links
            Hide
            ppollet Patrick Pollet added a comment -

            $unicoderegexp = false; // suggestion Eloy MDL-21168
            if ($unicoderegexp) { //We can use unicode modifiers

            Does seems to fix it for my two "test pages"... I shall keep an eye on apache logs to see if the max execution time error happens again, but not before Monday ; students are still in vacations...

            Cheers.

            Show
            ppollet Patrick Pollet added a comment - $unicoderegexp = false; // suggestion Eloy MDL-21168 if ($unicoderegexp) { //We can use unicode modifiers Does seems to fix it for my two "test pages"... I shall keep an eye on apache logs to see if the max execution time error happens again, but not before Monday ; students are still in vacations... Cheers.
            Hide
            stronk7 Eloy Lafuente (stronk7) added a comment -

            Hi Joseph,

            if I'm not wrong, the original bug (converting within image tags) is already fixed in CVS since some days ago.

            I just discovered this some hours ago when saw Patrick's comment about having "php timeout errors" with the new code and we are trying to find the ultimate reason (php/pcre versions) for that in order to fix it.

            So, images should be ok right now if I'm not wrong.

            Ciao

            Show
            stronk7 Eloy Lafuente (stronk7) added a comment - Hi Joseph, if I'm not wrong, the original bug (converting within image tags) is already fixed in CVS since some days ago. I just discovered this some hours ago when saw Patrick's comment about having "php timeout errors" with the new code and we are trying to find the ultimate reason (php/pcre versions) for that in order to fix it. So, images should be ok right now if I'm not wrong. Ciao
            Hide
            rezeau Joseph Rézeau added a comment -

            Hi Eloy,

            You are probably right. I'll have to wait until our computer people are back from the Xmas holidays so they can apply the fix from CVS to our site.

            Happy New Year to you & Feliz año nuevo tambien!
            Joseph

            Show
            rezeau Joseph Rézeau added a comment - Hi Eloy, You are probably right. I'll have to wait until our computer people are back from the Xmas holidays so they can apply the fix from CVS to our site. Happy New Year to you & Feliz año nuevo tambien! Joseph
            Hide
            ppollet Patrick Pollet added a comment -

            Eloy said,

            >If finally the PHP version is the problem (bug above)... we can enforce non-unicode regexp for PHP < 5.2.9 but we need to be sure it's the >cause before anything else. So any feedback will be welcome Patrick.

            According to the php bug page,
            [13 Jan 2009 7:23pm UTC] andrei@php.net
            Fixed in PHP_5_2 now as well.

            I guess they forgot php 5.2.1, but I can confirm that I have no trouble (without adding $unicoderegexp = false in two Moodle instances
            running under PHP 5.2.4. So may be enforcing non-unicode regexp for PHP <5.2.4 would be enough.

            As my good friend Joseph said : Feliz año nuevo tambien !

            Show
            ppollet Patrick Pollet added a comment - Eloy said, >If finally the PHP version is the problem (bug above)... we can enforce non-unicode regexp for PHP < 5.2.9 but we need to be sure it's the >cause before anything else. So any feedback will be welcome Patrick. According to the php bug page, [13 Jan 2009 7:23pm UTC] andrei@php.net Fixed in PHP_5_2 now as well. I guess they forgot php 5.2.1, but I can confirm that I have no trouble (without adding $unicoderegexp = false in two Moodle instances running under PHP 5.2.4. So may be enforcing non-unicode regexp for PHP <5.2.4 would be enough. As my good friend Joseph said : Feliz año nuevo tambien !
            Hide
            andyjdavis Andrew Davis added a comment -

            Ok, here is the situation as I understand it.

            First, the initial bug (conversion of urls inside img tags) had a fix committed on the 24th of december 2009.

            The recent rewriting of convert_urls_into_links() (see MDL-20826) has caused us to encounter what may well be a bug in php itself. This bug appears to be present in php 5.2.1 but not in 5.2.4.

            There was a php bug that looks extremely similar to what we are seeing. http://bugs.php.net/bug.php?id=44336 However at the top of that page the version is listed as 5.2.6RC1. I read that to mean this was fixed in 5.2.6 so why does 5.2.4 work? The comment saying it was fixed in the 5.2 branch was in January 2009. The next release after that date was 5.2.9 (http://php.net/releases/index.php) That's all a bit odd.

            There is another bug in php 5.2.1 that looks somewhat similar to what we're seeing. http://bugs.php.net/bug.php?id=40630
            It was fixed in snapshot php5.2-200702261730 in late February (26/02/2007) which would go into 5.2.2 which explains why 5.2.4 works.

            Patrick, when a page times out do you know whether the server's cpu is being hogged by apache/php as described in php bug 40630?

            Forcing the use of the non-unicode regular expressions makes the function work correctly on affected systems. At the least we can construct some criteria and force some systems to use the non-unicode regular expressions.

            Show
            andyjdavis Andrew Davis added a comment - Ok, here is the situation as I understand it. First, the initial bug (conversion of urls inside img tags) had a fix committed on the 24th of december 2009. The recent rewriting of convert_urls_into_links() (see MDL-20826 ) has caused us to encounter what may well be a bug in php itself. This bug appears to be present in php 5.2.1 but not in 5.2.4. There was a php bug that looks extremely similar to what we are seeing. http://bugs.php.net/bug.php?id=44336 However at the top of that page the version is listed as 5.2.6RC1. I read that to mean this was fixed in 5.2.6 so why does 5.2.4 work? The comment saying it was fixed in the 5.2 branch was in January 2009. The next release after that date was 5.2.9 ( http://php.net/releases/index.php ) That's all a bit odd. There is another bug in php 5.2.1 that looks somewhat similar to what we're seeing. http://bugs.php.net/bug.php?id=40630 It was fixed in snapshot php5.2-200702261730 in late February (26/02/2007) which would go into 5.2.2 which explains why 5.2.4 works. Patrick, when a page times out do you know whether the server's cpu is being hogged by apache/php as described in php bug 40630? Forcing the use of the non-unicode regular expressions makes the function work correctly on affected systems. At the least we can construct some criteria and force some systems to use the non-unicode regular expressions.
            Hide
            carco Rosario Carcò added a comment -

            Joseph, Eloy, thanks a lot:

            >>I do hope a solution will be found very soon. On our university site, after the recent upgrade to 1.9.7, this bug prevents images from being displayed in the Quiz and Lesson modules.

            I did the upgrade on beginning of December and resumed work today and teachers and students already started to complain, so I just upgraded from CVS again and it seems to work again. Unfortunately I saw a quiz where the urls got messed up to "> and the like which means some extra work for the teachers to clean up their image tags.
            Rosario

            Show
            carco Rosario Carcò added a comment - Joseph, Eloy, thanks a lot: >>I do hope a solution will be found very soon. On our university site, after the recent upgrade to 1.9.7, this bug prevents images from being displayed in the Quiz and Lesson modules. I did the upgrade on beginning of December and resumed work today and teachers and students already started to complain, so I just upgraded from CVS again and it seems to work again. Unfortunately I saw a quiz where the urls got messed up to "> and the like which means some extra work for the teachers to clean up their image tags. Rosario
            Hide
            rezeau Joseph Rézeau added a comment -

            Rosario,
            I do not understand. If you fixed the bug by upgrading from CVS, things should work normally without the teachers having to change anything in their quiz image links ???

            Show
            rezeau Joseph Rézeau added a comment - Rosario, I do not understand. If you fixed the bug by upgrading from CVS, things should work normally without the teachers having to change anything in their quiz image links ???
            Hide
            carco Rosario Carcò added a comment -

            >>I saw a quiz where the urls got messed up to "> and the like

            Sorry, the html-entities were converted. It should read & quot; & gt; and the like...

            Show
            carco Rosario Carcò added a comment - >>I saw a quiz where the urls got messed up to "> and the like Sorry, the html-entities were converted. It should read & quot; & gt; and the like...
            Hide
            keoghs Sean Keogh added a comment -

            I'm seeing this huge load on several cpus on one of our servers that has several sites that were recently upgraded to 1.9.7. The php error log is continually showing php fatal errors with the maximum execution time exceeded. The file affected always seems to be weblib.php, no metter where it is called from - /course/view.php, /my/index.php, /user/index.php.

            We have the same problem as reported above, in that attempting to view user profiles just seems to hang and eventually times out, sometimes after an image has been deisplayed, sometimes not.

            This server has 4 xeon cpus, 4GB RAM, and us running Centos 5.2, with Apache 2.2.3 and php 5.2.12

            Help!

            Sean K

            Show
            keoghs Sean Keogh added a comment - I'm seeing this huge load on several cpus on one of our servers that has several sites that were recently upgraded to 1.9.7. The php error log is continually showing php fatal errors with the maximum execution time exceeded. The file affected always seems to be weblib.php, no metter where it is called from - /course/view.php, /my/index.php, /user/index.php. We have the same problem as reported above, in that attempting to view user profiles just seems to hang and eventually times out, sometimes after an image has been deisplayed, sometimes not. This server has 4 xeon cpus, 4GB RAM, and us running Centos 5.2, with Apache 2.2.3 and php 5.2.12 Help! Sean K
            Hide
            andyjdavis Andrew Davis added a comment - - edited

            Ok, well 5.2.12 is the very latest version of php so this is almost certainly not being caused by the php bugs Eloy and I found. Something else is going on.

            Eloy, apart from readability is there another reason we have two pairs of regexs? One pair containing unicode modifiers and one pair without them. For Patrick anyway, forcing the use of the regexs without unicode modifiers resolved the issue so maybe we can do that for everyone.

            Sean, would you mind seeing if the same fix resolves the issue for you?
            http://tracker.moodle.org/browse/MDL-21168?focusedCommentId=79154&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_79154

            I've done some testing and there seems to be no difference between the execution times of the two sets of regular expressions although the pair without unicode modifiers do not correctly handle utf 8 characters so there's work to do there if we do want to use that pair all the time.

            Show
            andyjdavis Andrew Davis added a comment - - edited Ok, well 5.2.12 is the very latest version of php so this is almost certainly not being caused by the php bugs Eloy and I found. Something else is going on. Eloy, apart from readability is there another reason we have two pairs of regexs? One pair containing unicode modifiers and one pair without them. For Patrick anyway, forcing the use of the regexs without unicode modifiers resolved the issue so maybe we can do that for everyone. Sean, would you mind seeing if the same fix resolves the issue for you? http://tracker.moodle.org/browse/MDL-21168?focusedCommentId=79154&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_79154 I've done some testing and there seems to be no difference between the execution times of the two sets of regular expressions although the pair without unicode modifiers do not correctly handle utf 8 characters so there's work to do there if we do want to use that pair all the time.
            Hide
            andyjdavis Andrew Davis added a comment - - edited

            Looking back at the PCRE info From Patrick he reported a machine running php 5.2.1, pcre 6.6 and missing the 2 configuration directives is not working. A machine with php 5.2.4 and pcre 7.4 with the directives is working. I've got php 5.2.10 and pcre 7.8 with the directives and its working. Theoretically pcre 6.6 or the missing configuration directives could be the source of the problem.

            http://www.phptutorial.info/?pcre.configuration
            Those directives were introduced in php 5.2.0 so they should be there on the php 5.2.1 machine. That said, if they're missing I would have assumed sensible defaults would have been used.

            Sean, would you mind having a look at the PCRE section of the php info page (Site Administration -> Server -> Php Info) and letting us know what version of pcre you have and whether or not the 2 directives are set and what their values are? (pcre.backtrack_limit and pcre.recursion_limit)

            Show
            andyjdavis Andrew Davis added a comment - - edited Looking back at the PCRE info From Patrick he reported a machine running php 5.2.1, pcre 6.6 and missing the 2 configuration directives is not working. A machine with php 5.2.4 and pcre 7.4 with the directives is working. I've got php 5.2.10 and pcre 7.8 with the directives and its working. Theoretically pcre 6.6 or the missing configuration directives could be the source of the problem. http://www.phptutorial.info/?pcre.configuration Those directives were introduced in php 5.2.0 so they should be there on the php 5.2.1 machine. That said, if they're missing I would have assumed sensible defaults would have been used. Sean, would you mind having a look at the PCRE section of the php info page (Site Administration -> Server -> Php Info) and letting us know what version of pcre you have and whether or not the 2 directives are set and what their values are? (pcre.backtrack_limit and pcre.recursion_limit)
            Hide
            syxton Matthew Davidson added a comment -

            patch.txt is the difference report
            patched.php is the patched version of weblib.

            Just fixed the regex to exclude urls that are inside double and single quotes or have an = sign in front of them

            Show
            syxton Matthew Davidson added a comment - patch.txt is the difference report patched.php is the patched version of weblib. Just fixed the regex to exclude urls that are inside double and single quotes or have an = sign in front of them
            Hide
            andyjdavis Andrew Davis added a comment -

            Hi Matthew. That looks like it could be a step in the right direction but it still needs some tweaking. I'm seeing 39 unit tests failing with those regexs. Its getting towards the end of the day so I won't have time to look at this until tomorrow morning.

            In case you want to take a look the failures are mostly fairly small things like instead of producing this
            <br />This is some text. <a href="http://www.moodle.com" target="_blank">www.moodle.com</a> then some more text<br />
            It's coming out as
            <br />This is some text.<a href="http:// www.moodle.com" target="_blank"> www.moodle.com</a>then some more text<br />
            Note that the space before the anchor tag is now appearing within the tag after http:// and before www.moodle.com

            For anyone working on these regexs you can go Site Administration -> Reports -> Unit Tests and run the tests in lib/simpletest/testweblib.php (rather than the full suite of unit tests). We've got a pretty comprehensive set of tests that include all sorts of common and not so common cases

            Show
            andyjdavis Andrew Davis added a comment - Hi Matthew. That looks like it could be a step in the right direction but it still needs some tweaking. I'm seeing 39 unit tests failing with those regexs. Its getting towards the end of the day so I won't have time to look at this until tomorrow morning. In case you want to take a look the failures are mostly fairly small things like instead of producing this <br />This is some text. <a href="http://www.moodle.com" target="_blank">www.moodle.com</a> then some more text<br /> It's coming out as <br />This is some text.<a href="http:// www.moodle.com" target="_blank"> www.moodle.com</a>then some more text<br /> Note that the space before the anchor tag is now appearing within the tag after http:// and before www.moodle.com For anyone working on these regexs you can go Site Administration -> Reports -> Unit Tests and run the tests in lib/simpletest/testweblib.php (rather than the full suite of unit tests). We've got a pretty comprehensive set of tests that include all sorts of common and not so common cases
            Hide
            syxton Matthew Davidson added a comment -

            I have improved it some..I'm still trying to figure out the test failures. Some of the URL's that it tests come back fine if I try them in a Moodle course but the test fails. Down to 20 fails and the space is gone.

            Show
            syxton Matthew Davidson added a comment - I have improved it some..I'm still trying to figure out the test failures. Some of the URL's that it tests come back fine if I try them in a Moodle course but the test fails. Down to 20 fails and the space is gone.
            Hide
            andreabix Andrea Bicciolo added a comment -

            I can confirm the problem is still present on many sites even upgrading to the latest weekly build 1.9.7+ (Build: 20100106). We are running PHP 5.2.11. On the editor parsing is corrct, while when displaying some source code is still visible. Sample sourcce code from the editor:

            <table width="100%" cellspacing="0" cellpadding="0" border="0"><tbody>
            <tr>
            <td><img width="176" hspace="0" height="30" border="0" src="http://mysite.tld/file.php/6/img_prodotti/tab-uno.gif" alt="Prodotti" title="Prodotti company" />
            </td>
            </tr>
            <tr>
            <td background="http://mysite.tld/file.php/6/img_prodotti/bg_flex_prodotti.gif"><br />
            </td>
            </tr></tbody>
            </table>

            Yields the first image displayed but not the second background image which is shown as URL.

            Show
            andreabix Andrea Bicciolo added a comment - I can confirm the problem is still present on many sites even upgrading to the latest weekly build 1.9.7+ (Build: 20100106). We are running PHP 5.2.11. On the editor parsing is corrct, while when displaying some source code is still visible. Sample sourcce code from the editor: <table width="100%" cellspacing="0" cellpadding="0" border="0"><tbody> <tr> <td><img width="176" hspace="0" height="30" border="0" src="http://mysite.tld/file.php/6/img_prodotti/tab-uno.gif" alt="Prodotti" title="Prodotti company" /> </td> </tr> <tr> <td background="http://mysite.tld/file.php/6/img_prodotti/bg_flex_prodotti.gif"><br /> </td> </tr></tbody> </table> Yields the first image displayed but not the second background image which is shown as URL.
            Hide
            keoghs Sean Keogh added a comment -

            Hi Andrew,

            I put in the fix that you linked to, and I can now view user profiles on the site without any problem, and the cpu usage seems to stay normal, rather than shooting up to 100%.

            I have asked the site owner to give it a good test, and I'll report back.

            Sean K

            Show
            keoghs Sean Keogh added a comment - Hi Andrew, I put in the fix that you linked to, and I can now view user profiles on the site without any problem, and the cpu usage seems to stay normal, rather than shooting up to 100%. I have asked the site owner to give it a good test, and I'll report back. Sean K
            Hide
            stronk7 Eloy Lafuente (stronk7) added a comment -

            Apart from the PHP/PCRE problems (that I think we should move to another bug), I've performed some changes to improve detection of links within images, backgrounds, javascripts.... so since some minutes ago any URL preceded by =" (equal and quote) won't be auto-linked any more.

            Also I've performed some small changes to the tests, Andrew, taking out all that utf_decode() and urldecode() stuff. Right now all the tests (but the 2 encoded entity ones ) are passing ok under unicode. If unicode isn't available 2 more tests will fail.

            In fact, Andrew... I'm not sure if the encoded entity test are correct. Really we want those URLs not being converted... uhm... not sure.

            So... plz:

            1) Andrea, plz test the fix solves your problem.
            2) Andrew, can you add some more tests, processing examples like the one from Andrea above, some JS having urls... whatever you can imagine.
            3) Andrew (and the rest of people having problems with PHP/PCRE)... could we move that problem to another bug? Just to keep things separated or this is going to become crazily mixed.

            That's all. Ciao

            Show
            stronk7 Eloy Lafuente (stronk7) added a comment - Apart from the PHP/PCRE problems (that I think we should move to another bug), I've performed some changes to improve detection of links within images, backgrounds, javascripts.... so since some minutes ago any URL preceded by =" (equal and quote) won't be auto-linked any more. Also I've performed some small changes to the tests, Andrew, taking out all that utf_decode() and urldecode() stuff. Right now all the tests (but the 2 encoded entity ones ) are passing ok under unicode. If unicode isn't available 2 more tests will fail. In fact, Andrew... I'm not sure if the encoded entity test are correct. Really we want those URLs not being converted... uhm... not sure. So... plz: 1) Andrea, plz test the fix solves your problem. 2) Andrew, can you add some more tests, processing examples like the one from Andrea above, some JS having urls... whatever you can imagine. 3) Andrew (and the rest of people having problems with PHP/PCRE)... could we move that problem to another bug? Just to keep things separated or this is going to become crazily mixed. That's all. Ciao
            Hide
            syxton Matthew Davidson added a comment -

            I just tested the change. You might try my patched2.php version. It is still ripping out URL's that are inside <td background=""></td> tags

            Show
            syxton Matthew Davidson added a comment - I just tested the change. You might try my patched2.php version. It is still ripping out URL's that are inside <td background=""></td> tags
            Hide
            stronk7 Eloy Lafuente (stronk7) added a comment -

            Hi Matthew,

            <td background="http://some.url"></td> seems to be working ok here, as far as it doesn't get converted.

            Anyway, I've detected that when one url has this form:

            http://www.some.url

            Then it gets converted twice, one by the "http" regexp and another by the "www" regexp. Obviously, between both regexps... we should save <a href> tags again... but instead I've added one shortcut in the "www" regexp so it is ignored if preceded by // (two slashes).

            I've added some more tests that are passing ok.

            Looking at your patch2... it seems that I've used a 99% equivalent solution, looking for 2 chars instead of 1, but basically the same. Thanks!

            If you find other examples exhibiting problems, please, post them here. We can add them to the tests and find one solution.

            Right now, the only situation were I can imagine it failing is something like:

            <img src=http://moodle.org />

            i.e. tags where the quotes are missing, but that is incorrect html IMO and I think we shouldn't support it at all.

            Ciao

            Show
            stronk7 Eloy Lafuente (stronk7) added a comment - Hi Matthew, <td background="http://some.url"></td> seems to be working ok here, as far as it doesn't get converted. Anyway, I've detected that when one url has this form: http://www.some.url Then it gets converted twice, one by the "http" regexp and another by the "www" regexp. Obviously, between both regexps... we should save <a href> tags again... but instead I've added one shortcut in the "www" regexp so it is ignored if preceded by // (two slashes). I've added some more tests that are passing ok. Looking at your patch2... it seems that I've used a 99% equivalent solution, looking for 2 chars instead of 1, but basically the same. Thanks! If you find other examples exhibiting problems, please, post them here. We can add them to the tests and find one solution. Right now, the only situation were I can imagine it failing is something like: <img src= http://moodle.org /> i.e. tags where the quotes are missing, but that is incorrect html IMO and I think we shouldn't support it at all. Ciao
            Hide
            andyjdavis Andrew Davis added a comment -

            I have created MDL-21296 for the slowdown/timeout issue. Anymore information/comments related to that should be made there

            Show
            andyjdavis Andrew Davis added a comment - I have created MDL-21296 for the slowdown/timeout issue. Anymore information/comments related to that should be made there
            Hide
            andyjdavis Andrew Davis added a comment -

            I have committed a few more tests.

            Javascript like
            var url = "http://moodle.org";
            is currently being turned into
            var url = "<a href="http://moodle.org" target="_blank">http://moodle.org</a>";

            I'm not entirely sure anymore whether we care about processing escaped text in this way. I have however created MDL-21183 to look into this specifically so feel free to comment over there if you have an opinion either way.

            Show
            andyjdavis Andrew Davis added a comment - I have committed a few more tests. Javascript like var url = "http://moodle.org"; is currently being turned into var url = "<a href="http://moodle.org" target="_blank"> http://moodle.org </a>"; I'm not entirely sure anymore whether we care about processing escaped text in this way. I have however created MDL-21183 to look into this specifically so feel free to comment over there if you have an opinion either way.
            Hide
            korpelainen Mauno Korpelainen added a comment -

            A very silly question:

            WHY do we need function convert_urls_into_links in moodle ?

            ( = Is there some good reason for converting existing urls to links? )

            Show
            korpelainen Mauno Korpelainen added a comment - A very silly question: WHY do we need function convert_urls_into_links in moodle ? ( = Is there some good reason for converting existing urls to links? )
            Hide
            rezeau Joseph Rézeau added a comment -

            Hi Mauno,
            That's a very good question, not a silly one.
            Why on earth do we need this, considering all the havoc it's causing? What are the benefits?
            Joseph

            Show
            rezeau Joseph Rézeau added a comment - Hi Mauno, That's a very good question, not a silly one. Why on earth do we need this, considering all the havoc it's causing? What are the benefits? Joseph
            Hide
            stronk7 Eloy Lafuente (stronk7) added a comment -

            Hi Mauno, Joseph,

            the reason to have that function is simple: FORMAT_MOODLE uses it. Just that.

            The key here is that, under Moodle 1.9 and before, a lot of text fields are missing their corresponding "format" field, so FORMAT_MOODLE is applied to them (no matter if have been created with the html editor. So we need to have it working ok.

            Under Moodle 2.0, each text field will have its associated "format" field, so html fields won't be processed by this function any more and only real FORMAT_MOODLE fields will continue using it.

            That's all. Ciao

            Show
            stronk7 Eloy Lafuente (stronk7) added a comment - Hi Mauno, Joseph, the reason to have that function is simple: FORMAT_MOODLE uses it. Just that. The key here is that, under Moodle 1.9 and before, a lot of text fields are missing their corresponding "format" field, so FORMAT_MOODLE is applied to them (no matter if have been created with the html editor. So we need to have it working ok. Under Moodle 2.0, each text field will have its associated "format" field, so html fields won't be processed by this function any more and only real FORMAT_MOODLE fields will continue using it. That's all. Ciao
            Hide
            korpelainen Mauno Korpelainen added a comment -

            Thank You for the explanation, Eloy!

            I have never had problems with profile images or timeout and I have not intensively used moodle auto-format but as it seems to be used in many places as default format I tested some examples - composed a text page and saved it in Moodle auto-format.

            If slash arguments are disabled (Administration > Server > HTTP) and you copy address of some image from image plugin of editor or for example view course files and copy the address from address bar and paste them to document using moodle auto-format links are not always complete:

            http://127.0.0.1/m2010/file.php?file=/2/albert.gif

            would work but this is not what we get for address - we get

            http://127.0.0.1/m2010/file.php?file=%2F2%2Falbert.gif (slashes / are replaced with %2F before converting to moodle auto format)

            and link is cut after "...file=" - or let's say before %2F.

            The same way if I paste for example a Coogle Chart API tex image address like

            http://chart.apis.google.com/chart?cht=tx&chco=000000&chf=bg,s,FFFFFF00&chl=%5Cint_%7B1%7D%5E%7B2%7Dx%5C%2Cdx

            or the same example without pre-urlencoded %5C ( \ ), %7B (

            { ) and %7D ( }

            )

            http://chart.apis.google.com/chart?cht=tx&chco=000000&chf=bg,s,FFFFFF00&chl=\int_

            {1}

            ^

            {2}

            x%2Cdx

            they both get cut after "chl="

            If I try a Google Chart example:

            http://chart.apis.google.com/chart?cht=p3&chd=t:60,40&chs=250x100&chl=Hello|World

            the link is cut before "|World"

            Show
            korpelainen Mauno Korpelainen added a comment - Thank You for the explanation, Eloy! I have never had problems with profile images or timeout and I have not intensively used moodle auto-format but as it seems to be used in many places as default format I tested some examples - composed a text page and saved it in Moodle auto-format. If slash arguments are disabled (Administration > Server > HTTP) and you copy address of some image from image plugin of editor or for example view course files and copy the address from address bar and paste them to document using moodle auto-format links are not always complete: http://127.0.0.1/m2010/file.php?file=/2/albert.gif would work but this is not what we get for address - we get http://127.0.0.1/m2010/file.php?file=%2F2%2Falbert.gif (slashes / are replaced with %2F before converting to moodle auto format) and link is cut after "...file=" - or let's say before %2F. The same way if I paste for example a Coogle Chart API tex image address like http://chart.apis.google.com/chart?cht=tx&chco=000000&chf=bg,s,FFFFFF00&chl=%5Cint_%7B1%7D%5E%7B2%7Dx%5C%2Cdx or the same example without pre-urlencoded %5C ( \ ), %7B ( { ) and %7D ( } ) http://chart.apis.google.com/chart?cht=tx&chco=000000&chf=bg,s,FFFFFF00&chl=\int_ {1} ^ {2} x%2Cdx they both get cut after "chl=" If I try a Google Chart example: http://chart.apis.google.com/chart?cht=p3&chd=t:60,40&chs=250x100&chl=Hello |World the link is cut before "|World"
            Hide
            korpelainen Mauno Korpelainen added a comment -

            And you can see the same behaviour here in tracker in those last two examples...the first cut link is here broken before "^

            {2}

            " (pre-urlencoded version "%5E%7B2%7D" works ok )

            That Google Chart API example is taken directly from google chart api docs.

            Show
            korpelainen Mauno Korpelainen added a comment - And you can see the same behaviour here in tracker in those last two examples...the first cut link is here broken before "^ {2} " (pre-urlencoded version "%5E%7B2%7D" works ok ) That Google Chart API example is taken directly from google chart api docs.
            Hide
            korpelainen Mauno Korpelainen added a comment -

            Result is shown in attached image Broken_links.jpg

            Show
            korpelainen Mauno Korpelainen added a comment - Result is shown in attached image Broken_links.jpg
            Hide
            stronk7 Eloy Lafuente (stronk7) added a comment -

            Ah, crap, well spotted!

            We were already supporting urlencoded chars is the path (and had a bunch of tests for it), like:

            http://some.domain.com/%28path%29/script?param=value

            but weren't supporting them in the query part:

            http://some.domain.com/path/to/script?param=%28value%29

            Easy to fix... adding your examples as tests and fixing it... ciao

            Show
            stronk7 Eloy Lafuente (stronk7) added a comment - Ah, crap, well spotted! We were already supporting urlencoded chars is the path (and had a bunch of tests for it), like: http://some.domain.com/%28path%29/script?param=value but weren't supporting them in the query part: http://some.domain.com/path/to/script?param=%28value%29 Easy to fix... adding your examples as tests and fixing it... ciao
            Hide
            stronk7 Eloy Lafuente (stronk7) added a comment -

            Fixed. Now the query part of the URL supports encoded chars.

            Note that both "backslashes" and "pipes" aren't supported and they always should be present it their "encoded" way (following the RFCs). So the user should be introducing it properly (html editor encodes and copying URLS should be encoded too).

            I know this breaks some URLs, because some browsers display them urlencoded and so on, but it's impossible to know if the url being pasted is already encoded or no, and that would introduce one more step (at least) to the processor. So, trying to be sticky as possible with the RFCs only encoded versions of "reserved" and "dangerous" chars are allowed.

            I think I'll be closing this soon as far as it already supports far more URLs than the old version. For remaining problems, we should continue at:

            MDL-21296 : Slowdown problem with some OS/PHP/PCRE versions.
            MDL-21183 : Processing of javascript and/or htmlentities links

            Ciao

            Show
            stronk7 Eloy Lafuente (stronk7) added a comment - Fixed. Now the query part of the URL supports encoded chars. Note that both "backslashes" and "pipes" aren't supported and they always should be present it their "encoded" way (following the RFCs). So the user should be introducing it properly (html editor encodes and copying URLS should be encoded too). I know this breaks some URLs, because some browsers display them urlencoded and so on, but it's impossible to know if the url being pasted is already encoded or no, and that would introduce one more step (at least) to the processor. So, trying to be sticky as possible with the RFCs only encoded versions of "reserved" and "dangerous" chars are allowed. I think I'll be closing this soon as far as it already supports far more URLs than the old version. For remaining problems, we should continue at: MDL-21296 : Slowdown problem with some OS/PHP/PCRE versions. MDL-21183 : Processing of javascript and/or htmlentities links Ciao
            Hide
            andyjdavis Andrew Davis added a comment -

            Thanks for those fixes Eloy and everyone else. I'm going to go ahead and call this resolved. We're certainly now handling image tags which was the initial issue. Plus it's getting tough to keep the various issues reported here straight in my head.

            For other issues either go to MDL-21168, MDL-21296 or open a new issue

            Show
            andyjdavis Andrew Davis added a comment - Thanks for those fixes Eloy and everyone else. I'm going to go ahead and call this resolved. We're certainly now handling image tags which was the initial issue. Plus it's getting tough to keep the various issues reported here straight in my head. For other issues either go to MDL-21168 , MDL-21296 or open a new issue
            Hide
            dougiamas Martin Dougiamas added a comment - - edited

            I am just tagging 1.9.8 but we have two unit test fails, both in this new work.

            Fail: lib/simpletest/testweblib.php / ? web_test / ? test_convert_urls_into_links / ?
            Testing text: http://Iñtërnâtiônà lizætiøn.com?ô=nëø: Equal expectation fails at character 0 with http://Iñtërnâtiônà lizætiøn.com?ô=nëø and [<a href="http://Iñtërnâtiônà lizætiøn.com?ô=nëø" target="_blank">http://Iñtërnâtiônà lizætiøn.com?ô=nëø</a>] at [/web/19/lib/simpletest/testweblib.php line 212]

            Fail: lib/simpletest/testweblib.php / ? web_test / ? test_convert_urls_into_links / ?
            Testing text: www.Iñtërnâtiônà lizætiøn.com?ô=nëø: Equal expectation fails at character 0 with [www.Iñtërnâtiônà lizætiøn.com?ô=nëø] and [<a href="http://www.Iñtërnâtiônà lizætiøn.com?ô=nëø" target="_blank">www.Iñtërnâtiônà lizætiøn.com?ô=nëø</a>] at [/web/19/lib/simpletest/testweblib.php line 212]

            I can't see what the problem is, it might just be my PHP install. I'll ignore them for now.

            Show
            dougiamas Martin Dougiamas added a comment - - edited I am just tagging 1.9.8 but we have two unit test fails, both in this new work. Fail: lib/simpletest/testweblib.php / ? web_test / ? test_convert_urls_into_links / ? Testing text: http://I ñtërnâtiônà lizætiøn.com?ô=nëø: Equal expectation fails at character 0 with http://Iñtërnâtiônà lizætiøn.com?ô=nëø and [<a href="http://Iñtërnâtiônà lizætiøn.com?ô=nëø" target="_blank">http://Iñtërnâtiônà lizætiøn.com?ô=nëø</a>] at [/web/19/lib/simpletest/testweblib.php line 212] Fail: lib/simpletest/testweblib.php / ? web_test / ? test_convert_urls_into_links / ? Testing text: www.Iñtërnâtiônà lizætiøn.com?ô=nëø: Equal expectation fails at character 0 with [www.Iñtërnâtiônà lizætiøn.com?ô=nëø] and [<a href="http://www.Iñtërnâtiônà lizætiøn.com?ô=nëø" target="_blank">www.Iñtërnâtiônà lizætiøn.com?ô=nëø</a>] at [/web/19/lib/simpletest/testweblib.php line 212] I can't see what the problem is, it might just be my PHP install. I'll ignore them for now.

              People

              • Votes:
                8 Vote for this issue
                Watchers:
                15 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:
                  Fix Release Date:
                  25/Mar/10