Uploaded image for project: 'Moodle'
  1. Moodle
  2. MDL-54853

Messages are broken for non-latin letters

    XMLWordPrintable

Details

    • MOODLE_31_STABLE
    • MOODLE_30_STABLE, MOODLE_31_STABLE
    • MDL-54853-master
    • Hide

      Use:

      $text = mb_convert_encoding($text, 'HTML-ENTITIES', 'UTF-8');

      before:

      $domdoc = new DOMDocument();

      Show
      Use: $text = mb_convert_encoding($text, 'HTML-ENTITIES', 'UTF-8'); before: $domdoc = new DOMDocument();
    • Hide

      It would be prudent to do these tests on Windows as well. There may be differences in libxml.

      Follow the testing instructions from MDL-37138

      Test sending messages between two users, and put funny characters in the message. You can pick some from here

      Also test the chat module.

      Show
      It would be prudent to do these tests on Windows as well . There may be differences in libxml. Follow the testing instructions from MDL-37138 Test sending messages between two users, and put funny characters in the message. You can pick some from here Also test the chat module.
    • 3.2 Sprint 1

    Description

      After last weekly update, messages are not usable, because non-latin letters are broken during filter operation in "/lib/weblib.php" function "format_text", because of this part of the code:

      if ($options['blanktarget']) {
              $domdoc = new DOMDocument();
              $domdoc->loadHTML($text);
              foreach ($domdoc->getElementsByTagName('a') as $link) {
                  if ($link->hasAttribute('target') && strpos($link->getAttribute('target'), '_blank') === false) {
                      continue;
                  }
                  $link->setAttribute('target', '_blank');
                  if (strpos($link->getAttribute('rel'), 'noreferrer') === false) {
                      $link->setAttribute('rel', trim($link->getAttribute('rel') . ' noreferrer'));
                  }
              }
       
              // This regex is nasty and I don't like it. The correct way to solve this is by loading the HTML like so:
              // $domdoc->loadHTML($text, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD); however it seems like the libxml
              // version that travis uses doesn't work properly and ends up leaving <html><body>, so I'm forced to use
              // this regex to remove those tags.
              $text = trim(preg_replace('~<(?:!DOCTYPE|/?(?:html|body))[^>]*>\s*~i', '', $domdoc->saveHTML()));
          }
      

      Attachments

        1. actual.png
          actual.png
          7 kB
        2. expected.png
          expected.png
          7 kB

        Issue Links

          Activity

            People

              cameron1729 cameron1729
              evsoldatkin Evgeny Soldatkin
              Marina Glancy Marina Glancy
              Andrew Lyons Andrew Lyons
              Andrew Lyons Andrew Lyons
              Amaia Anabitarte, Carlos Escobedo, Ferran Recio, Ilya Tregubov, Laurent David, Raquel Ortega, Sara Arjona (@sarjona)
              Votes:
              1 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:
                11/Jul/16