Moodle
  1. Moodle
  2. MDL-8357

allow <span>...</span> in multilang block

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: 1.8
    • Fix Version/s: DEV backlog
    • Component/s: Filters
    • Labels:
    • Affected Branches:
      MOODLE_18_STABLE
    • Rank:
      507

      Description

      Hi filter fans,

      I would like to suggest an improvement to the multilang filter which would allow <span>...</span> WITHIN the start and finish tags for the language blocks.

      Currently, it is not possible to include <span>...</span> tags within a language block, because the RegEx to match the language blocks will terminate the block on the first matching </span> tag.

      The solution is short and does not require changing a lot of lines of code.

      Looking solely at the RegEx for the "new syntax" in "filter/multilang/filter.php", I suggest changing the following line (around line 46).

      Change this:

      ============
      $search = '/(<span lang="[a-zA-Z0-9_-]+" class="multilang">.?<\/span>)(\s<span lang="[a-zA-Z0-9_-]" class="multilang">.*?<\/span>)/is';
      ============

      to this:

      ============
      $search = '/(<(div|span) lang="[a-zA-Z0-9_-]+" class="multilang">.?<\/\2>)(\s<\2 lang="[a-zA-Z0-9_-]" class="multilang">.*?<\/\2>)/is';
      ============

      Notice the use of "\2". Instead of looking for the closing </span> tag in ALL cases, the RegEx will look for the closing tag of whatever type of tag it was the opened the language block.

      You could go further and make the type of tag a config variable, $CFG->filter_multilang_tagtypes, that could be modifiable via "filter/multilang/filterconfig.html". The the RegEx would become:

      ============
      $search = '/(<('.$CFG->filter_multilang_tagtypes.') lang="[a-zA-Z0-9_-]+" class="multilang">.?<\/\2>)(\s<\2 lang="[a-zA-Z0-9_-]" class="multilang">.*?<\/\2>)/is';
      ============

      A corresponding change would also be needed to "multilang_filter_impl" function (around line 72).

      Change this:

      ============
      $searchtosplit = '/<span lang="([a-zA-Z0-9_-]+)" class="multilang">(.*?)<\/span>/is';
      ============

      To this:

      ============
      $searchtosplit = '/<('.$CFG->filter_multilang_tagtypes.') lang="([a-zA-Z0-9_-]+)" class="multilang">(.*?)<\/\1>/is';
      ============

      The above change would in turn require incrementing the indexes on the "$rawlanglist" array (around lines 81-84).

      Change this:

      ============
      foreach ($rawlanglist[1] as $index=>$lang) {
      $lang = str_replace('utf8', '', str_replace('-','',strtolower($lang))); // normalize languages
      $langlist[$lang] = $rawlanglist[2][$index];
      }
      ============

      to this:

      ============
      foreach ($rawlanglist[2] as $index=>$lang) {
      $lang = str_replace('utf8', '', str_replace('-','',strtolower($lang))); // normalize languages
      $langlist[$lang] = $rawlanglist[3][$index];
      }
      ============

      As I said, these are small changes, with no hit on performance, no ill-effects on current sites, but a big increase in flexibility :-D

      best regards
      Gordon

      P.S. Similar changes could be made to "old syntax" RegEx statements, to allow the same flexibility to Moodle 1.7 and earlier

        Activity

        Hide
        Martin Dougiamas added a comment -

        Fantastic! +1

        Petr, can you please confirm and check this in?

        Show
        Martin Dougiamas added a comment - Fantastic! +1 Petr, can you please confirm and check this in?
        Hide
        Petr Škoda added a comment -

        $CFG->filter_multilang_tagtypes would require some changes in handling of PARAM_RAW

        If I understand it correctly you just want some tag that is not used so that there can be anything in multilang span, I guess you would want to configure $CFG->filter_multilang_tagtypes to contain <lang?

        Why not fix the code so that it deals with nested tags properly?

        Show
        Petr Škoda added a comment - $CFG->filter_multilang_tagtypes would require some changes in handling of PARAM_RAW If I understand it correctly you just want some tag that is not used so that there can be anything in multilang span, I guess you would want to configure $CFG->filter_multilang_tagtypes to contain <lang? Why not fix the code so that it deals with nested tags properly?
        Hide
        Petr Škoda added a comment -

        Thinking more about this I miss the old <lang tag

        Hmm, fixing the nested span tags would not help much the html validation of raw text because span can not contain div tag anyway - which is what you might want it for - you want to enclose large blocks of html in multilang blocks, right?

        What was the exact reason why we replaced <lang with <span ? Was it because we wanted correct HTML when the filter was turned off? If that was the case we could only strip the lang tag in format_string and format_text if multilang filter was not present, couldn't we? Or was it because of the htmlarea editor?

        Anyway my +1 for the changes above without the filter_multilang_tagtypes. I will commit it tomorrow if agreed.

        Show
        Petr Škoda added a comment - Thinking more about this I miss the old <lang tag Hmm, fixing the nested span tags would not help much the html validation of raw text because span can not contain div tag anyway - which is what you might want it for - you want to enclose large blocks of html in multilang blocks, right? What was the exact reason why we replaced <lang with <span ? Was it because we wanted correct HTML when the filter was turned off? If that was the case we could only strip the lang tag in format_string and format_text if multilang filter was not present, couldn't we? Or was it because of the htmlarea editor? Anyway my +1 for the changes above without the filter_multilang_tagtypes. I will commit it tomorrow if agreed.
        Hide
        Gordon Bateson added a comment -

        > Why not fix the code so that it deals with nested tags properly?

        I thought that might require a complicated RegEx and a lot more work for PHP, with a resulting performance hit, so I wasn't confident to suggest that, but that would be a very elegant solution, if it didn't require the PHP engine going crazy trying to bild a mini DOM!

        > you want to enclose large blocks of html in multilang blocks, right?

        Yes, that's exactly what I would like to do. To be precise it was my desire to include a <TABLE> with some text in it marked as <span style="font-weight:bold">BOLD TEXT</span>, which gotme started on this line of thought.

        I suppose the $CFG->filter_multilang_tagtypes would not contain actual tags, just a "|" delimted list of tag names such "span|div", so could it be validated as PARAM_CLEAN, which would be a small improvement on PARAM_RAW, but I see the problem with the cleaning of the input. There is no suitable PARAM_xxx at the moment.

        Anyway, even if the tags names are hard coded into a variable in "filter/multilang/filter.php", or a $CFG property in config.php, it would help

        many thanks for considering this idea and responding so quickly!

        cheers
        Gordon

        Show
        Gordon Bateson added a comment - > Why not fix the code so that it deals with nested tags properly? I thought that might require a complicated RegEx and a lot more work for PHP, with a resulting performance hit, so I wasn't confident to suggest that, but that would be a very elegant solution, if it didn't require the PHP engine going crazy trying to bild a mini DOM! > you want to enclose large blocks of html in multilang blocks, right? Yes, that's exactly what I would like to do. To be precise it was my desire to include a <TABLE> with some text in it marked as <span style="font-weight:bold">BOLD TEXT</span>, which gotme started on this line of thought. I suppose the $CFG->filter_multilang_tagtypes would not contain actual tags, just a "|" delimted list of tag names such "span|div", so could it be validated as PARAM_CLEAN, which would be a small improvement on PARAM_RAW, but I see the problem with the cleaning of the input. There is no suitable PARAM_xxx at the moment. Anyway, even if the tags names are hard coded into a variable in "filter/multilang/filter.php", or a $CFG property in config.php, it would help many thanks for considering this idea and responding so quickly! cheers Gordon
        Hide
        Gordon Bateson added a comment -

        I have discovered that "admin/filter.php" allows for individual filters to have their own function to check and save parameters, as long as there is a function called "multilang_process_config" in a script called "filterconfig.php" in the filter's subdirectory.

        The following amendments would allow the setting and maintenance of $CFG->filter_multilang_tagnames.

        =====
        add to:
        filter/multilang/defaultsettings.php
        ===========================
        if (!isset($CFG->filter_multilang_tagnames) or $forcereset)

        { set_config('filter_multilang_tagnames', 'div|span'); }

        =====
        add to:
        lang/en_utf8/admin.php
        ===================
        $string['multilangtagnames'] = 'List of allowable tag names, separated by a space or any non-alphabetic character';

        =====
        add to:
        filter/multilang/filterconfig.html
        ========================
        <tr valign="top">
        <td align="right"><?php print_string('multilangtagnames', 'admin'); ?></td>
        <td><input type="text" name="filter_multilang_tagnames" size="50"
        value="<?php p($CFG->filter_multilang_tagnames); ?>" /></td>
        </tr>

        =====
        create:
        filter/multilang/filterconfig.php
        ========================
        <?php
        // check the default settings
        require_once "defaultsettings.php";

        function multilang_process_config($config) {
        $name = 'filter_multilang_force_old';
        set_config($name, clean_param($config->$name, PARAM_BOOL));

        $name = 'filter_multilang_tagnames';
        if (preg_match_all('/[a-z]+/i', $config->$name, $values))

        { // standard Moodle param cleaning is not really necessary //$value[0] = clean_param($values[0], PARAM_ALPHA); $value = implode('|', $values[0]); }

        else

        { $value = ''; }

        set_config($name, $value);
        }
        ?>

        Show
        Gordon Bateson added a comment - I have discovered that "admin/filter.php" allows for individual filters to have their own function to check and save parameters, as long as there is a function called "multilang_process_config" in a script called "filterconfig.php" in the filter's subdirectory. The following amendments would allow the setting and maintenance of $CFG->filter_multilang_tagnames. ===== add to: filter/multilang/defaultsettings.php =========================== if (!isset($CFG->filter_multilang_tagnames) or $forcereset) { set_config('filter_multilang_tagnames', 'div|span'); } ===== add to: lang/en_utf8/admin.php =================== $string ['multilangtagnames'] = 'List of allowable tag names, separated by a space or any non-alphabetic character'; ===== add to: filter/multilang/filterconfig.html ======================== <tr valign="top"> <td align="right"><?php print_string('multilangtagnames', 'admin'); ?></td> <td><input type="text" name="filter_multilang_tagnames" size="50" value="<?php p($CFG->filter_multilang_tagnames); ?>" /></td> </tr> ===== create: filter/multilang/filterconfig.php ======================== <?php // check the default settings require_once "defaultsettings.php"; function multilang_process_config($config) { $name = 'filter_multilang_force_old'; set_config($name, clean_param($config->$name, PARAM_BOOL)); $name = 'filter_multilang_tagnames'; if (preg_match_all('/ [a-z] +/i', $config->$name, $values)) { // standard Moodle param cleaning is not really necessary //$value[0] = clean_param($values[0], PARAM_ALPHA); $value = implode('|', $values[0]); } else { $value = ''; } set_config($name, $value); } ?>
        Hide
        Gordon Bateson added a comment -

        to ensure lower case tag names:

        if (preg_match_all('/[a-z]+/', strtolower($config->$name), $values)) {

        Show
        Gordon Bateson added a comment - to ensure lower case tag names: if (preg_match_all('/ [a-z] +/', strtolower($config->$name), $values)) {
        Hide
        Petr Škoda added a comment -

        thinking a bit more about it
        1/ how would we deal with this in backup/restore/export/import? Different settings would break multilag text.
        2/ there are internal regex limitations too - if you have many nested tags in multilag block the regex engine panics and returns NULL - see MDL-10155

        The second problem is IMHO a show stopper.

        My +1 to return back to good old <lang lang="en'> tag and add stripping of <lang> into format_text/string if multilang switched off.
        If I remember it correctly the main reason for <span> was the xhtml validity, but it would not be valid anyway due to the nesting rules in strict - there are places when only block (div) elements are allowed and others where only inline (span) elements are allowed.

        Show
        Petr Škoda added a comment - thinking a bit more about it 1/ how would we deal with this in backup/restore/export/import? Different settings would break multilag text. 2/ there are internal regex limitations too - if you have many nested tags in multilag block the regex engine panics and returns NULL - see MDL-10155 The second problem is IMHO a show stopper. My +1 to return back to good old <lang lang="en'> tag and add stripping of <lang> into format_text/string if multilang switched off. If I remember it correctly the main reason for <span> was the xhtml validity, but it would not be valid anyway due to the nesting rules in strict - there are places when only block (div) elements are allowed and others where only inline (span) elements are allowed.
        Hide
        Hans de Zwart added a comment -

        I now have a non-technical client that needs to have a completely bi-lingual site. I am starting to realise that this is very difficult for her. Currently the HTML-editor messes up the nesting of the <span> tags, which causes the filter to fail and it does not even use the correct syntax (see MDL-10760).

        The fact that it uses <span> makes it so that each block level element has to get it's own <span> tags which is very inconvenient. Currently I just cannot recommend her using it, which is a problem as it is truly (by law) required functionality.

        Therefore I vote Petr and give my +1 to the old <lang lang="en"> tag.

        Thanks!

        Show
        Hans de Zwart added a comment - I now have a non-technical client that needs to have a completely bi-lingual site. I am starting to realise that this is very difficult for her. Currently the HTML-editor messes up the nesting of the <span> tags, which causes the filter to fail and it does not even use the correct syntax (see MDL-10760 ). The fact that it uses <span> makes it so that each block level element has to get it's own <span> tags which is very inconvenient. Currently I just cannot recommend her using it, which is a problem as it is truly (by law) required functionality. Therefore I vote Petr and give my +1 to the old <lang lang="en"> tag. Thanks!
        Hide
        Hans de Zwart added a comment -

        I have found out that the syntax is still available as a setting of the multilang filter. Please do not take this out!

        Show
        Hans de Zwart added a comment - I have found out that the syntax is still available as a setting of the multilang filter. Please do not take this out!
        Hide
        Petr Škoda added a comment -

        There is no way we could make this work properly wit hthe new "span" multilang syntax, we will hae to revert it in 2.0 back to <lang> and fix editor to support it.

        Show
        Petr Škoda added a comment - There is no way we could make this work properly wit hthe new "span" multilang syntax, we will hae to revert it in 2.0 back to <lang> and fix editor to support it.
        Hide
        Carlos García added a comment -

        Hi everyone,

        I also needed to create a bi-lingual site, and I didn't find here a solution to be able to do it. So I did a very simple change in /moodle/filter/multilang/filter.php, and it seems to work fine.

        I don't know if it's correct or not, because I'm a php beginner and the solution seems too simple, but in my case is working properly, and I have not found any problem using it.

        I have just changed the six times that appears the string "(?:lang|span) in filter.php, to "(?:lang)". I suppose there will be a simpler way to write it in php, but as I have no many php knowledge and It works fine to me, I have just changed that string. Obviously, you can't use the old syntax with span, but it works fine using <lang lang="lang_code"> and </lang>.

        The only problems that I have is that sometimes Moodle introduces extra </lang> when you're editing some texts, but after revising and erasing the extra </lang>, it works properly. I have found that sometimes Moodle opens <div> but doesn't close it, and when this happens it adds an extra </lang> before you want it, but if you ensure all the tags are well opened and closed, it works fine, even in course and category names.

        If, as I think, in Moodle 2.0 will be available the old syntax, you can do this change in Moodle 1.9.9, write your texts using the old syntax, and it should work properly in Moodle 2.0.

        I would appreciate it if somebody could confirm this can be used a provisional solution. At the moment, I'm using it and I have not had any problem with this.

        Thanks!

        Show
        Carlos García added a comment - Hi everyone, I also needed to create a bi-lingual site, and I didn't find here a solution to be able to do it. So I did a very simple change in /moodle/filter/multilang/filter.php, and it seems to work fine. I don't know if it's correct or not, because I'm a php beginner and the solution seems too simple, but in my case is working properly, and I have not found any problem using it. I have just changed the six times that appears the string "(?:lang|span) in filter.php, to "(?:lang)". I suppose there will be a simpler way to write it in php, but as I have no many php knowledge and It works fine to me, I have just changed that string. Obviously, you can't use the old syntax with span, but it works fine using <lang lang="lang_code"> and </lang>. The only problems that I have is that sometimes Moodle introduces extra </lang> when you're editing some texts, but after revising and erasing the extra </lang>, it works properly. I have found that sometimes Moodle opens <div> but doesn't close it, and when this happens it adds an extra </lang> before you want it, but if you ensure all the tags are well opened and closed, it works fine, even in course and category names. If, as I think, in Moodle 2.0 will be available the old syntax, you can do this change in Moodle 1.9.9, write your texts using the old syntax, and it should work properly in Moodle 2.0. I would appreciate it if somebody could confirm this can be used a provisional solution. At the moment, I'm using it and I have not had any problem with this. Thanks!

          People

          • Votes:
            13 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated: