Moodle

p() and s() should not be used to print lang strings

Details

  • Type: Bug Bug
  • Status: Closed Closed
  • Priority: Minor Minor
  • Resolution: Fixed
  • Affects Version/s: 1.9.3
  • Fix Version/s: 2.0
  • Component/s: Language, Libraries
  • Labels:
    None
  • Affected Branches:
    MOODLE_19_STABLE
  • Fixed Branches:
    MOODLE_20_STABLE

Description

HTML entities like   or   don't get parsed by function p() and print litteraly, e.g. $string['usergroupmembership'] in group/members.php, line 281.

Issue Links

Activity

Hide
Nicolas Martignoni added a comment -

After more inspection, it turns out that the only entities that are parsed are in the form &#ddd;, where the d's are digits. The & is not parsed, but printed instead.

Show
Nicolas Martignoni added a comment - After more inspection, it turns out that the only entities that are parsed are in the form &#ddd;, where the d's are digits. The & is not parsed, but printed instead.
Hide
Petr Škoda (skodak) added a comment -

the p() and s() are not designed to print html with tags & etc., it is for plaintext and form values mostly, looking at this now...

Show
Petr Škoda (skodak) added a comment - the p() and s() are not designed to print html with tags & etc., it is for plaintext and form values mostly, looking at this now...
Hide
Petr Škoda (skodak) added a comment -

fixed the members.php - uses echo() now
please reopen if there are more similar problems,
thanks for the report!

Show
Petr Škoda (skodak) added a comment - fixed the members.php - uses echo() now please reopen if there are more similar problems, thanks for the report!
Hide
Nicolas Martignoni added a comment -

Thanks for the quick fix. Verified. I don't close this as I'm pretty sure that there are other instances of this issue.

NB Should the developer docs be updated with this info about p() and s()?

Show
Nicolas Martignoni added a comment - Thanks for the quick fix. Verified. I don't close this as I'm pretty sure that there are other instances of this issue. NB Should the developer docs be updated with this info about p() and s()?
Hide
Petr Škoda (skodak) added a comment -

Yes, please fix the docs if you can

Show
Petr Škoda (skodak) added a comment - Yes, please fix the docs if you can
Hide
Nicolas Martignoni added a comment -
Show
Nicolas Martignoni added a comment - Docs updated here: http://docs.moodle.org/en/Development:Coding#General_rules
Hide
Dongsheng Cai added a comment -

Thanks, Petr

Show
Dongsheng Cai added a comment - Thanks, Petr
Hide
Nicolas Martignoni added a comment -

Reopening as s() is used in admin/uploaduser.php to print status messages, see e.g. function track() of class uu_progress_tracker.

Show
Nicolas Martignoni added a comment - Reopening as s() is used in admin/uploaduser.php to print status messages, see e.g. function track() of class uu_progress_tracker.
Hide
Tim Hunt added a comment -

I think if we are being strict, in XHTML, in browsers that treat the content as XML and which do not parse the DTD, then   will not work, so it is always safer to use numeric entities like  . Or since all lang files are UTF-8 anyway, just put the characters in them directly.

Anyway, I suppose the issue is: what do lang strings contain? Is it plain text, or HTML source? I guess it is HTML source, in which case the rule should be numeric entities only (plus the three that are included in XML: <, >, &, ', ").

However, when lang strings are output as attribute values in the HTML, it is essential that ' and " are output as ' and ", and that is certainly not the case in any language pack I have seen. Therefore the strings need to be fed through a function to escape them, and s() and p() are Moodle's standard function for doing this, and we seem to have come full circle.

Please someone explain to me!

Show
Tim Hunt added a comment - I think if we are being strict, in XHTML, in browsers that treat the content as XML and which do not parse the DTD, then   will not work, so it is always safer to use numeric entities like  . Or since all lang files are UTF-8 anyway, just put the characters in them directly. Anyway, I suppose the issue is: what do lang strings contain? Is it plain text, or HTML source? I guess it is HTML source, in which case the rule should be numeric entities only (plus the three that are included in XML: <, >, &, ', "). However, when lang strings are output as attribute values in the HTML, it is essential that ' and " are output as ' and ", and that is certainly not the case in any language pack I have seen. Therefore the strings need to be fed through a function to escape them, and s() and p() are Moodle's standard function for doing this, and we seem to have come full circle. Please someone explain to me!
Hide
Nicolas Martignoni added a comment -

Hi Tim!

I too would prefer UTF-8 chars directly in lang files, but for some reason (sprintf lacking support of multibyte chars) the UTF-8 non-breaking space chars breaks lang strings. See MDL-6688 (still reproducible as of 2.0dev) that was only worked around preserving the   entities in lang files.

Otherwise, we should instruct lang pack maintainers, including myself , to always use numerical entities.

Show
Nicolas Martignoni added a comment - Hi Tim! I too would prefer UTF-8 chars directly in lang files, but for some reason (sprintf lacking support of multibyte chars) the UTF-8 non-breaking space chars breaks lang strings. See MDL-6688 (still reproducible as of 2.0dev) that was only worked around preserving the   entities in lang files. Otherwise, we should instruct lang pack maintainers, including myself , to always use numerical entities.
Hide
Nicolas Martignoni added a comment -

The numeric entities are too not parsed by p() and s().

Show
Nicolas Martignoni added a comment - The numeric entities are too not parsed by p() and s().
Hide
Nicolas Martignoni added a comment -

Now that MDL-17358 is closed (and we now why), couldn't we close this as "not a bug" (see Tim comment) and instruct translators to use exclusively UTF-8 chars in the strings of the lang packs?

Show
Nicolas Martignoni added a comment - Now that MDL-17358 is closed (and we now why), couldn't we close this as "not a bug" (see Tim comment) and instruct translators to use exclusively UTF-8 chars in the strings of the lang packs?
Hide
Nicolas Martignoni added a comment -

Adding David as a watcher, re chat discussion about entities.

Show
Nicolas Martignoni added a comment - Adding David as a watcher, re chat discussion about entities.
Hide
Petr Škoda (skodak) added a comment -

This should be finally fixed now, the numeric entities are not double encoded any more in s() and p(), most of the credit goes to Tim.

Thanks for the report!

Show
Petr Škoda (skodak) added a comment - This should be finally fixed now, the numeric entities are not double encoded any more in s() and p(), most of the credit goes to Tim. Thanks for the report!

People

Dates

  • Created:
    Updated:
    Resolved: