|
[
Permalink
| « Hide
]
Helen Foster added a comment - 03/Dec/08 06:54 PM
VIP watchers added.
See http://www.php.net/manual/en/function.sprintf.php#55837
Further reading (sorry for the spam):
http://www.phpwact.org/php/i18n/utf-8 Hi Nicolas... can you execute the attached test in your server to see what it returns? It's a simple string, containing one utf-8 no-break space (C2A0) and here, being processed by sprintf() doesn't cause apparently any problem.
Ciao
Eloy Lafuente (stronk7) made changes - 26/Jan/09 04:02 AM
Hi Eloy. Your test.php passes at my web server as well. However there are some more examples at the pages referenced by Nicolas where this issue should appear. If I understood it correctly (haven't spent much time on it), the problem is with the percentage sign contained in multibyte strings code.
Hi Eloy and thanks for looking at this. Just like David, your test file runs correctly on my servers.
However the problem occurs with embedded vars ($a) in strings. Please have a look at the test2.php file I attach here now. It describes the problem I experience. Thanks in advance for your feedback.
Nicolas Martignoni made changes - 26/Jan/09 06:33 AM
Bah!
Forgot my last comment!!! It's sprintf(). Bloody sprintf(). Deleting previous comment!
Eloy Lafuente (stronk7) made changes - 26/Jan/09 07:25 AM
aha!
I think I've found a workaround: This is current string to evaluate, that is broken: $resultstring= sprintf("Connecté sous le nom « $a »"); But this seems to be properly evaluated: $resultstring= sprintf("Connecté sous le nom « ". $a. " »"); Silly trick but seems that we could, simply, replace all the $a (or $a->whatever) by " . $a . ", and that way, forcing the concatenation... results seem to be properly processed by sprintf/eval. Stupid PHP after all. My only concern is about the cost of applying that replace (regular expression) to all the strings all the time. Could be too much. Hehe, another workaround that is properly processed too:
$resultstring= sprintf("Connecté sous le nom « {$a} »"); Idiot PHP. :-D Attached the test3.php example that uses the {$a} trick to allow sprintf()/eval() to work properly in the example given.
Ciao
Eloy Lafuente (stronk7) made changes - 26/Jan/09 07:54 AM
Just for reference, some execution times, to compare:
Difference: 0.007 seconds (21% slower)
Difference: 0.749 seconds (22% slower) Note that this is a comparison between the execution of test2.php and test3.php above, processing the same "in-memory" string 1000 and 100000 times. I guess differences in real world (where strings are "in-file", hence slower access) will be really smaller. More yet, 0.007 seconds for 1000 strings seems to be a viable solution if finally implemented. Ciao Or, we can make sure that $a parameters are properly quoted by the curly brackets at the time of writing the string via the translation tool. Let us check this condition into the syntax checker, recently published by Petr (was it he?). No need to to search and replace on the fly.
Translators not-using the translation UI should run the checker before they commit anyway. After running test3.php, I confirm that the solution with {$a} works correctly for me too
The grep solution would work out of the box for a small performance cost. With the checker solcution, there's a risk for the translator to forget the check before committing (I know myself In any case, both solution will be better than actual situation. For the grep solution, I'll suggest to take potential {} in the lang strings:
return preg_replace('/{?(\$a(->\w+)?)}?/', '{\\1}', $temp); Petr's script is in
Please include it in moodle/admin with all the other nice testscripts (might be the time to create a folder for them Hi,
I think that we shouldn't make translators life harder here, by forcing them to know about those {} internals at all. IMO, if the times are aceptable, we should grep in code and leave them free from complexities. Only if times aren't aceptable I'd convert/produce that in the editor. Just my opinion. What about a pre-commit CVS hook script to search and replace all language files before they are actually saved into the repository. All I am leading towards is to do this thing once instead of doing it everytime we load the strings.
But that doesn't work for local changes and so on, plus the idea of modifying files in the commit process, wow. -10
IMO it should be handled from core. The only thing is to measure carefully the cost before doing that. Hopefully, some day, PHP won't bug there, so the workaround will be erased safely. Ciao Yeah, make sense... So what about sanity this by cron before generating the lang pack zip archives? Local modifications should be OK because master pack will already contain {$a} form so admin will preserve such format. Just a final idea...
Yes Eloy, I agree with you. +1 from me (if it counts) to handle this in core, if the cost is not too high.
And re PHP bug: it seems that php6 will be UTF-8 compatible, so there's some hope here. Ciao Eloy, do you have an estimate date for pushing your fix into CVS, and if it's a candidate for backporting in 1.8.x and 1.9.x?
Ciao! Well, the commit is a matter of a few seconds... but it seems that there are two alternatives:
1)- apply the patch and leave moodle to process it, leaving languages unmodified. My +1 is about to implement 1), apply the patch and forget modifications in langs. But I'd like to hear some more votes here before doing that. Ciao Adding Martin and Tim here.
Martin, after some HQ chat about this, Tim detected that that sprintf() has been there since the beginning. See: http://cvs.moodle.org/moodle/lib/moodlelib.php?view=diff&pathrev=MOODLE_19_STABLE&r1=1.25&r2=1.26 Do you remember any case where it is necessary? It seems that directly eval() the string is enough to perform all the $a(whatever) parsing. Also, Tim comments about to use one manual solution for translators having the problem as far as it doesn't affect too many strings, instead of preg_replace always the evaled sentence. Added test 4 that:
I any case, things to decide: 1) Erase sprintf() use if we don't need it. HEAD only for now + unit tests. Backported later. Ciao
Eloy Lafuente (stronk7) made changes - 10/Feb/09 08:01 PM
New file test5.php shows that the problem is not sprintf() function, but eval(). Silly PHP
{$a} workaround still applies. Is there any alternative to eval() for detecting parse errors?
Nicolas Martignoni made changes - 10/Apr/09 02:21 AM
I'll look into this, since I have been messing with get_string.
If eval is really not Unicode-safe, then we may have to resort to writing our own parser to replace $a and $a->xxx in lang strings. Actually, that should be an easy preg_replace. The only thing to be careful about is to know whether we are relying on eval to do anything else for us. I guess it also replaces \" and \$ with " and $ respectively.
Tim Hunt made changes - 10/Apr/09 09:46 AM
Tim Hunt made changes - 10/Apr/09 12:37 PM
OK, you have to RTFM (http://au2.php.net/manual/en/language.variables.basics.php
Non-breaking space (character \xa0) is a legal character in a variable name, so, if you are insane you could write code like $user name = "Nicolas Martignoni"; Anyway, that explains why '$a »' does not work as expected. It is looking for a variable called '$a ' (a followed by non-breaking-space) which obviously does not exist. The correct work-around is to define $string['loggedinas'] = 'Connecté sous le nom « {$a} »'; That is, use curly braces to separate the variable from the following character, just like you would say $string['distancerun'] = 'You have run {$a}km.'; to avoid referring to the variable $akm. So, this is resolve not-a-bug, but translators need to be aware. I'll post in the Languages forum.
Tim Hunt made changes - 10/Apr/09 12:37 PM
Thanks Tim, I was getting mad, as I couldn't understand what was going here. What a stupidity...
Indeed not a bug, just a silly PHP convention.
Nicolas Martignoni made changes - 10/Apr/09 04:19 PM
Wow, that is ridiculous - is it worth greping through all the language packs and ensuring that we use braces everywhere in the lang packs?
Is it possible to add to the language pack editor to add curly brackets arround $a if they are not already there? That way, opening and saving a language file would fix all $a strings in one go. New errors are not possible anymore, old ones, if they exist, are easy to fix and will be fixed over time automatically.
Tim Hunt made changes - 12/Apr/09 09:08 PM
martignoni committed 53 files to 'Lang CVS' - 14/Apr/09 04:41 PM
martignoni committed 5 files to 'Lang CVS' - 14/Apr/09 06:25 PM
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||