Moodle
  1. Moodle
  2. MDL-14149

day names, month names and am/pm appear garbled in Chinese and Japanese on Windows servers

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Duplicate
    • Affects Version/s: 1.7.4, 1.8.4, 1.9
    • Fix Version/s: None
    • Component/s: Language, Unicode
    • Labels:
      None
    • Environment:
      Windows 2003 server (English OS and Japanese OS both exhibit same behavior)
      Apache 2.2
      PHP 5.2.4 (mbstring and iconv enabled with default settings)
      database is not relevant
      browser is not relevant
    • Database:
      Any
    • Affected Branches:
      MOODLE_17_STABLE, MOODLE_18_STABLE, MOODLE_19_STABLE
    • Rank:
      4250

      Description

      Dates in Japanese and Chinese appear garbled on Moodle sites running on a Windows server. By "garbled" I mean the date and the text around it on the page in the browser are morphed into irrelevant characters and question marks (see screenshot)

      Apparently PHP's strftime() function used by Moodle's "userdate()" function (in lib/moodlelib.php") is where the text gets mangled.

      In Japanese, date strings containing %A, %a, %B, %b, %p are consistently mangled by strftime().

      In Chinese, dates strings ending in a multibyte character cause the mangling:
      $string['strftimedate'] = '%%Y?%%m?%%d?';
      $string['strftimedateshort'] = '%%m?%%d?';
      $string['strftimemonthyear'] = '%%Y?%%m?';

      I experimented with many values for ...
      (1) the Moodle site locale (Language -> Language settings -> Sitewide locale)
      (2) the 'locale' string in the language pack
      (3) the 'localewin' string in the language pack
      (4) the 'localewincharset' string in the language pack

      but I could not find values for the above settings to fix the multibyte dates.

      In the end I modified the userdate() function to fix %A, %a, %B, %b, %p in a similar that %d is fixed. Firstly, replace with "AA", "aa", "BB", "bb", or "pp", then format the date, then replace "AA" with %A that has been correctly converted to utf8, "aa" with a% that has been converted to utf8, and so on.

      As well as replacing the userdate function in moodlelib.php with the function in the attached file, "userdate.txt", it is also necessary to modify "get_string()" as follows:

      (1) locate the following line in the "get_string()" function (in "lib/moodlelib.php"):
      'localewin', 'localewincharset', 'oldcharset',

      (2) replace the above line with the following line:
      'localewin', 'localewincharset', 'localewincharsetdaymonth', 'oldcharset',

      The fixing will only be triggered by the presence of a new string in the language pack: 'localewincharsetdaymonth'. This string needs to be added as an empty string in the English language pack and as a non-empty value in the relevant language packs on sites where the fixing is required.

      During testing the last year on a live university Moode site in Japan, I have found that with the following language settings in the language pack and the modified userdate() and get_string() functions, the dates in Japanese display as requried:
      $string['localewincharset'] = '';
      $string['localewincharsetdaymonth'] = 'CP932';

      Similarly, for Chinese dates, I use:
      $string['localewincharset'] = '';
      $string['localewincharsetdaymonth'] = 'CP936';

      The same function and fix was recently confirmed for a Chinese site in Hong Kong:
      http://moodle.org/mod/forum/discuss.php?d=93697#p414588

      many thanks
      Gordon

      1. language.php
        16 kB
        Anthony Borrow
      2. MDL-14149-p.diff
        0.8 kB
        Anthony Borrow
      3. userdate.txt
        5 kB
        Gordon Bateson
      4. userdate.txt
        5 kB
        Gordon Bateson
      1. dates.chinese.gif
        13 kB
      2. dates.english.gif
        6 kB
      3. dates.japanese.gif
        13 kB
      4. screenshot-1.jpg
        125 kB
      5. strftime_p_en.png
        134 kB
      6. strftime_p_es.png
        126 kB

        Issue Links

          Activity

          Hide
          Gordon Bateson added a comment - - edited

          added screensnips showing the calendar and recent activity dates as they appear in English, Japanese and Chinese

          Show
          Gordon Bateson added a comment - - edited added screensnips showing the calendar and recent activity dates as they appear in English, Japanese and Chinese
          Hide
          Gordon Bateson added a comment - - edited

          You may notice that the attached userdate() function also has two other fixes.

          First is an improvement for the removal of the leading "0" from %d (two-digit day of month). Currently the leading zero is removed, but it is replaced by a space.

          The result is a date string such as the following (yyyy.m.d) :
          2008.01. 1
          The rightmost "1" seems to be detached from the rest of the date because of the extra space.

          Second, I would like to suggest is that the leading zeros on %m (two-digit month number) should also be removed. If you are deleting zeros from the day numbers, you certainly want to delete zeros from the month numbers.

          With these two additional fixes, the above date becomes ...
          2008.1.1

          ... which is much more the sort of thing a real person would write it.

          Show
          Gordon Bateson added a comment - - edited You may notice that the attached userdate() function also has two other fixes. First is an improvement for the removal of the leading "0" from %d (two-digit day of month). Currently the leading zero is removed, but it is replaced by a space. The result is a date string such as the following (yyyy.m.d) : 2008.01. 1 The rightmost "1" seems to be detached from the rest of the date because of the extra space. Second, I would like to suggest is that the leading zeros on %m (two-digit month number) should also be removed. If you are deleting zeros from the day numbers, you certainly want to delete zeros from the month numbers. With these two additional fixes, the above date becomes ... 2008.1.1 ... which is much more the sort of thing a real person would write it.
          Hide
          Gao Han added a comment -

          This fix worked perfectly for me as well. My specifics:

          Moodle version 1.9+, build 20080331
          Windows XP, Chinese "international version" (i.e. english GUI, but frequently still recognized as being chinese)
          Apache 2.2.8 + openSSL
          MySQL 5.0.51a
          PHP 5.2.5

          Added, a screenshot, to show that it can get even worse (before the fix), the reason why i am so happy with this fix. My site is in chinese (Shenyang, not Hongkong... ), simplified chinese

          Show
          Gao Han added a comment - This fix worked perfectly for me as well. My specifics: Moodle version 1.9+, build 20080331 Windows XP, Chinese "international version" (i.e. english GUI, but frequently still recognized as being chinese) Apache 2.2.8 + openSSL MySQL 5.0.51a PHP 5.2.5 Added, a screenshot, to show that it can get even worse (before the fix), the reason why i am so happy with this fix. My site is in chinese (Shenyang, not Hongkong... ), simplified chinese
          Hide
          Gao Han added a comment -

          Garbled screen in detailed calender view.

          Show
          Gao Han added a comment - Garbled screen in detailed calender view.
          Hide
          Petr Škoda added a comment -

          1/ the static $localewin = null; will not work in cron, each user may have different locale
          2/ I do not think we should be doing this in all locales
          3/ I do not understand why you need new localewincharsetdaymonth

          Show
          Petr Škoda added a comment - 1/ the static $localewin = null; will not work in cron, each user may have different locale 2/ I do not think we should be doing this in all locales 3/ I do not understand why you need new localewincharsetdaymonth
          Hide
          Gordon Bateson added a comment -

          > 1/ the static $localewin = null; will not work in cron,
          > each user may have different locale

          Ah yes, good point. We probably have to set it up for each call to userdate() then

          > 2/ I do not think we should be
          > doing this in all locales

          How else could we do it?

          > 3/ I do not understand why you need
          > new localewincharsetdaymonth

          localewincharsetdaymonth is necessary because localewincharset influences the entire date string, but on servers suffering from this "garbeled dates" issue, only parts of the date string need special treatment.

          Also, localewincharset has been around for some time and I presume it is working well for some people. If we tried to change the way it works, then there is a great risk of breaking dates on systems where localewincharset is sufficient.

          I can give you access to my windows server to investigate if that would be helpful.

          Show
          Gordon Bateson added a comment - > 1/ the static $localewin = null; will not work in cron, > each user may have different locale Ah yes, good point. We probably have to set it up for each call to userdate() then > 2/ I do not think we should be > doing this in all locales How else could we do it? > 3/ I do not understand why you need > new localewincharsetdaymonth localewincharsetdaymonth is necessary because localewincharset influences the entire date string, but on servers suffering from this "garbeled dates" issue, only parts of the date string need special treatment. Also, localewincharset has been around for some time and I presume it is working well for some people. If we tried to change the way it works, then there is a great risk of breaking dates on systems where localewincharset is sufficient. I can give you access to my windows server to investigate if that would be helpful.
          Hide
          Gordon Bateson added a comment - - edited

          Revised the userdate() function in the attached file, userdate.txt, so that it maintains a cache of strings used for each language pack.

          This allows the function to be efficient both when being called by a single user with a single language pack, and when being called by multiple users with possibly multiple language packs, for example while the Moodle cron.php is being run.

          Show
          Gordon Bateson added a comment - - edited Revised the userdate() function in the attached file, userdate.txt, so that it maintains a cache of strings used for each language pack. This allows the function to be efficient both when being called by a single user with a single language pack, and when being called by multiple users with possibly multiple language packs, for example while the Moodle cron.php is being run.
          Hide
          Anthony Borrow added a comment -

          In working with the MRBS block, I've been attempting to move the language stuff from the MRBS way of doing things to a more Moodle-like way. I came across an issue with strftime using the %p under certain locales. I initially thought I would just use userdate to replace it and handle it; however, it appears that userdate does not handle the %p format correctly. I'm attaching the language.php file from MRBS that has a function utf8_strftime that checks to see if in fact the locale returns the proper am/pm. If not, it adds it in. This seems related to the issue here. What I plan to do for the MRBS block is to use userdate and hope that the userdate function gets patched up so that it works nicely with the %p format in all languages. I'm also attaching a couple of screen shots of the MRBS block. One in English and one in Spanish which are using the userdate function with the format of %I:%M%p. I'm not 100% sure what is happening with the locales. I'll try to take a closer look at the userdate function to explain the differences in behavior. For example, I notice in viewing the Moodle logs in English I get am and pm displayed; however, in Spanish it converts to a 24 hour format. Curiously in MRBS block using the same function I am getting different behavior with just the am/pm being dropped in Spanish. Bottom line, I'm still trying to make sense out of what is happening but thought sharing this mess might help somewhat this issue. Peace - Anthony

          – from the MRBS changelog:

          • Removed all use of the utf8_date() function, strftime()
            is used instead. strftime() now works around a bug in
            strftime() in the handling of %p, using date() if your locale
            happens to return empty strings for the %p substitution.
          Show
          Anthony Borrow added a comment - In working with the MRBS block, I've been attempting to move the language stuff from the MRBS way of doing things to a more Moodle-like way. I came across an issue with strftime using the %p under certain locales. I initially thought I would just use userdate to replace it and handle it; however, it appears that userdate does not handle the %p format correctly. I'm attaching the language.php file from MRBS that has a function utf8_strftime that checks to see if in fact the locale returns the proper am/pm. If not, it adds it in. This seems related to the issue here. What I plan to do for the MRBS block is to use userdate and hope that the userdate function gets patched up so that it works nicely with the %p format in all languages. I'm also attaching a couple of screen shots of the MRBS block. One in English and one in Spanish which are using the userdate function with the format of %I:%M%p. I'm not 100% sure what is happening with the locales. I'll try to take a closer look at the userdate function to explain the differences in behavior. For example, I notice in viewing the Moodle logs in English I get am and pm displayed; however, in Spanish it converts to a 24 hour format. Curiously in MRBS block using the same function I am getting different behavior with just the am/pm being dropped in Spanish. Bottom line, I'm still trying to make sense out of what is happening but thought sharing this mess might help somewhat this issue. Peace - Anthony – from the MRBS changelog: Removed all use of the utf8_date() function, strftime() is used instead. strftime() now works around a bug in strftime() in the handling of %p, using date() if your locale happens to return empty strings for the %p substitution.
          Hide
          Anthony Borrow added a comment -

          Here is a patch file that I did some testing with using the format '%I:%M%p' which should return am/pm. In English, the am/pm show just fine; however, in other locales such as Spanish (es_utf8) it does not show up. I pulled in the fix that was used in the MRBS project (as shown in the language.php already attached) and incorporated the additional check into Moodle's userdate function (in /lib/moodlelib.php) and it seems to work. The logic seems pretty simple. I'm hoping that perhaps the patch will also help to partially resolve the issue that Gordon reported.

          To ensure that was not some strange MRBS thing, I tested using the '%I:%M%p' format in the /course/report/log/live.php file and received similar results (i.e. the am/pm showed in English but not in Spanish). With this patch it shows for English and Spanish.

          In the MRBS block, I'm going to replace all the strftime functions with Moodle's userdate so it would help if this or a similar patch were introduced to provide for the proper formating of the time. I suspect that its a PHP issue with the strftime function although I did not find it documented as being problematic elsewhere. I'll do some continued testing and if needed file an issue with php.net.

          Also, for the record I am using PHP Version 5.2.4 and tested using Ubuntu, FF3. Peace - Anthony

          Show
          Anthony Borrow added a comment - Here is a patch file that I did some testing with using the format '%I:%M%p' which should return am/pm. In English, the am/pm show just fine; however, in other locales such as Spanish (es_utf8) it does not show up. I pulled in the fix that was used in the MRBS project (as shown in the language.php already attached) and incorporated the additional check into Moodle's userdate function (in /lib/moodlelib.php) and it seems to work. The logic seems pretty simple. I'm hoping that perhaps the patch will also help to partially resolve the issue that Gordon reported. To ensure that was not some strange MRBS thing, I tested using the '%I:%M%p' format in the /course/report/log/live.php file and received similar results (i.e. the am/pm showed in English but not in Spanish). With this patch it shows for English and Spanish. In the MRBS block, I'm going to replace all the strftime functions with Moodle's userdate so it would help if this or a similar patch were introduced to provide for the proper formating of the time. I suspect that its a PHP issue with the strftime function although I did not find it documented as being problematic elsewhere. I'll do some continued testing and if needed file an issue with php.net. Also, for the record I am using PHP Version 5.2.4 and tested using Ubuntu, FF3. Peace - Anthony
          Hide
          Frank Ralf added a comment -

          I found that disabling the following if statement of the userdate() function in moodlelib.php makes Japanese dates show correctly, without any further modifications. So I suppose the error lies with the $textlib->convert() function.

          /// If we are running under Windows convert from windows encoding to UTF-8
          /// (because it's impossible to specify UTF-8 to fetch locale info in Win32)

          if ($CFG->ostype == 'WINDOWS') {
          if ($localewincharset = get_string('localewincharset'))

          { $textlib = textlib_get_instance(); $datestring = $textlib->convert($datestring, $localewincharset, 'utf-8'); }

          }

          Hope that helps solving this issue.

          Regards,
          Frank

          Show
          Frank Ralf added a comment - I found that disabling the following if statement of the userdate() function in moodlelib.php makes Japanese dates show correctly, without any further modifications. So I suppose the error lies with the $textlib->convert() function. /// If we are running under Windows convert from windows encoding to UTF-8 /// (because it's impossible to specify UTF-8 to fetch locale info in Win32) if ($CFG->ostype == 'WINDOWS') { if ($localewincharset = get_string('localewincharset')) { $textlib = textlib_get_instance(); $datestring = $textlib->convert($datestring, $localewincharset, 'utf-8'); } } Hope that helps solving this issue. Regards, Frank
          Hide
          Frank Ralf added a comment -

          The $textlib->convert() function of \lib\textlib.class.php (line 111) is basically only a wrapper for the Typo3 conv() function which does all the work. It is found in \lib\typo3\class.t3lib_cs.php (line 590).

          Show
          Frank Ralf added a comment - The $textlib->convert() function of \lib\textlib.class.php (line 111) is basically only a wrapper for the Typo3 conv() function which does all the work. It is found in \lib\typo3\class.t3lib_cs.php (line 590).
          Hide
          Frank Ralf added a comment -

          I did some more poking around.

          The culprit seems to be PHP's inconv() function used in class.t3lib_cs.php (line 611).

          When using the following test script, I get the same mangled output as in Moodle.

          <?php
          echo iconv("shift_jis", "UTF-8", "2009?01?05? 20?52?");
          ?>

          Output of script: 2009??01?
          Date in Moodle: 2009?? 01?

          Using EUC-JP encoding instead of Shift-JIS results in this output: 2009?

          <?php
          echo iconv("euc-jp", "UTF-8", "2009?01?05? 20?52?");
          ?>

          Hopefully, a further step to a solution.

          Frank

          Show
          Frank Ralf added a comment - I did some more poking around. The culprit seems to be PHP's inconv() function used in class.t3lib_cs.php (line 611). When using the following test script, I get the same mangled output as in Moodle. <?php echo iconv("shift_jis", "UTF-8", "2009?01?05? 20?52?"); ?> Output of script: 2009??01? Date in Moodle: 2009?? 01? Using EUC-JP encoding instead of Shift-JIS results in this output: 2009? <?php echo iconv("euc-jp", "UTF-8", "2009?01?05? 20?52?"); ?> Hopefully, a further step to a solution. Frank
          Hide
          Frank Ralf added a comment -

          Only found issue MDL-13389 after having worked on MDL-14149 for some time.

          1) My findings confirm the problem description there.
          2) I linked both issues.

          Frank

          Show
          Frank Ralf added a comment - Only found issue MDL-13389 after having worked on MDL-14149 for some time. 1) My findings confirm the problem description there. 2) I linked both issues. Frank
          Hide
          Anthony Borrow added a comment -

          Thanks Frank for poking and reporting what you found. I am just returning from Nepal but it sounds like your comments may help us work toward a solution. Peace - Anthony

          Show
          Anthony Borrow added a comment - Thanks Frank for poking and reporting what you found. I am just returning from Nepal but it sounds like your comments may help us work toward a solution. Peace - Anthony
          Hide
          Frank Ralf added a comment -

          I did some thinking after the poking and might have jumped to a conclusion.

          After all, my test script (the PHP file) itself isn't encoded in Shift JIS but in UTF-8. So I am feeding the inconv() function a UTF-8 string while telling it it is Shift JIS encoded. That causes the wrong output.

          So there seems to be some implicit converting to UTF-8 going on further upstream and the inconv() function is innocent after all.

          Frank

          Show
          Frank Ralf added a comment - I did some thinking after the poking and might have jumped to a conclusion. After all, my test script (the PHP file) itself isn't encoded in Shift JIS but in UTF-8. So I am feeding the inconv() function a UTF-8 string while telling it it is Shift JIS encoded. That causes the wrong output. So there seems to be some implicit converting to UTF-8 going on further upstream and the inconv() function is innocent after all. Frank
          Hide
          Frank Ralf added a comment -

          http://moodle.org/mod/forum/discuss.php?d=118294 might provide some related information (Problems with date display on Windows machine with non English locale)

          Frank

          Show
          Frank Ralf added a comment - http://moodle.org/mod/forum/discuss.php?d=118294 might provide some related information (Problems with date display on Windows machine with non English locale) Frank
          Hide
          Michael de Raadt added a comment -

          This issue has been duplicated a number of times.

          I'm closing this issue as a more recent duplicate seems to have a more specific fix for this issue.

          Show
          Michael de Raadt added a comment - This issue has been duplicated a number of times. I'm closing this issue as a more recent duplicate seems to have a more specific fix for this issue.

            People

            • Votes:
              5 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: