Moodle
  1. Moodle
  2. MDL-28730

Upgrade from 1.9.13 to 2.0.4 aborts due to problem with invalid byte sequences within a wiki

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Critical Critical
    • Resolution: Unresolved
    • Affects Version/s: 1.9.13, 2.0.4, 2.2.9
    • Fix Version/s: FRONTEND
    • Component/s: Wiki (2.x)
    • Environment:
      Centos 5.6 x86_64, apache 2.2.3, php 5.3.3, postgresql 8.4 server
    • Database:
      PostgreSQL
    • Testing Instructions:
      Hide

      If needed we will try to provide a course backup file (v 1.9.13) containing a minimal wiki.

      Show
      If needed we will try to provide a course backup file (v 1.9.13) containing a minimal wiki.
    • Workaround:
      Hide

      Rolling back the half - upgraded moodle instance to 1.9.13 (using a database and userdata backup) and removing the contents of the offending wiki seems to be sufficiant to work around the problem. Since the upgrade process is halted by an uncaught exception, we assume that having N wiki activities with such a problematic contents may result in N rollbacks would be necessary to bypass this problems. This may turn out to be rather inconvenient especially for large installations; Luckily, so far this problem affected only one wiki instance per moodle server.

      Show
      Rolling back the half - upgraded moodle instance to 1.9.13 (using a database and userdata backup) and removing the contents of the offending wiki seems to be sufficiant to work around the problem. Since the upgrade process is halted by an uncaught exception, we assume that having N wiki activities with such a problematic contents may result in N rollbacks would be necessary to bypass this problems. This may turn out to be rather inconvenient especially for large installations; Luckily, so far this problem affected only one wiki instance per moodle server.
    • Affected Branches:
      MOODLE_19_STABLE, MOODLE_20_STABLE, MOODLE_22_STABLE
    • Rank:
      18400

      Description

      System Information:
      moodle 1.9.13 (the wiki activities in question were created using earlier versions of moodle 1.9.x)
      database postgres 8.4
      Centos 5.6 x86_64
      apache 2.2.3
      php 5.3.3

      Steps to reproduce:
      We tried to upgrade our existing moodle 1.9.13 installation to moodle version 2.0.4 (in order to upgrade to version 2.1.1 down the road). While converting the wiki activity, the upgrade process stalled with an error message. The upgrade process was stuck at this stage until we rolled back to our backup of 1.9.13, deleted the contents in question and repeated the whole upgrade procedure.

      Error message / Output from the upgrade process:
      {{[Thu Aug 11 13:55:54 2011] [error] [client xxx.xxx.xxx.xxx] Default exception handler: Error reading from database Debug:
      ERROR: invalid byte sequence for encoding "UTF8": 0xe2\nHINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding".\n
      SELECT * FROM mdl_wiki_pages_old WHERE ((pagename= $1) OR (pagename= $2) OR (pagename= $3) OR (pagename= $4) OR (pagename= $5) OR (pagename= $6) OR (pagename= $7)) AND wiki= $8 \n
      [array (\n 0 => 'Importfunktionalit\xc3\xa4ten',\n 1 => 'Abh\xc3\xa4ngigkeit',\n 2 => 'MyDAMS\xe2',\n 3 => 'BenutzerInnen',\n 4 => 'BearbeiterInnen',\n 5 => 'Universit\xc3\xa4tsverlages',\n 6 => 'Regelm\xc3\xa4',\n 7 => '5',\n)]\n
      line 391 of /lib/dml/moodle_database.php: dml_read_exception thrown\n
      line 232 of /lib/dml/pgsql_native_moodle_database.php: call to moodle_database->query_end()\n
      line 678 of /lib/dml/pgsql_native_moodle_database.php: call to pgsql_native_moodle_database->query_end()\n
      line 1097 of /lib/dml/moodle_database.php: call to pgsql_native_moodle_database->get_records_sql()\n
      line 166 of /mod/wiki/db/migration/wiki/ewikimoodlelib.php: call to moodle_database->get_records_select()\n
      line 3327 of /mod/wiki/db/migration/wiki/ewiki/ewiki.php: call to ewiki_database_moodle()\n
      line 2234 of /mod/wiki/db/migration/wiki/ewiki/ewiki.php: call to ewiki_database()\n
      line 2119 of /mod/wiki/db/migration/wiki/ewiki/ewiki.php: call to ewiki_scan_wikiwords()\n
      line 57 of /mod/wiki/db/migration/lib.php: call to ewiki_format()\n
      line 309 of /mod/wiki/db/upgradelib.php: call to wiki_ewiki_2_html()\n
      line 175 of /mod/wiki/db/upgrade.php: call to wiki_upgrade_migrate_versions()\n
      line 526 of /lib/upgradelib.php: call to xmldb_wiki_upgrade()\n
      line 265 of /lib/upgradelib.php: call to upgrade_plugins_modules()\n
      line 1421 of /lib/upgradelib.php: call to upgrade_plugins()\n
      line 311 of /admin/index.php: call to upgrade_noncore()\n, referer: http://hostname/cache/admin/index.php?confirmupgrade=1&confirmrelease=1}}

      Actual Result:
      Upgrade process stuck at error message

      Expected Result:
      It would be very convenient if the upgrade script could

      • notify the user which wiki (including course id/name information) contains the problematic content
      • fail gracefully (e.g. notify the user that the wiki in question could not be converted / upgraded) thus allowing to fix or work around the problem with the problematic wiki after the upgrade is finished

      Remarks:

      • Most content in our instance is using the german localisation
      • Our postgres database for this moodle instances was created with the {WITH ENCODING 'unicode'}

        option

      • Our current explaination for the cause of this problem (e.g. how the incompatible byte sequence ended up in the database in the first place) would be a copy&paste operation containing latin1 encoded special characters by the user of the wiki. If this turns out to be correct, I would be very grateful for any ideas how to prevent such things in the future.

      Best regards & many thanks in advance

        Issue Links

          Activity

          Hide
          Michael de Raadt added a comment -

          Thanks for reporting this.

          Show
          Michael de Raadt added a comment - Thanks for reporting this.
          Hide
          Brian King added a comment -

          I'm also seeing this issue, with Solaris 10, PostgreSQL 9.1, php 5.3.10, using cli upgrade script. Content is mostly German. Upgrading from latest 1.9 to 2.2.1+ (Build: 20120217).

          Show
          Brian King added a comment - I'm also seeing this issue, with Solaris 10, PostgreSQL 9.1, php 5.3.10, using cli upgrade script. Content is mostly German. Upgrading from latest 1.9 to 2.2.1+ (Build: 20120217).
          Hide
          Brian King added a comment -

          More info: the wiki causing the problem contains the text (the numbers are unicode sequences as recognized by php):

           
          ... auftretender (\342\200\236Zahnradph\303\244nomen\342\200\234) oder ...
          

          or, as plaintext instead of with unicode sequences:

          ... auftretender („Zahnradphänomen“) oder ...

          The function ewiki_scan_wikiwords is extracting part of that using preg_match_all. This results in a non-valid unicode sequence at the end:

          Zahnradph\303\244nomen\342
          

          This is then used in a database query, which makes postgresql complain.

          Show
          Brian King added a comment - More info: the wiki causing the problem contains the text (the numbers are unicode sequences as recognized by php): ... auftretender (\342\200\236Zahnradph\303\244nomen\342\200\234) oder ... or, as plaintext instead of with unicode sequences: ... auftretender („Zahnradphänomen“) oder ... The function ewiki_scan_wikiwords is extracting part of that using preg_match_all. This results in a non-valid unicode sequence at the end: Zahnradph\303\244nomen\342 This is then used in a database query, which makes postgresql complain.
          Hide
          Brian King added a comment -

          This same upgrade (same db content, same moodle 2 version) was also made on a debian / postgresql sytem. There, the upgrade did not fail, but this wiki's content is missing after upgrade.

          Show
          Brian King added a comment - This same upgrade (same db content, same moodle 2 version) was also made on a debian / postgresql sytem. There, the upgrade did not fail, but this wiki's content is missing after upgrade.
          Hide
          Brian King added a comment -

          I've seen this on problem when upgrading from 1.9 to 2.2 on two large Moodle installations, both of which have primarily German content.

          I think it's related to MDL-6425. I noticed that if I ran some sql:

          update mdl_wiki set disablecamelcase=1;
          

          The upgrade was able to complete successfully; without this it failed with the dreaded 'invalid byte sequence for encoding "UTF8"' error.

          The people using wikis on this Moodle don't seem to be actually using the CamelCase linking feature, but there is lots of accidental linkage, as per MDL-6425 (for example, a word like Zahnradphänomen is interpreted like a CamelCase word).

          Show
          Brian King added a comment - I've seen this on problem when upgrading from 1.9 to 2.2 on two large Moodle installations, both of which have primarily German content. I think it's related to MDL-6425 . I noticed that if I ran some sql: update mdl_wiki set disablecamelcase=1; The upgrade was able to complete successfully; without this it failed with the dreaded 'invalid byte sequence for encoding "UTF8"' error. The people using wikis on this Moodle don't seem to be actually using the CamelCase linking feature, but there is lots of accidental linkage, as per MDL-6425 (for example, a word like Zahnradphänomen is interpreted like a CamelCase word).
          Hide
          Andrew Nicols added a comment -

          We've just hit the same issue but on a wiki which does use camelcase.
          I haven't tracked down the true root cause of the issue, but can allow the upgrade to continue by changing:

          mod/wiki/db/migration/wiki/ewikimoodlelib.php line 160:

          $params[$pname] = fix_utf8($id);
          

          The issue we're seeing is that ewiki_scan_wikiwords is picking up words in the content which contain non-utf8 characters and it's trying to autolink them. Postgres is fussy when you start trying to search for non-utf8 characters and throws the error.

          I believe that the worst case with this workaround is that some pages may not get autolinked, but they were probably broken anyway.

          Longer term solution - convert to mod_ouwiki in 2.4.

          Show
          Andrew Nicols added a comment - We've just hit the same issue but on a wiki which does use camelcase. I haven't tracked down the true root cause of the issue, but can allow the upgrade to continue by changing: mod/wiki/db/migration/wiki/ewikimoodlelib.php line 160: $params[$pname] = fix_utf8($id); The issue we're seeing is that ewiki_scan_wikiwords is picking up words in the content which contain non-utf8 characters and it's trying to autolink them. Postgres is fussy when you start trying to search for non-utf8 characters and throws the error. I believe that the worst case with this workaround is that some pages may not get autolinked, but they were probably broken anyway. Longer term solution - convert to mod_ouwiki in 2.4.
          Hide
          Andrew Nicols added a comment -

          One to watch out for with the above workaround: MDL-33007 - Some versions of php ship with broken iconv - see the php bug https://bugs.php.net/bug.php?id=61484

          Show
          Andrew Nicols added a comment - One to watch out for with the above workaround: MDL-33007 - Some versions of php ship with broken iconv - see the php bug https://bugs.php.net/bug.php?id=61484
          Hide
          Didier Raboud added a comment -

          I just hit the same problem on a Moodle 1.9.19+ to 2.2.4+ migration and the workaround mentioned by Andrew two comments above seems to work fine.

          Show
          Didier Raboud added a comment - I just hit the same problem on a Moodle 1.9.19+ to 2.2.4+ migration and the workaround mentioned by Andrew two comments above seems to work fine.
          Hide
          Andrew Davis added a comment -

          I'm turning up the priority on this as well as adding the patch label as there appears to be a working fix provided. Thanks Andrew.

          Show
          Andrew Davis added a comment - I'm turning up the priority on this as well as adding the patch label as there appears to be a working fix provided. Thanks Andrew.
          Hide
          Tim Lock added a comment -

          Thanks Andrew Nicols for the patch!

          Show
          Tim Lock added a comment - Thanks Andrew Nicols for the patch!
          Hide
          Marina Glancy added a comment -

          Andrew, do you want to submit your fix as a patch? It has to go in Moodle 2.2 [only], we still accept patches there for migration issues.

          Show
          Marina Glancy added a comment - Andrew, do you want to submit your fix as a patch? It has to go in Moodle 2.2 [only] , we still accept patches there for migration issues.

            People

            • Votes:
              2 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated: