Non-core contributed modules

Paste from Word causes 'Data Too Long' error in IE6

Details

  • Type: Bug Bug
  • Status: Resolved Resolved
  • Priority: Minor Minor
  • Resolution: Cannot Reproduce
  • Affects Version/s: 1.9.2
  • Fix Version/s: None
  • Component/s: Module: OU wiki
  • Labels:
    None
  • Environment:
    mysql 5.0.22 php 5.1.4 Windows server OUWiki 2008100600 downloaded 28/1/2009
  • Database:
    MySQL
  • Affected Branches:
    MOODLE_19_STABLE

Description

If you copy and past entries from Word into the OUWiki, you get an error:

Database problem: Data too long for column 'xhtml' at row 1 (code OUWIKI-E:\Websites\...oursite...\root\moodle183\mod\ouwiki\ouwiki.php/1614)

It appears to be generated because of the additional markup from Word text, as the actual entries are not excessively long. The workaround we are currently using is to ask students to use the Word Cleanup button or use text mode for editing, but this seems beyond the capability of some of our students.

Activity

Hide
Sam Marshall added a comment -

(The error occurs at the point where it inserts a record for the new version, obviously.)

This is very odd. I checked in my MySQL test installation and the xhtml field has type 'longtext'. According to the MySQL manual, this type should be able to store up to 4 gigabytes:

http://dev.mysql.com/doc/refman/5.0/en/storage-requirements.html

This would also be limited by the max_allowed_packet value, but this defaults to 16MB:

http://dev.mysql.com/doc/refman/5.0/en/mysql-command-options.html#option_mysql_max_allowed_packet

Word stores enough crap in the file that it is typically multiplied by a factor of five or ten over the plain HTML equivalent (which by the way does cause reduced performance), but for any reasonable document, that would still be much less than 16MB.

The only thing I can think about to debugging is if you can reproduce the problem, maybe add some code just before line 1614 mentioned that does

print_object(strlen($version->xhtml)); exit;

so you can see how many bytes it's trying to insert. If the number of bytes is some reasonable number, like less than 1000000, then there's a database/configuration problem. If it's an unreasonable number, then maybe there is a weird problem in ouwiki.

If it's a database issue, unfortunately I don't know anything much about MySQL - maybe you can find an expert (or ask in forums etc) who'd know why this error would occur even when trying to put a relatively small bit of data into the field.

You might like to attach a Word document that causes this problem when you do Ctrl-A to select all in Word, Copy, and Paste into ouwiki's html editor, to see if it can be reproduced elsewhere.

Show
Sam Marshall added a comment - (The error occurs at the point where it inserts a record for the new version, obviously.) This is very odd. I checked in my MySQL test installation and the xhtml field has type 'longtext'. According to the MySQL manual, this type should be able to store up to 4 gigabytes: http://dev.mysql.com/doc/refman/5.0/en/storage-requirements.html This would also be limited by the max_allowed_packet value, but this defaults to 16MB: http://dev.mysql.com/doc/refman/5.0/en/mysql-command-options.html#option_mysql_max_allowed_packet Word stores enough crap in the file that it is typically multiplied by a factor of five or ten over the plain HTML equivalent (which by the way does cause reduced performance), but for any reasonable document, that would still be much less than 16MB. The only thing I can think about to debugging is if you can reproduce the problem, maybe add some code just before line 1614 mentioned that does print_object(strlen($version->xhtml)); exit; so you can see how many bytes it's trying to insert. If the number of bytes is some reasonable number, like less than 1000000, then there's a database/configuration problem. If it's an unreasonable number, then maybe there is a weird problem in ouwiki. If it's a database issue, unfortunately I don't know anything much about MySQL - maybe you can find an expert (or ask in forums etc) who'd know why this error would occur even when trying to put a relatively small bit of data into the field. You might like to attach a Word document that causes this problem when you do Ctrl-A to select all in Word, Copy, and Paste into ouwiki's html editor, to see if it can be reproduced elsewhere.
Hide
Sam Marshall added a comment -

One more thought. If you can reproduce the problem, maybe you could try posting the same content into a forum post on your site to see if that also gives the error or not.

Show
Sam Marshall added a comment - One more thought. If you can reproduce the problem, maybe you could try posting the same content into a forum post on your site to see if that also gives the error or not.
Hide
David Brighton added a comment - - edited

The issue seems to occur with Word documents saved in Word 97-2003 format pasted into a new page in Internet Explorer 6 (and possibly other versions). I tried it in the forum and there was no problem. You can also trigger a similar error message by placing square brackets around a long section of wiki text (which is what I originally thought they were doing) which makes me suspect that the hidden code in the word document matches something that triggers the error rather than the length itself.

Show
David Brighton added a comment - - edited The issue seems to occur with Word documents saved in Word 97-2003 format pasted into a new page in Internet Explorer 6 (and possibly other versions). I tried it in the forum and there was no problem. You can also trigger a similar error message by placing square brackets around a long section of wiki text (which is what I originally thought they were doing) which makes me suspect that the hidden code in the word document matches something that triggers the error rather than the length itself.
Hide
Sam Marshall added a comment -

This does not occur on my MySQL test system when using either Internet Explorer 7 browser or Firefox 3 browser. (Unfortunately, I don't have access to IE6.) My test procedure was:

1) Open attached document in Word (running Word 2003 here).

2) Press Ctrl-A, then Ctrl-C, to copy entire document.

3) In an OU wiki, go to a new page.

4) Click 'Create page'

5) Click into the HTML area and press Ctrl-V to paste entire document

6) Click Save changes

This works correctly on my system, so I'm currently unable to reproduce the bug. Any suggestions? Does it still break for you when testing with one or other of those browsers?

Show
Sam Marshall added a comment - This does not occur on my MySQL test system when using either Internet Explorer 7 browser or Firefox 3 browser. (Unfortunately, I don't have access to IE6.) My test procedure was: 1) Open attached document in Word (running Word 2003 here). 2) Press Ctrl-A, then Ctrl-C, to copy entire document. 3) In an OU wiki, go to a new page. 4) Click 'Create page' 5) Click into the HTML area and press Ctrl-V to paste entire document 6) Click Save changes This works correctly on my system, so I'm currently unable to reproduce the bug. Any suggestions? Does it still break for you when testing with one or other of those browsers?
Hide
David Brighton added a comment -

It works for me with Firefox, We do not have IE7 here so it is likely an IE6 issue, thanks for that info, I will pass it on that IE7 is 'better'. I will try to discourage copying and pasting out of Word for the moment . I am surprised this has not come up as an issue before as it not that unusual a way for students to work. I will keep my eye on this in the hope that magic happens or that our organisation moves along (or even away) from IE6! Out of interest, was the 'hidden codes' visible on the wiki page after you saved it? When it works here all the document formatting codes are visible (unless you use the clean word button).

Show
David Brighton added a comment - It works for me with Firefox, We do not have IE7 here so it is likely an IE6 issue, thanks for that info, I will pass it on that IE7 is 'better'. I will try to discourage copying and pasting out of Word for the moment . I am surprised this has not come up as an issue before as it not that unusual a way for students to work. I will keep my eye on this in the hope that magic happens or that our organisation moves along (or even away) from IE6! Out of interest, was the 'hidden codes' visible on the wiki page after you saved it? When it works here all the document formatting codes are visible (unless you use the clean word button).
Hide
Sam Marshall added a comment -

OK. At some point I will try to get hold of an IE6 machine for testing, as we do still support it (student use has dropped rapidly but the last time i checked six months back, it was still around 25%).

My test earlier was on standard Moodle, not the OU version. But in the OU's version we do have some custom server code which very aggressively removes styling and tags that MS Word creates. [There are various disadvantages to doing this, for instance it is also likely to strip many styles from manually-created HTML too.] This may be why we don't see the problem for our students.

Although this problem appears only to manifest itself depending on browser version, it may likely be an indicator of another server-side bug.

Show
Sam Marshall added a comment - OK. At some point I will try to get hold of an IE6 machine for testing, as we do still support it (student use has dropped rapidly but the last time i checked six months back, it was still around 25%). My test earlier was on standard Moodle, not the OU version. But in the OU's version we do have some custom server code which very aggressively removes styling and tags that MS Word creates. [There are various disadvantages to doing this, for instance it is also likely to strip many styles from manually-created HTML too.] This may be why we don't see the problem for our students. Although this problem appears only to manifest itself depending on browser version, it may likely be an indicator of another server-side bug.
Hide
David Brighton added a comment -

That aggressive style & tag remover sounds like it would do the job, at the moment I am advising people to use the strip Word button in the edit box which seems to work, whilst retaining the basic text formatting.

Show
David Brighton added a comment - That aggressive style & tag remover sounds like it would do the job, at the moment I am advising people to use the strip Word button in the edit box which seems to work, whilst retaining the basic text formatting.
Hide
Sam Marshall added a comment -

Alan Carter checked this for us. We determined that there was unusual behaviour in IE6, but this appears to be created by the HTML editor (before the $_POST variable is received from PHP), presumably as a consequence of a bug in IE6. The same document pastes correctly when using IE7.

See comment below (sorry I didn't bother to transfer the attachments mentioned from our internal bug tracking system, where this is part of bug 7797).

Attachments created for results of copy and paste tests into IE6 and IE7 from a
Word document. In the case of IE6 the text 'Part two - Assessment of
competence'
was moved to the bottom of the resulting file (data.txt) and lost its bold Word
formatting. When originaly placed into the html editor it kept its formatting
but by clicking on the 'save changes' or 'preview' buttons the
$_POST['content'] into ouwiki/edit.php had changed.

data1.txt is the result of the copy and paste into IE7 of the same Word
document, formatting etc, remains as was copied into the html editor.

Show
Sam Marshall added a comment - Alan Carter checked this for us. We determined that there was unusual behaviour in IE6, but this appears to be created by the HTML editor (before the $_POST variable is received from PHP), presumably as a consequence of a bug in IE6. The same document pastes correctly when using IE7. See comment below (sorry I didn't bother to transfer the attachments mentioned from our internal bug tracking system, where this is part of bug 7797). Attachments created for results of copy and paste tests into IE6 and IE7 from a Word document. In the case of IE6 the text 'Part two - Assessment of competence' was moved to the bottom of the resulting file (data.txt) and lost its bold Word formatting. When originaly placed into the html editor it kept its formatting but by clicking on the 'save changes' or 'preview' buttons the $_POST['content'] into ouwiki/edit.php had changed. data1.txt is the result of the copy and paste into IE7 of the same Word document, formatting etc, remains as was copied into the html editor.
Hide
Sam Marshall added a comment -

Grrr, forgot to explain - this testing was with core Moodle 1.9 (and IE6/IE7). So it doesn't include the special OU filter that I mentioned earlier.

Show
Sam Marshall added a comment - Grrr, forgot to explain - this testing was with core Moodle 1.9 (and IE6/IE7). So it doesn't include the special OU filter that I mentioned earlier.

People

Vote (0)
Watch (0)

Dates

  • Created:
    Updated:
    Resolved: