Uploaded image for project: 'Moodle'
  1. Moodle
  2. MDL-47003

Atto HTML cleaning is overly agressive and nonsensical

    XMLWordPrintable

    Details

    • Testing Instructions:
      Hide

      To fully exercise this code requires quite a bit of work - mostly in that you need numerous operating systems, browsers, and office suites. We are working on testing most/all of the combinations and will post results in the ticket, so the HQ testers will need to decide to what level they want test this through.

      The browsers that are officially supported:

      • IE 9+
      • Safari 6+
      • Chrome 30+
      • Firefox 25+

      OSs that should probably be tried

      • Windows Vista, 7, 8, 8.1
      • Mac 10.7.x, 10.8.x, 10.9.x, 10.10.x
      • Ubuntu

      Office suites

      • MS Office 2007, 2010, 2013
      • MS Office Mac 2011, 2016 (Preview)
      • LibreOffice
      • OpenOffice
      • iWork (Pages/Keynote/Numbers)

      Things to watch for

      For all testing you should have the JavaScript console up and watch of for warnings and errors.

      For each Browser/OS/Office Suite combination

      • After pasting rich content, confirm the result looks approximately as expected.
        **Note that you may need to compare the results to the same actions in the current version of Moodle, as some quirky behaviors still exist after this work.
      • Enter HTML view and confirm that the source seems mostly cleaned of extra markup
      • Switch back to the rich view and confirm that it still looks as expected
      Show
      To fully exercise this code requires quite a bit of work - mostly in that you need numerous operating systems, browsers, and office suites. We are working on testing most/all of the combinations and will post results in the ticket, so the HQ testers will need to decide to what level they want test this through. The browsers that are officially supported: IE 9+ Safari 6+ Chrome 30+ Firefox 25+ OSs that should probably be tried Windows Vista, 7, 8, 8.1 Mac 10.7.x, 10.8.x, 10.9.x, 10.10.x Ubuntu Office suites MS Office 2007, 2010, 2013 MS Office Mac 2011, 2016 (Preview) LibreOffice OpenOffice iWork (Pages/Keynote/Numbers) Things to watch for For all testing you should have the JavaScript console up and watch of for warnings and errors. For each Browser/OS/Office Suite combination After pasting rich content, confirm the result looks approximately as expected. **Note that you may need to compare the results to the same actions in the current version of Moodle, as some quirky behaviors still exist after this work. Enter HTML view and confirm that the source seems mostly cleaned of extra markup Switch back to the rich view and confirm that it still looks as expected
    • Affected Branches:
      MOODLE_27_STABLE, MOODLE_28_STABLE
    • Fixed Branches:
      MOODLE_27_STABLE, MOODLE_28_STABLE
    • Pull Master Branch:
      MDL-47003-master

      Description

      The atto cleanHTML function is a bit heavy handed. It removes characters it shouldn't (see MDL-46746), it also removes all comments (see https://moodle.org/mod/forum/discuss.php?d=264482).

      It also removes <style> tags, but not in a logical way. It removes the opening and closing tags, but leaves the style definitions there - meaning that the style definitions become visible text. If we don't allow 'style' blocks, then we should remove the entire block, otherwise, leave them alone.

      Overall it seems like conglomeration of MS Word stripping research was made in MDL-43857, but not all of the ramifications we considered for each item.

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                8 Vote for this issue
                Watchers:
                16 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:
                  Fix Release Date:
                  11/May/15