Moodle
  1. Moodle
  2. MDL-37847

Plain text essays mess up HTML special characters.

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 2.3.4, 2.4.1, 2.5
    • Fix Version/s: 2.3.5, 2.4.2
    • Component/s: Questions
    • Labels:
    • Testing Instructions:
      Hide

      1. Create an essay question that uses Plain text Response format.
      2. Preview the question, and input a response that includes HTML special characters. (See, for example https://moodle.org/mod/forum/discuss.php?d=220480#p963039.)
      3. Sumbit all and finish. Verify that the response has been saved and re-displayed correctly.

      4. Ideally verify the other three response format also still work.

      Show
      1. Create an essay question that uses Plain text Response format. 2. Preview the question, and input a response that includes HTML special characters. (See, for example https://moodle.org/mod/forum/discuss.php?d=220480#p963039 .) 3. Sumbit all and finish. Verify that the response has been saved and re-displayed correctly. 4. Ideally verify the other three response format also still work.
    • Affected Branches:
      MOODLE_23_STABLE, MOODLE_24_STABLE, MOODLE_25_STABLE
    • Fixed Branches:
      MOODLE_23_STABLE, MOODLE_24_STABLE
    • Pull from Repository:
    • Pull Master Branch:

      Description

      Steps to reproduce:

      1. Create an essay question where the response format is 'Plain text', not 'HTML editor'.

      2. Preview the question using deferred feedback, and enter a response like:

      Here <is> some junk</is>:

      x < 1

      3. Submit all and finish.

      Expected result:

      The question is shown in review mode, with the student's answer clearly visible.

      Actual result:

      The review shows

      Here some junk:

      x < 1

        Gliffy Diagrams

          Issue Links

            Activity

            Hide
            Tim Hunt added a comment -

            Fix done. Submitting for integration now.

            How come we have not spotted this before. It has been broken since 2.1!

            Show
            Tim Hunt added a comment - Fix done. Submitting for integration now. How come we have not spotted this before. It has been broken since 2.1!
            Hide
            Eloy Lafuente (stronk7) added a comment -

            Integrated (23, 24 & master), thanks!

            Show
            Eloy Lafuente (stronk7) added a comment - Integrated (23, 24 & master), thanks!
            Hide
            Frédéric Massart added a comment -

            The plain text format does not strip the HTML any more, but failing for discussion. I found another issue but I'm not sure it's related to this one. If not, then passing this test and raising another issue is probably the thing to do.

            Replication steps

            1. Edit your profile and set When editing text to Use standard web forms
            2. Preview a Plain Text question (response format: HTML editor)
            3. Enter some HTML (see below) selecting one of the following format:
              • Moodle
              • Plain text
              • Markdown

            Actual

            • The HTML has been stripped out

            Expected

            • The content remains the same

            Sample text used:

            <p><b>To: John Doe</b> <johndoe@acme.com>
            <br/>From: Jane Doe <doe@xyz.com>
            <p>
            

            Show
            Frédéric Massart added a comment - The plain text format does not strip the HTML any more, but failing for discussion. I found another issue but I'm not sure it's related to this one. If not, then passing this test and raising another issue is probably the thing to do. Replication steps Edit your profile and set When editing text to Use standard web forms Preview a Plain Text question (response format: HTML editor) Enter some HTML (see below) selecting one of the following format: Moodle Plain text Markdown Actual The HTML has been stripped out Expected The content remains the same Sample text used: <p><b>To: John Doe</b> <johndoe@acme.com> <br/>From: Jane Doe <doe@xyz.com> <p>
            Hide
            Tim Hunt added a comment -

            Fred, if you have found that, then it is a different issue to this bug.

            I am not sure what you have found is a bug. If you are typing HTML input, it should be

            <p><b>To: John Doe</b> &lt;johndoe@acme.com&gt;
            <br/>From: Jane Doe &lt;doe@xyz.com&gt;
            <p>
            

            If you are typing plain text input, then it should be

            To: John Doe <johndoe@acme.com>
             
            From: Jane Doe <doe@xyz.com>
            

            If you are using Moodle auto-format input, then I don't know what is correct. Should you escape the < > or not? I guess not, but hopefully this is documented somewhere.

            Anyway, the real problem is that the way optional_praram('name', PARAM_CLEANHTML); works is not compatible with how the editor system works. (The input may actually be in any number of different formats.) That is quite a fundamental problem! --> New issue.

            Show
            Tim Hunt added a comment - Fred, if you have found that, then it is a different issue to this bug. I am not sure what you have found is a bug. If you are typing HTML input, it should be <p><b>To: John Doe</b> &lt;johndoe@acme.com&gt; <br/>From: Jane Doe &lt;doe@xyz.com&gt; <p> If you are typing plain text input, then it should be To: John Doe <johndoe@acme.com>   From: Jane Doe <doe@xyz.com> If you are using Moodle auto-format input, then I don't know what is correct. Should you escape the < > or not? I guess not, but hopefully this is documented somewhere. Anyway, the real problem is that the way optional_praram('name', PARAM_CLEANHTML); works is not compatible with how the editor system works. (The input may actually be in any number of different formats.) That is quite a fundamental problem! --> New issue.
            Hide
            Tim Hunt added a comment -

            Hmm. For comparison, I just looked at mod/forum/post_form.php. That uses PARAM_RAW for the input, even though this is untrusted student input. Perhaps PARAM_CLEANHTML should not be used. Its PHPdoc comment does not make it at all clear when it is appropriate.

            Show
            Tim Hunt added a comment - Hmm. For comparison, I just looked at mod/forum/post_form.php. That uses PARAM_RAW for the input, even though this is untrusted student input. Perhaps PARAM_CLEANHTML should not be used. Its PHPdoc comment does not make it at all clear when it is appropriate.
            Hide
            Eloy Lafuente (stronk7) added a comment -

            Shouldn't we be applying different param cleanings based on the formats, more yet, shouldn't that be automatically provided by the editors stuff?

            Allowing the developer to decide there is somehow, stupid (see the PARAM_RAW), if we can provide that from the editor. Or at least have some helper function like:

            giveme_cleaner_for_format($format)

            And apply it everywhere. Else discrepancies will continue happening always.

            But surely that's another issue, if I've not understood this wrongly.

            Ciao

            Show
            Eloy Lafuente (stronk7) added a comment - Shouldn't we be applying different param cleanings based on the formats, more yet, shouldn't that be automatically provided by the editors stuff? Allowing the developer to decide there is somehow, stupid (see the PARAM_RAW), if we can provide that from the editor. Or at least have some helper function like: giveme_cleaner_for_format($format) And apply it everywhere. Else discrepancies will continue happening always. But surely that's another issue, if I've not understood this wrongly. Ciao
            Hide
            Tim Hunt added a comment - - edited

            The problem with that, Eloy, is that it is a completely different API from optional_param.

            My code is effectively

            $expectedparams = get_expected_params();
            $values = array();
            foreach ($expectedparams as $name => $type) {
                $values[$name] = optional_param($name, null, $type);
            }
            

            If the HTML editor requires something completely different, then it is a mess.

            Perhaps we need a PARAM_EDITOR_INPUT - except that we seem to use PARAM_RAW for that.

            Anyway, this is a different issue.

            Show
            Tim Hunt added a comment - - edited The problem with that, Eloy, is that it is a completely different API from optional_param. My code is effectively $expectedparams = get_expected_params(); $values = array(); foreach ($expectedparams as $name => $type) { $values[$name] = optional_param($name, null, $type); } If the HTML editor requires something completely different, then it is a mess. Perhaps we need a PARAM_EDITOR_INPUT - except that we seem to use PARAM_RAW for that. Anyway, this is a different issue.
            Hide
            Petr Skoda added a comment -

            We clean texts BEFORE output as part of the format_text()...

            Show
            Petr Skoda added a comment - We clean texts BEFORE output as part of the format_text()...
            Hide
            Tim Hunt added a comment -

            Of course we do Petr. That is not the question.

            Questions are:

            1. When is it correct to use PARAM_CLEANHTML - grep finds 48 references in master. Are they all correct?

            2. Speaking of which, why do we have this code duplicated in almost every mod_form?

            if (!empty($CFG->formatstringstriptags)) {
                $mform->setType('name', PARAM_TEXT);
            } else {
                $mform->setType('name', PARAM_CLEANHTML);
            }
            

            I can't understand this right now. Anwyay, if this is necessary, wouldn't it be better to introduce PARAM_ACTIVITYNAME, and hide that if inside clean_param?

            3. When you are using the HTML editor for student-like input (strick cleaning) what is the correct PARAM type?

            I think I have worked out the answer to 3. It is PARAM_RAW, so I need to fix the HTML-editor case of essay, but that is a different issue.

            Show
            Tim Hunt added a comment - Of course we do Petr. That is not the question. Questions are: 1. When is it correct to use PARAM_CLEANHTML - grep finds 48 references in master. Are they all correct? 2. Speaking of which, why do we have this code duplicated in almost every mod_form? if (!empty($CFG->formatstringstriptags)) { $mform->setType('name', PARAM_TEXT); } else { $mform->setType('name', PARAM_CLEANHTML); } I can't understand this right now. Anwyay, if this is necessary, wouldn't it be better to introduce PARAM_ACTIVITYNAME, and hide that if inside clean_param? 3. When you are using the HTML editor for student-like input (strick cleaning) what is the correct PARAM type? I think I have worked out the answer to 3. It is PARAM_RAW, so I need to fix the HTML-editor case of essay, but that is a different issue.
            Hide
            Eloy Lafuente (stronk7) added a comment - - edited

            So, new issues for 1. 2. & 3. above + resetting this and pass is the way?

            Edited: grrr, sorry.

            Show
            Eloy Lafuente (stronk7) added a comment - - edited So, new issues for 1. 2. & 3. above + resetting this and pass is the way? Edited: grrr, sorry.
            Hide
            Petr Skoda added a comment -

            1. most probably never, the cleanhtml was introduced to pass around html fragments without storage into database, that is why it did not have magic quotes originally

            2. because PARAM_TEXT is confusing name, it was supposed to mean plaintext with multilang compatible syntax only and sometimes also html entities

            3. anything from general text editor is PARAM_RAW, format_text() does the cleaning later and also if you want somebody else to edit the text (do you remember htmlarea executing any embedded JS?)

            Show
            Petr Skoda added a comment - 1. most probably never, the cleanhtml was introduced to pass around html fragments without storage into database, that is why it did not have magic quotes originally 2. because PARAM_TEXT is confusing name, it was supposed to mean plaintext with multilang compatible syntax only and sometimes also html entities 3. anything from general text editor is PARAM_RAW, format_text() does the cleaning later and also if you want somebody else to edit the text (do you remember htmlarea executing any embedded JS?)
            Hide
            Tim Hunt added a comment -

            Eloy, correct.

            Show
            Tim Hunt added a comment - Eloy, correct.
            Hide
            Tim Hunt added a comment -

            I created MDL-37847 for the qtype_essay issue (3).

            Show
            Tim Hunt added a comment - I created MDL-37847 for the qtype_essay issue (3).
            Hide
            Eloy Lafuente (stronk7) added a comment -

            I've create MDL-37908 about to clarify and review PARAM_CLEANHTML uses (1)

            And MDL-37909 about the repeated code block (2).

            So sending back to testing and putting it in Fred hands, thanks!

            Show
            Eloy Lafuente (stronk7) added a comment - I've create MDL-37908 about to clarify and review PARAM_CLEANHTML uses (1) And MDL-37909 about the repeated code block (2). So sending back to testing and putting it in Fred hands, thanks!
            Hide
            Frédéric Massart added a comment -

            Passing as behaviour was expected. Cheers!

            Show
            Frédéric Massart added a comment - Passing as behaviour was expected. Cheers!
            Hide
            Damyon Wiese added a comment -

            Congratulations! This issue has been resolved. Thanks for helping to make Moodle better for everyone!

            Regards, Damyon

            Show
            Damyon Wiese added a comment - Congratulations! This issue has been resolved. Thanks for helping to make Moodle better for everyone! Regards, Damyon
            Hide
            Justin Hunt added a comment -

            FWIW I think this change to questionattempt.php:

            • const PARAM_CLEANHTML_FILES = 'paramcleanhtmlfiles';
              + const PARAM_RAW_FILES = 'paramrawfiles';

            Should have been left till Moodle 2.5 . Because it has broken my question , the poodllrecording question. And possibly other 3rd party question types. If you change an API that is used in 3rd party types, in the minor releases it causes us a real headache.
            i) because we don't really find out about the change till our questions break. So its not just a bug fix, its a mail to all users, a flurry of email support blah blah.
            ii) and because when users update their Moodle versions, the questionattempt,php and the also changed essay question type will come down. So Moodle HQ is safe. But our question types won't. So we have to implement spaghetti code to handle the different host versions of Moodle our questions might be running in.

            Or did I miss something?

            Show
            Justin Hunt added a comment - FWIW I think this change to questionattempt.php: const PARAM_CLEANHTML_FILES = 'paramcleanhtmlfiles'; + const PARAM_RAW_FILES = 'paramrawfiles'; Should have been left till Moodle 2.5 . Because it has broken my question , the poodllrecording question. And possibly other 3rd party question types. If you change an API that is used in 3rd party types, in the minor releases it causes us a real headache. i) because we don't really find out about the change till our questions break. So its not just a bug fix, its a mail to all users, a flurry of email support blah blah. ii) and because when users update their Moodle versions, the questionattempt,php and the also changed essay question type will come down. So Moodle HQ is safe. But our question types won't. So we have to implement spaghetti code to handle the different host versions of Moodle our questions might be running in. Or did I miss something?
            Hide
            Tim Hunt added a comment -

            No. I missed something. Sorry, I did not realise this would actually break any third-party qtypes. Hopefully it will not take you too long to sort this out.

            This most certainly is a necessary bug fix. See https://moodle.org/mod/forum/discuss.php?d=220480#p963039. (And Moodle HQ had nothing to do with this, other than integrating it.)

            And it will not really require spaghetti code for you to fix this, just if (isset(question_attempt::PARAM_CLEANHTML_FILES)) in one place.

            So, as I say, sorry we screwed up slightly when fixing this nasty little bug, but please don't overreact.

            Show
            Tim Hunt added a comment - No. I missed something. Sorry, I did not realise this would actually break any third-party qtypes. Hopefully it will not take you too long to sort this out. This most certainly is a necessary bug fix. See https://moodle.org/mod/forum/discuss.php?d=220480#p963039 . (And Moodle HQ had nothing to do with this, other than integrating it.) And it will not really require spaghetti code for you to fix this, just if (isset(question_attempt::PARAM_CLEANHTML_FILES)) in one place. So, as I say, sorry we screwed up slightly when fixing this nasty little bug, but please don't overreact.
            Hide
            Justin Hunt added a comment -

            True, I overreacted a bit. Sorry. We are all just trying to get the software right.

            Show
            Justin Hunt added a comment - True, I overreacted a bit. Sorry. We are all just trying to get the software right.
            Hide
            Tim Hunt added a comment -

            No worries. I am sure I would have 'overreacted' in the same way, or worse, if it had been the other way around. I'm really sorry we caused you that hassle. I must remember Poodll when I touch this code again, and be warned, MDL-34640 will happen some time. You may want to add yourself as a watcher to that issue. (Hopefully those changes will help you in the long term, and won't inconvenience you too much in the short-term.)

            Show
            Tim Hunt added a comment - No worries. I am sure I would have 'overreacted' in the same way, or worse, if it had been the other way around. I'm really sorry we caused you that hassle. I must remember Poodll when I touch this code again, and be warned, MDL-34640 will happen some time. You may want to add yourself as a watcher to that issue. (Hopefully those changes will help you in the long term, and won't inconvenience you too much in the short-term.)
            Hide
            Eloy Lafuente (stronk7) added a comment - - edited

            Yeah, surely Tim would have been worse, lol.

            Apologies Justin, we really need to minimize these cases, grrr.

            Thanks both, btw... both.... "Hunts" ? heh!

            PS: Tim, I don't remember if it was done but perhaps it would be worth commenting the change in upgrade.txts or wherever to allow everybody with interests to know?

            Show
            Eloy Lafuente (stronk7) added a comment - - edited Yeah, surely Tim would have been worse, lol. Apologies Justin, we really need to minimize these cases, grrr. Thanks both, btw... both.... "Hunts" ? heh! PS: Tim, I don't remember if it was done but perhaps it would be worth commenting the change in upgrade.txts or wherever to allow everybody with interests to know?
            Hide
            Tim Hunt added a comment -

            I created MDL-38119 for adding this to the upgrade.txt files.

            Show
            Tim Hunt added a comment - I created MDL-38119 for adding this to the upgrade.txt files.

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: