Moodle
  1. Moodle
  2. MDL-14442

Improvements importing into database activity when using CSV files produced by export

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.9, 1.9.1
    • Fix Version/s: 1.9.2
    • Labels:
      None
    • Database:
      Any
    • Affected Branches:
      MOODLE_19_STABLE
    • Fixed Branches:
      MOODLE_19_STABLE
    • Rank:
      31011

      Description

      People have been asking for exporting all possible records of a database activity as a CSV file since one year. "All possible records" are all records from field types other than "file" or "picture". Dan Poltawski finally submitted this patch yesterday at MDL-8407. I have re-submitted it including some bugfixes recently, as "MDL-8407-20080419.patch", which works fine against the latest Moodle 1.9+.

      The patch attached here allows you to use the exported CSV file to be used to import all the records. The current implementation failed to import some fields, such as multimenu and latlong.

      In addition, the import process should be less resource eating, since it avoids to read files that are not necessary and does not create mega-objects such like $currentfield. During the import process, there is output written to the browser what has actually been done.

      1. MDL-14442_addslashes_2.patch
        0.9 kB
        Robert Allerstorfer
      2. MDL-14442_error-checking.patch
        2 kB
        Robert Allerstorfer
      3. MDL-14442-20080421.patch
        3 kB
        Robert Allerstorfer
      4. MDL-14442-clean_param.patch
        2 kB
        Robert Allerstorfer

        Issue Links

          Activity

          Hide
          Robert Allerstorfer added a comment -

          New version of the patch from today ("MDL-14442-20080420.patch") with additional fixes:

          + When "url" type fields also have a name, it also gets imported. A complete url field within the CSV file should look like: url name

          + First round of inserting new data_content fields ensures all contents to be NULL, the actual values get filled in at the second round.

          Tested successfully with these field types:
          text, textarea, menu, multimenu, url, latlong

          Show
          Robert Allerstorfer added a comment - New version of the patch from today (" MDL-14442 -20080420.patch") with additional fixes: + When "url" type fields also have a name, it also gets imported. A complete url field within the CSV file should look like: url name + First round of inserting new data_content fields ensures all contents to be NULL, the actual values get filled in at the second round. Tested successfully with these field types: text, textarea, menu, multimenu, url, latlong
          Hide
          Robert Allerstorfer added a comment -

          Next version of patch takes care when the content1 value (the name of an url field) contains spaces.

          Show
          Robert Allerstorfer added a comment - Next version of patch takes care when the content1 value (the name of an url field) contains spaces.
          Hide
          Eloy Lafuente (stronk7) added a comment -

          Assigning to Dongsheng... can you please, test latest patch (MDL-14442-20080421.patch (3 kb) and apply it (with MD confirmation).

          Thanks a lot by your continued work, Robert! B-)

          Show
          Eloy Lafuente (stronk7) added a comment - Assigning to Dongsheng... can you please, test latest patch ( MDL-14442 -20080421.patch (3 kb) and apply it (with MD confirmation). Thanks a lot by your continued work, Robert! B-)
          Hide
          Eloy Lafuente (stronk7) added a comment -

          Addressing this for Moodle 2.0 (mainly to be paired with MDL-8407 ). Martin, can you confirm fixfor version? Sounds like an interesting option to be present ASAP but... FYC.

          Ciao

          Show
          Eloy Lafuente (stronk7) added a comment - Addressing this for Moodle 2.0 (mainly to be paired with MDL-8407 ). Martin, can you confirm fixfor version? Sounds like an interesting option to be present ASAP but... FYC. Ciao
          Hide
          Robert Allerstorfer added a comment -

          Moodle 1.9.1 may be released next Wednesday, so it would be nice to see a progress of this issue before.

          Thanks,
          Robert

          Show
          Robert Allerstorfer added a comment - Moodle 1.9.1 may be released next Wednesday, so it would be nice to see a progress of this issue before. Thanks, Robert
          Hide
          Dongsheng Cai added a comment -

          Hi, Rebert, sorry for delay(I didn't work last Friday), I will confirm this issue with Martin next Monday, and merge your patch to 1.9.1 before Wednesday, so no worries.

          Show
          Dongsheng Cai added a comment - Hi, Rebert, sorry for delay(I didn't work last Friday), I will confirm this issue with Martin next Monday, and merge your patch to 1.9.1 before Wednesday, so no worries.
          Hide
          Robert Allerstorfer added a comment -

          Thanks Martin for giving green light for inclusion in 1.9.1. I have committed the patch to MOODLE_19_STABLE now.

          Best regards,
          Robert

          Show
          Robert Allerstorfer added a comment - Thanks Martin for giving green light for inclusion in 1.9.1. I have committed the patch to MOODLE_19_STABLE now. Best regards, Robert
          Hide
          Eloy Lafuente (stronk7) added a comment -

          Hi Robert,

          some comments about your commit to 19_STABLE:

          • You need to commit it too to HEAD (any commit performed in branches must go to HEAD), marking the 19_STABLE files as merged once performed the commit.
          • If I'm not wrong this was paired with MDL-8407 (the option to export). So both should be there, if I'm not wrong.
          • 100% np... but we use to comment here before performing commits assigned to anybody else (Dongsheng in this case). I insist, 100% np... only trying to comment "courtesy" norms. I re-insist, 100% np! :-D

          Thanks and ciao

          Show
          Eloy Lafuente (stronk7) added a comment - Hi Robert, some comments about your commit to 19_STABLE: You need to commit it too to HEAD (any commit performed in branches must go to HEAD), marking the 19_STABLE files as merged once performed the commit. If I'm not wrong this was paired with MDL-8407 (the option to export). So both should be there, if I'm not wrong. 100% np... but we use to comment here before performing commits assigned to anybody else (Dongsheng in this case). I insist, 100% np... only trying to comment "courtesy" norms. I re-insist, 100% np! :-D Thanks and ciao
          Hide
          Robert Allerstorfer added a comment -

          Dear Eloy,

          sorry for not having being compliant to the usual way - this actually was my first commit to the Moodle code so I'm still a newbie on these norms .

          Allowing to import a CSV file properly also makes sense without an option to export, so it does not harm yet not having the export option, too. When the export ability will be committed, the existing import ability already works well with it. There are still some things to do on the export side. I will post an updated patch at MDL-8407 soon.

          Dongsheng, would you like to do what I also should have done (committing to HEAD)? And please don't hesitate to make changes to my commit if you feel there should be done something.

          Show
          Robert Allerstorfer added a comment - Dear Eloy, sorry for not having being compliant to the usual way - this actually was my first commit to the Moodle code so I'm still a newbie on these norms . Allowing to import a CSV file properly also makes sense without an option to export, so it does not harm yet not having the export option, too. When the export ability will be committed, the existing import ability already works well with it. There are still some things to do on the export side. I will post an updated patch at MDL-8407 soon. Dongsheng, would you like to do what I also should have done (committing to HEAD)? And please don't hesitate to make changes to my commit if you feel there should be done something.
          Hide
          Dongsheng Cai added a comment -

          Hi, Robert, thanks for you committing

          Did you check out the HEAD? Go to the HEAD version, then merge the changes from MOODLE_19_STABLE, in this case, use this command:

          cvs update -kk -j MOODLE_19_MERGED -j MOODLE_19_STABLE import.php

          This command will merge your change to the local copy, you should commit this change to HEAD.
          Update the floating merge tag (MOODLE_19_MERGED) for the affected files so that this process can be repeated next time.

          See more: http://docs.moodle.org/en/CVS_for_Developers

          Show
          Dongsheng Cai added a comment - Hi, Robert, thanks for you committing Did you check out the HEAD? Go to the HEAD version, then merge the changes from MOODLE_19_STABLE, in this case, use this command: cvs update -kk -j MOODLE_19_MERGED -j MOODLE_19_STABLE import.php This command will merge your change to the local copy, you should commit this change to HEAD. Update the floating merge tag (MOODLE_19_MERGED) for the affected files so that this process can be repeated next time. See more: http://docs.moodle.org/en/CVS_for_Developers
          Hide
          Dongsheng Cai added a comment -

          I guess you are not online, so I merged the changes to HEAD for you, I hope you don't mind, thanks

          Show
          Dongsheng Cai added a comment - I guess you are not online, so I merged the changes to HEAD for you, I hope you don't mind, thanks
          Hide
          Martin Dougiamas added a comment -

          Can you please add some error checking on the return values for update_record and insert_record calls?

          Show
          Martin Dougiamas added a comment - Can you please add some error checking on the return values for update_record and insert_record calls?
          Hide
          Robert Allerstorfer added a comment -

          Something like this (MDL-14442_error-checking.patch)?

          Show
          Robert Allerstorfer added a comment - Something like this ( MDL-14442 _error-checking.patch)?
          Hide
          Martin Dougiamas added a comment -

          Yep, thanks!

          Show
          Martin Dougiamas added a comment - Yep, thanks!
          Hide
          Robert Allerstorfer added a comment -

          OK, committed to MOODLE_19_STABLE. Dongsheng, I think merging it into HEAD cannot be automatically, since I think the function error() must be replaced by print_error(). Would you like to do this?

          Thanks,
          Robert

          Show
          Robert Allerstorfer added a comment - OK, committed to MOODLE_19_STABLE. Dongsheng, I think merging it into HEAD cannot be automatically, since I think the function error() must be replaced by print_error(). Would you like to do this? Thanks, Robert
          Hide
          Dongsheng Cai added a comment -

          I have replaced the error(), thanks Robert.

          Show
          Dongsheng Cai added a comment - I have replaced the error(), thanks Robert.
          Hide
          Dongsheng Cai added a comment -

          Fixed and merged into HEAD, please review, feel free to reopen if you find any problem.

          Show
          Dongsheng Cai added a comment - Fixed and merged into HEAD, please review, feel free to reopen if you find any problem.
          Hide
          Robert Allerstorfer added a comment -

          This patch (MDL-14442_addslashes.patch) should also be applied - otherwise the import process fails at the first occurence of a field name or value containing quotes.

          Show
          Robert Allerstorfer added a comment - This patch ( MDL-14442 _addslashes.patch) should also be applied - otherwise the import process fails at the first occurence of a field name or value containing quotes.
          Hide
          Robert Allerstorfer added a comment -

          The addslashes patch should only add backslashes to $value (not to $name). Attached is the corrected patch (MDL-14442_addslashes_2.patch). tested successfully.

          Show
          Robert Allerstorfer added a comment - The addslashes patch should only add backslashes to $value (not to $name). Attached is the corrected patch ( MDL-14442 _addslashes_2.patch). tested successfully.
          Hide
          Eloy Lafuente (stronk7) added a comment -

          Hi Robert, well spotted!

          Are you going to commit this? Else I'll do in 19_STABLE and HEAD (1.9.1 is really near and I think it's ok and safe to send the fix before release).

          (another tip: if one resolved/closed bug needs further action - like this one - feel free to reopen it. That way, it will ping on each developer list of pending bugs, else it can be forgotten easily. Or alternatively, create a new bug, as you want. Personally I prefer to reopen if it's directly related and has been resolved/closed 1-2 days ago).

          Reopening this... thanks!

          Ciao

          Show
          Eloy Lafuente (stronk7) added a comment - Hi Robert, well spotted! Are you going to commit this? Else I'll do in 19_STABLE and HEAD (1.9.1 is really near and I think it's ok and safe to send the fix before release). (another tip: if one resolved/closed bug needs further action - like this one - feel free to reopen it. That way, it will ping on each developer list of pending bugs, else it can be forgotten easily. Or alternatively, create a new bug, as you want. Personally I prefer to reopen if it's directly related and has been resolved/closed 1-2 days ago). Reopening this... thanks! Ciao
          Hide
          Robert Allerstorfer added a comment -

          Thanks Eloy for all your tips

          I have committed the fix to 19_STABLE and HEAD. Let's hope 1.9.1 will get it.

          Show
          Robert Allerstorfer added a comment - Thanks Eloy for all your tips I have committed the fix to 19_STABLE and HEAD. Let's hope 1.9.1 will get it.
          Hide
          Robert Allerstorfer added a comment -

          Eloy, just FYI, I cannot change the status of this issue.

          Show
          Robert Allerstorfer added a comment - Eloy, just FYI, I cannot change the status of this issue.
          Hide
          Dongsheng Cai added a comment -

          Thanks Robert, feel free to reopen

          Show
          Dongsheng Cai added a comment - Thanks Robert, feel free to reopen
          Hide
          Martin Dougiamas added a comment -

          I've made you a developer in the tracker now, Robert, so you can assign things.

          Show
          Martin Dougiamas added a comment - I've made you a developer in the tracker now, Robert, so you can assign things.
          Hide
          Robert Allerstorfer added a comment -

          Thanks Martin and congratulations for the 1.9.1 release

          Show
          Robert Allerstorfer added a comment - Thanks Martin and congratulations for the 1.9.1 release
          Hide
          Robert Allerstorfer added a comment -

          addslashes() is not enough. htmlspecialchars() must also be applied to each field value before importing it. Otherwise, clean_param($value, PARAM_NOTAGS) would cut off everything beginning from a < sign (but not from a > sign).

          Having a field "sample description" for example "fraction > 200µm" would currently work, but "fraction < 200µm" would be cut off to "fraction ". With the attached patch (MDL-14442_htmlspecialchars.patch), all variants will work. clean_param($value, PARAM_NOTAGS) would no more be needed because after a htmlspecialchars() no literal < or > can exist any longer, however, I have kept it to be double-safe.

          Tested successfully. If I get green light, I will commit it into MOODLE_19_STABLE and HEAD.

          Show
          Robert Allerstorfer added a comment - addslashes() is not enough. htmlspecialchars() must also be applied to each field value before importing it. Otherwise, clean_param($value, PARAM_NOTAGS) would cut off everything beginning from a < sign (but not from a > sign). Having a field "sample description" for example "fraction > 200µm" would currently work, but "fraction < 200µm" would be cut off to "fraction ". With the attached patch ( MDL-14442 _htmlspecialchars.patch), all variants will work. clean_param($value, PARAM_NOTAGS) would no more be needed because after a htmlspecialchars() no literal < or > can exist any longer, however, I have kept it to be double-safe. Tested successfully. If I get green light, I will commit it into MOODLE_19_STABLE and HEAD.
          Hide
          Robert Allerstorfer added a comment -

          new patch posted above needs to be applied

          Show
          Robert Allerstorfer added a comment - new patch posted above needs to be applied
          Hide
          Eloy Lafuente (stronk7) added a comment -

          Uhm... I'm not really sure if that's the best solution. Some fields must be able to handle proper HTML and using htmlspecialchars() will kill content, for sure.

          Instead, the followed approach should be to "relax" the validation of the field to PARAM_RAW (looking for analogies with the cleaning performed in forum->posts or glossary->entries) and applying trustext functions to it. Or use, simply, PARAM_CLEAN, that performs basic protection.

          Petr, any insight ?

          Show
          Eloy Lafuente (stronk7) added a comment - Uhm... I'm not really sure if that's the best solution. Some fields must be able to handle proper HTML and using htmlspecialchars() will kill content, for sure. Instead, the followed approach should be to "relax" the validation of the field to PARAM_RAW (looking for analogies with the cleaning performed in forum->posts or glossary->entries) and applying trustext functions to it. Or use, simply, PARAM_CLEAN, that performs basic protection. Petr, any insight ?
          Hide
          Petr Škoda added a comment -

          Trusttext is very tricky my -100 for that here.
          The fields are responsible for the XSS protection, in general the Data mod allows teacher to use JS in templates, but it must prevent JS in data that was received from students. Normally we do the cleaning before the display, which means there should not be issues during import of actual entries. In any case only the field implementation knows what to do, we should let it handle this if needed.

          Show
          Petr Škoda added a comment - Trusttext is very tricky my -100 for that here. The fields are responsible for the XSS protection, in general the Data mod allows teacher to use JS in templates, but it must prevent JS in data that was received from students. Normally we do the cleaning before the display, which means there should not be issues during import of actual entries. In any case only the field implementation knows what to do, we should let it handle this if needed.
          Hide
          Robert Allerstorfer added a comment -

          Eloy,

          PARAM_RAW does not do any cleaning/processing at all, so its using it would be completely useless.

          Regarding PARAM_CLEAN, a comment in 'moodlelib.php' reads as follows:
          /**

          • PARAM_CLEAN - obsoleted, please try to use more specific type of parameter.
          • It was one of the first types, that is why it is abused so much
            */

          The only field type that may contain HTML tags is 'textarea'. Thus, htmlspecialchars() could safely be applied at least to all other field types. How should we proceed?

          Show
          Robert Allerstorfer added a comment - Eloy, PARAM_RAW does not do any cleaning/processing at all, so its using it would be completely useless. Regarding PARAM_CLEAN, a comment in 'moodlelib.php' reads as follows: /** PARAM_CLEAN - obsoleted, please try to use more specific type of parameter. It was one of the first types, that is why it is abused so much */ The only field type that may contain HTML tags is 'textarea'. Thus, htmlspecialchars() could safely be applied at least to all other field types. How should we proceed?
          Hide
          Petr Škoda added a comment -

          You do not know what some 3rd party fields do with the data, hardcoded htmlspecialchars() here is imho a bad solution. I really think that by default it should do nothing and if needed let fields decide what to do.

          Show
          Petr Škoda added a comment - You do not know what some 3rd party fields do with the data, hardcoded htmlspecialchars() here is imho a bad solution. I really think that by default it should do nothing and if needed let fields decide what to do.
          Hide
          Robert Allerstorfer added a comment -

          Petr,

          so how would your proposed patch look like? Thew current implementation (clean_param($value, PARAM_NOTAGS)) is not good.

          Show
          Robert Allerstorfer added a comment - Petr, so how would your proposed patch look like? Thew current implementation (clean_param($value, PARAM_NOTAGS)) is not good.
          Hide
          Petr Škoda added a comment -

          Unfortunately I can not help you now, will be back working on moodle code on Thursday

          Show
          Petr Škoda added a comment - Unfortunately I can not help you now, will be back working on moodle code on Thursday
          Hide
          Robert Allerstorfer added a comment -

          The new attached patch (MDL-14442-clean_param.patch) tries to solve this problem better. I have completely removed clean_param($value, PARAM_NOTAGS) since it neither allows HTML nor literal < or > symbols. As mentioned above, the only field type where HTML is possible is 'textarea', thus I now treat this type differently than all the others:

          To keep HTML in 'textarea' fields, clean_param($value, PARAM_CLEANHTML) cleaning will be performed. This is a compromise since it still strips something like "fraction < 200µm" to "fraction " (but keeps "fraction > 200µm" as is). The workaround is to replace those symbols in the CSV file prior to use it for importing.

          In all other fields, literal < and > symbols will be replaced by their HTML entities (but not & - so htmlspecialchars() would in fact be too much). I have entensively tested this patch together with MDL-8407_20080518.patch and both work fine for me.

          Eloy, what do you think?

          Show
          Robert Allerstorfer added a comment - The new attached patch ( MDL-14442 -clean_param.patch) tries to solve this problem better. I have completely removed clean_param($value, PARAM_NOTAGS) since it neither allows HTML nor literal < or > symbols. As mentioned above, the only field type where HTML is possible is 'textarea', thus I now treat this type differently than all the others: To keep HTML in 'textarea' fields, clean_param($value, PARAM_CLEANHTML) cleaning will be performed. This is a compromise since it still strips something like "fraction < 200µm" to "fraction " (but keeps "fraction > 200µm" as is). The workaround is to replace those symbols in the CSV file prior to use it for importing. In all other fields, literal < and > symbols will be replaced by their HTML entities (but not & - so htmlspecialchars() would in fact be too much). I have entensively tested this patch together with MDL-8407 _20080518.patch and both work fine for me. Eloy, what do you think?
          Hide
          Robert Allerstorfer added a comment -

          I am planning to commit MDL-14442-clean_param.patch to MOODLE_19_STABLE and HEAD in a few days, since there has yet not been any response against it. I have tested it on importing large sets of records, with all possible field types, and it works fine (in contrast to the currently used PARAM_NOTAGS on all field types). I don't see a reason to keep known buggy implementations for a long time when there are tested working solutions available. If someone is against the commit, please give feedback.

          Show
          Robert Allerstorfer added a comment - I am planning to commit MDL-14442 -clean_param.patch to MOODLE_19_STABLE and HEAD in a few days, since there has yet not been any response against it. I have tested it on importing large sets of records, with all possible field types, and it works fine (in contrast to the currently used PARAM_NOTAGS on all field types). I don't see a reason to keep known buggy implementations for a long time when there are tested working solutions available. If someone is against the commit, please give feedback.
          Hide
          Robert Allerstorfer added a comment -

          Dongsheng, since you have been assigned as QA assignee, could you pls verify my last commit, so we could reclose this bug?

          Show
          Robert Allerstorfer added a comment - Dongsheng, since you have been assigned as QA assignee, could you pls verify my last commit, so we could reclose this bug?
          Hide
          Dongsheng Cai added a comment -

          Reclose, Thanks

          Show
          Dongsheng Cai added a comment - Reclose, Thanks

            People

            • Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: