Moodle

Improvements importing into database activity when using CSV files produced by export

Details

  • Type: Bug Bug
  • Status: Closed Closed
  • Priority: Major Major
  • Resolution: Fixed
  • Affects Version/s: 1.9, 1.9.1
  • Fix Version/s: 1.9.2
  • Labels:
    None
  • Database:
    Any
  • Affected Branches:
    MOODLE_19_STABLE
  • Fixed Branches:
    MOODLE_19_STABLE

Description

People have been asking for exporting all possible records of a database activity as a CSV file since one year. "All possible records" are all records from field types other than "file" or "picture". Dan Poltawski finally submitted this patch yesterday at MDL-8407. I have re-submitted it including some bugfixes recently, as "MDL-8407-20080419.patch", which works fine against the latest Moodle 1.9+.

The patch attached here allows you to use the exported CSV file to be used to import all the records. The current implementation failed to import some fields, such as multimenu and latlong.

In addition, the import process should be less resource eating, since it avoids to read files that are not necessary and does not create mega-objects such like $currentfield. During the import process, there is output written to the browser what has actually been done.

  1. MDL-14442_addslashes_2.patch
    14/May/08 9:03 PM
    0.9 kB
    Robert Allerstorfer
  2. MDL-14442_error-checking.patch
    13/May/08 5:47 PM
    2 kB
    Robert Allerstorfer
  3. MDL-14442-20080421.patch
    21/Apr/08 1:24 AM
    3 kB
    Robert Allerstorfer
  4. MDL-14442-clean_param.patch
    19/May/08 4:53 AM
    2 kB
    Robert Allerstorfer

Issue Links

Activity

Hide
Robert Allerstorfer added a comment -

New version of the patch from today ("MDL-14442-20080420.patch") with additional fixes:

+ When "url" type fields also have a name, it also gets imported. A complete url field within the CSV file should look like: url name

+ First round of inserting new data_content fields ensures all contents to be NULL, the actual values get filled in at the second round.

Tested successfully with these field types:
text, textarea, menu, multimenu, url, latlong

Show
Robert Allerstorfer added a comment - New version of the patch from today ("MDL-14442-20080420.patch") with additional fixes: + When "url" type fields also have a name, it also gets imported. A complete url field within the CSV file should look like: url name + First round of inserting new data_content fields ensures all contents to be NULL, the actual values get filled in at the second round. Tested successfully with these field types: text, textarea, menu, multimenu, url, latlong
Hide
Robert Allerstorfer added a comment -

Next version of patch takes care when the content1 value (the name of an url field) contains spaces.

Show
Robert Allerstorfer added a comment - Next version of patch takes care when the content1 value (the name of an url field) contains spaces.
Hide
Eloy Lafuente (stronk7) added a comment -

Assigning to Dongsheng... can you please, test latest patch (MDL-14442-20080421.patch (3 kb) and apply it (with MD confirmation).

Thanks a lot by your continued work, Robert! B-)

Show
Eloy Lafuente (stronk7) added a comment - Assigning to Dongsheng... can you please, test latest patch (MDL-14442-20080421.patch (3 kb) and apply it (with MD confirmation). Thanks a lot by your continued work, Robert! B-)
Hide
Eloy Lafuente (stronk7) added a comment -

Addressing this for Moodle 2.0 (mainly to be paired with MDL-8407 ). Martin, can you confirm fixfor version? Sounds like an interesting option to be present ASAP but... FYC.

Ciao

Show
Eloy Lafuente (stronk7) added a comment - Addressing this for Moodle 2.0 (mainly to be paired with MDL-8407 ). Martin, can you confirm fixfor version? Sounds like an interesting option to be present ASAP but... FYC. Ciao
Hide
Robert Allerstorfer added a comment -

Moodle 1.9.1 may be released next Wednesday, so it would be nice to see a progress of this issue before.

Thanks,
Robert

Show
Robert Allerstorfer added a comment - Moodle 1.9.1 may be released next Wednesday, so it would be nice to see a progress of this issue before. Thanks, Robert
Hide
Dongsheng Cai added a comment -

Hi, Rebert, sorry for delay(I didn't work last Friday), I will confirm this issue with Martin next Monday, and merge your patch to 1.9.1 before Wednesday, so no worries.

Show
Dongsheng Cai added a comment - Hi, Rebert, sorry for delay(I didn't work last Friday), I will confirm this issue with Martin next Monday, and merge your patch to 1.9.1 before Wednesday, so no worries.
Hide
Robert Allerstorfer added a comment -

Thanks Martin for giving green light for inclusion in 1.9.1. I have committed the patch to MOODLE_19_STABLE now.

Best regards,
Robert

Show
Robert Allerstorfer added a comment - Thanks Martin for giving green light for inclusion in 1.9.1. I have committed the patch to MOODLE_19_STABLE now. Best regards, Robert
Hide
Eloy Lafuente (stronk7) added a comment -

Hi Robert,

some comments about your commit to 19_STABLE:

  • You need to commit it too to HEAD (any commit performed in branches must go to HEAD), marking the 19_STABLE files as merged once performed the commit.
  • If I'm not wrong this was paired with MDL-8407 (the option to export). So both should be there, if I'm not wrong.
  • 100% np... but we use to comment here before performing commits assigned to anybody else (Dongsheng in this case). I insist, 100% np... only trying to comment "courtesy" norms. I re-insist, 100% np! :-D

Thanks and ciao

Show
Eloy Lafuente (stronk7) added a comment - Hi Robert, some comments about your commit to 19_STABLE:
  • You need to commit it too to HEAD (any commit performed in branches must go to HEAD), marking the 19_STABLE files as merged once performed the commit.
  • If I'm not wrong this was paired with MDL-8407 (the option to export). So both should be there, if I'm not wrong.
  • 100% np... but we use to comment here before performing commits assigned to anybody else (Dongsheng in this case). I insist, 100% np... only trying to comment "courtesy" norms. I re-insist, 100% np! :-D
Thanks and ciao
Hide
Robert Allerstorfer added a comment -

Dear Eloy,

sorry for not having being compliant to the usual way - this actually was my first commit to the Moodle code so I'm still a newbie on these norms .

Allowing to import a CSV file properly also makes sense without an option to export, so it does not harm yet not having the export option, too. When the export ability will be committed, the existing import ability already works well with it. There are still some things to do on the export side. I will post an updated patch at MDL-8407 soon.

Dongsheng, would you like to do what I also should have done (committing to HEAD)? And please don't hesitate to make changes to my commit if you feel there should be done something.

Show
Robert Allerstorfer added a comment - Dear Eloy, sorry for not having being compliant to the usual way - this actually was my first commit to the Moodle code so I'm still a newbie on these norms . Allowing to import a CSV file properly also makes sense without an option to export, so it does not harm yet not having the export option, too. When the export ability will be committed, the existing import ability already works well with it. There are still some things to do on the export side. I will post an updated patch at MDL-8407 soon. Dongsheng, would you like to do what I also should have done (committing to HEAD)? And please don't hesitate to make changes to my commit if you feel there should be done something.
Hide
Dongsheng Cai added a comment -

Hi, Robert, thanks for you committing

Did you check out the HEAD? Go to the HEAD version, then merge the changes from MOODLE_19_STABLE, in this case, use this command:

cvs update -kk -j MOODLE_19_MERGED -j MOODLE_19_STABLE import.php

This command will merge your change to the local copy, you should commit this change to HEAD.
Update the floating merge tag (MOODLE_19_MERGED) for the affected files so that this process can be repeated next time.

See more: http://docs.moodle.org/en/CVS_for_Developers

Show
Dongsheng Cai added a comment - Hi, Robert, thanks for you committing Did you check out the HEAD? Go to the HEAD version, then merge the changes from MOODLE_19_STABLE, in this case, use this command: cvs update -kk -j MOODLE_19_MERGED -j MOODLE_19_STABLE import.php This command will merge your change to the local copy, you should commit this change to HEAD. Update the floating merge tag (MOODLE_19_MERGED) for the affected files so that this process can be repeated next time. See more: http://docs.moodle.org/en/CVS_for_Developers
Hide
Dongsheng Cai added a comment -

I guess you are not online, so I merged the changes to HEAD for you, I hope you don't mind, thanks

Show
Dongsheng Cai added a comment - I guess you are not online, so I merged the changes to HEAD for you, I hope you don't mind, thanks
Hide
Martin Dougiamas added a comment -

Can you please add some error checking on the return values for update_record and insert_record calls?

Show
Martin Dougiamas added a comment - Can you please add some error checking on the return values for update_record and insert_record calls?
Hide
Robert Allerstorfer added a comment -

Something like this (MDL-14442_error-checking.patch)?

Show
Robert Allerstorfer added a comment - Something like this (MDL-14442_error-checking.patch)?
Hide
Martin Dougiamas added a comment -

Yep, thanks!

Show
Martin Dougiamas added a comment - Yep, thanks!
Hide
Robert Allerstorfer added a comment -

OK, committed to MOODLE_19_STABLE. Dongsheng, I think merging it into HEAD cannot be automatically, since I think the function error() must be replaced by print_error(). Would you like to do this?

Thanks,
Robert

Show
Robert Allerstorfer added a comment - OK, committed to MOODLE_19_STABLE. Dongsheng, I think merging it into HEAD cannot be automatically, since I think the function error() must be replaced by print_error(). Would you like to do this? Thanks, Robert
Hide
Dongsheng Cai added a comment -

I have replaced the error(), thanks Robert.

Show
Dongsheng Cai added a comment - I have replaced the error(), thanks Robert.
Hide
Dongsheng Cai added a comment -

Fixed and merged into HEAD, please review, feel free to reopen if you find any problem.

Show
Dongsheng Cai added a comment - Fixed and merged into HEAD, please review, feel free to reopen if you find any problem.
Hide
Robert Allerstorfer added a comment -

This patch (MDL-14442_addslashes.patch) should also be applied - otherwise the import process fails at the first occurence of a field name or value containing quotes.

Show
Robert Allerstorfer added a comment - This patch (MDL-14442_addslashes.patch) should also be applied - otherwise the import process fails at the first occurence of a field name or value containing quotes.
Hide
Robert Allerstorfer added a comment -

The addslashes patch should only add backslashes to $value (not to $name). Attached is the corrected patch (MDL-14442_addslashes_2.patch). tested successfully.

Show
Robert Allerstorfer added a comment - The addslashes patch should only add backslashes to $value (not to $name). Attached is the corrected patch (MDL-14442_addslashes_2.patch). tested successfully.
Hide
Eloy Lafuente (stronk7) added a comment -

Hi Robert, well spotted!

Are you going to commit this? Else I'll do in 19_STABLE and HEAD (1.9.1 is really near and I think it's ok and safe to send the fix before release).

(another tip: if one resolved/closed bug needs further action - like this one - feel free to reopen it. That way, it will ping on each developer list of pending bugs, else it can be forgotten easily. Or alternatively, create a new bug, as you want. Personally I prefer to reopen if it's directly related and has been resolved/closed 1-2 days ago).

Reopening this... thanks!

Ciao

Show
Eloy Lafuente (stronk7) added a comment - Hi Robert, well spotted! Are you going to commit this? Else I'll do in 19_STABLE and HEAD (1.9.1 is really near and I think it's ok and safe to send the fix before release). (another tip: if one resolved/closed bug needs further action - like this one - feel free to reopen it. That way, it will ping on each developer list of pending bugs, else it can be forgotten easily. Or alternatively, create a new bug, as you want. Personally I prefer to reopen if it's directly related and has been resolved/closed 1-2 days ago). Reopening this... thanks! Ciao
Hide
Robert Allerstorfer added a comment -

Thanks Eloy for all your tips

I have committed the fix to 19_STABLE and HEAD. Let's hope 1.9.1 will get it.

Show
Robert Allerstorfer added a comment - Thanks Eloy for all your tips I have committed the fix to 19_STABLE and HEAD. Let's hope 1.9.1 will get it.
Hide
Robert Allerstorfer added a comment -

Eloy, just FYI, I cannot change the status of this issue.

Show
Robert Allerstorfer added a comment - Eloy, just FYI, I cannot change the status of this issue.
Hide
Dongsheng Cai added a comment -

Thanks Robert, feel free to reopen

Show
Dongsheng Cai added a comment - Thanks Robert, feel free to reopen
Hide
Martin Dougiamas added a comment -

I've made you a developer in the tracker now, Robert, so you can assign things.

Show
Martin Dougiamas added a comment - I've made you a developer in the tracker now, Robert, so you can assign things.
Hide
Robert Allerstorfer added a comment -

Thanks Martin and congratulations for the 1.9.1 release

Show
Robert Allerstorfer added a comment - Thanks Martin and congratulations for the 1.9.1 release
Hide
Robert Allerstorfer added a comment -

addslashes() is not enough. htmlspecialchars() must also be applied to each field value before importing it. Otherwise, clean_param($value, PARAM_NOTAGS) would cut off everything beginning from a < sign (but not from a > sign).

Having a field "sample description" for example "fraction > 200µm" would currently work, but "fraction < 200µm" would be cut off to "fraction ". With the attached patch (MDL-14442_htmlspecialchars.patch), all variants will work. clean_param($value, PARAM_NOTAGS) would no more be needed because after a htmlspecialchars() no literal < or > can exist any longer, however, I have kept it to be double-safe.

Tested successfully. If I get green light, I will commit it into MOODLE_19_STABLE and HEAD.

Show
Robert Allerstorfer added a comment - addslashes() is not enough. htmlspecialchars() must also be applied to each field value before importing it. Otherwise, clean_param($value, PARAM_NOTAGS) would cut off everything beginning from a < sign (but not from a > sign). Having a field "sample description" for example "fraction > 200µm" would currently work, but "fraction < 200µm" would be cut off to "fraction ". With the attached patch (MDL-14442_htmlspecialchars.patch), all variants will work. clean_param($value, PARAM_NOTAGS) would no more be needed because after a htmlspecialchars() no literal < or > can exist any longer, however, I have kept it to be double-safe. Tested successfully. If I get green light, I will commit it into MOODLE_19_STABLE and HEAD.
Hide
Robert Allerstorfer added a comment -

new patch posted above needs to be applied

Show
Robert Allerstorfer added a comment - new patch posted above needs to be applied
Hide
Eloy Lafuente (stronk7) added a comment -

Uhm... I'm not really sure if that's the best solution. Some fields must be able to handle proper HTML and using htmlspecialchars() will kill content, for sure.

Instead, the followed approach should be to "relax" the validation of the field to PARAM_RAW (looking for analogies with the cleaning performed in forum->posts or glossary->entries) and applying trustext functions to it. Or use, simply, PARAM_CLEAN, that performs basic protection.

Petr, any insight ?

Show
Eloy Lafuente (stronk7) added a comment - Uhm... I'm not really sure if that's the best solution. Some fields must be able to handle proper HTML and using htmlspecialchars() will kill content, for sure. Instead, the followed approach should be to "relax" the validation of the field to PARAM_RAW (looking for analogies with the cleaning performed in forum->posts or glossary->entries) and applying trustext functions to it. Or use, simply, PARAM_CLEAN, that performs basic protection. Petr, any insight ?
Hide
Petr Škoda (skodak) added a comment -

Trusttext is very tricky my -100 for that here.
The fields are responsible for the XSS protection, in general the Data mod allows teacher to use JS in templates, but it must prevent JS in data that was received from students. Normally we do the cleaning before the display, which means there should not be issues during import of actual entries. In any case only the field implementation knows what to do, we should let it handle this if needed.

Show
Petr Škoda (skodak) added a comment - Trusttext is very tricky my -100 for that here. The fields are responsible for the XSS protection, in general the Data mod allows teacher to use JS in templates, but it must prevent JS in data that was received from students. Normally we do the cleaning before the display, which means there should not be issues during import of actual entries. In any case only the field implementation knows what to do, we should let it handle this if needed.
Hide
Robert Allerstorfer added a comment -

Eloy,

PARAM_RAW does not do any cleaning/processing at all, so its using it would be completely useless.

Regarding PARAM_CLEAN, a comment in 'moodlelib.php' reads as follows:
/**

  • PARAM_CLEAN - obsoleted, please try to use more specific type of parameter.
  • It was one of the first types, that is why it is abused so much
    */

The only field type that may contain HTML tags is 'textarea'. Thus, htmlspecialchars() could safely be applied at least to all other field types. How should we proceed?

Show
Robert Allerstorfer added a comment - Eloy, PARAM_RAW does not do any cleaning/processing at all, so its using it would be completely useless. Regarding PARAM_CLEAN, a comment in 'moodlelib.php' reads as follows: /**
  • PARAM_CLEAN - obsoleted, please try to use more specific type of parameter.
  • It was one of the first types, that is why it is abused so much */
The only field type that may contain HTML tags is 'textarea'. Thus, htmlspecialchars() could safely be applied at least to all other field types. How should we proceed?
Hide
Petr Škoda (skodak) added a comment -

You do not know what some 3rd party fields do with the data, hardcoded htmlspecialchars() here is imho a bad solution. I really think that by default it should do nothing and if needed let fields decide what to do.

Show
Petr Škoda (skodak) added a comment - You do not know what some 3rd party fields do with the data, hardcoded htmlspecialchars() here is imho a bad solution. I really think that by default it should do nothing and if needed let fields decide what to do.
Hide
Robert Allerstorfer added a comment -

Petr,

so how would your proposed patch look like? Thew current implementation (clean_param($value, PARAM_NOTAGS)) is not good.

Show
Robert Allerstorfer added a comment - Petr, so how would your proposed patch look like? Thew current implementation (clean_param($value, PARAM_NOTAGS)) is not good.
Hide
Petr Škoda (skodak) added a comment -

Unfortunately I can not help you now, will be back working on moodle code on Thursday

Show
Petr Škoda (skodak) added a comment - Unfortunately I can not help you now, will be back working on moodle code on Thursday
Hide
Robert Allerstorfer added a comment -

The new attached patch (MDL-14442-clean_param.patch) tries to solve this problem better. I have completely removed clean_param($value, PARAM_NOTAGS) since it neither allows HTML nor literal < or > symbols. As mentioned above, the only field type where HTML is possible is 'textarea', thus I now treat this type differently than all the others:

To keep HTML in 'textarea' fields, clean_param($value, PARAM_CLEANHTML) cleaning will be performed. This is a compromise since it still strips something like "fraction < 200µm" to "fraction " (but keeps "fraction > 200µm" as is). The workaround is to replace those symbols in the CSV file prior to use it for importing.

In all other fields, literal < and > symbols will be replaced by their HTML entities (but not & - so htmlspecialchars() would in fact be too much). I have entensively tested this patch together with MDL-8407_20080518.patch and both work fine for me.

Eloy, what do you think?

Show
Robert Allerstorfer added a comment - The new attached patch (MDL-14442-clean_param.patch) tries to solve this problem better. I have completely removed clean_param($value, PARAM_NOTAGS) since it neither allows HTML nor literal < or > symbols. As mentioned above, the only field type where HTML is possible is 'textarea', thus I now treat this type differently than all the others: To keep HTML in 'textarea' fields, clean_param($value, PARAM_CLEANHTML) cleaning will be performed. This is a compromise since it still strips something like "fraction < 200µm" to "fraction " (but keeps "fraction > 200µm" as is). The workaround is to replace those symbols in the CSV file prior to use it for importing. In all other fields, literal < and > symbols will be replaced by their HTML entities (but not & - so htmlspecialchars() would in fact be too much). I have entensively tested this patch together with MDL-8407_20080518.patch and both work fine for me. Eloy, what do you think?
Hide
Robert Allerstorfer added a comment -

I am planning to commit MDL-14442-clean_param.patch to MOODLE_19_STABLE and HEAD in a few days, since there has yet not been any response against it. I have tested it on importing large sets of records, with all possible field types, and it works fine (in contrast to the currently used PARAM_NOTAGS on all field types). I don't see a reason to keep known buggy implementations for a long time when there are tested working solutions available. If someone is against the commit, please give feedback.

Show
Robert Allerstorfer added a comment - I am planning to commit MDL-14442-clean_param.patch to MOODLE_19_STABLE and HEAD in a few days, since there has yet not been any response against it. I have tested it on importing large sets of records, with all possible field types, and it works fine (in contrast to the currently used PARAM_NOTAGS on all field types). I don't see a reason to keep known buggy implementations for a long time when there are tested working solutions available. If someone is against the commit, please give feedback.
Hide
Robert Allerstorfer added a comment -

Dongsheng, since you have been assigned as QA assignee, could you pls verify my last commit, so we could reclose this bug?

Show
Robert Allerstorfer added a comment - Dongsheng, since you have been assigned as QA assignee, could you pls verify my last commit, so we could reclose this bug?
Hide
Dongsheng Cai added a comment -

Reclose, Thanks

Show
Dongsheng Cai added a comment - Reclose, Thanks

Dates

  • Created:
    Updated:
    Resolved: