Uploaded image for project: 'Moodle'
  1. Moodle
  2. MDL-33337

META: RFC-4180 CSV file compliance for greater portablility of exported data.

    Details

    • Workaround:
      Hide

      The workaround to this is to use ODS (OpenOffice) format when opening in Excel. The data for text and textarea is preserved in a sinbl cell.

      Show
      The workaround to this is to use ODS (OpenOffice) format when opening in Excel. The data for text and textarea is preserved in a sinbl cell.
    • Affected Branches:
      MOODLE_22_STABLE, MOODLE_23_STABLE

      Description

      Old title: Database Activity Export for CSV and Excel does not wrap text or textarea fields in quotes

      RFC 4180 is an effort to formalize CSV. It defines the MIME type "text/csv", and CSV files that follow its rules should be very widely portable. Among its requirements:

      • DOS-style lines that end with (CRLF) characters
      • An optional header record (there is no sure way to detect whether it is present, so care is required when importing).
      • Each record "should" contain the same number of comma-separated fields.
      • Any field may be quoted (with double quotes).
      • Fields containing a line-break, double-quote, and/or commas should be quoted. (If they are not, the file will likely be impossible to process correctly, so this should is better taken as must).
      • A (double) quote character in a field must be represented by two double quote characters.

        Gliffy Diagrams

          Issue Links

            Activity

            Hide
            salvetore Michael de Raadt added a comment -

            Thanks for reporting this.

            I've put that on the backlog.

            In the meantime feel free to help us work on this issue. If you are able to provide a patch, please add a patch label so we will spot it.

            Show
            salvetore Michael de Raadt added a comment - Thanks for reporting this. I've put that on the backlog. In the meantime feel free to help us work on this issue. If you are able to provide a patch, please add a patch label so we will spot it.
            Hide
            benteo Bente Olsen added a comment -

            I have changed the priority to Major, there is no easily workaround here. You can export to ods format without problems, but you can not import the ods file.

            Show
            benteo Bente Olsen added a comment - I have changed the priority to Major, there is no easily workaround here. You can export to ods format without problems, but you can not import the ods file.
            Hide
            abgreeve Adrian Greeve added a comment -

            fix estimate: 5 hours.

            Show
            abgreeve Adrian Greeve added a comment - fix estimate: 5 hours.
            Hide
            timhunt Tim Hunt added a comment -

            Adrian, PHP has fgetcsv and fputcsv. Not sure if they are full RFC compliant, but I guess if you are writing unit tests, you will find out.

            Anyway, if those are usable, it would be good not to re-invent the wheel.

            You certainly need to look in lib/tablelib.php, because that is a place that does CSV output that is used in a number of other places.

            Show
            timhunt Tim Hunt added a comment - Adrian, PHP has fgetcsv and fputcsv. Not sure if they are full RFC compliant, but I guess if you are writing unit tests, you will find out. Anyway, if those are usable, it would be good not to re-invent the wheel. You certainly need to look in lib/tablelib.php, because that is a place that does CSV output that is used in a number of other places.
            Hide
            skodak Petr Skoda added a comment -

            Hmm, forget standards and instead make sure the export/import works with MS/open/libre office for ENglish, but also other languages that do not use "," as separator. It has to deal with legacy encodings too.

            The trickiest part is how to detect/select encoding+separator+enclosure when picking the file. User import does some nasty form resubmission to preview the file structure, I believe that the best solution would be to implement the CSV file import preview/selection directly in new forms element "csvfilepicker". It would return array of draftitemid,encoding,separator,enclosure and optionally mapping of columns.

            The column is another problem we are facing here, you should not expect everybody will use English for column names, there should be a way to map the actual CSV columns to moodle columns. Outlook had a nice UI for that when importing contacts.

            Show
            skodak Petr Skoda added a comment - Hmm, forget standards and instead make sure the export/import works with MS/open/libre office for ENglish, but also other languages that do not use "," as separator. It has to deal with legacy encodings too. The trickiest part is how to detect/select encoding+separator+enclosure when picking the file. User import does some nasty form resubmission to preview the file structure, I believe that the best solution would be to implement the CSV file import preview/selection directly in new forms element "csvfilepicker". It would return array of draftitemid,encoding,separator,enclosure and optionally mapping of columns. The column is another problem we are facing here, you should not expect everybody will use English for column names, there should be a way to map the actual CSV columns to moodle columns. Outlook had a nice UI for that when importing contacts.
            Hide
            abgreeve Adrian Greeve added a comment -

            Thanks Tim for pointing me towards fgetscsv and fputcsv. As far as I can see they are RFC compliant.

            Petr, I really like the idea of having a preview/selection form for the file picker, but that might have to be something done outside of this issue. I agree that detecting the encoding, separator and enclosure will be difficult with many different formats floating around.

            My thoughts at the moment is to create a class for csv export similar to the import class that we currently have. This way the code becomes more centralised and hopefully easier to manage in the future.

            Show
            abgreeve Adrian Greeve added a comment - Thanks Tim for pointing me towards fgetscsv and fputcsv. As far as I can see they are RFC compliant. Petr, I really like the idea of having a preview/selection form for the file picker, but that might have to be something done outside of this issue. I agree that detecting the encoding, separator and enclosure will be difficult with many different formats floating around. My thoughts at the moment is to create a class for csv export similar to the import class that we currently have. This way the code becomes more centralised and hopefully easier to manage in the future.
            Hide
            abgreeve Adrian Greeve added a comment -

            Issue size: XL

            Show
            abgreeve Adrian Greeve added a comment - Issue size: XL
            Hide
            salvetore Michael de Raadt added a comment -

            Carrying over to the next sprint.

            Show
            salvetore Michael de Raadt added a comment - Carrying over to the next sprint.
            Hide
            abgreeve Adrian Greeve added a comment -

            All sub-tasks have been completed.
            Closing this issue.

            Show
            abgreeve Adrian Greeve added a comment - All sub-tasks have been completed. Closing this issue.

              People

              • Votes:
                2 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: