Moodle
  1. Moodle
  2. MDL-33337

META: RFC-4180 CSV file compliance for greater portablility of exported data.

    Details

    • Workaround:
      Hide

      The workaround to this is to use ODS (OpenOffice) format when opening in Excel. The data for text and textarea is preserved in a sinbl cell.

      Show
      The workaround to this is to use ODS (OpenOffice) format when opening in Excel. The data for text and textarea is preserved in a sinbl cell.
    • Affected Branches:
      MOODLE_22_STABLE, MOODLE_23_STABLE
    • Rank:
      41189

      Description

      Old title: Database Activity Export for CSV and Excel does not wrap text or textarea fields in quotes

      RFC 4180 is an effort to formalize CSV. It defines the MIME type "text/csv", and CSV files that follow its rules should be very widely portable. Among its requirements:

      • DOS-style lines that end with (CRLF) characters
      • An optional header record (there is no sure way to detect whether it is present, so care is required when importing).
      • Each record "should" contain the same number of comma-separated fields.
      • Any field may be quoted (with double quotes).
      • Fields containing a line-break, double-quote, and/or commas should be quoted. (If they are not, the file will likely be impossible to process correctly, so this should is better taken as must).
      • A (double) quote character in a field must be represented by two double quote characters.

        Issue Links

          Activity

          Hide
          Michael de Raadt added a comment -

          Thanks for reporting this.

          I've put that on the backlog.

          In the meantime feel free to help us work on this issue. If you are able to provide a patch, please add a patch label so we will spot it.

          Show
          Michael de Raadt added a comment - Thanks for reporting this. I've put that on the backlog. In the meantime feel free to help us work on this issue. If you are able to provide a patch, please add a patch label so we will spot it.
          Hide
          Bente Olsen added a comment -

          I have changed the priority to Major, there is no easily workaround here. You can export to ods format without problems, but you can not import the ods file.

          Show
          Bente Olsen added a comment - I have changed the priority to Major, there is no easily workaround here. You can export to ods format without problems, but you can not import the ods file.
          Hide
          Adrian Greeve added a comment -

          fix estimate: 5 hours.

          Show
          Adrian Greeve added a comment - fix estimate: 5 hours.
          Hide
          Tim Hunt added a comment -

          Adrian, PHP has fgetcsv and fputcsv. Not sure if they are full RFC compliant, but I guess if you are writing unit tests, you will find out.

          Anyway, if those are usable, it would be good not to re-invent the wheel.

          You certainly need to look in lib/tablelib.php, because that is a place that does CSV output that is used in a number of other places.

          Show
          Tim Hunt added a comment - Adrian, PHP has fgetcsv and fputcsv. Not sure if they are full RFC compliant, but I guess if you are writing unit tests, you will find out. Anyway, if those are usable, it would be good not to re-invent the wheel. You certainly need to look in lib/tablelib.php, because that is a place that does CSV output that is used in a number of other places.
          Hide
          Petr Škoda added a comment -

          Hmm, forget standards and instead make sure the export/import works with MS/open/libre office for ENglish, but also other languages that do not use "," as separator. It has to deal with legacy encodings too.

          The trickiest part is how to detect/select encoding+separator+enclosure when picking the file. User import does some nasty form resubmission to preview the file structure, I believe that the best solution would be to implement the CSV file import preview/selection directly in new forms element "csvfilepicker". It would return array of draftitemid,encoding,separator,enclosure and optionally mapping of columns.

          The column is another problem we are facing here, you should not expect everybody will use English for column names, there should be a way to map the actual CSV columns to moodle columns. Outlook had a nice UI for that when importing contacts.

          Show
          Petr Škoda added a comment - Hmm, forget standards and instead make sure the export/import works with MS/open/libre office for ENglish, but also other languages that do not use "," as separator. It has to deal with legacy encodings too. The trickiest part is how to detect/select encoding+separator+enclosure when picking the file. User import does some nasty form resubmission to preview the file structure, I believe that the best solution would be to implement the CSV file import preview/selection directly in new forms element "csvfilepicker". It would return array of draftitemid,encoding,separator,enclosure and optionally mapping of columns. The column is another problem we are facing here, you should not expect everybody will use English for column names, there should be a way to map the actual CSV columns to moodle columns. Outlook had a nice UI for that when importing contacts.
          Hide
          Adrian Greeve added a comment -

          Thanks Tim for pointing me towards fgetscsv and fputcsv. As far as I can see they are RFC compliant.

          Petr, I really like the idea of having a preview/selection form for the file picker, but that might have to be something done outside of this issue. I agree that detecting the encoding, separator and enclosure will be difficult with many different formats floating around.

          My thoughts at the moment is to create a class for csv export similar to the import class that we currently have. This way the code becomes more centralised and hopefully easier to manage in the future.

          Show
          Adrian Greeve added a comment - Thanks Tim for pointing me towards fgetscsv and fputcsv. As far as I can see they are RFC compliant. Petr, I really like the idea of having a preview/selection form for the file picker, but that might have to be something done outside of this issue. I agree that detecting the encoding, separator and enclosure will be difficult with many different formats floating around. My thoughts at the moment is to create a class for csv export similar to the import class that we currently have. This way the code becomes more centralised and hopefully easier to manage in the future.
          Hide
          Adrian Greeve added a comment -

          Issue size: XL

          Show
          Adrian Greeve added a comment - Issue size: XL
          Hide
          Michael de Raadt added a comment -

          Carrying over to the next sprint.

          Show
          Michael de Raadt added a comment - Carrying over to the next sprint.
          Hide
          Adrian Greeve added a comment -

          All sub-tasks have been completed.
          Closing this issue.

          Show
          Adrian Greeve added a comment - All sub-tasks have been completed. Closing this issue.

            People

            • Votes:
              2 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: