Details

    • Testing Instructions:
      Hide

      Multiple areas of moodle export csv files and need to be tested.

      data module

      1. Create a database module.
      2. Add a few fields, make sure that one is text and the other is a textarea.
      3. Add multiple lines to the entries and include quote marks(") and commas(,).
      4. Click on the export tab.
      5. Make sure that the CSV radio button is selected.
      6. Export the entries.

      [Test]

      • Open with a text editor and check that any field that has spaces or line breaks, or double quotes is encapsulated in double quotes.
      • Any word that has double quotes has another set of double quotes e.g. ""This"".

      Grade export

      1. Create or go to a course that has assessments that have been completed by students.
      2. Go to [Settings->Grades].
      3. From the drop down list in the top left corner select "Export Plain text file".
      4. check "include feedback in export" and then submit.

      [Test]

      • Open with a text editor and check that any field that has spaces or line breaks, or double quotes is encapsulated in double quotes.
      • Any word that has double quotes has another set of double quotes e.g. ""This"".

      Bulk user export

      To really test this out, create a user profile field:

      1. Go to [Settings->Site administration->Users->Accounts->User profile fields]
      2. Select a text area
      3. Fill in the form to create a custom field for the user.
      4. Edit your profile settings and fill in the new field that you have created, make sure to add multiple lines and put something in double quotes.
      5. Go to [Settings->Site administration->Users->Accounts->Bulk user actions]
      6. Select a group of users including the one that you added the information to.
      7. Choose "Download" from the "With selected users..." section.
      8. Click the "Download in text format" link.

      [Test]

      • Open with a text editor and check that any field that has spaces or line breaks, or double quotes is encapsulated in double quotes.
      • Any word that has double quotes has another set of double quotes e.g. ""This"".

      Quiz results export

      1. Create a quiz and have multiple students complete it.
      2. Browse back to the quiz and view the attempts made.
      3. Click "Download table data as" and Make sure that the box next to it reads "a comma separated values text file".

      [Test]

      • Open with a text editor and check that any field that has spaces or line breaks, or double quotes is encapsulated in double quotes.
      • Any word that has double quotes has another set of double quotes e.g. ""This"".
      Show
      Multiple areas of moodle export csv files and need to be tested. data module Create a database module. Add a few fields, make sure that one is text and the other is a textarea. Add multiple lines to the entries and include quote marks(") and commas(,). Click on the export tab. Make sure that the CSV radio button is selected. Export the entries. [Test] Open with a text editor and check that any field that has spaces or line breaks, or double quotes is encapsulated in double quotes. Any word that has double quotes has another set of double quotes e.g. ""This"". Grade export Create or go to a course that has assessments that have been completed by students. Go to [Settings->Grades] . From the drop down list in the top left corner select " Export Plain text file". check "include feedback in export" and then submit. [Test] Open with a text editor and check that any field that has spaces or line breaks, or double quotes is encapsulated in double quotes. Any word that has double quotes has another set of double quotes e.g. ""This"". Bulk user export To really test this out, create a user profile field: Go to [Settings->Site administration->Users->Accounts->User profile fields] Select a text area Fill in the form to create a custom field for the user. Edit your profile settings and fill in the new field that you have created, make sure to add multiple lines and put something in double quotes. Go to [Settings->Site administration->Users->Accounts->Bulk user actions] Select a group of users including the one that you added the information to. Choose "Download" from the "With selected users..." section. Click the "Download in text format" link. [Test] Open with a text editor and check that any field that has spaces or line breaks, or double quotes is encapsulated in double quotes. Any word that has double quotes has another set of double quotes e.g. ""This"". Quiz results export Create a quiz and have multiple students complete it. Browse back to the quiz and view the attempts made. Click "Download table data as" and Make sure that the box next to it reads "a comma separated values text file". [Test] Open with a text editor and check that any field that has spaces or line breaks, or double quotes is encapsulated in double quotes. Any word that has double quotes has another set of double quotes e.g. ""This"".
    • Affected Branches:
      MOODLE_23_STABLE
    • Fixed Branches:
      MOODLE_24_STABLE
    • Pull from Repository:
    • Pull Master Branch:
      wip-MDL-34074-master
    • Rank:
      42379

      Description

      At the moment moodle csv file don't follow any particular standards which makes exporting into other applications difficult.

      For example:
      When exporting a database activity in either CSV or Excel format, the text and textarea fields with are not being quoted. Upon opening the CSV or Excel export in Excel, the results span multiple lines when there are line breaks in the data.

      The solution to this is to wrap these fields in quotes.

      Standards to check:

      • DOS-style lines that end with (CRLF) characters
      • An optional header record (there is no sure way to detect whether it is present, so care is required when importing).
      • Each record "should" contain the same number of comma-separated fields.
      • Any field may be quoted (with double quotes).
      • Fields containing a line-break, double-quote, and/or commas should be quoted. (If they are not, the file will likely be impossible to process correctly, so this should is better taken as must).
      • A (double) quote character in a field must be represented by two double quote characters.

      for full details of the RFC 4180 standards:
      http://tools.ietf.org/html/rfc4180

        Issue Links

          Activity

          Hide
          Adrian Greeve added a comment -

          Additional places that upload csv files:

          • admin/user/user_bulk_download.php (From [Settings->Users->Accounts->Bulk user actions])
          • Quiz results download - mod/quiz/report.php?xxxxxx
          Show
          Adrian Greeve added a comment - Additional places that upload csv files: admin/user/user_bulk_download.php (From [Settings->Users->Accounts->Bulk user actions] ) Quiz results download - mod/quiz/report.php?xxxxxx
          Hide
          Petr Škoda added a comment -

          Hi, could we have some unit tests for this too? I suppose it should be relatively simple to test all the features.

          Show
          Petr Škoda added a comment - Hi, could we have some unit tests for this too? I suppose it should be relatively simple to test all the features.
          Hide
          Adrian Greeve added a comment -

          Issue size: XL

          Show
          Adrian Greeve added a comment - Issue size: XL
          Hide
          Sam Hemelryk added a comment -

          Hi Adrian,

          Its looking really good thanks.
          I've just been reading through it now and have noted the following.

          This is mostly focusing on csv_export_writer.

          1. Petr is right this needs to have some unit tests written for csv_export_writer. It shouldn't be too hard at all. This is something we are really pushing for new code where practical.
          2. @package core, @category csv
          3. Should probably have a __destruct method to close the file resource. Just protects in the case of exceptions or unexpected exit/die.
          4. within send_header the Expires header is common to both https and http so could be moved out of the if.
          5. The way in which $this->filename is set and used is pretty all over the place presently. Its set by set_filename, gets opened in the same method, and then can be defaulted in send_header. Either the defaulting should be removed, or should be moved to the constructor.
          6. I also think that the file should be opened the first time that data is being written, not within set_filename.
          7. download_file should exit when complete to ensure 100% that NOTHING else gets output.
          8. consider making send_header a protected method and only called from within download_file. Better for security and results in a simpler API.
          9. Just as a friendly initiative within the constructor you could check the global renderer to make sure output has not already started and thrown an exception if it has ($OUTPUT->has_started).
          10. The set_filename method shouldn't take the count arg. Calling code should work out all of that crap into a single filename to pass to this method. Having it there doesn't make sense to the API. I also wonder about the $extension arg, under what circumstance do we want to masquerade a file extension, but then perhaps there is a need, I think that could stay.
          11. Also just as an idea the following methods may be cool to think about.
            • public static function download_array($filename, array $rows, $delimiter = 'comma', $enclosure = "'"); This would be a straight up method to produce a download for an array of data. Convenience ++, would be immediately usable within data_export_csv.
            • public static function print_array(array $rows, $delimiter = 'comma', $enclosure = "'"); The equivalent to above but print rather than download.
          12. Actually looking at those two functions $delimiter and $enclosure should really be turned into either constants or defines. And delimiter called delimiter rather than delimiter name. Randomness.

          Anyway a bit to look at there. All in all I think it is going very well, nothing to serious there.

          Cheers
          Sam

          Show
          Sam Hemelryk added a comment - Hi Adrian, Its looking really good thanks. I've just been reading through it now and have noted the following. This is mostly focusing on csv_export_writer. Petr is right this needs to have some unit tests written for csv_export_writer. It shouldn't be too hard at all. This is something we are really pushing for new code where practical. @package core, @category csv Should probably have a __destruct method to close the file resource. Just protects in the case of exceptions or unexpected exit/die. within send_header the Expires header is common to both https and http so could be moved out of the if. The way in which $this->filename is set and used is pretty all over the place presently. Its set by set_filename, gets opened in the same method, and then can be defaulted in send_header. Either the defaulting should be removed, or should be moved to the constructor. I also think that the file should be opened the first time that data is being written, not within set_filename. download_file should exit when complete to ensure 100% that NOTHING else gets output. consider making send_header a protected method and only called from within download_file. Better for security and results in a simpler API. Just as a friendly initiative within the constructor you could check the global renderer to make sure output has not already started and thrown an exception if it has ($OUTPUT->has_started). The set_filename method shouldn't take the count arg. Calling code should work out all of that crap into a single filename to pass to this method. Having it there doesn't make sense to the API. I also wonder about the $extension arg, under what circumstance do we want to masquerade a file extension, but then perhaps there is a need, I think that could stay. Also just as an idea the following methods may be cool to think about. public static function download_array($filename, array $rows, $delimiter = 'comma', $enclosure = "'"); This would be a straight up method to produce a download for an array of data. Convenience ++, would be immediately usable within data_export_csv. public static function print_array(array $rows, $delimiter = 'comma', $enclosure = "'"); The equivalent to above but print rather than download. Actually looking at those two functions $delimiter and $enclosure should really be turned into either constants or defines. And delimiter called delimiter rather than delimiter name. Randomness. Anyway a bit to look at there. All in all I think it is going very well, nothing to serious there. Cheers Sam
          Hide
          Adrian Greeve added a comment -

          Hi Sam,

          Thanks for the comments.

          1. I have now included a unit test. I think that it might be a little simple, so I would appreciate some ideas as to how I could test this further.
          2. @package core and @category csv has been added
          3. A destruct method is now included
          4. The expires header has been moved so that it's only written once.
          5. $this->file name has been tided up and the default name has been moved to the constructor.
          6. The file is now opened the first time that data is written to the file.
          7. send_header is now a protected function.
          8. I had a look around and I couldn't find $OUTPUT->has_started. If you could direct me to where this function is and possibly an example of where it's being used then I might have a better idea of how to use it.
          9. set_filename no longer has the count argument.
          10. I have now included two convenience methods. These probably need to be looked at carefully.
          11. delimitername has been renamed to delimiter.
          12. Looking at the code for csv_import_reader::get_delimiter the line:
            case 'cfg':       if (isset($CFG->CSV_DELIMITER)) { return $CFG->CSV_DELIMITER; } // no break; fall back to comma
            

            makes me hesitant to make delimiter a constant.

          13. I'm not sure what else might be used as an enclosure and this is why this hasn't been made into a constant as well.
          Show
          Adrian Greeve added a comment - Hi Sam, Thanks for the comments. I have now included a unit test. I think that it might be a little simple, so I would appreciate some ideas as to how I could test this further. @package core and @category csv has been added A destruct method is now included The expires header has been moved so that it's only written once. $this->file name has been tided up and the default name has been moved to the constructor. The file is now opened the first time that data is written to the file. send_header is now a protected function. I had a look around and I couldn't find $OUTPUT->has_started. If you could direct me to where this function is and possibly an example of where it's being used then I might have a better idea of how to use it. set_filename no longer has the count argument. I have now included two convenience methods. These probably need to be looked at carefully. delimitername has been renamed to delimiter. Looking at the code for csv_import_reader::get_delimiter the line: case 'cfg': if (isset($CFG->CSV_DELIMITER)) { return $CFG->CSV_DELIMITER; } // no break ; fall back to comma makes me hesitant to make delimiter a constant. I'm not sure what else might be used as an enclosure and this is why this hasn't been made into a constant as well.
          Hide
          Sam Hemelryk added a comment -

          Howdy,

          I am full tack on MUC at the moment and wont' have time to have a good think about this today sorry, feel free to get someone else to peer-review, or I'll take a look later this week as soon as I have time.

          In regards to your question points:

          8. $OUTPUT->has_started I wouldn't worry about it any more, its sig is core_renderer::has_started, grep will show you where it is being used. It allows you to check if things have already been output. If they have then send_header...header calls will throw warnings and then will fail. Looking at the code now its probably not required, but if you wanted to you could put it in send_header and trigger an error if output has started. But again not necessary.

          10. Looking at the convenience methods I had envisaged them to use the csv_export_writer class rather than replicating what each method of the class does as is needed.
          That way when you make changes/fix bugs in csv_export_writer you don't need to remember to update those methods and you reduce the amount of duplicate code.

          12. Aha, given its done like that in csv_export_reader then it is fine in csv_export_writer as its consistent.

          You asked me whether it was safe for people to start basing their work on the API this code provides.
          You can't be sure of anything until it gets integrated, until its in its still possible that it will change and it having already been used by upcoming code is not a good reason not to make required changes.
          That being said I think that the methods that make up this API are pretty solid, I don't think they will need to change and it there are changes still required I imagine you could hide them internally if you really wanted.

          Cheers
          Sam

          Show
          Sam Hemelryk added a comment - Howdy, I am full tack on MUC at the moment and wont' have time to have a good think about this today sorry, feel free to get someone else to peer-review, or I'll take a look later this week as soon as I have time. In regards to your question points: 8. $OUTPUT->has_started I wouldn't worry about it any more, its sig is core_renderer::has_started, grep will show you where it is being used. It allows you to check if things have already been output. If they have then send_header...header calls will throw warnings and then will fail. Looking at the code now its probably not required, but if you wanted to you could put it in send_header and trigger an error if output has started. But again not necessary. 10. Looking at the convenience methods I had envisaged them to use the csv_export_writer class rather than replicating what each method of the class does as is needed. That way when you make changes/fix bugs in csv_export_writer you don't need to remember to update those methods and you reduce the amount of duplicate code. 12. Aha, given its done like that in csv_export_reader then it is fine in csv_export_writer as its consistent. You asked me whether it was safe for people to start basing their work on the API this code provides. You can't be sure of anything until it gets integrated, until its in its still possible that it will change and it having already been used by upcoming code is not a good reason not to make required changes. That being said I think that the methods that make up this API are pretty solid, I don't think they will need to change and it there are changes still required I imagine you could hide them internally if you really wanted. Cheers Sam
          Hide
          Adrian Greeve added a comment -

          Thanks Sam for your help, especially considering how busy you are.

          • I've re-written the convenience methods to utilize the class.
          • I've had a talk with Ankit and thankfully any changes that he has to make to his issues will be minor regardless of what happens here in this issue. Obviously his patches won't get implemented until this one does.
          Show
          Adrian Greeve added a comment - Thanks Sam for your help, especially considering how busy you are. I've re-written the convenience methods to utilize the class. I've had a talk with Ankit and thankfully any changes that he has to make to his issues will be minor regardless of what happens here in this issue. Obviously his patches won't get implemented until this one does.
          Hide
          Ankit Agarwal added a comment - - edited

          Adrian asked me to have a look at this and give feedback. Here is a few things I noticed:-

          1. @var string $delimitername The name of the delimiter (comma, tab, etc.)
            That should be changed to $delimiter and we should disclose all valid $delimiters in the brackets that we support(colon,semicolon,cfg,tab,comma), just for reference of external developers
          2. @param string $delimitername The name of the character used to separate fields.
            That should be changed to $delimiter again.
          3. Returns csv data line by line for displaying. (print_csv_data())
            The php doc is misleading as it prints the data instead of returning it
          4. IMO it will make sense to have an explicit method to destroy the temp file which can also be called inside download_file() method. The reason being the destructor is not always called, specially incase of fatal errors. Here is a small code sample from php.net to display that concept.
            Unable to find source-code formatter for language: php. Available languages are: actionscript, html, java, javascript, none, sql, xhtml, xml
            As of PHP 5.3.10 destructors are not run on shutdown caused by fatal errors.
            
            For example:
            <?php
            class Logger
            {
                protected $rows = array();
            
                public function __destruct()
                {
                    $this->save();
                }
            
                public function log($row)
                {
                    $this->rows[] = $row;
                }
            
                public function save()
                {
                    echo '<ul>';
                    foreach ($this->rows as $row)
                    {
                        echo '<li>', $row, '</li>';
                    }
                    echo '</ul>';
                }
            }
            
            $logger = new Logger;
            $logger->log('Before');
            
            $nonset->foo();
            
            $logger->log('After');
            ?>
            

            So say I initialize the script and add in some data and than, there is some other fatal error, causing the script to terminate. The tmp file will lie there as orphan.
            Other solution can be using register-shutdown-function . But am not sure if we are allowed to use that with in Moodle, ideally it would be much better to have an explicit function to destroy stuff and call it explicitly.

          5. don't we need to validate $delimiter and $encloser before using them? or atleast csv_import_reader::get_delimiter($this->delimiter) should have a default clause in the switch.
          6. table_text_export_format_parent::output_headers($headers) is supposed to dump table headers I guess (not php headers) so a call to add_data() would make sense
          7. in mod/data/lib.php
            Unable to find source-code formatter for language: php. Available languages are: actionscript, html, java, javascript, none, sql, xhtml, xml
            if ($return) {
            csv_export_writer::print_array($export, $delimiter_name);
            

            Is changing the logic of the original code. Previously it was doing a return or a download..now it is doing a print or a download.

          8. it might be worthwhile to look into adding a $return to print_array() and print_csv_data() methods
          9. using the first entry in the array, when the user profile field lists data as an array, doesnt make any sense to me. May be am missing something here , but am not sure this is the best way to do it.
            Unable to find source-code formatter for language: php. Available languages are: actionscript, html, java, javascript, none, sql, xhtml, xml
                       // Custom user profile fields sometimes come in an array and need to be checked.
                        if (is_array($user->$field)) {
                            $userprofiledata[] = array_shift(&$user->$field);
                        } else {
                            $userprofiledata[] = $user->$field;
                        }
            

          Rest, this looks solid. It is nice to have a single class to export csv and remove all duplicate code all over the place.
          Thanks

          Show
          Ankit Agarwal added a comment - - edited Adrian asked me to have a look at this and give feedback. Here is a few things I noticed:- @var string $delimitername The name of the delimiter (comma, tab, etc.) That should be changed to $delimiter and we should disclose all valid $delimiters in the brackets that we support(colon,semicolon,cfg,tab,comma), just for reference of external developers @param string $delimitername The name of the character used to separate fields. That should be changed to $delimiter again. Returns csv data line by line for displaying. (print_csv_data()) The php doc is misleading as it prints the data instead of returning it IMO it will make sense to have an explicit method to destroy the temp file which can also be called inside download_file() method. The reason being the destructor is not always called, specially incase of fatal errors. Here is a small code sample from php.net to display that concept. Unable to find source-code formatter for language: php. Available languages are: actionscript, html, java, javascript, none, sql, xhtml, xml As of PHP 5.3.10 destructors are not run on shutdown caused by fatal errors. For example: <?php class Logger { protected $rows = array(); public function __destruct() { $ this ->save(); } public function log($row) { $ this ->rows[] = $row; } public function save() { echo '<ul>'; foreach ($ this ->rows as $row) { echo '<li>', $row, '</li>'; } echo '</ul>'; } } $logger = new Logger; $logger->log('Before'); $nonset->foo(); $logger->log('After'); ?> So say I initialize the script and add in some data and than, there is some other fatal error, causing the script to terminate. The tmp file will lie there as orphan. Other solution can be using register-shutdown-function . But am not sure if we are allowed to use that with in Moodle, ideally it would be much better to have an explicit function to destroy stuff and call it explicitly. don't we need to validate $delimiter and $encloser before using them? or atleast csv_import_reader::get_delimiter($this->delimiter) should have a default clause in the switch. table_text_export_format_parent::output_headers($headers) is supposed to dump table headers I guess (not php headers) so a call to add_data() would make sense in mod/data/lib.php Unable to find source-code formatter for language: php. Available languages are: actionscript, html, java, javascript, none, sql, xhtml, xml if ($ return ) { csv_export_writer::print_array($export, $delimiter_name); Is changing the logic of the original code. Previously it was doing a return or a download..now it is doing a print or a download. it might be worthwhile to look into adding a $return to print_array() and print_csv_data() methods using the first entry in the array, when the user profile field lists data as an array, doesnt make any sense to me. May be am missing something here , but am not sure this is the best way to do it. Unable to find source-code formatter for language: php. Available languages are: actionscript, html, java, javascript, none, sql, xhtml, xml // Custom user profile fields sometimes come in an array and need to be checked. if (is_array($user->$field)) { $userprofiledata[] = array_shift(&$user->$field); } else { $userprofiledata[] = $user->$field; } Rest, this looks solid. It is nice to have a single class to export csv and remove all duplicate code all over the place. Thanks
          Hide
          Adrian Greeve added a comment -

          Thanks Ankit for taking a look at all of the code.

          1. doc block changed.
          2. doc block changed.
          3. doc block changed.
          4. Ankit and I had a in-depth discussion about this and finally consulted with Dan. We'll leave the destructor as it is.
          5. $delimiter and $enclosure now have some checks put in place.
          6. table_text_export_format_parent::output_headers has been modified so that the title of each row is now returned in the csv file.
          7. mod/data/lib.php has been modified to follow the correct logic that existed before.
          8. print_array() and print_csv_data() now have an optional parameter to allow for a string to be returned.
          9. Custom profile textarea fields return an array. The first field is the text and the second is the format value. These lines of code check to see if an array exists and then takes in the first record. Unfortunately formatting will be lost when exporting to a .csv file.

          Thanks again.

          Show
          Adrian Greeve added a comment - Thanks Ankit for taking a look at all of the code. doc block changed. doc block changed. doc block changed. Ankit and I had a in-depth discussion about this and finally consulted with Dan. We'll leave the destructor as it is. $delimiter and $enclosure now have some checks put in place. table_text_export_format_parent::output_headers has been modified so that the title of each row is now returned in the csv file. mod/data/lib.php has been modified to follow the correct logic that existed before. print_array() and print_csv_data() now have an optional parameter to allow for a string to be returned. Custom profile textarea fields return an array. The first field is the text and the second is the format value. These lines of code check to see if an array exists and then takes in the first record. Unfortunately formatting will be lost when exporting to a .csv file. Thanks again.
          Hide
          Ankit Agarwal added a comment - - edited

          This is looking good now Adrian.Thanks for making all the changes! Feel free to submit for integration.
          Thanks

          Show
          Ankit Agarwal added a comment - - edited This is looking good now Adrian.Thanks for making all the changes! Feel free to submit for integration. Thanks
          Hide
          Sam Hemelryk added a comment -

          Thanks guys, this looked much better this time round and has been integrated now.

          Cheers
          Sam

          Show
          Sam Hemelryk added a comment - Thanks guys, this looked much better this time round and has been integrated now. Cheers Sam
          Hide
          Rossiani Wijaya added a comment -

          This looks good.

          Test passed.

          Show
          Rossiani Wijaya added a comment - This looks good. Test passed.
          Hide
          Eloy Lafuente (stronk7) added a comment -

          Fixed STOP Closed STOP Thanks STOP

          Yay, imagination! Ciao

          Show
          Eloy Lafuente (stronk7) added a comment - Fixed STOP Closed STOP Thanks STOP Yay, imagination! Ciao

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: