Uploaded image for project: 'Moodle'
  1. Moodle
  2. MDL-72342

Group import from CSV is broken by BOM

XMLWordPrintable

    • MOODLE_310_STABLE, MOODLE_311_STABLE, MOODLE_39_STABLE, MOODLE_400_STABLE
    • MOODLE_310_STABLE, MOODLE_311_STABLE
    • MDL-72432-master
    • Hide

      Remove BOM from the input CSV file is a workaround, but correctly processing BOM is a better option.

      Show
      Remove BOM from the input CSV file is a workaround, but correctly processing BOM is a better option.
    • Easy
    • Hide

      The test is done manually. It aims to check that the CSV file in UTF-8 encoding is correctly processed. In particular, the Byte Order Mark, which is a 3-byte header is appended to the name of the first column causing column name mismatch and importing errors. For instance, if the first column is expected to be groupname the header string will be \EF\BB\BFgroupname, which is not right.

      The following steps need to be performed to test the patch:

      1. Create a test course (if the attached csv is used the course name should be "test course 01")
      2. Go to Course -> Users -> Groups
      3. Click Import Groups
      4. Choose grouptest.csv file (file is attached to the ticket) in the file picker. Alternatively, another csv file can be picked, but to test the correct BOM processing the csv file needs BOM header bytes.
      5. CSV delimiter should be comma (,) and Encoding should be UTF-8 (which is default)
      6. Click "Import"
      7. Verify that no errors are displayed and the data is imported correctly

       

      Show
      The test is done manually. It aims to check that the CSV file in UTF-8 encoding is correctly processed. In particular, the Byte Order Mark, which is a 3-byte header is appended to the name of the first column causing column name mismatch and importing errors. For instance, if the first column is expected to be  groupname the header string will be  \EF\BB\BFgroupname , which is not right. The following steps need to be performed to test the patch: Create a test course (if the attached csv is used the course name should be " test course 01 ") Go to Course -> Users -> Groups Click Import Groups Choose grouptest.csv file (file is attached to the ticket) in the file picker. Alternatively, another csv file can be picked, but to test the correct BOM processing the csv file needs BOM header bytes. CSV delimiter should be comma ( , ) and Encoding should be UTF-8 (which is default) Click " Import " Verify that no errors are displayed and the data is imported correctly  

      Group import from CSV can fail when the CSV file is in UTF-8 and is containing the Byte Order Mark (EF BB BF) at the beginning of the file.

      Steps to reproduce:

      1. Create a test course (if the attached csv is used the course name should be "test course 01")
      2. Go to Course -> Users -> Groups
      3. Click Import Groups
      4. Choose grouptest.csv file (file is attached to the ticket) in the file picker. Alternatively, another csv file can be picked, but to test the correct BOM processing the csv file needs BOM header bytes.
      5. CSV delimiter should be comma (,) and Encoding should be UTF-8 (which is default)
      6. Click "Import"
      7. The error is as follows:

      "groupname" is not a valid field name
      Debug info:
      Error code: invalidfieldname
      Stack trace: * line 498 of /lib/setuplib.php: moodle_exception thrown

      • line 103 of /group/import.php: call to print_error()
        The logic of import.php trims spaces and quotes, but it does not take into account the BOM, which is neither space nor quote sequence. It just uses the variable $rawlines, which starts as "\EF\BB\BFgroupname,coursename":

       

      $header = explode($csvimport::get_delimiter($delimiter), array_shift($rawlines));

       

      Hence $header[0] is "\EF\BB\BFgroupname", which is incorrect and it does not pass the check causing the error.

      The correct way of obtaining the header information would be to use the method get_columns() from csv_import_reader class. The method load_csv_content()  in __ csv_import_reader  class __ trims BOM in the first line allowing to process the headers in the right way.

      The fix is very simple:

      ------------------------------- group/import.php -------------------------------index 58f405bd0cd..b50f880d738 100644
      @@ -61,8 +61,6 @@ if ($importform->is_cancelled()) {
           $text = $importform->get_file_content('userfile');
           $text = preg_replace('!\r\n?!', "\n", $text);
       
      -    $rawlines = explode("\n", $text);
      -
           require_once($CFG->libdir . '/csvlib.class.php');
           $importid = csv_import_reader::get_new_iid('groupimport');
           $csvimport = new csv_import_reader($importid, 'groupimport');
      @@ -95,7 +93,9 @@ if ($importform->is_cancelled()) {
               );
       
           // --- get header (field names) ---
      -    $header = explode($csvimport::get_delimiter($delimiter), array_shift($rawlines));
      +    // Using get_columns() ensures the Byte Order Mark is removed.
      +    $header = $csvimport->get_columns();
      +
           // check for valid field names
           foreach ($header as $i => $h) {
               $h = trim($h); $header[$i] = $h; // remove whitespace
      
      

        1. group_import_error.png
          73 kB
          Kateryna Degtyariova
        2. grouptest.csv
          0.0 kB
          Kateryna Degtyariova
        3. MDL-72342_Test Passed.PNG
          45 kB
          Gladys Basiana

            katerynadegtyariova Kateryna Degtyariova
            katerynadegtyariova Kateryna Degtyariova
            Scott Verbeek Scott Verbeek
            Eloy Lafuente (stronk7) Eloy Lafuente (stronk7)
            Gladys Basiana Gladys Basiana
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved:

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0 minutes
                0m
                Logged:
                Time Spent - 40 minutes
                40m

                  Error rendering 'clockify-timesheets-time-tracking-reports:timer-sidebar'. Please contact your Jira administrators.