Uploaded image for project: 'Moodle'
  1. Moodle
  2. MDL-66765

Grade Import CSV - Inefficient Query Can Causes Timeouts

    XMLWordPrintable

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 3.5.6, 3.6.4, 3.7
    • None
    • Gradebook, Performance
    • None
    • MOODLE_35_STABLE, MOODLE_36_STABLE, MOODLE_37_STABLE

    Description

      Grade Import CSV does not include mnethostid as part of the query and was further changed to allow case insensitive comparisons in MDL-29317: https://github.com/moodle/moodle/commit/27913342518bcfbe782c1cb6e723150f78645f6c

      First, without mnethostid, the username lookup needs to perform a full table scan (which can take a couple hundred milliseconds when you have over 100,000 users). For a course with 1000 participants, this can mean a 200 second wait time for a single grade import just for user lookups (which is well over the default Apache/php-fpm proxy timeout of 30 seconds).

      Second, the set_sql() comparisons under MySQL add two elements which make the email index unusable: a LOWER() and an email-side collation adjustment.

      SELECT * FROM mdl_user WHERE LOWER(email) COLLATE utf8mb4_bin = LOWER('user@example.com')

       

      The default collation for Moodle is already case-insensitive, so email = :email would be a case-insensitive comparison. If the concern is about accent-insensitive collations like the MySQL 8 default (utf8mb4_0900_ai_ci), then I could see that you may want an accent-sensitive comparison to preserve RFC 6530 compatibility. The current code already checks the collation in use, but the MySQL version doesn't perform more efficient logic to level the existing indexes. Instead, it builds a collation string based on the first chunk of the collation split on underscore. However, it should be noted that there is technically a difference between utf8mb4_0900_bin and utf8mb4_bin (see: Sushi = Beer ?!).

      Ideally, we could use something collation independent like (but it does require that we're lowercasing email addresses as they come in):

      SELECT * FROM mdl_user WHERE email = BINARY LOWER('user@example.com');

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              jrchamp Jonathan Champ
              Adrian Greeve, Ilya Tregubov, Kevin Percy, Mathew May, Mihail Geshoski, Shamim Rezaie, Matteo Scaramuccia, David Woloszyn, Huong Nguyen, Jake Dallimore, Meirza, Michael Hawkins, Raquel Ortega, Safat Shahin, Stevani Andolo
              Votes:
              3 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: