Uploaded image for project: 'Moodle'
  1. Moodle
  2. MDL-81882

upgrade_block_delete_instances is extremely slow for large sets of blocks and preferences

XMLWordPrintable

    • MOODLE_404_STABLE

      upgrade_block_delete_instances (which was used in an upgrade step) from lib/db/upgradelib.php generates SQL that looks something like this:

      DELETE FROM mdl_user_preferences WHERE id IN (SELECT p.id AS pid
        FROM mdl_user_preferences p
        JOIN mdl_block_instances bi ON p.name IN ( '' || 'block' || bi.id || 'hidden' ,  '' || 'docked_block_instance_' || bi.id )
       WHERE      bi.id IN (
          SELECT * FROM (
              SELECT bi.id
                FROM mdl_my_pages mp
                JOIN mdl_block_instances bi
                      ON bi.blockname = 'myoverview'
                     AND bi.subpagepattern IS NOT NULL AND  ( NOT  (bi.subpagepattern = '') )
                     AND bi.pagetypepattern = 'my-index'
                     AND  CAST(bi.subpagepattern AS INT)  = mp.id
               WHERE mp.private = 1
                 AND mp.name = '__default'
          ) bid
       ))
      

      The part that generates it looks like:

      $blockhidden = $DB->sql_concat("'block'", 'bi.id', "'hidden'");
      $blockdocked = $DB->sql_concat("'docked_block_instance_'", 'bi.id');
      $deletesql = <<<EOF
          SELECT p.id AS pid
            FROM {user_preferences} p
            JOIN {block_instances} bi ON p.name IN ({$blockhidden}, {$blockdocked})
           WHERE {$instanceselect}
      EOF;
      $DB->delete_records_subquery('user_preferences', 'id', 'pid', $deletesql, $instanceparams);
      

      For sites with a large number of rows in the relevant tables, this SQL can run for days. When running it on a DB with the following row counts:

      Block instances 1,148,414
      User preferences: 228,906

      It did not complete in over 24 hours. Even worse, the SQL runs even when there are no preferences that it actually needs to delete (which was the case with the table I tested on - it did not contain any preferences matching the pattern block%hidden or docked_block_instance%).

      I put together a proof of concept that splits this in to two queries (one to get the preferences and one to get the block instance IDs) and then computes which preferences need to be deleted (in PHP) which was much faster. On a DB with the following row counts:

      Block instances: 2.7 million
      User preferences: 9.7million (500k of which match the pattern the SQL looks for)
      My pages: 600k

      It was able to delete the relevant preferences (of which there were about 63k) in a few seconds.

      The POC looks like:

      <?php
       
      define('CLI_SCRIPT', true);
      require_once('config.php');
       
      $dockedlike = $DB->sql_like('name', ':docked', false);
      $hiddenlike = $DB->sql_like('name', ':hidden', false);
       
      $prefs = $DB->get_records_sql(
          'SELECT * FROM {user_preferences} WHERE ' . $dockedlike . ' OR ' . $hiddenlike,
          ['docked' => 'docked_block_instance_%', 'hidden' => 'block%hidden']
      );
       
       
      $subpagepattern = $DB->sql_cast_char2int('bi.subpagepattern');
      $subpageempty = $DB->sql_isnotempty('block_instances', 'bi.subpagepattern', true, false);
       
      $instanceselect = <<<EOF
          SELECT * FROM (
              SELECT bi.id
                FROM {my_pages} mp
                JOIN {block_instances} bi
                     ON bi.blockname = :blockname
                     AND bi.subpagepattern IS NOT NULL AND {$subpageempty}
                     AND bi.pagetypepattern = :pagetypepattern
                     AND {$subpagepattern} = mp.id
               WHERE mp.private = :private
                 AND mp.name = :pagename
          ) bid
      EOF;
       
      $blockinstances = array_column(
          $DB->get_records_sql(
              $instanceselect,
              [
                  'blockname' => 'myoverview',
                  'pagetypepattern' => 'my-index',
                  'pagename' => '__default',
                  'private' => 1,
              ]
          ),
          'id',
      );
       
      $prefinstanceids  = array_map(fn(stdClass $n): int => (int)str_replace(['docked_block_instance_', 'block', 'hidden'], '', $n->name), $prefs);
      $prefstodelete = array_keys(
          array_intersect(
              array_map(fn(stdClass $n): int => (int)str_replace(['docked_block_instance_', 'block', 'hidden'], '', $n->name), $prefs),
              $blockinstances
          )
      );
       
      foreach (array_chunk($prefstodelete, 65535) as $chunk) {
          [$insql, $params] = $DB->get_in_or_equal($chunk);
          $DB->delete_records_select('user_preferences', "id $insql", $params);
      }
      

      I was able to get some confidence that the PHP implementation is equivalent by culling the DB to a point where the original SQL could complete in a reasonable amount of time (the tables ended up having about 100,000 or so rows) and comparing the IDs generated by the SQL and the IDs generated by the script.

            Unassigned Unassigned
            cameron1729 cameron1729
            Votes:
            1 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:

                Error rendering 'clockify-timesheets-time-tracking-reports:timer-sidebar'. Please contact your Jira administrators.