Moodle
  1. Moodle
  2. MDL-27071

Oracle cannot cope withan IN operator with more than 1000 conditions in it

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 2.0.1, 2.3.6, 2.4.3, 2.5
    • Fix Version/s: 2.3.7, 2.4.4
    • Component/s: Database SQL/XMLDB
    • Labels:
    • Environment:
      CentOS 5.5, Oracle 10g, PHP 5.3.5 with oci8 library
    • Database:
      Oracle
    • Testing Instructions:
      Hide

      In all the branches where this patch is integrated....

      1) Run lib/dml/tests/dml_test.php under all DBs. Getting them passing (see note below) means we can now use bigger (up to 10000) IN() clauses in a cross-db way.

      Note that only master in able to pass the DML tests 100% under all databases, so the stables will show some failures. In any case, the tests verifying this issue are:

      • test_get_in_or_equal()
      • test_get_records_sql_complicated()

      No error related to them should happen at all.

      Show
      In all the branches where this patch is integrated.... 1) Run lib/dml/tests/dml_test.php under all DBs. Getting them passing (see note below) means we can now use bigger (up to 10000) IN() clauses in a cross-db way. Note that only master in able to pass the DML tests 100% under all databases, so the stables will show some failures. In any case, the tests verifying this issue are: test_get_in_or_equal() test_get_records_sql_complicated() No error related to them should happen at all.
    • Workaround:
      Hide

      Fix the code to avoid using such big IN() clauses.

      Show
      Fix the code to avoid using such big IN() clauses.
    • Difficulty:
      Easy
    • Affected Branches:
      MOODLE_20_STABLE, MOODLE_23_STABLE, MOODLE_24_STABLE, MOODLE_25_STABLE
    • Fixed Branches:
      MOODLE_23_STABLE, MOODLE_24_STABLE
    • Pull from Repository:
    • Pull 2.4 Branch:
    • Pull Master Branch:
    • Rank:
      16709

      Description

      Problem with query with In operator with more than 1000 condition in it (error ORA-01795: maximum number of expressions in a list is 1000)

      Proposed solution:
      an hack in the lib/dml/oci_nativa_moodle_database.php
      Due to the fact we are working at the library level, it's not possible to use a subquery instead of the "long" list of parameters. We adopted the other suggested approach: to parse the sql code for the IN operator with more than 1000 conditions and to rewrite it as a series of IN operators connected through the OR condition (suggestion from http://forums.oracle.com/forums/thread.jspa?threadID=233143)

      The hack is the following:
      <

      {moodle}/lib/dml/oci_native_moodle_database.php> : 981 //added some lines to use the defined method "ORA01795_hack" on the SQL query
      //LkM79 for ISP, 2010-04-03
      //var_dump(method_exists($this,'ORA01795_hack'));
      $sql=$this->ORA01795_hack($sql);

      <{moodle}

      /lib/dml/oci_native_moodle_database.php> : 1040 //added some lines to use the defined method "ORA01795_hack" on the SQL query
      //LkM79 for ISP, 2010-04-03
      //var_dump(method_exists($this,'ORA01795_hack'));
      $sql=$this->ORA01795_hack($sql);

      <

      {moodle}

      /lib/dml/oci_native_moodle_database.php> : 981 //added some lines (a function) to split every IN clause with more than 1000 conditions in a OR combination of many shorter IN lists
      /**

      • Driver specific hack for introducing a sort of support for IN list with more than 1000 condition
      • split the IN list with more than 1000 co0nditions in many lists, connected through an OR operator
      • @return string
      • LkM79 for ISP, 2010-04-03
        */
        public function ORA01795_hack($sql){
        //var_dump($sql);
        //var_dump(stripos($sql,"IN (?,"));
        //var_dump(explode("?,",$sql));

      $start=stripos($sql," IN (?,");
      if($start>1) {
      $end=stripos(substr($sql,$start),"?)")+2;
      //var_dump($end);
      $in_list=substr($sql, $start,$end);
      //var_dump($in_list);
      $n=explode("?,",$in_list);
      //var_dump($n);
      $slices=(int)((count($n))/1000)+1;
      //var_dump($slices);
      $step=(int)((count($n))/$slices);
      //var_dump($step);
      if($slices>1){
      $sql_i=substr($sql,0,$start);
      $field=strrchr($sql_i, " ");
      $sql_i=substr($sql_i,0,-strlen($field));
      $sql_e=substr($sql,$start+$end);
      $sql_in=' (';
      for($i=0;$i<$slices;$i++){
      $sql_in.=" $field IN (";
      if($i<($slices -1)){
      for($t=0;$t<($step-1);$t++)

      { $sql_in.='?,'; }
      $sql_in.='?) OR ';
      }else{
      for($t=0;$t<((count($n)-(($slices -1)*$step))-1);$t++){ $sql_in.='?,'; }

      $sql_in.='?))';
      }
      }
      $sql=$sql_i.$sql_in.$sql_e;
      }
      }
      //var_dump($sql);
      return $sql;
      }

      }

      Attached, the modified version of the oci_native_moodle_database.php file

      1. oci_native_moodle_database.php
        71 kB
        Jaya Krishna
      2. oci_native_moodle_database.php
        65 kB
        Luca Mazzola

        Issue Links

          Activity

          Hide
          Tim Hunt added a comment -

          There are really two issues here. First this limit in the Oracle driver, which Eloy should consider. There is also the fact that in 2.0, it should be possible for the quiz reports to do everything in SQL, rather than using a long IN list. I separated that out as MDL-27072.

          Show
          Tim Hunt added a comment - There are really two issues here. First this limit in the Oracle driver, which Eloy should consider. There is also the fact that in 2.0, it should be possible for the quiz reports to do everything in SQL, rather than using a long IN list. I separated that out as MDL-27072 .
          Hide
          Luca Mazzola added a comment -

          Agree. I proposed the solution with lower expected impact on the actual code, but I will like a more structural solution, like using an inner query (also for portability between different databases and for performances reasons). I think the same approach of IN list is also used in other modules (i.e. assignment). Thanks, Luca.

          Show
          Luca Mazzola added a comment - Agree. I proposed the solution with lower expected impact on the actual code, but I will like a more structural solution, like using an inner query (also for portability between different databases and for performances reasons). I think the same approach of IN list is also used in other modules (i.e. assignment). Thanks, Luca.
          Hide
          Henrik Thorn added a comment -

          I've just applied the attached hack, which did not work during the Moodle upgrade process from Moodle 1.97 -> 2.03.

          The hack alters the SQL, but not the select-statement used by the upgrade script. I've copied the debugging error (without all the ids):

          Debug info: ORA-01795: maximum number of expressions in a list is 1000
          SELECT id, contextlevel FROM m_context WHERE contextlevel = 80 AND id IN (2,1,235,238,241,244,247,250,253,256,259,......)
          [array (
          )]
          Stack trace:
          line 391 of /lib/dml/moodle_database.php: dml_read_exception thrown
          line 268 of /lib/dml/oci_native_moodle_database.php: call to moodle_database->query_end()
          line 1015 of /lib/dml/oci_native_moodle_database.php: call to oci_native_moodle_database->query_end()
          line 1003 of /lib/dml/moodle_database.php: call to oci_native_moodle_database->get_recordset_sql()
          line 582 of /lib/db/upgradelib.php: call to moodle_database->get_recordset_select()
          line 2116 of /lib/db/upgrade.php: call to upgrade_cleanup_unwanted_block_contexts()
          line 1382 of /lib/upgradelib.php: call to xmldb_main_upgrade()
          line 273 of /admin/index.php: call to upgrade_core()

          Show
          Henrik Thorn added a comment - I've just applied the attached hack, which did not work during the Moodle upgrade process from Moodle 1.97 -> 2.03. The hack alters the SQL, but not the select-statement used by the upgrade script. I've copied the debugging error (without all the ids): Debug info: ORA-01795: maximum number of expressions in a list is 1000 SELECT id, contextlevel FROM m_context WHERE contextlevel = 80 AND id IN (2,1,235,238,241,244,247,250,253,256,259,......) [array ( )] Stack trace: line 391 of /lib/dml/moodle_database.php: dml_read_exception thrown line 268 of /lib/dml/oci_native_moodle_database.php: call to moodle_database->query_end() line 1015 of /lib/dml/oci_native_moodle_database.php: call to oci_native_moodle_database->query_end() line 1003 of /lib/dml/moodle_database.php: call to oci_native_moodle_database->get_recordset_sql() line 582 of /lib/db/upgradelib.php: call to moodle_database->get_recordset_select() line 2116 of /lib/db/upgrade.php: call to upgrade_cleanup_unwanted_block_contexts() line 1382 of /lib/upgradelib.php: call to xmldb_main_upgrade() line 273 of /admin/index.php: call to upgrade_core()
          Hide
          Luca Mazzola added a comment -

          Dear Henrik,
          the issue is related to the fact that the hack works on the "normal" format of an Oracle query. This means it searches for sequence of question marks that follows an IN operator (then associated during executions phase with the actual parametrs) and not a series of already "expanded" IDS hardcoded in the SQL script.
          In the next days I will try to provide you with a piece of code to overcome this problem (I have already implemented it in some other part of the modules directory, but I need to search for it...

          Hope it makes sense for you, regards,
          Luca.

          Show
          Luca Mazzola added a comment - Dear Henrik, the issue is related to the fact that the hack works on the "normal" format of an Oracle query. This means it searches for sequence of question marks that follows an IN operator (then associated during executions phase with the actual parametrs) and not a series of already "expanded" IDS hardcoded in the SQL script. In the next days I will try to provide you with a piece of code to overcome this problem (I have already implemented it in some other part of the modules directory, but I need to search for it... Hope it makes sense for you, regards, Luca.
          Hide
          Luca Mazzola added a comment - - edited

          A possible hack for solving your proble could be to substitute the two lines in your

          {moodle}

          /lib/db/upgradelib.php at loine 581-583 with the following lines:

          ///----------------------------------------------------------------------------------------------------------------------------------------------------------///
          //HACK for error ORA-01795, make query generic LkM79
          /*//////
          $contextidstring = join(',', $contextidarray);
          $blockcontexts = $DB->get_recordset_select('context', 'contextlevel = '.CONTEXT_BLOCK.' AND id IN ('.$contextidstring.')', array(), '', 'id, contextlevel');
          //*/

          $qry_in='(?';
          for($i=1;$i<count($contextidarray);$i++)

          { $qry_in.=',?'; }

          $qry_in.=')';

          $blockcontexts = $DB->get_recordset_select('context', 'contextlevel = '.CONTEXT_BLOCK.' AND id IN ('.$qry_in.')', $contextidarray, '', 'id, contextlevel');

          unset($qry_in);
          //end of HACK for error ORA-01795, make query generic LkM79
          ///----------------------------------------------------------------------------------------------------------------------------------------------------------///

          Due to the fact that I cannot attach my version of the modified file to this comment, feel free to write me an email and I will send it to you, in case you prefear it (even if I cannot guarantee that mine is precisely the same version as yours).

          Regards, and hope this helps you,
          Luca.

          Show
          Luca Mazzola added a comment - - edited A possible hack for solving your proble could be to substitute the two lines in your {moodle} /lib/db/upgradelib.php at loine 581-583 with the following lines: /// ----------------------------------------------------------------------------------------------------------------------------------------------------------// / //HACK for error ORA-01795, make query generic LkM79 /*////// $contextidstring = join(',', $contextidarray); $blockcontexts = $DB->get_recordset_select('context', 'contextlevel = '.CONTEXT_BLOCK.' AND id IN ('.$contextidstring.')', array(), '', 'id, contextlevel'); //*/ $qry_in='(?'; for($i=1;$i<count($contextidarray);$i++) { $qry_in.=',?'; } $qry_in.=')'; $blockcontexts = $DB->get_recordset_select('context', 'contextlevel = '.CONTEXT_BLOCK.' AND id IN ('.$qry_in.')', $contextidarray, '', 'id, contextlevel'); unset($qry_in); //end of HACK for error ORA-01795, make query generic LkM79 /// ----------------------------------------------------------------------------------------------------------------------------------------------------------// / Due to the fact that I cannot attach my version of the modified file to this comment, feel free to write me an email and I will send it to you, in case you prefear it (even if I cannot guarantee that mine is precisely the same version as yours). Regards, and hope this helps you, Luca.
          Hide
          Jonathon Fowler added a comment -

          This is something I've done to work around this:
          https://github.com/jonof/moodle/commit/7e4a382030eab44e3d193677a9b70e2686df4f95

          It introduces a get_field_in_or_equal() method to the moodle_database class, which is basically just a wrapper around get_in_or_equal() that prepends a field name to the SQL fragment. Next, this function is overriden in oci_native_moodle_database with code to split the list up into chunks of 1000. The instance where I ran into this issue was in lib/accesslib.php, and you can see how the code was changed to implement the call to the function.

          Jonathon

          Show
          Jonathon Fowler added a comment - This is something I've done to work around this: https://github.com/jonof/moodle/commit/7e4a382030eab44e3d193677a9b70e2686df4f95 It introduces a get_field_in_or_equal() method to the moodle_database class, which is basically just a wrapper around get_in_or_equal() that prepends a field name to the SQL fragment. Next, this function is overriden in oci_native_moodle_database with code to split the list up into chunks of 1000. The instance where I ran into this issue was in lib/accesslib.php, and you can see how the code was changed to implement the call to the function. Jonathon
          Hide
          Chris Myers added a comment -

          I did a search in our Moodle instance for a common string in the title of all auto-generated courses, and viewed all 2760 results in one big long list on a single page with the logging turned up to "developer," and it didn't throw the error anymore. So it looks like this is fixed in 2.3.2+ at least.

          Chris

          Show
          Chris Myers added a comment - I did a search in our Moodle instance for a common string in the title of all auto-generated courses, and viewed all 2760 results in one big long list on a single page with the logging turned up to "developer," and it didn't throw the error anymore. So it looks like this is fixed in 2.3.2+ at least. Chris
          Hide
          Jaya Krishna added a comment -

          The problem still exist on version 2.4.3.
          The original function as suggested above (ORA01795_hack) no longer working because the query format for "IN" no longer the same.
          Rather than slicing the param "?", it's now required to slice ":Exxx" when reach 1000.
          I modifed the function and I attach the file: oci_native_moodle_database.php.

          I hope the next version will use sub-query such as "Select .. IN (select ...)" , rather than e lengthy IN operator.

          Show
          Jaya Krishna added a comment - The problem still exist on version 2.4.3. The original function as suggested above (ORA01795_hack) no longer working because the query format for "IN" no longer the same. Rather than slicing the param "?", it's now required to slice ":Exxx" when reach 1000. I modifed the function and I attach the file: oci_native_moodle_database.php. I hope the next version will use sub-query such as "Select .. IN (select ...)" , rather than e lengthy IN operator.
          Hide
          Jaya Krishna added a comment - - edited

          oci_native_moodle_database.php modified for version 2.4.3, file attached above

          Show
          Jaya Krishna added a comment - - edited oci_native_moodle_database.php modified for version 2.4.3, file attached above
          Hide
          Eloy Lafuente (stronk7) added a comment - - edited

          Hi, Jaya (et all), first of all, many thanks for your progresses here!

          Then, I'd want to state clearly that having queries in Moodle doing that sort of constructions is 100% wrong. "IN ()" queries should be exclusively used for (very) reduced sets of values.

          They are vary slow, and they don't scale well under ANY database. It just happens that Oracle is the first one to fail (with 1000 elements), but all them will fail when any other limit is reached (max number of bind elements, max length of sql query, max size of transmission packet... all them!).

          So, the correct way to fix these cases (like we have done in the past in other places using those crazy "IN ()" hyper-large clause is to move them to subqueries. Those subqueries can be against real tables if the data can be queried or against a temp table filled with the target values. 100%. Always. That's the unique solution that will fix it in an scalable way (else it will fail with 2000, or 10000, or... anytime another limit is reached).

          Said that, I think that your fix is a good step to allow "more" values to be allowed in practice under Oracle, although the point I don't like about it is that the "hack" happens outside from get_in_or_equal() and it should be completely within it.

          And I think it's pretty doable by "pivoting" the list of bound params into rows. Basically, it's about to transform any result having more than 1000 elements into chunks of multiple "UNION ALL" expressions.

          I just have tried this, let's imagine that the limit is 2 (instead of 1000), for readability:

          Original query:

          SELECT * FROM M_USER WHERE id IN (100, 101, 102, 103, 104);
          

          Alternative query to avoid the limit=2:

          SELECT * FROM m_user WHERE id IN (
              SELECT DECODE(pivot,
                  1, 100,
                  2, 101)
              FROM dual
              CROSS JOIN (SELECT LEVEL AS pivot FROM dual CONNECT BY LEVEL <= 2)
            UNION ALL
              SELECT DECODE(pivot,
                  1, 102,
                  2, 103)
              FROM dual
              CROSS JOIN (SELECT LEVEL AS pivot FROM dual CONNECT BY LEVEL <= 2)
            UNION ALL
              SELECT DECODE(pivot,
                  1, 104)
              FROM dual
              CROSS JOIN (SELECT LEVEL AS pivot FROM dual CONNECT BY LEVEL <= 1)
          );
          

          Observations:

          • It can be fully implemented within oracle's get_in_or_equal() method. First call the parent and then, based on results... modify the query.
          • DECODE() has a limit of 255 elements so, in the real solution the UNION ALL chunks will have, each, max 255 elements.
          • I've not tested it with placeholders, but I bet it will work without problems.
          • I've quick tested it by building manually and expression with 1200 elements and has worked. Again, I'm sure there will be another limit at some point, haven't tried to find it.

          So, using that strategy I'll try to hack oracle's get_in_or_equal() and put some tests in order to verify that, say, 2000 elements are working ok. If everything seems to work ok, it will be sent to integration.

          But again, again, it's 100% guarantied that we'll raise some other limit on any database. 100%. We should forbid/warn when uses like this are found, really.

          Ciao

          Show
          Eloy Lafuente (stronk7) added a comment - - edited Hi, Jaya (et all), first of all, many thanks for your progresses here! Then, I'd want to state clearly that having queries in Moodle doing that sort of constructions is 100% wrong. "IN ()" queries should be exclusively used for (very) reduced sets of values. They are vary slow, and they don't scale well under ANY database. It just happens that Oracle is the first one to fail (with 1000 elements), but all them will fail when any other limit is reached (max number of bind elements, max length of sql query, max size of transmission packet... all them!). So, the correct way to fix these cases (like we have done in the past in other places using those crazy "IN ()" hyper-large clause is to move them to subqueries . Those subqueries can be against real tables if the data can be queried or against a temp table filled with the target values. 100%. Always. That's the unique solution that will fix it in an scalable way (else it will fail with 2000, or 10000, or... anytime another limit is reached). Said that, I think that your fix is a good step to allow "more" values to be allowed in practice under Oracle, although the point I don't like about it is that the "hack" happens outside from get_in_or_equal() and it should be completely within it. And I think it's pretty doable by "pivoting" the list of bound params into rows. Basically, it's about to transform any result having more than 1000 elements into chunks of multiple "UNION ALL" expressions. I just have tried this, let's imagine that the limit is 2 (instead of 1000), for readability: Original query: SELECT * FROM M_USER WHERE id IN (100, 101, 102, 103, 104); Alternative query to avoid the limit=2: SELECT * FROM m_user WHERE id IN ( SELECT DECODE(pivot, 1, 100, 2, 101) FROM dual CROSS JOIN (SELECT LEVEL AS pivot FROM dual CONNECT BY LEVEL <= 2) UNION ALL SELECT DECODE(pivot, 1, 102, 2, 103) FROM dual CROSS JOIN (SELECT LEVEL AS pivot FROM dual CONNECT BY LEVEL <= 2) UNION ALL SELECT DECODE(pivot, 1, 104) FROM dual CROSS JOIN (SELECT LEVEL AS pivot FROM dual CONNECT BY LEVEL <= 1) ); Observations: It can be fully implemented within oracle's get_in_or_equal() method. First call the parent and then, based on results... modify the query. DECODE() has a limit of 255 elements so, in the real solution the UNION ALL chunks will have, each, max 255 elements. I've not tested it with placeholders, but I bet it will work without problems. I've quick tested it by building manually and expression with 1200 elements and has worked. Again, I'm sure there will be another limit at some point, haven't tried to find it. So, using that strategy I'll try to hack oracle's get_in_or_equal() and put some tests in order to verify that, say, 2000 elements are working ok. If everything seems to work ok, it will be sent to integration. But again, again, it's 100% guarantied that we'll raise some other limit on any database. 100%. We should forbid/warn when uses like this are found, really. Ciao
          Hide
          Eloy Lafuente (stronk7) added a comment - - edited

          PS: Edited. There is another alternative that is to perform the offending actions in chunks done by PHP (not by drivers), in case the subquery alternative is not viable, but that's another story.

          Show
          Eloy Lafuente (stronk7) added a comment - - edited PS: Edited. There is another alternative that is to perform the offending actions in chunks done by PHP (not by drivers), in case the subquery alternative is not viable, but that's another story.
          Hide
          Eloy Lafuente (stronk7) added a comment -

          I've created MDL-39396 about to warn developers when the number of params in IN() clauses could be considered bad practice/abuse.

          Show
          Eloy Lafuente (stronk7) added a comment - I've created MDL-39396 about to warn developers when the number of params in IN() clauses could be considered bad practice/abuse.
          Hide
          Eloy Lafuente (stronk7) added a comment -

          Sending to peer-review, patches for 24_STABLE and master. Tests here (master) seem to assert that now we can use up to 10000 elements in a cross-db way.

          Show
          Eloy Lafuente (stronk7) added a comment - Sending to peer-review, patches for 24_STABLE and master. Tests here (master) seem to assert that now we can use up to 10000 elements in a cross-db way.
          Hide
          Petr Škoda added a comment -

          +1, the new tests look good and anything that passes the oracle tests is ok for me, thanks!

          Show
          Petr Škoda added a comment - +1, the new tests look good and anything that passes the oracle tests is ok for me, thanks!
          Hide
          Eloy Lafuente (stronk7) added a comment -

          Thanks, sending to integration.

          Show
          Eloy Lafuente (stronk7) added a comment - Thanks, sending to integration.
          Hide
          Eloy Lafuente (stronk7) added a comment -

          I've added the corresponding patch to 23_STABLE too.

          Show
          Eloy Lafuente (stronk7) added a comment - I've added the corresponding patch to 23_STABLE too.
          Hide
          Dan Poltawski added a comment -

          Integrated to master, thanks Eloy

          Show
          Dan Poltawski added a comment - Integrated to master, thanks Eloy
          Hide
          Dan Poltawski added a comment -

          I meant to say master, 24 and 23!

          Show
          Dan Poltawski added a comment - I meant to say master, 24 and 23!
          Hide
          Dan Poltawski added a comment -

          phpunit passes on all branches (apart from expected nulls failure on <2.5)

          Show
          Dan Poltawski added a comment - phpunit passes on all branches (apart from expected nulls failure on <2.5)
          Hide
          Dan Poltawski added a comment -

          Thanks! You're changes are now spread to the world through this git and our source control repositories.

          No time to rest though, we've got days to make 2.5 the best yet!

          ciao

          Show
          Dan Poltawski added a comment - Thanks! You're changes are now spread to the world through this git and our source control repositories. No time to rest though, we've got days to make 2.5 the best yet! ciao
          Hide
          Joseph Rézeau added a comment -

          Further to this discussion: https://moodle.org/mod/forum/discuss.php?d=238800 and after reading the details of the current issue I have no idea what should be done with the IN() clause in Questionnaire.
          1.- Has the Oracle issue been fixed in moodle versions further than 2.3 or 2.4?
          2.- What precisely should be done to avoid the "huge number of params in IN() clauses could be considered bad practice/abuse"?
          Joseph

          Show
          Joseph Rézeau added a comment - Further to this discussion: https://moodle.org/mod/forum/discuss.php?d=238800 and after reading the details of the current issue I have no idea what should be done with the IN() clause in Questionnaire. 1.- Has the Oracle issue been fixed in moodle versions further than 2.3 or 2.4? 2.- What precisely should be done to avoid the "huge number of params in IN() clauses could be considered bad practice/abuse"? Joseph

            People

            • Votes:
              2 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: