Moodle
  1. Moodle
  2. MDL-26542

get_in_or_equal should support empty array

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 2.0.2
    • Fix Version/s: 2.1
    • Component/s: Database SQL/XMLDB
    • Labels:
    • Rank:
      16187

      Description

      The get_in_or_equal function throws coding_exception if you pass it an empty array, this is annoying as in many cases you need to use it where there might potentially be an empty array.

      (note I am going to submit pull request about this in a sec, I have patch)

        Activity

        Hide
        Sam Marshall added a comment -

        Filed as PULL-344

        Note I do not need this to be fixed any particular time as I have done a workaround by making my own utility function to wrap the existing behaviour. Just thought I'd contribute it in order to be nice.

        Show
        Sam Marshall added a comment - Filed as PULL-344 Note I do not need this to be fixed any particular time as I have done a workaround by making my own utility function to wrap the existing behaviour. Just thought I'd contribute it in order to be nice.
        Hide
        David Mudrak added a comment -

        I have always thought this was intentional behaviour. In most cases, if the array is empty, you do not want to execute the query at all. I feel it is the caller's responsibility to check if the array is empty or not and do the best according the specific situation. I would be afraid of a general behaviour here...

        Show
        David Mudrak added a comment - I have always thought this was intentional behaviour. In most cases, if the array is empty, you do not want to execute the query at all. I feel it is the caller's responsibility to check if the array is empty or not and do the best according the specific situation. I would be afraid of a general behaviour here...
        Hide
        Eloy Lafuente (stronk7) added a comment -

        Lol,

        1) What makes you infer that the "default" conditions is NULL? Why not 0, or 'Sam', or PI ?
        2) That's not cross-db at all and also "= NULL" is (in the practice) one inequality, matches nothing, why is that the desired behavior?

        Sorry this cannot land... ciao

        Show
        Eloy Lafuente (stronk7) added a comment - Lol, 1) What makes you infer that the "default" conditions is NULL? Why not 0, or 'Sam', or PI ? 2) That's not cross-db at all and also "= NULL" is (in the practice) one inequality, matches nothing, why is that the desired behavior? Sorry this cannot land... ciao
        Hide
        Sam Marshall added a comment -

        Well, in many of my cases, you certainly do want to execute the query. I typically have used this for exceptions; example, 'list of forums in which you have access-all-groups permission', that kind of thing. It's just one small part of a larger query. I wouldn't usually do a query using JUST one of these.

        If the array is empty it should return 0 results. Isn't that the entirely expected behaviour? Or are you suggesting that foreach($emptyarray as $item) {} should also throw an exception?

        Will respond to other point in other bug.

        Show
        Sam Marshall added a comment - Well, in many of my cases, you certainly do want to execute the query. I typically have used this for exceptions; example, 'list of forums in which you have access-all-groups permission', that kind of thing. It's just one small part of a larger query. I wouldn't usually do a query using JUST one of these. If the array is empty it should return 0 results. Isn't that the entirely expected behaviour? Or are you suggesting that foreach($emptyarray as $item) {} should also throw an exception? Will respond to other point in other bug.
        Hide
        Sam Marshall added a comment -

        Actually makes more sense to reply here...

        About ternary logic: in practice it works because in a WHERE / ON clause, the value 'unknown' is generally equivalent to 'false' (and the same is true of the way the AND/OR operators work, i.e. if you think that a final value of 'unknown' is equivalent to 'false', which for the WHERE operator is the case, then the AND/OR operator is consistent with this)..

        I agree this is slightly dubious but in a practical query it's fairly unlikely that you would specifically depend on the 'unknown' value, unless I'm missing something.

        About database compatibility, this definitely works fine in Postgres (live on our system for ages), in MS SQL (just tested), and I'm fairly sure it also works in MySQL. However I agree it might not work on Oracle, hasn't been tested. Of course there is no chance that it can cause a bug in current system because any code that has an empty array already has a bug in current system.

        Not having this facility is annoying and is a source of unexpected bugs (where the 'normal' case is that there will be some results in the array).

        Perhaps another alternative would be to create a new version of get_in_or_equals that includes the entire SQL, i.e.

        get_in_or_equals_of('x.groupid', $array)

        which could then result in the SQL values for different cases which would always be boolean true/false and never unknown:

        x.groupid IN (3,4,5)
        x.groupid = 4
        0=1

        therefore ensuring no ambiguity. (0=1 is the database-independent version of 'false', afaik...)

        If anybody is actually strongly in favour of this, let me know, I might code it. Otherwise if you want to keep with the current developer-hostile approach, that's fine.

        Show
        Sam Marshall added a comment - Actually makes more sense to reply here... About ternary logic: in practice it works because in a WHERE / ON clause, the value 'unknown' is generally equivalent to 'false' (and the same is true of the way the AND/OR operators work, i.e. if you think that a final value of 'unknown' is equivalent to 'false', which for the WHERE operator is the case, then the AND/OR operator is consistent with this).. I agree this is slightly dubious but in a practical query it's fairly unlikely that you would specifically depend on the 'unknown' value, unless I'm missing something. About database compatibility, this definitely works fine in Postgres (live on our system for ages), in MS SQL (just tested), and I'm fairly sure it also works in MySQL. However I agree it might not work on Oracle, hasn't been tested. Of course there is no chance that it can cause a bug in current system because any code that has an empty array already has a bug in current system. Not having this facility is annoying and is a source of unexpected bugs (where the 'normal' case is that there will be some results in the array). Perhaps another alternative would be to create a new version of get_in_or_equals that includes the entire SQL, i.e. get_in_or_equals_of('x.groupid', $array) which could then result in the SQL values for different cases which would always be boolean true/false and never unknown: x.groupid IN (3,4,5) x.groupid = 4 0=1 therefore ensuring no ambiguity. (0=1 is the database-independent version of 'false', afaik...) If anybody is actually strongly in favour of this, let me know, I might code it. Otherwise if you want to keep with the current developer-hostile approach, that's fine.
        Hide
        Eloy Lafuente (stronk7) added a comment -

        I think, as commented in HQ chat that the main problem for this is why to assume that "unknown" is the correct evaluation. I can see some point about allowing the function to, optionally (via new param) return one known true or false, but I think current behavior should be the default one.

        Ciao

        Show
        Eloy Lafuente (stronk7) added a comment - I think, as commented in HQ chat that the main problem for this is why to assume that "unknown" is the correct evaluation. I can see some point about allowing the function to, optionally (via new param) return one known true or false, but I think current behavior should be the default one. Ciao
        Hide
        Sam Marshall added a comment -

        Agree with conclusion that 'unknown' logic state is inappropriate and therefore previous version was not suitable for use.

        Rather than changing behaviour of existing function, perhaps better to make a new clearly named one. Here's a function that is named to make clear that it behaves like the in_array function (i.e. if you pass an empty array, it will return false).

            /**
             * Similar to core $DB->get_in_or_equal, but permits empty array and requires the value to
             * be passed to this function. The returned SQL fragment is a complete expression that returns
             * boolean TRUE or FALSE.
             * 
             * As an example, if you want to find out if the SQL expression
             * f.id is contained within the array of ids $items:
             * 
             * list($sql, $params) = get_in_array_sql('f.id', $items);
             * $DB->get_records_sql("SELECT * FROM whatever WHERE $sql", $params);
             * @param string $value SQL expression for value to be compared with array items
             * @param array $items Array of IDs (or similar) that the value will be compared with
             * @param int $type bound param type SQL_PARAMS_QM or SQL_PARAMS_NAMED
             * @param string $start named param placeholder start
             * @return array - $sql and $params
             */
            public static function get_in_array_sql($value, $items, $type=SQL_PARAMS_QM, $start='param0000') {
                global $DB;
                if (is_array($items) && empty($items)) {
                    // Empty array returns false. The keyword FALSE is not supported on all databases
                    // but 1=0 probably should be.
                    return array('(1=0)', array());
                }
                list ($sql, $params) = $DB->get_in_or_equal($items, $type, $start); 
                return array($value . ' ' . $sql, $params);
            }
        

        This is tested... a bit

        Show
        Sam Marshall added a comment - Agree with conclusion that 'unknown' logic state is inappropriate and therefore previous version was not suitable for use. Rather than changing behaviour of existing function, perhaps better to make a new clearly named one. Here's a function that is named to make clear that it behaves like the in_array function (i.e. if you pass an empty array, it will return false). /** * Similar to core $DB->get_in_or_equal, but permits empty array and requires the value to * be passed to this function. The returned SQL fragment is a complete expression that returns * boolean TRUE or FALSE. * * As an example, if you want to find out if the SQL expression * f.id is contained within the array of ids $items: * * list($sql, $params) = get_in_array_sql('f.id', $items); * $DB->get_records_sql( "SELECT * FROM whatever WHERE $sql" , $params); * @param string $value SQL expression for value to be compared with array items * @param array $items Array of IDs (or similar) that the value will be compared with * @param int $type bound param type SQL_PARAMS_QM or SQL_PARAMS_NAMED * @param string $start named param placeholder start * @ return array - $sql and $params */ public static function get_in_array_sql($value, $items, $type=SQL_PARAMS_QM, $start='param0000') { global $DB; if (is_array($items) && empty($items)) { // Empty array returns false . The keyword FALSE is not supported on all databases // but 1=0 probably should be. return array('(1=0)', array()); } list ($sql, $params) = $DB->get_in_or_equal($items, $type, $start); return array($value . ' ' . $sql, $params); } This is tested... a bit
        Hide
        Eloy Lafuente (stronk7) added a comment -

        Looks correct IMO. I've added also Petr here to know his opinion too.

        Personally I "feel" that extending the current function could be better, like get_record() and friends have one final param deciding behavior (MUST|ALLOW|WHATEVER), but it's only one opinion.

        Ciao

        Show
        Eloy Lafuente (stronk7) added a comment - Looks correct IMO. I've added also Petr here to know his opinion too. Personally I "feel" that extending the current function could be better, like get_record() and friends have one final param deciding behavior (MUST|ALLOW|WHATEVER), but it's only one opinion. Ciao
        Hide
        Petr Škoda added a comment -

        1/ The example above is incorrect instead of "WHERE $sql" should be "WHERE column $sql".
        2/ The code itself is incorrect, it would probably return something like array(' = (1=0)', array());
        3/ Comparing random stuff to Bool seems like a bad idea away, mysql returns all active users when I tried this "SELECT * FROM

        {user}

        WHERE deleted = (1=0)" because it casts false to 0 automatically. This could be a huge problem for fields that use integer constants.

        My -1, I do not like this because it may hide logic errors in code and I do not see any easy way to implement this using some new parameters. In each situation the empty array may mean something completely different: O, -1, NULL, '', ....

        Show
        Petr Škoda added a comment - 1/ The example above is incorrect instead of "WHERE $sql" should be "WHERE column $sql". 2/ The code itself is incorrect, it would probably return something like array(' = (1=0)', array()); 3/ Comparing random stuff to Bool seems like a bad idea away, mysql returns all active users when I tried this "SELECT * FROM {user} WHERE deleted = (1=0)" because it casts false to 0 automatically. This could be a huge problem for fields that use integer constants. My -1, I do not like this because it may hide logic errors in code and I do not see any easy way to implement this using some new parameters. In each situation the empty array may mean something completely different: O, -1, NULL, '', ....
        Hide
        Sam Marshall added a comment -

        Hi Petr,

        1/ The example code is correct - it must be WHERE $sql. The reason is that this function has a different prototype from get_in_or_equal: it requires the column to be included in a parameter.

        2/ The function is correct, same reason.

        3/ The problem you've described would apply if it were implemented that way, but it does not compare anything to bool. In this case it would not do 'WHERE deleted = (1=0)', it would simply do 'WHERE (1=0)' which would not return any rows even in MySQL (I sincerely hope; note that I actually tested so far in Postgres, SQL Server).

        So in other words this function behaves in a reliable manner and does not rely on any odd behaviour from SQL (which I agree was a serious problem with my original suggestion). It produces queries of the following three forms:

        WHERE id = 37
        WHERE id IN (37, 38)
        WHERE (1=0)

        As you can see there is nothing odd about any of these three forms; the first two are from get_in_or_equal and the last one is straightforward boolean false.

        This example is a silly one; in practice there is no good reason to use this function for such simple queries (although actually, I kind of prefer the syntax that include the sql expression you are comparing inside the function, it always seemed a bit weird to return half an expression - but that's a minor thing).

        The benefit of this function is in a complex query with many clauses (perhaps including two or three of these conditions). For example, one place it's used in ForumNG inside a complex query is to do something like... 'AND ((join against group_members matches) OR (this forum is in a previously-calculated list of forum ids where the current user has access-all-groups))'. Obviously in that case, if the user does not have any access-all-groups forums, you still need to run the query in general, just that small part should always be false.

        Anyway - I have this code in forumng now - so it can get some testing etc, there is no need to do anything about it. Perhaps at some later date, e.g. if anything based on ForumNG actually ever gets included in core, that might be a good time to consider adding this function.

        Show
        Sam Marshall added a comment - Hi Petr, 1/ The example code is correct - it must be WHERE $sql. The reason is that this function has a different prototype from get_in_or_equal: it requires the column to be included in a parameter. 2/ The function is correct, same reason. 3/ The problem you've described would apply if it were implemented that way, but it does not compare anything to bool. In this case it would not do 'WHERE deleted = (1=0)', it would simply do 'WHERE (1=0)' which would not return any rows even in MySQL (I sincerely hope; note that I actually tested so far in Postgres, SQL Server). So in other words this function behaves in a reliable manner and does not rely on any odd behaviour from SQL (which I agree was a serious problem with my original suggestion). It produces queries of the following three forms: WHERE id = 37 WHERE id IN (37, 38) WHERE (1=0) As you can see there is nothing odd about any of these three forms; the first two are from get_in_or_equal and the last one is straightforward boolean false. This example is a silly one; in practice there is no good reason to use this function for such simple queries (although actually, I kind of prefer the syntax that include the sql expression you are comparing inside the function, it always seemed a bit weird to return half an expression - but that's a minor thing). The benefit of this function is in a complex query with many clauses (perhaps including two or three of these conditions). For example, one place it's used in ForumNG inside a complex query is to do something like... 'AND ((join against group_members matches) OR (this forum is in a previously-calculated list of forum ids where the current user has access-all-groups))'. Obviously in that case, if the user does not have any access-all-groups forums, you still need to run the query in general, just that small part should always be false. Anyway - I have this code in forumng now - so it can get some testing etc, there is no need to do anything about it. Perhaps at some later date, e.g. if anything based on ForumNG actually ever gets included in core, that might be a good time to consider adding this function.
        Hide
        Petr Škoda added a comment -

        Oh! right, that is a different function! I was confused by the "get_in_or_equal should support empty array", sorrrrryyyy.

        We were discussion this with Eloy a bit, there could be added a new parameter in get_in_or_equal() with defaulting to PHP false meaning throw exception and accepting any other value to be used when array empty (you could use NULL, -1, 0 or whatever else is guaranteed to not be in the column).

        Show
        Petr Škoda added a comment - Oh! right, that is a different function! I was confused by the "get_in_or_equal should support empty array", sorrrrryyyy. We were discussion this with Eloy a bit, there could be added a new parameter in get_in_or_equal() with defaulting to PHP false meaning throw exception and accepting any other value to be used when array empty (you could use NULL, -1, 0 or whatever else is guaranteed to not be in the column).
        Hide
        Sam Marshall added a comment -

        ah - yep that would work!

        Show
        Sam Marshall added a comment - ah - yep that would work!
        Hide
        Eloy Lafuente (stronk7) added a comment -

        Just for sharing, here it is one patch implementing the new param ($onemptyitems = false):

        https://github.com/stronk7/moodle/compare/master...MDL-26542_empty_array

        Notes:

        • it only applies when $items is empty array.
        • tests added for $onemptyitems = (false, NULL, true, -1 and 'onevalue').
        • also fixed another test that was wrong and added one more about invalid $type.

        Concern:

        • How do drivers behave with the boolean true case? Do we provide automatic true => 1 conversion in all them? It sounds to me, but not 100% sure. If that isn't supported 100% then, perhaps, we should keep the boolean true case out from the allowed values for that param.

        Comment, plz, ciao

        Show
        Eloy Lafuente (stronk7) added a comment - Just for sharing, here it is one patch implementing the new param ($onemptyitems = false): https://github.com/stronk7/moodle/compare/master...MDL-26542_empty_array Notes: it only applies when $items is empty array. tests added for $onemptyitems = (false, NULL, true, -1 and 'onevalue'). also fixed another test that was wrong and added one more about invalid $type. Concern: How do drivers behave with the boolean true case? Do we provide automatic true => 1 conversion in all them? It sounds to me, but not 100% sure. If that isn't supported 100% then, perhaps, we should keep the boolean true case out from the allowed values for that param. Comment, plz, ciao
        Hide
        Eloy Lafuente (stronk7) added a comment -

        Sam, Petr, ping, plz. TIA!

        Show
        Eloy Lafuente (stronk7) added a comment - Sam, Petr, ping, plz. TIA!
        Hide
        Sam Marshall added a comment -

        I read through code, looks great. (Trivial, but I noticed inconsistent spacing around = in function default parameter - however not sure which one is right...)

        Show
        Sam Marshall added a comment - I read through code, looks great. (Trivial, but I noticed inconsistent spacing around = in function default parameter - however not sure which one is right...)
        Hide
        Eloy Lafuente (stronk7) added a comment -

        Fixed the param spacing to 0. Personally I always add space to new methods but follow the rule in existing ones.

        Ciao

        Show
        Eloy Lafuente (stronk7) added a comment - Fixed the param spacing to 0. Personally I always add space to new methods but follow the rule in existing ones. Ciao
        Hide
        Eloy Lafuente (stronk7) added a comment -

        PULL-564 has been created for integration of this in master only.

        I think it's safe to be backported to 20_STABLE but... feel free to discuss.

        Ciao

        Show
        Eloy Lafuente (stronk7) added a comment - PULL-564 has been created for integration of this in master only. I think it's safe to be backported to 20_STABLE but... feel free to discuss. Ciao
        Hide
        David Mudrak added a comment -

        Tests passed for pgsql, mysqli, mssql and sqlsrv. Apu is going to test oracle tomorrow, just in case. Thanks Sam and Eloy for your work.

        Show
        David Mudrak added a comment - Tests passed for pgsql, mysqli, mssql and sqlsrv. Apu is going to test oracle tomorrow, just in case. Thanks Sam and Eloy for your work.
        Hide
        Helen Foster added a comment -

        Thanks everyone!

        Show
        Helen Foster added a comment - Thanks everyone!

          People

          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: