Moodle

get_records_sql() with GROUP BY still returns incomplete result sets

Details

  • Type: Sub-task Sub-task
  • Status: Closed Closed
  • Priority: Major Major
  • Resolution: Fixed
  • Affects Version/s: 1.9
  • Fix Version/s: 1.8.3, 1.9
  • Component/s: Database SQL/XMLDB
  • Labels:
    None
  • Environment:
    Windows Vista Business
    Apache 2.2.4
    MySQL 5.0.37
    PHP 5.2.1
  • Database:
    MySQL
  • Affected Branches:
    MOODLE_19_STABLE
  • Fixed Branches:
    MOODLE_18_STABLE, MOODLE_19_STABLE

Description

The problem described in issue MDL-7840 still exists in some special cases.
Resolving MDL-7840 as "no bug" does not solve the problem because in many cases the function recordset_to_array() ist still used with non-unique keys in some files. It is still hard (if not impossible) to avoid using GROUP BY or to introduce unique keys in many cases.

For example, see mod/survey/lib.php:

SELECT MAX(a.time) as time,
u.id, u.firstname, u.lastname, u.picture
FROM {$CFG->prefix}survey_answers a,
{$CFG->prefix}user u $groupsdb
WHERE a.survey = $surveyid
AND a.userid = u.id $groupsql
GROUP BY u.id, u.firstname, u.lastname, u.picture
ORDER BY time ASC

If 2 users posts their answer at the same time (the timestamp is equal) it is still possible to get an equal MAX(a.time) resulting in records lost (only in some cases due to the complex WHERE clause -> unluckily, you it is difficult to reproduce this problem until it appears ).

I found a similar problem today in another file, so I expect that there will be many more files with GROUP-BY clauses using non-unique keys. In my opinion, the recordset_to_array() function SHOULD BE CHANGED to support more than one identical value for the first column. For this reason issue MDL-7840 may be reopened because it still produces errors in some cases (and therefore it is a bug!!!).
As an alternative for changing recordset_to_array(), ALL used GROUP-BYs in the code must be checked if their first column is really a unique key.

The workaround by using get_recordset_sql() instead of get_records_sql() does also not work in all cases (there seems to be some problems with non-MySQL databases, but I cannot reproduce this on my system since i do not have an Oracle installation here).

Activity

Hide
Petr Škoda (skodak) added a comment -

I do not think that we can change the way get_records_sql() works even if we wanted to because many other places would be broken by this change, sorry.

Show
Petr Škoda (skodak) added a comment - I do not think that we can change the way get_records_sql() works even if we wanted to because many other places would be broken by this change, sorry.
Hide
Eloy Lafuente (stronk7) added a comment -

Hi,

IMO that exact query is absolutely wrong (in concept). It should be:

SELECT u.id, u.firstname, u.lastname, u.picture, MAX(a.time) as time,
FROM {$CFG->prefix}survey_answers a,
{$CFG->prefix}user u $groupsdb
WHERE a.survey = $surveyid
AND a.userid = u.id $groupsql
GROUP BY u.id, u.firstname, u.lastname, u.picture
ORDER BY time ASC

to return, per user, his max time.

I've searched across code and that query results are only used in survey_print_all_responses() that doesn't uses the key of the array at all.

So IMO it should be correct to fix the query.

Gert, I really think that there shouldn't be wrong uses like this across Moodle and I believe that they are practically 0 (or we had detected them). And agree with Petr that to change the way isn't the solution. It's a very core function. Instead I would go to fix wrong uses too.

Show
Eloy Lafuente (stronk7) added a comment - Hi, IMO that exact query is absolutely wrong (in concept). It should be: SELECT u.id, u.firstname, u.lastname, u.picture, MAX(a.time) as time, FROM {$CFG->prefix}survey_answers a, {$CFG->prefix}user u $groupsdb WHERE a.survey = $surveyid AND a.userid = u.id $groupsql GROUP BY u.id, u.firstname, u.lastname, u.picture ORDER BY time ASC to return, per user, his max time. I've searched across code and that query results are only used in survey_print_all_responses() that doesn't uses the key of the array at all. So IMO it should be correct to fix the query. Gert, I really think that there shouldn't be wrong uses like this across Moodle and I believe that they are practically 0 (or we had detected them). And agree with Petr that to change the way isn't the solution. It's a very core function. Instead I would go to fix wrong uses too.
Hide
Eloy Lafuente (stronk7) added a comment -

Fixed in CVS, both 18_STABLE and HEAD. Closing

Thanks Gert! And, please, if you find more wrong uses of this type, let us know.

Ciao

Show
Eloy Lafuente (stronk7) added a comment - Fixed in CVS, both 18_STABLE and HEAD. Closing Thanks Gert! And, please, if you find more wrong uses of this type, let us know. Ciao
Hide
Gert Sauerstein added a comment -

As Petr said it will be difficult to change recordset_to_array(), but I think it is recommended to provide a solution for general GROUP BY queries.

So may this will be a solution:
Adding an additional parameter with a default value to the definiton of both recordset_to_array() and get_records_sql() to define the field which will be stored in the array keys. I think that a String field would be good where you can specify the field name (with a default value of "id" maybe) or leave it null to build the array with self-incrementing (and therefore unique) keys.

But maybe this will not work since moodle widely uses other key fields than "id" in queries. This will be a second solution:
Alternatively we can introduce an new (enhanced) function similar to get_records_sql() (maybe named "get_records_sql_by_field()") with such an extra argument as described in solution1. This provides real compatibility with the current code.

Of course, I can write such a function on my own patching all the affected moodle systems. But in my opinion it will be better to fix this in a central way in dmllib.php as i am not the only administrator / extension developer affected by this problem (as you can read in MDL-7840). For users the advantage with this will be that they do not have to patch their moodle systems by hand instead of getting the fix as a "normal moodle upgrade"

So please, tell me what you think of my ideas, this is a serious problem for me (on of my moodle extensions is seriously affected) and i have to fix this in ANY way in the next days ...

Show
Gert Sauerstein added a comment - As Petr said it will be difficult to change recordset_to_array(), but I think it is recommended to provide a solution for general GROUP BY queries. So may this will be a solution: Adding an additional parameter with a default value to the definiton of both recordset_to_array() and get_records_sql() to define the field which will be stored in the array keys. I think that a String field would be good where you can specify the field name (with a default value of "id" maybe) or leave it null to build the array with self-incrementing (and therefore unique) keys. But maybe this will not work since moodle widely uses other key fields than "id" in queries. This will be a second solution: Alternatively we can introduce an new (enhanced) function similar to get_records_sql() (maybe named "get_records_sql_by_field()") with such an extra argument as described in solution1. This provides real compatibility with the current code. Of course, I can write such a function on my own patching all the affected moodle systems. But in my opinion it will be better to fix this in a central way in dmllib.php as i am not the only administrator / extension developer affected by this problem (as you can read in MDL-7840). For users the advantage with this will be that they do not have to patch their moodle systems by hand instead of getting the fix as a "normal moodle upgrade" So please, tell me what you think of my ideas, this is a serious problem for me (on of my moodle extensions is seriously affected) and i have to fix this in ANY way in the next days ...
Hide
Eloy Lafuente (stronk7) added a comment -

Hi Getr, some thoughts...

Right now I'm 95% sure that nothing is going to be modified at hat level not for 1.8 nor 1.9beta (coming in hours). It's really a bad timeframe to modify nothing like that.

Those are the bad news...

But also, I'm 75% confident that next release of Moodle (2.0 and above) will have that limitation out. It will require to change all the code that currently is using the keys of the associative array, creating new funcions and so on but I'm confident it will be improved for that release.

Finally, while agreeing that it can (and will) be improved... can you point to the problem you are finding in your development? Perhaps knowing a bit more about it, we could propose alternate DB calls to execute it properly (without implementing custom functions in your site).

Ciao

Show
Eloy Lafuente (stronk7) added a comment - Hi Getr, some thoughts... Right now I'm 95% sure that nothing is going to be modified at hat level not for 1.8 nor 1.9beta (coming in hours). It's really a bad timeframe to modify nothing like that. Those are the bad news... But also, I'm 75% confident that next release of Moodle (2.0 and above) will have that limitation out. It will require to change all the code that currently is using the keys of the associative array, creating new funcions and so on but I'm confident it will be improved for that release. Finally, while agreeing that it can (and will) be improved... can you point to the problem you are finding in your development? Perhaps knowing a bit more about it, we could propose alternate DB calls to execute it properly (without implementing custom functions in your site). Ciao

People

Vote (0)
Watch (1)

Dates

  • Created:
    Updated:
    Resolved: