I am assuming the final patch (third attached file, gradebook.patch) is the official fix for this problem and I think I understand the goal of that fragment, however, I'd like to suggest what I think to be a simpler single query, rather than using two separate queries to the db:
Assume that the $gradebookroles variable contains an appropriate SQL constraint for either a single or multiple key values for the roleid.
select distinct ra.userid
from mdl_role_assignments ra
inner join mdl_context ctx on ra.contextid = ctx.id
left join mdl_grade_grades gg on gg.itemid = $this->id and gg.userid = ra.userid
where ra.roleid $gradebookroles
and ctx.instanceid = $this->courseid
and ctx.contextlevel = CONTEXT_COURSE
and gg.id is null
The first possible difference from the patch's query is that this one does not consider any "parent" contexts as pulled in from the get_related_contexts_string. But, from what I can tell, a course's parentage can only consist of categories on up to the site context (/1). I'm fairly ignorant of moodle still, but I'm reasoning that I don't need to consider a course's parent contexts when determining which student's grades are not present for one particular grade_item.
The other difference is that the population of user id values is coming from a set that has more of an expectation of being complete--enrollments, whereas the patch's approach is to look for user id values in the mdl_grade_grades table corresponding to any possible grade_items other than the ONE in which we're interested. If there happen to be no grades for a user with a different grade_item, then that user's id won't be in the starting population against which we filter using the left join, and so won't be in the result.
I can confirm this issue. On a test site with ~10,000 users. I created a course and added 3 faculty and 2 students. I added 6 assignments, 3 categories and 3 graded items. I entered in three grades total. I entered a calculation in each of the three grade items. The mdl_grade_grades table grew from 1152 records to 2,450 and mdl_grade_grades_history grew from 10,114 records to 12,715. There was no other activity on the test site while I was doing this.
On our production site we have
mdl_grade_grades with 536,018 records and
mdl_grade_grades_history with 2,184,929 records
These large tables are causing our production server to have one type of query go over 5 seconds:
Count: 254 Time=27.43s (6967s) Lock=1.27s (323s) Rows=66.3 (16840),
SELECT DISTINCT go.userid
FROM mdl_grade_grades go
JOIN mdl_grade_items gi
ON gi.id = go.itemid
LEFT OUTER JOIN mdl_grade_grades g
ON (g.userid = go.userid AND g.itemid = N)
WHERE gi.id <> N AND g.id IS NULL