Details

    • Type: Sub-task Sub-task
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 2.3
    • Fix Version/s: 2.5
    • Component/s: Groups, Performance
    • Labels:
    • Testing Instructions:
      Hide

      This patch has not made any changes to user interface or functionality. It is a performance improvement.
      As such we can really only test that things behave exactly as the did before.

      1. Run unit tests
      2. Create a new course and give it a group mode.
      3. Create a couple of groupings and several groups (assign students at the same time).
      4. Map the groups to groupings.
      5. Browse around the course as a teacher and make sure group things still work as expected.
      6. Browse around as a student in a group and make sure things are fine for them as well.
      Show
      This patch has not made any changes to user interface or functionality. It is a performance improvement. As such we can really only test that things behave exactly as the did before. Run unit tests Create a new course and give it a group mode. Create a couple of groupings and several groups (assign students at the same time). Map the groups to groupings. Browse around the course as a teacher and make sure group things still work as expected. Browse around as a student in a group and make sure things are fine for them as well.
    • Affected Branches:
      MOODLE_23_STABLE
    • Fixed Branches:
      MOODLE_25_STABLE
    • Pull Master Branch:
      wip-MDL-34398-m25-integration
    • Rank:
      42793

      Description

      Describe what $GROUPLIB_CACHE is and how it works.

        Issue Links

          Activity

          Hide
          Sam Hemelryk added a comment -

          OK I believe I have this ready for peer-review now.

          I've implemented caching with the group API for group data based around course.
          It removes the use of $GROUPLIB_CACHE (which was pretty trivial) and involves caching groups, groupings and mappings between the two.
          The end effect isn't huge for students, 1-2 queries per page when in a course with groups enabled and in used.
          For teachers it is a little more I believe.

          I am yet to run proper performance testing on this to measure its real impact. In order to do that I will need to set up a new test site with content (can take ages).

          I'll get a review in the mean time please.

          Many thanks
          Sam

          Show
          Sam Hemelryk added a comment - OK I believe I have this ready for peer-review now. I've implemented caching with the group API for group data based around course. It removes the use of $GROUPLIB_CACHE (which was pretty trivial) and involves caching groups, groupings and mappings between the two. The end effect isn't huge for students, 1-2 queries per page when in a course with groups enabled and in used. For teachers it is a little more I believe. I am yet to run proper performance testing on this to measure its real impact. In order to do that I will need to set up a new test site with content (can take ages). I'll get a review in the mean time please. Many thanks Sam
          Hide
          Martin Dougiamas added a comment -

          Damyon could you peer review this when you have a chance?

          Show
          Martin Dougiamas added a comment - Damyon could you peer review this when you have a chance?
          Hide
          Damyon Wiese added a comment -

          No problem, I'll take a look now.

          Show
          Damyon Wiese added a comment - No problem, I'll take a look now.
          Hide
          Damyon Wiese added a comment -

          Hi Sam,

          This is code looks great.

          The main thing I spotted is that there are 3 calls to:

          cache_helper::invalidate_by_definition('core', 'coursegroupings', array(), array($courseid));
          

          Which should be:

          cache_helper::invalidate_by_definition('core', 'coursegroupdata', array(), array($course->id));
          

          Other than that, there are a couple of comments missing full stops.

          I was wondering if it would be worth populating some mapping arrays when putting data into the cache to speed up the calls to:

          groups_get_group_by_name, groups_get_group_by_idnumber, groups_get_grouping_by_name, groups_get_grouping_by_idnumber
          

          But I searched the code and these functions get used rarely, so it's probably better not to change this.

          The other thing I found is that in:
          backup/moodle2/restore_stepslib.php and
          enrol/imsenterprise/lib.php

          There is some code that manually inserts records in the groups / groupings tables and doesn't invalidate the cache.

          Also groups_update_group in group/lib.php doesn't invalidate the cache.

          And finally - when also triggering an event, e.g. in groups_create_group (group/lib.php) - I think it would be best to invalidate the cache before triggering the event. Otherwise, the event trigger may trigger a handler that might use a group function which would return stale data.

          Show
          Damyon Wiese added a comment - Hi Sam, This is code looks great. The main thing I spotted is that there are 3 calls to: cache_helper::invalidate_by_definition('core', 'coursegroupings', array(), array($courseid)); Which should be: cache_helper::invalidate_by_definition('core', 'coursegroupdata', array(), array($course->id)); Other than that, there are a couple of comments missing full stops. I was wondering if it would be worth populating some mapping arrays when putting data into the cache to speed up the calls to: groups_get_group_by_name, groups_get_group_by_idnumber, groups_get_grouping_by_name, groups_get_grouping_by_idnumber But I searched the code and these functions get used rarely, so it's probably better not to change this. The other thing I found is that in: backup/moodle2/restore_stepslib.php and enrol/imsenterprise/lib.php There is some code that manually inserts records in the groups / groupings tables and doesn't invalidate the cache. Also groups_update_group in group/lib.php doesn't invalidate the cache. And finally - when also triggering an event, e.g. in groups_create_group (group/lib.php) - I think it would be best to invalidate the cache before triggering the event. Otherwise, the event trigger may trigger a handler that might use a group function which would return stale data.
          Hide
          Damyon Wiese added a comment -

          One other thing that might be an issue is that the performance of import.php will be bad.

          The code in group/import.php loops through the lines in the uploaded csv and for each one will call:
          groups_get_group_by_name (which will fill the cache)
          and then
          groups_create_group (which will empty the cache)

          • Damyon
          Show
          Damyon Wiese added a comment - One other thing that might be an issue is that the performance of import.php will be bad. The code in group/import.php loops through the lines in the uploaded csv and for each one will call: groups_get_group_by_name (which will fill the cache) and then groups_create_group (which will empty the cache) Damyon
          Hide
          Sam Hemelryk added a comment -

          Hi Damyon,

          Thanks for having a look, just going over the points you've raised.
          BTW it helps to use numbers to identify things, makes it easier to reply

          1. cache_helper::invalidate_by_definition('core', 'coursegroupings', array(), array($courseid));
          I've updated coursegroupings to coursegroupdata now. Originally I had written the cache for coursegroupings only but after looking more closely at the groups API decided we could combine group information for a course and speed things up more.

          2. mapping arrays.
          I tossed this up myself when writing the code but settled on the idea that doing so would provide only a very small benefit compared to the complexity of the data it would introduce.
          Essentially I imagine it would be unusual to find courses with large numbers of groups (>1000) in reality most users will only have a small number of groups (<100). The benefit of having a mapping would add only minimally to the memory used and would speed things up, but I don't imagine by much as iterating over small arrays isn't overly costly.
          In order to facilitate the mapping we would need to map both idnumber and name back to groups and groupings.
          Four more arrays to manage in the group data structure is probably overkill compared to the performance benefit and as you noted they arn't used a very often.

          3. event triggering.
          Very good point, I've updated all invalidation calls to make sure they happen before event invalidation or any further group API integration.

          4. import.php performance.
          The performance there will unfortunately get worse in that we won't benefit from the cache, reading and clearing for each line as you note.
          Having looked at it there isn't really a great solution, while a course is specificied for the import it appears that groups can be imported into any course through it. Best we create an issue to look at that script and its performance I think, deal with it separately after the integration of this.

          I've pushed an updated branch now

          Many thanks
          Sam

          Show
          Sam Hemelryk added a comment - Hi Damyon, Thanks for having a look, just going over the points you've raised. BTW it helps to use numbers to identify things, makes it easier to reply 1. cache_helper::invalidate_by_definition('core', 'coursegroupings', array(), array($courseid)); I've updated coursegroupings to coursegroupdata now. Originally I had written the cache for coursegroupings only but after looking more closely at the groups API decided we could combine group information for a course and speed things up more. 2. mapping arrays. I tossed this up myself when writing the code but settled on the idea that doing so would provide only a very small benefit compared to the complexity of the data it would introduce. Essentially I imagine it would be unusual to find courses with large numbers of groups (>1000) in reality most users will only have a small number of groups (<100). The benefit of having a mapping would add only minimally to the memory used and would speed things up, but I don't imagine by much as iterating over small arrays isn't overly costly. In order to facilitate the mapping we would need to map both idnumber and name back to groups and groupings. Four more arrays to manage in the group data structure is probably overkill compared to the performance benefit and as you noted they arn't used a very often. 3. event triggering. Very good point, I've updated all invalidation calls to make sure they happen before event invalidation or any further group API integration. 4. import.php performance. The performance there will unfortunately get worse in that we won't benefit from the cache, reading and clearing for each line as you note. Having looked at it there isn't really a great solution, while a course is specificied for the import it appears that groups can be imported into any course through it. Best we create an issue to look at that script and its performance I think, deal with it separately after the integration of this. I've pushed an updated branch now Many thanks Sam
          Hide
          Damyon Wiese added a comment -

          Hi Sam,

          I can't see the updated branch - can you double check it got pushed?

          Thanks, Damyon

          Show
          Damyon Wiese added a comment - Hi Sam, I can't see the updated branch - can you double check it got pushed? Thanks, Damyon
          Hide
          Damyon Wiese added a comment -

          I created a linked issue to check the performance of group/import.php.

          All other issues are now fixed - I'll push this for integration.

          Thanks Sam

          Show
          Damyon Wiese added a comment - I created a linked issue to check the performance of group/import.php. All other issues are now fixed - I'll push this for integration. Thanks Sam
          Hide
          Eloy Lafuente (stronk7) added a comment -

          The main moodle.git repository has just been updated with latest weekly modifications. You may wish to rebase your PULL branches to simplify history and avoid any possible merge conflicts. This would also make integrator's life easier next week.

          TIA and ciao

          Show
          Eloy Lafuente (stronk7) added a comment - The main moodle.git repository has just been updated with latest weekly modifications. You may wish to rebase your PULL branches to simplify history and avoid any possible merge conflicts. This would also make integrator's life easier next week. TIA and ciao
          Hide
          Dan Poltawski added a comment -

          Hi Sam,

          This looks nice, but I have one trivial point which blocks me at the moment.. I don't like the name 'coursegroupdata' for the definition. I think it would be better to use 'group', as we have defined in the Core APIs (and use for @category etc): http://docs.moodle.org/dev/Core_APIs#Groups_API_.28group.29

          group always refers to course group IMO so no need to mix with course.

          What do you think?

          [Have we thought of any conventions for the names of the names of caches - now might be a good time ;-)]

          Show
          Dan Poltawski added a comment - Hi Sam, This looks nice, but I have one trivial point which blocks me at the moment.. I don't like the name 'coursegroupdata' for the definition. I think it would be better to use 'group', as we have defined in the Core APIs (and use for @category etc): http://docs.moodle.org/dev/Core_APIs#Groups_API_.28group.29 group always refers to course group IMO so no need to mix with course. What do you think? [Have we thought of any conventions for the names of the names of caches - now might be a good time ;-)]
          Hide
          Sam Hemelryk added a comment -

          Hi Dan,,

          There aren't any conventions for naming caches at the moment.
          I think as a general rule at present they should be descriptive but not wasteful.

          In regards to this cache coursegroupdata it is indeed wasteful now that I look at it.
          Prefixing it by course is wasteful. Groups can only be created for courses presently so I think it can safetly be assumed that the group relates to courses reducing it to groupdata.
          Now as for data at the end.
          My thoughts in using data was to relay to users that this cache is a collection of information relating to the groups in a course.
          Perhaps the "data" is not necessary, but presently I'm a little hesitant in agreeing to calling it just "group" because perhaps in the future we will have another group component cache. Some means of caching group members for instance is an example.
          In thinking a little bit more about it I think perhaps it is a bad idea to encourage people to use the component as the name of any cache for that reason. I think we encourage devs to come up with something more descriptive.

          Having read that what are your thoughts?
          I'd love to see this get in so ping me when you're online and we'll make a decision so that I can make the change and you can continue it on through the integration process.

          Many thanks
          Sam

          Show
          Sam Hemelryk added a comment - Hi Dan,, There aren't any conventions for naming caches at the moment. I think as a general rule at present they should be descriptive but not wasteful. In regards to this cache coursegroupdata it is indeed wasteful now that I look at it. Prefixing it by course is wasteful. Groups can only be created for courses presently so I think it can safetly be assumed that the group relates to courses reducing it to groupdata. Now as for data at the end. My thoughts in using data was to relay to users that this cache is a collection of information relating to the groups in a course. Perhaps the "data" is not necessary, but presently I'm a little hesitant in agreeing to calling it just "group" because perhaps in the future we will have another group component cache. Some means of caching group members for instance is an example. In thinking a little bit more about it I think perhaps it is a bad idea to encourage people to use the component as the name of any cache for that reason. I think we encourage devs to come up with something more descriptive. Having read that what are your thoughts? I'd love to see this get in so ping me when you're online and we'll make a decision so that I can make the change and you can continue it on through the integration process. Many thanks Sam
          Hide
          Dan Poltawski added a comment -

          +1 to groupdata

          Show
          Dan Poltawski added a comment - +1 to groupdata
          Hide
          Sam Hemelryk added a comment -

          Thanks Dan, I've updated the branch now and its all ready to go.

          Show
          Sam Hemelryk added a comment - Thanks Dan, I've updated the branch now and its all ready to go.
          Hide
          Dan Poltawski added a comment -

          Unfortunately there is a conflict with groups_assign_grouping() MDL-30797, I don't like my chances of fixing that now i'm reopening this.

          Sam, if you are able to fix it and we've still got testing outstanding tomorrow (highly likely) i'll be happy to integrate it and test it tomorrow. But whatever works for you.

          Show
          Dan Poltawski added a comment - Unfortunately there is a conflict with groups_assign_grouping() MDL-30797 , I don't like my chances of fixing that now i'm reopening this. Sam, if you are able to fix it and we've still got testing outstanding tomorrow (highly likely) i'll be happy to integrate it and test it tomorrow. But whatever works for you.
          Hide
          CiBoT added a comment -

          Moving this reopened issue out from current integration. Please, re-submit it for integration once ready.

          Show
          CiBoT added a comment - Moving this reopened issue out from current integration. Please, re-submit it for integration once ready.
          Hide
          Sam Hemelryk added a comment -

          No probs at all Dan, I've created a branch based upon integration master presently to resolve the conflicts.
          It's up for integration again now, entirely up to you if you pull it in now or leave it till next week.

          Many thanks
          Sam

          Show
          Sam Hemelryk added a comment - No probs at all Dan, I've created a branch based upon integration master presently to resolve the conflicts. It's up for integration again now, entirely up to you if you pull it in now or leave it till next week. Many thanks Sam
          Hide
          Dan Poltawski added a comment -

          Integrated, thanks Sam.

          Tested during integration, but i'm going to continue to try and do some more testing on it.

          Show
          Dan Poltawski added a comment - Integrated, thanks Sam. Tested during integration, but i'm going to continue to try and do some more testing on it.
          Hide
          Dan Poltawski added a comment -

          Passed. I would not suggest backporting this though until we've run some more functional testing on it, as I have been multitasking and could spend more time on it.

          Show
          Dan Poltawski added a comment - Passed. I would not suggest backporting this though until we've run some more functional testing on it, as I have been multitasking and could spend more time on it.
          Hide
          Dan Poltawski added a comment -

          Hurray! We did it! Thanks to all the reporters, testers, user and watchers for a bumper week of Moodling!

          Show
          Dan Poltawski added a comment - Hurray! We did it! Thanks to all the reporters, testers, user and watchers for a bumper week of Moodling!
          Hide
          David Mudrak added a comment -

          FYI there was a trivial typo in this cache definition - see MDL-37673.

          Show
          David Mudrak added a comment - FYI there was a trivial typo in this cache definition - see MDL-37673 .

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: