Moodle
  1. Moodle
  2. MDL-25290

MUC Stage 1: Implement some core caching architecture (MUC)

    Details

    • Rank:
      6219

      Description

      A single new caching system that can be applied whenever a developer writes code that could benefit from a cache. This consistency will give the admin more flexibility to configure Moodle to cope with very large loads by saving CPU and I/O.

      The caching system allows the admin to choose from a variety of different backends, ranging from simple files to memcached and everything else.

        Issue Links

          Activity

          Hide
          Eloy Lafuente (stronk7) added a comment -

          I'm changing this issue to be the place where discuss about any general architecture for supporting caching at different levels (static/session/db/memcached...).

          About the original title of this (slow count problems) it seems that MDL-25637 has already some analysis and work done, please follow it there.

          Show
          Eloy Lafuente (stronk7) added a comment - I'm changing this issue to be the place where discuss about any general architecture for supporting caching at different levels (static/session/db/memcached...). About the original title of this (slow count problems) it seems that MDL-25637 has already some analysis and work done, please follow it there.
          Hide
          Eloy Lafuente (stronk7) added a comment -

          Some basic ideas:

          • it should support different backends (static, session, DB, shared memory...).
          • any attempt of caching within Moodle should end using it (and not other solutions).
          • code should continue working 100% without caching enabled.
          • each use should define one fallback of backend to use so, for example, for caching CFG we could use (shared memory, DB) and the 1st one available is picked.
          • each cached element must be hash-based (hash selection will be critical to determine if it's globsl/per user/whatever).
          • each cached element must have one TTL

          Ciao

          PS: Surely Sam's implementation for accesslib (static caching) can be one good start point to abstract from and so on.

          Show
          Eloy Lafuente (stronk7) added a comment - Some basic ideas: it should support different backends (static, session, DB, shared memory...). any attempt of caching within Moodle should end using it (and not other solutions). code should continue working 100% without caching enabled. each use should define one fallback of backend to use so, for example, for caching CFG we could use (shared memory, DB) and the 1st one available is picked. each cached element must be hash-based (hash selection will be critical to determine if it's globsl/per user/whatever). each cached element must have one TTL Ciao PS: Surely Sam's implementation for accesslib (static caching) can be one good start point to abstract from and so on.
          Hide
          Tony Levi added a comment -

          Hi,

          Got some ideas and even time to implement this so I want to open the floor for discussion, is a new forum thread best?

          I've setup this page on docs also; looking for some commentary and ideas about what the best design is.
          http://docs.moodle.org/dev/Caching_system_%28proposed%29
          This doesn't quite explain fully what I have at the moment and there is more to fill out here; hope to get that added over time.

          There is some code where I have been playing with ideas but I don't want to just dump something, it has to be an acceptable design first...

          Cheers

          Show
          Tony Levi added a comment - Hi, Got some ideas and even time to implement this so I want to open the floor for discussion, is a new forum thread best? I've setup this page on docs also; looking for some commentary and ideas about what the best design is. http://docs.moodle.org/dev/Caching_system_%28proposed%29 This doesn't quite explain fully what I have at the moment and there is more to fill out here; hope to get that added over time. There is some code where I have been playing with ideas but I don't want to just dump something, it has to be an acceptable design first... Cheers
          Hide
          Tony Levi added a comment -

          Added some DB info to the docs page, probably needs a nicer overview to explain for people not in my head

          Show
          Tony Levi added a comment - Added some DB info to the docs page, probably needs a nicer overview to explain for people not in my head
          Hide
          Tony Levi added a comment -

          Looking for some discussion on this: http://moodle.org/mod/forum/discuss.php?d=189498

          Show
          Tony Levi added a comment - Looking for some discussion on this: http://moodle.org/mod/forum/discuss.php?d=189498
          Hide
          Martin Dougiamas added a comment -

          Let's get this moving again. We need something in place ASAP so that others can build on it.

          Show
          Martin Dougiamas added a comment - Let's get this moving again. We need something in place ASAP so that others can build on it.
          Hide
          Martin Dougiamas added a comment - - edited

          Thanks very much Tony! I've not looked at your code yet at all but it's probably a good starting point!

          Sam H, do you have some dev time to have a look at Tony's code and give some feedback? Plus the other linked issues.

          If you see a better way, can you come up with a quick initial text spec as to how you think this should work? Don't spend too much time on it, I think it might just be something to get everybody else thinking.

          Show
          Martin Dougiamas added a comment - - edited Thanks very much Tony! I've not looked at your code yet at all but it's probably a good starting point! Sam H, do you have some dev time to have a look at Tony's code and give some feedback? Plus the other linked issues. If you see a better way, can you come up with a quick initial text spec as to how you think this should work? Don't spend too much time on it, I think it might just be something to get everybody else thinking.
          Hide
          Mark Drechsler added a comment -

          Thanks Martin!
          As our unis ramped up in the last couple of weeks the topic of caching to support large Moodle deployments came up repeatedly, and I've just had Tony in here doing the Happy Dance in the knowledge that Sam is at least going to look at this issue.
          Please let Tony know if there's anything we can do to support the process - he's been doing plenty of analysis on areas where caching is desperately needed for large implementations.
          Cheers,
          Mark.

          Show
          Mark Drechsler added a comment - Thanks Martin! As our unis ramped up in the last couple of weeks the topic of caching to support large Moodle deployments came up repeatedly, and I've just had Tony in here doing the Happy Dance in the knowledge that Sam is at least going to look at this issue. Please let Tony know if there's anything we can do to support the process - he's been doing plenty of analysis on areas where caching is desperately needed for large implementations. Cheers, Mark.
          Hide
          Tony Levi added a comment -

          Adding a direct link here and rebased to latest 22_STABLE again.

          https://github.com/tlevi/moodle/commits/wip-mdl25290

          Show
          Tony Levi added a comment - Adding a direct link here and rebased to latest 22_STABLE again. https://github.com/tlevi/moodle/commits/wip-mdl25290
          Hide
          akfoote added a comment -

          I've been watching this issue since its inception - over a year now..

          Is there any word that moodle.org can give indicating a caching layer will be forthcoming for a/any moodle 2.x version.

          Show
          akfoote added a comment - I've been watching this issue since its inception - over a year now.. Is there any word that moodle.org can give indicating a caching layer will be forthcoming for a/any moodle 2.x version.
          Hide
          Sam Hemelryk added a comment -

          Hi akfoote,

          This issue is going to be given time very shortly (it is currently 3rd on my list to look at).
          We are very keen to get a proper caching layer into Moodle as soon as we can.

          Cheers
          Sam

          Show
          Sam Hemelryk added a comment - Hi akfoote, This issue is going to be given time very shortly (it is currently 3rd on my list to look at). We are very keen to get a proper caching layer into Moodle as soon as we can. Cheers Sam
          Hide
          akfoote added a comment -

          Thanks for the ping back Sam..

          Good to hear. I know this is a key factor in some moving to 2.x (myself included). I also understand that it needs to be solid.

          Show
          akfoote added a comment - Thanks for the ping back Sam.. Good to hear. I know this is a key factor in some moving to 2.x (myself included). I also understand that it needs to be solid.
          Hide
          Eloy Lafuente (stronk7) added a comment -

          Some targets, to avoid forgetting them:

          TG1 - shared - lang strings
          TG2 - shared - db metainformation
          TG3 - shared - config

          keep adding...

          Show
          Eloy Lafuente (stronk7) added a comment - Some targets, to avoid forgetting them: TG1 - shared - lang strings TG2 - shared - db metainformation TG3 - shared - config keep adding...
          Hide
          Sam Hemelryk added a comment -

          OK I'm writing up a proposal based upon my findings and research now... watch this space

          Show
          Sam Hemelryk added a comment - OK I'm writing up a proposal based upon my findings and research now... watch this space
          Hide
          Tony Levi added a comment -

          Another target - dirty contexts. cache_flags seems to be a big chunk of our DB activity.

          Show
          Tony Levi added a comment - Another target - dirty contexts. cache_flags seems to be a big chunk of our DB activity.
          Hide
          Martin Dougiamas added a comment -

          Converting to a full bug so DEV can build subtasks

          Show
          Martin Dougiamas added a comment - Converting to a full bug so DEV can build subtasks
          Hide
          Martin Dougiamas added a comment -

          Linking to Sam's earlier proposal

          Show
          Martin Dougiamas added a comment - Linking to Sam's earlier proposal
          Hide
          Sam Marshall added a comment -

          When implementing caching, please consider multi-server architectures.

          The proposal doesn't seem to mention that and it might be significant. For example, do you need to use a shared cache, and if so, will this still result in good performance?

          Or, is there a need to indicate certain pieces of cache data which must be shared and some which can be tracked entirely locally because they're never/rarely invalidated?

          I also think (this contradicts the above to some extent) that in general it is worth trying to favour an 'infinite cache, with invalidate' approach, ideally for all items. For example, you could cache config table for 1 minute and that would probably be fine in general but it could cause odd, brief problems for users when switching settings if you switch two settings that depend on each other and the switch happens to occur across the 1 minute deadline. It'll be fixed in another minute, but that's still bit sucky compared to cacheing the table infinitely and clearing it whenever somebody changes it.

          Show
          Sam Marshall added a comment - When implementing caching, please consider multi-server architectures. The proposal doesn't seem to mention that and it might be significant. For example, do you need to use a shared cache, and if so, will this still result in good performance? Or, is there a need to indicate certain pieces of cache data which must be shared and some which can be tracked entirely locally because they're never/rarely invalidated? I also think (this contradicts the above to some extent) that in general it is worth trying to favour an 'infinite cache, with invalidate' approach, ideally for all items. For example, you could cache config table for 1 minute and that would probably be fine in general but it could cause odd, brief problems for users when switching settings if you switch two settings that depend on each other and the switch happens to occur across the 1 minute deadline. It'll be fixed in another minute, but that's still bit sucky compared to cacheing the table infinitely and clearing it whenever somebody changes it.
          Hide
          Tony Levi added a comment -

          So far I handled that here by only using 'coherent' (consistent view to all concurrent requests) plugins where needed e.g. config can live in memcache but given the option of several frontend servers cannot live in xcache/apc - langcache is stored in these places though, with a mechanism to make each frontend obey purge all caches.

          I can't see a way to do this with the proposed solution as-is, but if the plugins published their coherency capability it could be an extra option when creating a new cache. It might also be handy to be able to override by config - advanced users know how many frontends they have and can specify that xcache/apc/whatever will be coherent. Or, much simpler, it has to be just required that any configured plugin instance must give this guarantee - so far I have only a couple of cases that can work without it and they would be happy enough to use memcache instead of xcache.

          TTLs are needed for the non-coherent cases. If this is not allowed, then TTLs are only useful in that they make invalidate bugs less of a disaster though also makes finding / debugging them way harder...

          Thoughts?

          Show
          Tony Levi added a comment - So far I handled that here by only using 'coherent' (consistent view to all concurrent requests) plugins where needed e.g. config can live in memcache but given the option of several frontend servers cannot live in xcache/apc - langcache is stored in these places though, with a mechanism to make each frontend obey purge all caches. I can't see a way to do this with the proposed solution as-is, but if the plugins published their coherency capability it could be an extra option when creating a new cache. It might also be handy to be able to override by config - advanced users know how many frontends they have and can specify that xcache/apc/whatever will be coherent. Or, much simpler, it has to be just required that any configured plugin instance must give this guarantee - so far I have only a couple of cases that can work without it and they would be happy enough to use memcache instead of xcache. TTLs are needed for the non-coherent cases. If this is not allowed, then TTLs are only useful in that they make invalidate bugs less of a disaster though also makes finding / debugging them way harder... Thoughts?
          Hide
          Martin Dougiamas added a comment -

          Thanks for getting involved, guys. We really want something good here so your help is really valuable.

          One point to make is that any one solution is unlikely to solve all caching needs, we might need two or more patterns.

          I'm creating sub tasks for known caching hotspots in Moodle so we can analyse each one and build up the requirements.

          Show
          Martin Dougiamas added a comment - Thanks for getting involved, guys. We really want something good here so your help is really valuable. One point to make is that any one solution is unlikely to solve all caching needs, we might need two or more patterns. I'm creating sub tasks for known caching hotspots in Moodle so we can analyse each one and build up the requirements.
          Hide
          Sam Hemelryk added a comment -

          Hi guys,

          I've made some serious headway into this now and the code is pretty much ready for review.
          I've asked Eloy + Petr to take a look at it and I invite anyone else who is keen to have a look at it as well.
          You can find the code for this at:

          https://github.com/samhemelryk/moodle/tree/wip-MDL-25290-m24-compact
          git fetch git://github.com/samhemelryk/moodle.git wip-MDL-25290-m24-compact

          That is just the MUC code without any conversions. If you want to give things a whirl to see how they fell try using:

          https://github.com/samhemelryk/moodle/tree/wip-MDL-25290-m24-combined
          git fetch git://github.com/samhemelryk/moodle.git wip-MDL-25290-m24-combined

          The combined branch includes 3 implementations:

          1. Conversion of the string cache to use MUC
          2. Introduction of caching for Postgres database meta information
          3. Introduction of caching for config data

          Please post any feedback here on this issue so that we can centralise things.
          I'm still working on the code as we speak however I feel the API has settled down presently and is in a stable enough state to start reviewing the code.

          Have a read of the readme at https://github.com/samhemelryk/moodle/tree/wip-MDL-25290-m24-combined/cache for information about this solution.

          I've also been running some performance testing over this code using JMeter (https://github.com/samhemelryk/moodle-jmeter-perfcomp)

          Test environment:

          • Test machine (dedicated to this task) default settings for everything
            • AMD Quad core, 4GB ram, 80GB 5200RPM HDD.
            • Debian 6.0.5 (no xserver)
            • Apache 2.4.2
            • PHP 5.4.5
            • PHP Memcache 3.0.6
            • PHP Memcached 2.1.0 (1.0.10)
            • PGSQL 9.1.4
            • Memcached 1.4.14
            • libevent 2.0.19
          • Development machine to run the JMeter tests

          Test scenario (by page visit, 15 pages total)

          1. Login page (student logs in)
          2. View course (World of water course http://school.demo.moodle.net/course/view.php?id=115)
          3. View forum
          4. View discussion
          5. View Course calendar
          6. View course again
          7. View forum again
          8. Add discussion
          9. Add discussion complete (confirmation screen)
          10. View forum
          11. View discussion just added
          12. Delete discussion
          13. Confirm delete discussion
          14. Log out
          15. View home page (no longer logged in)

          Test conditions

          • 50 Users at a time
          • 180 second ramp up (on average 1 student starts the scenario every 3.6 seconds)
          • Repeated 20 times (50 users + 15 pages repeated 20 times)
          • Maximum throughput of 220 requests for second.
          • Resulting total requests per page in the scenario = 1000
          • Resulting total requests in the scenario = 15000

          The results are as follows:

          Comparison of Moodle without MUC and then Moodle with MUC there but not enabled. (gives an idea of code overhead)
          http://moodev.com/?w=300&h=150&before=03.master&after=03.master.muc

          Comparison of Moodle without MUC and then Moodle with MUC and the 3 conversions. Using the default stores.
          http://moodev.com/?w=300&h=150&before=03.master&after=03.master.muc.conversions

          Comparison of Moodle without MUC and then Moodle with MUC and the 3 conversions. Using Memcached this time.
          http://moodev.com/?w=300&h=150&before=03.master&after=03.master.muc.conversions.memcached

          (noting as of the time of writing this that site is getting a memory exhausted error, I've rectified it but it takes some time to kick in. Hopefully this will work by the time others read it)

          Many thanks
          Sam

          Show
          Sam Hemelryk added a comment - Hi guys, I've made some serious headway into this now and the code is pretty much ready for review. I've asked Eloy + Petr to take a look at it and I invite anyone else who is keen to have a look at it as well. You can find the code for this at: https://github.com/samhemelryk/moodle/tree/wip-MDL-25290-m24-compact git fetch git://github.com/samhemelryk/moodle.git wip- MDL-25290 -m24-compact That is just the MUC code without any conversions. If you want to give things a whirl to see how they fell try using: https://github.com/samhemelryk/moodle/tree/wip-MDL-25290-m24-combined git fetch git://github.com/samhemelryk/moodle.git wip- MDL-25290 -m24-combined The combined branch includes 3 implementations: Conversion of the string cache to use MUC Introduction of caching for Postgres database meta information Introduction of caching for config data Please post any feedback here on this issue so that we can centralise things. I'm still working on the code as we speak however I feel the API has settled down presently and is in a stable enough state to start reviewing the code. Have a read of the readme at https://github.com/samhemelryk/moodle/tree/wip-MDL-25290-m24-combined/cache for information about this solution. I've also been running some performance testing over this code using JMeter ( https://github.com/samhemelryk/moodle-jmeter-perfcomp ) Test environment: Test machine (dedicated to this task) default settings for everything AMD Quad core, 4GB ram, 80GB 5200RPM HDD. Debian 6.0.5 (no xserver) Apache 2.4.2 PHP 5.4.5 PHP Memcache 3.0.6 PHP Memcached 2.1.0 (1.0.10) PGSQL 9.1.4 Memcached 1.4.14 libevent 2.0.19 Development machine to run the JMeter tests Test scenario (by page visit, 15 pages total) Login page (student logs in) View course (World of water course http://school.demo.moodle.net/course/view.php?id=115 ) View forum View discussion View Course calendar View course again View forum again Add discussion Add discussion complete (confirmation screen) View forum View discussion just added Delete discussion Confirm delete discussion Log out View home page (no longer logged in) Test conditions 50 Users at a time 180 second ramp up (on average 1 student starts the scenario every 3.6 seconds) Repeated 20 times (50 users + 15 pages repeated 20 times) Maximum throughput of 220 requests for second. Resulting total requests per page in the scenario = 1000 Resulting total requests in the scenario = 15000 The results are as follows: Comparison of Moodle without MUC and then Moodle with MUC there but not enabled. (gives an idea of code overhead) http://moodev.com/?w=300&h=150&before=03.master&after=03.master.muc Comparison of Moodle without MUC and then Moodle with MUC and the 3 conversions. Using the default stores. http://moodev.com/?w=300&h=150&before=03.master&after=03.master.muc.conversions Comparison of Moodle without MUC and then Moodle with MUC and the 3 conversions. Using Memcached this time. http://moodev.com/?w=300&h=150&before=03.master&after=03.master.muc.conversions.memcached (noting as of the time of writing this that site is getting a memory exhausted error, I've rectified it but it takes some time to kick in. Hopefully this will work by the time others read it) Many thanks Sam
          Hide
          Tim Hunt added a comment -

          I note in the readme you say "The file structure is still very separated. Before this gets integrated I will optimise the files, combining most things into lib.php" and that you have already done that. Argh! a 3000+ line lib.php is not in any sense optimised. It is a bloody nightmare. We should be heading towards one file per class, not the other way around!

          Show
          Tim Hunt added a comment - I note in the readme you say "The file structure is still very separated. Before this gets integrated I will optimise the files, combining most things into lib.php" and that you have already done that. Argh! a 3000+ line lib.php is not in any sense optimised. It is a bloody nightmare. We should be heading towards one file per class, not the other way around!
          Hide
          Tim Hunt added a comment -

          Overall, the is looking fantastic. However, can criticise anything, so here goes

          1. db/cachedefinitions.php -> db/caches.php or db/caching.php. After all we do not use long names like db/eventdefinitions.php.

          2. The readme would be easier to understand if the code examples came earlier in the file. Or at least if you mentioned early on that they were at the end.

          3. I assume cache names only have to be unique within a component. No, I am confused. Surely (component, area) is a unique identifier for the cache, and name is redundant and should be removed. (e.g. file areas don't have names.)

          4. I don't see why I, as a user of the API, should be forced to know about two classes cache_configuration_reader and cache_configuration_writer. I think the external API should just use a single cache_config class. I don't care what that does behind the scenes.

          5. Re TODOs. One example of why you need separate component/area is when the admin uninstalls a plugin, we should automatically clean up any caches that component owns.

          6. define('CACHE_NOW', time()); seems like a bad idea to me. It is not amenable to unit testing. cache::$now is probably better.

          7. If ::has is such a bad idea (as the phpdoc comment says) why support it at all? I would keep this new code clean.

          8. I have been told in the past by one integrator not to use the /**#@+ PHPdoc syntax.

          9. I wonder if cache_feature_lockable is a more grammatical name. Similarly cache_feature_key_aware

          10. cache_data_source::*_for_cache method names are pretty ugly. I guess you wanted to make sure there were not any name collisions with the class's other methods, but I still think we could improve things here. load_cachable, load_cachables (not sure those are actually improvements). Does get_instance_for_cache return a cache_data_source? If so, why doesn't the PHPdoc say that, and why not call the method get_cache_data_source

          11. cacheable_object -> cacheable ? Do we really need this class? Why can't we just use __sleep and __wakeup?

          12. Just reading on-screen in the github UI, some of the field comments in the cache class need to be re-line-wrapped. They are too long.

          13. static $persistentcache = array(); inside make() is a terrible idea. We know from past experience that this makes you cry when writing unit tests. Make it a protected static field of the class. (Actually, why are you manually caching these in a static field. Shouldn't you be using MUC to cache them :-p)

          14. In make, the

          $self = get_called_class();
          $cache = $self::from_definition($definition);
          

          bit is just nasty. I am pretty sure it means you have got your design wrong. Is there enough complexity here that you should create a real cache_factory class? Then cache::make can be a thin facade that just gets/creates a factory and calls it to do the hard work. That could also be a more natural home for $persistentcache.

          15. I am losing track (if I thought about it I might be able to work it out myself) but

          if ($loader instanceof cache_loader) {
              $this->loader = $loader;
          } else if ($loader instanceof cache_data_source) {
              $this->datasource = $loader;
          }
          

          also makes me suspicious. Your code is going to end of full of if ($this->loader) else if ($this->datasource) else .... Can't you make these two classes implement the same interface; and also have a null implementation of the same interface, so you can always rely on $this->datasource, even though it might not do anything?

          OK, my brain is getting full now. I will stop. (Note to self, line 850 of lib.php.)

          Show
          Tim Hunt added a comment - Overall, the is looking fantastic. However, can criticise anything, so here goes 1. db/cachedefinitions.php -> db/caches.php or db/caching.php. After all we do not use long names like db/eventdefinitions.php. 2. The readme would be easier to understand if the code examples came earlier in the file. Or at least if you mentioned early on that they were at the end. 3. I assume cache names only have to be unique within a component. No, I am confused. Surely (component, area) is a unique identifier for the cache, and name is redundant and should be removed. (e.g. file areas don't have names.) 4. I don't see why I, as a user of the API, should be forced to know about two classes cache_configuration_reader and cache_configuration_writer. I think the external API should just use a single cache_config class. I don't care what that does behind the scenes. 5. Re TODOs. One example of why you need separate component/area is when the admin uninstalls a plugin, we should automatically clean up any caches that component owns. 6. define('CACHE_NOW', time()); seems like a bad idea to me. It is not amenable to unit testing. cache::$now is probably better. 7. If ::has is such a bad idea (as the phpdoc comment says) why support it at all? I would keep this new code clean. 8. I have been told in the past by one integrator not to use the /**#@+ PHPdoc syntax. 9. I wonder if cache_feature_lockable is a more grammatical name. Similarly cache_feature_key_aware 10. cache_data_source::*_for_cache method names are pretty ugly. I guess you wanted to make sure there were not any name collisions with the class's other methods, but I still think we could improve things here. load_cachable, load_cachables (not sure those are actually improvements). Does get_instance_for_cache return a cache_data_source? If so, why doesn't the PHPdoc say that, and why not call the method get_cache_data_source 11. cacheable_object -> cacheable ? Do we really need this class? Why can't we just use __sleep and __wakeup? 12. Just reading on-screen in the github UI, some of the field comments in the cache class need to be re-line-wrapped. They are too long. 13. static $persistentcache = array(); inside make() is a terrible idea. We know from past experience that this makes you cry when writing unit tests. Make it a protected static field of the class. (Actually, why are you manually caching these in a static field. Shouldn't you be using MUC to cache them :-p) 14. In make, the $self = get_called_class(); $cache = $self::from_definition($definition); bit is just nasty. I am pretty sure it means you have got your design wrong. Is there enough complexity here that you should create a real cache_factory class? Then cache::make can be a thin facade that just gets/creates a factory and calls it to do the hard work. That could also be a more natural home for $persistentcache. 15. I am losing track (if I thought about it I might be able to work it out myself) but if ($loader instanceof cache_loader) { $ this ->loader = $loader; } else if ($loader instanceof cache_data_source) { $ this ->datasource = $loader; } also makes me suspicious. Your code is going to end of full of if ($this->loader) else if ($this->datasource) else .... Can't you make these two classes implement the same interface; and also have a null implementation of the same interface, so you can always rely on $this->datasource, even though it might not do anything? OK, my brain is getting full now. I will stop. (Note to self, line 850 of lib.php.)
          Hide
          Sam Hemelryk added a comment -

          Hi Tim,

          Thanks for having a look at the code, it is most appreciated.
          In reply to the points raised so far (I'll try keep it breif at this point).

          1. Sorry yes that has been done, Martin made the same suggestion about db/caches.php so that is what I have gone for. I had forgotted to update the readme my apologies.
          2. I'll have a look at that today
          3. Certainly the name could be removed, that is a good idea. That would actually solve a couple of other problems and I'm sure it wouldn't be too hard to convert. I'll see what others think and tentativily add it to my todo list.
          4. Interesting point, I separated the two classes because 99.9% of users will never ever need to use cache_configuration_writer. It is only used within the administration pages and I split it locallib.php so that it was not being loaded unnessecarily. I still think the separation works, perhaps I need to look at renaming things. What about cache_configuration and cache_configuration_writer?
          5. Great point, thank you.
          6. Added to my todo list for today.
          7. Heh it is there because I wrote it there, personally I think calling has is a terrible idea, its not reliable in a highly volatile situation and in 99% of cases we can be sure there is a better way to do things. That being said I can imagine the has methods will still be used, and perhaps there is a valid use case out there. It'll be interesting to hear others thoughts on these methods and whether we should allow them to remain or whether we should form a hardline and remove them. The code can still be saved to a patch file and kept in tracker in case a valid use case is found if we do decide to remove them.
          8. Personally I like the /**#@+ syntax. I will argue with the chosen integrator about their existence... we will just have to see whether they land or not. (I don't see harm in them).
          9. Brilliant, much better names, now on my todo list.
          10. I agree, I struggled with the naming there. Indeed the current names were chosen to avoid conflicts, and the instance was chosen to reflect the trend of having static instance methods. (read more on this in point 15)
          11. I spent a bit of time considering this. __sleep and __wakeup are the obvious choices but in the end I decided that they were not correct for this use. I came to the conclussion that by requiring a interface and having dedicate methods we can easily identify cacheable objects and we can allow classes to perform different cleanup+wakeup routines for caching and serialisaion. Also worth noting (perhaps relevant, perhaps not) is that cache stores are responsible handling incoming data, and for most cases I imagine this will be handled internally. Not all objects will be serialised, such is the case with APC for example. I suppose I note this in support of making it clear that preparing an object for a cache is not the same as serialising an object for storage albeit that is what is going to happen in many cases. Again this is something I would be keen to hear others thoughts on. People know __sleep and __wakeup methods and we can still use an interface so perhaps the easier route is to simply use them.
          12. I've a guide set at 132 chars which I've tried to stick to, although I do see there are a dozen or so instances where I have gone slightly over. I'll have a scroll through and see about perhaps wrapping things a little more.
          13. +
          14. Hehe it probably could be using itself for this, although I think I will explore the idea of a factory class first. I definitely agree the current solution there, the use of the static var, is not at all good.
          15. This ties in to 10. really. Originally I had them sharing an interface and that if was not needed, then however I decided it was a bad idea as it meant that classing implementing had to have get and get_many methods dedicated to loading cache information and I felt that in the case of an all in one class this would both be confusing to the end user of the class and at risk of causing name collisions. Now that I think about I am still not sure. Again I'd like to hear what others have to think. Having that shared interface would be both cleaner and easier to deal with within the cache API, it would however perhaps not be so clear for the implementing classes and at risk of causing conflict. Then again will people create all in one classes or will they create dedicated data source classes, perhaps is the question that will decide for us.

          So in summary:
          1. 2. 6. 9. 12. 13/14. on my immediate todo list.
          3. 4. 7. 11. 10/15. Will see if other have some thoughts on these issues and will then decide upon the action to take.

          I look forward to the rest of your review Tim, thanks so far it is a big help. Certainly the more we can turn up now the better, things will get hard once we have implementations for this code.
          I'll aim to make the noted changes today where practical (factory method I imagine will take more thought and process than today will allow).

          Many thanks
          Sam

          Show
          Sam Hemelryk added a comment - Hi Tim, Thanks for having a look at the code, it is most appreciated. In reply to the points raised so far (I'll try keep it breif at this point). Sorry yes that has been done, Martin made the same suggestion about db/caches.php so that is what I have gone for. I had forgotted to update the readme my apologies. I'll have a look at that today Certainly the name could be removed, that is a good idea. That would actually solve a couple of other problems and I'm sure it wouldn't be too hard to convert. I'll see what others think and tentativily add it to my todo list. Interesting point, I separated the two classes because 99.9% of users will never ever need to use cache_configuration_writer. It is only used within the administration pages and I split it locallib.php so that it was not being loaded unnessecarily. I still think the separation works, perhaps I need to look at renaming things. What about cache_configuration and cache_configuration_writer? Great point, thank you. Added to my todo list for today. Heh it is there because I wrote it there, personally I think calling has is a terrible idea, its not reliable in a highly volatile situation and in 99% of cases we can be sure there is a better way to do things. That being said I can imagine the has methods will still be used, and perhaps there is a valid use case out there. It'll be interesting to hear others thoughts on these methods and whether we should allow them to remain or whether we should form a hardline and remove them. The code can still be saved to a patch file and kept in tracker in case a valid use case is found if we do decide to remove them. Personally I like the /**#@+ syntax. I will argue with the chosen integrator about their existence... we will just have to see whether they land or not. (I don't see harm in them). Brilliant, much better names, now on my todo list. I agree, I struggled with the naming there. Indeed the current names were chosen to avoid conflicts, and the instance was chosen to reflect the trend of having static instance methods. (read more on this in point 15) I spent a bit of time considering this. __sleep and __wakeup are the obvious choices but in the end I decided that they were not correct for this use. I came to the conclussion that by requiring a interface and having dedicate methods we can easily identify cacheable objects and we can allow classes to perform different cleanup+wakeup routines for caching and serialisaion. Also worth noting (perhaps relevant, perhaps not) is that cache stores are responsible handling incoming data, and for most cases I imagine this will be handled internally. Not all objects will be serialised, such is the case with APC for example. I suppose I note this in support of making it clear that preparing an object for a cache is not the same as serialising an object for storage albeit that is what is going to happen in many cases. Again this is something I would be keen to hear others thoughts on. People know __sleep and __wakeup methods and we can still use an interface so perhaps the easier route is to simply use them. I've a guide set at 132 chars which I've tried to stick to, although I do see there are a dozen or so instances where I have gone slightly over. I'll have a scroll through and see about perhaps wrapping things a little more. + Hehe it probably could be using itself for this, although I think I will explore the idea of a factory class first. I definitely agree the current solution there, the use of the static var, is not at all good. This ties in to 10. really. Originally I had them sharing an interface and that if was not needed, then however I decided it was a bad idea as it meant that classing implementing had to have get and get_many methods dedicated to loading cache information and I felt that in the case of an all in one class this would both be confusing to the end user of the class and at risk of causing name collisions. Now that I think about I am still not sure. Again I'd like to hear what others have to think. Having that shared interface would be both cleaner and easier to deal with within the cache API, it would however perhaps not be so clear for the implementing classes and at risk of causing conflict. Then again will people create all in one classes or will they create dedicated data source classes, perhaps is the question that will decide for us. So in summary: 1. 2. 6. 9. 12. 13/14. on my immediate todo list. 3. 4. 7. 11. 10/15. Will see if other have some thoughts on these issues and will then decide upon the action to take. I look forward to the rest of your review Tim, thanks so far it is a big help. Certainly the more we can turn up now the better, things will get hard once we have implementations for this code. I'll aim to make the noted changes today where practical (factory method I imagine will take more thought and process than today will allow). Many thanks Sam
          Hide
          Matt Sharpe (Inactive) added a comment -

          As discussed on Jabber with Sam, potential backends alongside what's already been mentioned: Amazon ElastiCache (API compatible with Memcached) and Redis. Rather basic (although Redis datatypes could be interesting) however this also includes cache server clusters. Memcached and Redis both support clusters of distributed and/or replicated data, so the MUC should be able to roundrobbin off the pool somehow. ElastiCache is mostly a single IP to connect to and doesn't have this issue. Note that roundrobbining won't mean only some data is accessible, it just spreads the requests out over the cluster to avoid congestion at a single node.

          Show
          Matt Sharpe (Inactive) added a comment - As discussed on Jabber with Sam, potential backends alongside what's already been mentioned: Amazon ElastiCache (API compatible with Memcached) and Redis. Rather basic (although Redis datatypes could be interesting) however this also includes cache server clusters. Memcached and Redis both support clusters of distributed and/or replicated data, so the MUC should be able to roundrobbin off the pool somehow. ElastiCache is mostly a single IP to connect to and doesn't have this issue. Note that roundrobbining won't mean only some data is accessible, it just spreads the requests out over the cluster to avoid congestion at a single node.
          Hide
          Tony Levi added a comment -

          The two pecl memcache extensions handle the distribution themselves. The only support needed in Moodle is to set multiple servers in the configuration. Not sure about how Redis handles this, but I assume the details would be hidden behind the extension/library used in this case too.

          Show
          Tony Levi added a comment - The two pecl memcache extensions handle the distribution themselves. The only support needed in Moodle is to set multiple servers in the configuration. Not sure about how Redis handles this, but I assume the details would be hidden behind the extension/library used in this case too.
          Hide
          Sam Hemelryk added a comment -

          OK I've made the following changes of Friday and the weekend and have pushed them up now:

          • Changed definitions to component and area to identify themselves rather than a name.
          • Created a cache factory class and converted object creation to use that to allow for reuse and make it possible to reset the cache API (useful for unit testing).
          • Tidied up config initialisation making default config creation more readable.
          • Added unit tests for locking and tidied up locking as required.
          • Fixed up typos
          • Changed CACHE_NOW to cache::now()
          • Renamed a few things:
            • cache_feature_locking => cache_is_lockable
            • cache_feature_key_awareness => cache_is_key_aware
            • cache_configuration_reader => cache_config
            • cache_configuration_writer => cache_writer
          • Further work on coding style including line lenght and phpdocs
          • Added unit tests for invalidation
          • Split lib.php classes into separate files in a classes directory.
          • Tidied up README
          • Fixed a couple of misc bugs I found
          Show
          Sam Hemelryk added a comment - OK I've made the following changes of Friday and the weekend and have pushed them up now: Changed definitions to component and area to identify themselves rather than a name. Created a cache factory class and converted object creation to use that to allow for reuse and make it possible to reset the cache API (useful for unit testing). Tidied up config initialisation making default config creation more readable. Added unit tests for locking and tidied up locking as required. Fixed up typos Changed CACHE_NOW to cache::now() Renamed a few things: cache_feature_locking => cache_is_lockable cache_feature_key_awareness => cache_is_key_aware cache_configuration_reader => cache_config cache_configuration_writer => cache_writer Further work on coding style including line lenght and phpdocs Added unit tests for invalidation Split lib.php classes into separate files in a classes directory. Tidied up README Fixed a couple of misc bugs I found
          Hide
          Sam Hemelryk added a comment -

          Quite right about the memcache extensions btw, they do manage themselves and it can be controlled in a limited way through the weight property when specifying servers.
          The two plugins allow this by allowing the admin to specify servers in a text area, one per line, colon separate params for host:port:weight.

          Show
          Sam Hemelryk added a comment - Quite right about the memcache extensions btw, they do manage themselves and it can be controlled in a limited way through the weight property when specifying servers. The two plugins allow this by allowing the admin to specify servers in a text area, one per line, colon separate params for host:port:weight.
          Hide
          Eloy Lafuente (stronk7) added a comment -

          Grrr, and I got flooded with CI/testing tasks again before putting my eyes on code.

          Anyway +1 to all the comments/suggestions above, thanks everybody. I'll be on this.. soon.

          Apo & Cya

          Show
          Eloy Lafuente (stronk7) added a comment - Grrr, and I got flooded with CI/testing tasks again before putting my eyes on code. Anyway +1 to all the comments/suggestions above, thanks everybody. I'll be on this.. soon. Apo & Cya
          Hide
          James Cracknell added a comment -

          Can I make a wacky suggestion?

          Would this be able to be put into a release of 2.3 when ready with the option to run on an experimental basis because it seems to be quite a big change. Then we can get more people to test it on more platforms.

          Show
          James Cracknell added a comment - Can I make a wacky suggestion? Would this be able to be put into a release of 2.3 when ready with the option to run on an experimental basis because it seems to be quite a big change. Then we can get more people to test it on more platforms.
          Hide
          Mark Nielsen added a comment -

          Hi Sam, here is my feedback after reviewing your MUC branch (Note: I reviewed it at the beginning of August):

          As I said before, I thought the definitions introduced an unnecessary learning curve to the developer. It seems to make things less explicit and more disconnected. What is cache::make() doing? How do I influence/customize it? Did I successfully customize it or is my definition incorrectly setup? What goes into the definition (I have to find docs that could be out of date or find where the definition is processed)?

          So, an alternative would be that instead of a plugin defining definitions for its caches, a plugin could define its own cache classes that extend the core cache class or implements the core cache interface(s). The cache class would then interact with the cache store via a cache storage manager. The manager would route the calls from the cache class to the appropriate cache storage based on site configurations, etc.

          // So, the cache class would have something like this...
          
          public function get_storage_manager() {
              if (is_null($this->storage_manager) {
                  $this->set_storage_manager(new cache_storage_manager($this));
              }
              return $this->storage_manager;
          }
          
          public function get($name) {
              return $this->get_storage_manager()->get($name);
          }
          
          // Plus have various methods for things that are currently in the definition,
          // like events that clear this cache, mode, etc.
          

          Though, Moodle still needs to know about all of these custom classes and some meta data about them (EG: events that clear them). This would be similar to events, where Moodle would scan plugins for their cache classes and then store meta data in the database during upgrades. This cache meta data stored in the database would then be cached by the MUC itself to help speed it up for things like clearing caches during events. So, a plugin's db/caches.php could look similar to this:

          $caches = array(
              array(
                  'classname' => 'component_cache_foo',
                  'classfile' => 'plugin/path/cache/foo.php',
              ),
              array(
                  'classname' => 'component_cache_bar',
                  'classfile' => 'plugin/path/cache/bar.php',
              ),
          );
          

          Finally, usage would look like this:

          $cache = new component_cache_foo();
          $result = $cache->get('bar');
          // ...etc
          

          Random thoughts:

          • The cache_helper seems more like a cache manager or library. Suggest converting all static functions to non-static functions.

          Advantages:

          1. Removes some magic for how the cache instance is created. This is nice in that it should be easier to use and understand the system as a whole. Might make it easier to customize as well for specialized situations.
          2. Could make it easier to define data sources for automatically fetching the data on cache misses.
          3. Removes usages of static functions. This allows for easier mocking or completely swapping out classes (dependency injection).
          4. Seems like it would be faster (Who knows without actually testing...).

          Disadvantages:

          1. A developer might abuse the class, EG: force some sort of storage. Though, that would obviously be frowned upon. Could use final on some key methods perhaps.

          Sorry about the big post, but I hope it was clear and please feel free to discuss. Thanks so much for working on this much needed feature, really looking forward to using it!

          Cheers!

          Show
          Mark Nielsen added a comment - Hi Sam, here is my feedback after reviewing your MUC branch (Note: I reviewed it at the beginning of August): As I said before, I thought the definitions introduced an unnecessary learning curve to the developer. It seems to make things less explicit and more disconnected. What is cache::make() doing? How do I influence/customize it? Did I successfully customize it or is my definition incorrectly setup? What goes into the definition (I have to find docs that could be out of date or find where the definition is processed)? So, an alternative would be that instead of a plugin defining definitions for its caches, a plugin could define its own cache classes that extend the core cache class or implements the core cache interface(s). The cache class would then interact with the cache store via a cache storage manager. The manager would route the calls from the cache class to the appropriate cache storage based on site configurations, etc. // So, the cache class would have something like this... public function get_storage_manager() { if (is_null($this->storage_manager) { $this->set_storage_manager(new cache_storage_manager($this)); } return $this->storage_manager; } public function get($name) { return $this->get_storage_manager()->get($name); } // Plus have various methods for things that are currently in the definition, // like events that clear this cache, mode, etc. Though, Moodle still needs to know about all of these custom classes and some meta data about them (EG: events that clear them). This would be similar to events, where Moodle would scan plugins for their cache classes and then store meta data in the database during upgrades. This cache meta data stored in the database would then be cached by the MUC itself to help speed it up for things like clearing caches during events. So, a plugin's db/caches.php could look similar to this: $caches = array( array( 'classname' => 'component_cache_foo', 'classfile' => 'plugin/path/cache/foo.php', ), array( 'classname' => 'component_cache_bar', 'classfile' => 'plugin/path/cache/bar.php', ), ); Finally, usage would look like this: $cache = new component_cache_foo(); $result = $cache->get('bar'); // ...etc Random thoughts: The cache_helper seems more like a cache manager or library. Suggest converting all static functions to non-static functions. Advantages: Removes some magic for how the cache instance is created. This is nice in that it should be easier to use and understand the system as a whole. Might make it easier to customize as well for specialized situations. Could make it easier to define data sources for automatically fetching the data on cache misses. Removes usages of static functions. This allows for easier mocking or completely swapping out classes (dependency injection). Seems like it would be faster (Who knows without actually testing...). Disadvantages: A developer might abuse the class, EG: force some sort of storage. Though, that would obviously be frowned upon. Could use final on some key methods perhaps. Sorry about the big post, but I hope it was clear and please feel free to discuss. Thanks so much for working on this much needed feature, really looking forward to using it! Cheers!
          Hide
          Tim Hunt added a comment -

          Mark, have you looked at the admin settings that let the admin change the configuration for which back-end each cache uses? That is why the code is designed to be so flexible.

          The more I work with object oriented designs, the more value I see in the Maxim "prefer aggregation to inheritance". Hence, I think Sam's design is good and flexible, and your suggestion that a cache definition is a subclass of the core cache class is a really bad idea.

          Yes, factory methods can be a bit mysterious, because you don't actually know what class you are getting an instance of, but that is the whole point. You need that flexibility. Rather than destroying that flexibility, the correct solution is to add the debugging facilities so that questions like "Did I successfully customize it or is my definition incorrectly setup?" are easy to answer.

          If your claimed advantages, I think you are wrong about 3. and 4. and I don't understand the reasoning that leads to your claim of 2.

          Show
          Tim Hunt added a comment - Mark, have you looked at the admin settings that let the admin change the configuration for which back-end each cache uses? That is why the code is designed to be so flexible. The more I work with object oriented designs, the more value I see in the Maxim "prefer aggregation to inheritance". Hence, I think Sam's design is good and flexible, and your suggestion that a cache definition is a subclass of the core cache class is a really bad idea. Yes, factory methods can be a bit mysterious, because you don't actually know what class you are getting an instance of, but that is the whole point. You need that flexibility. Rather than destroying that flexibility, the correct solution is to add the debugging facilities so that questions like "Did I successfully customize it or is my definition incorrectly setup?" are easy to answer. If your claimed advantages, I think you are wrong about 3. and 4. and I don't understand the reasoning that leads to your claim of 2.
          Hide
          Mark Nielsen added a comment -

          Hi Tim, responses below:

          Mark, have you looked at the admin settings that let the admin change the configuration for which back-end each cache uses? That is why the code is designed to be so flexible.

          That is handled through the cache_storage_manager class in my example code. The idea being, the class that you create is the definition and based on that, the cache storage manager finds an appropriate cache backend based on the class (AKA definition) and site configurations.

          The more I work with object oriented designs, the more value I see in the Maxim "prefer aggregation to inheritance". Hence, I think Sam's design is good and flexible, and your suggestion that a cache definition is a subclass of the core cache class is a really bad idea.

          I'm not sure what this is all about, is this a design pattern or similar? Links?

          Yes, factory methods can be a bit mysterious, because you don't actually know what class you are getting an instance of, but that is the whole point. You need that flexibility. Rather than destroying that flexibility, the correct solution is to add the debugging facilities so that questions like "Did I successfully customize it or is my definition incorrectly setup?" are easy to answer.

          I don't think we are loosing any flexibility here because of the cache storage manager class. Maybe I'm wrong and I'm missing some detail?

          If your claimed advantages, I think you are wrong about 3. and 4. and I don't understand the reasoning that leads to your claim of 2.

          • Claim 2: Instead of having to lookup a file/callback, I could define the data source directly in my cache class.
          • Claim 3: Let's say I have a class that depends on a cache. If it was "component_cache_foo" I could easily mock it and inject it into the class for testing.
          • Claim 4: I really don't know if it would be faster or not, but it seems like not having to process db/caches.php during the factory would be a win (I believe that's happening).
          Show
          Mark Nielsen added a comment - Hi Tim, responses below: Mark, have you looked at the admin settings that let the admin change the configuration for which back-end each cache uses? That is why the code is designed to be so flexible. That is handled through the cache_storage_manager class in my example code. The idea being, the class that you create is the definition and based on that, the cache storage manager finds an appropriate cache backend based on the class (AKA definition) and site configurations. The more I work with object oriented designs, the more value I see in the Maxim "prefer aggregation to inheritance". Hence, I think Sam's design is good and flexible, and your suggestion that a cache definition is a subclass of the core cache class is a really bad idea. I'm not sure what this is all about, is this a design pattern or similar? Links? Yes, factory methods can be a bit mysterious, because you don't actually know what class you are getting an instance of, but that is the whole point. You need that flexibility. Rather than destroying that flexibility, the correct solution is to add the debugging facilities so that questions like "Did I successfully customize it or is my definition incorrectly setup?" are easy to answer. I don't think we are loosing any flexibility here because of the cache storage manager class. Maybe I'm wrong and I'm missing some detail? If your claimed advantages, I think you are wrong about 3. and 4. and I don't understand the reasoning that leads to your claim of 2. Claim 2: Instead of having to lookup a file/callback, I could define the data source directly in my cache class. Claim 3: Let's say I have a class that depends on a cache. If it was "component_cache_foo" I could easily mock it and inject it into the class for testing. Claim 4: I really don't know if it would be faster or not, but it seems like not having to process db/caches.php during the factory would be a win (I believe that's happening).
          Show
          Tim Hunt added a comment - http://c2.com/cgi/wiki?CompositionInsteadOfInheritance
          Hide
          Mark Nielsen added a comment -

          http://c2.com/cgi/wiki?CompositionInsteadOfInheritance

          I don't have a problem with that, though I'm having a hard time seeing it being implemented in this design. Perhaps I'm just missing it and could use some help. Also, in my suggestion, I don't see a conflict with the inheritance, I'm just defining the definition of my cache. Not extending a specific cache storage or even a specific cache type.

          Show
          Mark Nielsen added a comment - http://c2.com/cgi/wiki?CompositionInsteadOfInheritance I don't have a problem with that, though I'm having a hard time seeing it being implemented in this design. Perhaps I'm just missing it and could use some help. Also, in my suggestion, I don't see a conflict with the inheritance, I'm just defining the definition of my cache. Not extending a specific cache storage or even a specific cache type.
          Hide
          Sam Hemelryk added a comment -

          Hi Mark,

          My appologies yesterday was a busy day and I didn't get past drafting a reply to you.
          First up thank you for sharing your thoughts here, certainly the more perspectives we can get on this the better.
          I've read over your initial comment and the banter between yourself and Tim, and I've spent a bit of time yesterday really thinking about it.

          I appreciate where you are coming from with the idea of using classes to define caches (rather than the array structure presently used) however I still think the array structure offers a few bonuses.

          In designing this system I've tried to create a system that is used for caching irrespective of the actual "cache" and what its benefit may or may not be.
          This really stemed from the desire to create an easy to use cache API. We wanted to keep the API as simple and consistent as absolutely possible. So that developers who wanted to make use of it wern't going to be intimidated by and nor would they need to worry about the complexities of caching such as access to the cache from other areas for the purposes of invalidation, testing, and administration. This ties back into the learning curve you noted in regards to definitions, I'll address that shortly.

          The obsfucation of what is going on when interacting with the cache is very intentional in this design.
          I wanted to force the hand of the developer to recognise that they are dealing with a cache, no with how the cache works to optimise their interaction, or where the cache is storing data. Instead they interact with the cache API as if it were backend solution.
          While this doesn't naturally offer the control developers would normally like to take I think it really will make the Cache API easier to use, especially for those who are not familiar the Cache API.
          Examples are always a great aid to developers and while I completely agree that documentation can very quickly slip out of relavance I think it is still an incredibly important tool. I hope that we can very easily explain the simple uses of the Cache API.
          Having had a think about the class situation your've purposed I'm not sure it would be any easier for developers, in fact I wonder whether having to define a class and publish it through a definition anyway is going to be work even. Granted the complexity will not be in the definition it will be in the class and that as many developers have IDE's which aid then with code hints things will be more transparent, however the class is going to need to detail that same desired functionality and requirements and anyone creating these cache classes is still going to need to know about it.
          One benefit to this is as you mentioned that the end developer will know exactly what they are working with, and I think that will be true in most cases although I also think it will be prone to make life harder for those working with caches outside of the area the class was developed for (of course thats minorty of places and people).
          Personally at this point I am still leaning towards the current structure, providing a class thats chosen by Cache API logic rather than the developer but that is sure to be consistent in design and use. Of course this means that the developer is going to need to trust in the Cache API but I think that is fine.
          This design has other advantages as well, I feel this end design is going to be much easier to maintain given its logic is wrapped internally and that only in minority of cases will people be overriding/affecting these core cache classes.

          That all being said I still recognise that these definitions are still going to be a learning curve. However there are only three required fields (mode, component, area) and that its likely most caches are not going to be interested in any of the other available properties (except the persistent option I suppose).
          I've had a think about what we could do to make these easier to "develop" under the same architecture but to be truthful I didn't come up with anything other than perhaps creating a simple admin tool to allow developers to design a cache and have it spit out a definition that can be pasted into the caches.php file.

          Just to note a couple of other things as well about the current design and the other things you've mentioned:

          1/ Definition processing.
          The db/caches.php files are only searched for and processed in the following situations:

          • Install/Upgrade
          • When manually triggered through the admin interface.
          • When a developer asks for a cache that does not exist.

          Once processed they are written out to a cache config file that is stored within the Moodle data directory for the site as a single PHP file to be included when required.
          Upon the first interaction with the Cache API that file is included and parsed back into a cache config object used by the Cache API.

          2/ Abusing the API.
          You mentioned Mark that abuse was one of the disadvantages to using the class solution, this made me smile as I had manually added the ability to abuse to my design.
          One of features available throught the current definition is to specify a custom cache class to use and the file it resides in. The properties are overrideclass and overrideclassfile (horrible names I know). If an override class is provided the Cache API will attempt to load and use that class as the loader rather than the otherwise expected cache_application/session/request class.
          The override class still must override the cache loader class for the mode that it is being designed for, in turn implies it is extending the base cache class and the cache_loader interface.
          Either way for truly advanced situations it is still possible to utilise your own cache loader class and abuse things as you wish, you're tied in to the extend so you must maintain the consistent public API at a minimum however I've made sure to not use the final keyword anywhere and instead wrap essential functionality that I don't think should be touched in private methods of the cache_loader class. The developer of such a class could take nearly full control of the loader should they wish, and through this have access to what ever stores and chained loaders/data sources that may be available.
          I only envisage this being used in the most advanced caches, but it is none the less supported and has unit tests to make sure it is not forgotten.

          3/ Obsfucation hides all truth.
          This is about being given the cache class that the API decided is right for you rather than having full control over it by explicitly intialising or requesting a cache class.
          Tim's pretty much hit this one on the head I think, and certainly it is a very essential element in the current design.
          I completely agree with Tim I think that having some means of quering and/or displaying the structure of the cache API is going to be the best solution to this (as mentioned I still think the current design is the way to go).
          The cache loader you recieve may be one of 3 cache loader classes + a custom loader if you have defined one.
          That loader may be just the first element in a chain or loaders exposing several cache stores potentially over a distributed system, and possibly all backed up by a data source that is going to make things work regardless of the cache store outcomes anyway.
          This system offers use insane flexibility and while the average developer is not going to need to know anything about it for sure having some means to query + investigate + debug it is going to be essential.
          I will make a note to look at this, perhaps not before integration now, but certainly as a blocker to release.

          4/ cache_helper => Convert static to normal methods and use as an instance.
          I channeled the design of the backup helper classes as I quite like the organisation of "helpful" functions that this approach provides. I think since you've looked at the code I have introduced a cache_factory class, the factory class took the methods of the helper responsible for instantiating and holding onto the instance of essential classes.
          None of the functions belonging to it have a need for an instance any more and in that sense things appear fine as static methods.
          Certainly your point about being able to override/switchout if required is a good point though and has got me thinking.
          I'm 50/50 on whether the helper should be instanced and those static methods converted to normal methods. I'd like to know what others thought of this.

          5/ Admin interfaces
          I've attached a screenshot (Cache_administration_20120830.png) that shows the admin interface and quickly identifies a couple of things it does.

          Cheers
          Sam

          Show
          Sam Hemelryk added a comment - Hi Mark, My appologies yesterday was a busy day and I didn't get past drafting a reply to you. First up thank you for sharing your thoughts here, certainly the more perspectives we can get on this the better. I've read over your initial comment and the banter between yourself and Tim, and I've spent a bit of time yesterday really thinking about it. I appreciate where you are coming from with the idea of using classes to define caches (rather than the array structure presently used) however I still think the array structure offers a few bonuses. In designing this system I've tried to create a system that is used for caching irrespective of the actual "cache" and what its benefit may or may not be. This really stemed from the desire to create an easy to use cache API. We wanted to keep the API as simple and consistent as absolutely possible. So that developers who wanted to make use of it wern't going to be intimidated by and nor would they need to worry about the complexities of caching such as access to the cache from other areas for the purposes of invalidation, testing, and administration. This ties back into the learning curve you noted in regards to definitions, I'll address that shortly. The obsfucation of what is going on when interacting with the cache is very intentional in this design. I wanted to force the hand of the developer to recognise that they are dealing with a cache, no with how the cache works to optimise their interaction, or where the cache is storing data. Instead they interact with the cache API as if it were backend solution. While this doesn't naturally offer the control developers would normally like to take I think it really will make the Cache API easier to use, especially for those who are not familiar the Cache API. Examples are always a great aid to developers and while I completely agree that documentation can very quickly slip out of relavance I think it is still an incredibly important tool. I hope that we can very easily explain the simple uses of the Cache API. Having had a think about the class situation your've purposed I'm not sure it would be any easier for developers, in fact I wonder whether having to define a class and publish it through a definition anyway is going to be work even. Granted the complexity will not be in the definition it will be in the class and that as many developers have IDE's which aid then with code hints things will be more transparent, however the class is going to need to detail that same desired functionality and requirements and anyone creating these cache classes is still going to need to know about it. One benefit to this is as you mentioned that the end developer will know exactly what they are working with, and I think that will be true in most cases although I also think it will be prone to make life harder for those working with caches outside of the area the class was developed for (of course thats minorty of places and people). Personally at this point I am still leaning towards the current structure, providing a class thats chosen by Cache API logic rather than the developer but that is sure to be consistent in design and use. Of course this means that the developer is going to need to trust in the Cache API but I think that is fine. This design has other advantages as well, I feel this end design is going to be much easier to maintain given its logic is wrapped internally and that only in minority of cases will people be overriding/affecting these core cache classes. That all being said I still recognise that these definitions are still going to be a learning curve. However there are only three required fields (mode, component, area) and that its likely most caches are not going to be interested in any of the other available properties (except the persistent option I suppose). I've had a think about what we could do to make these easier to "develop" under the same architecture but to be truthful I didn't come up with anything other than perhaps creating a simple admin tool to allow developers to design a cache and have it spit out a definition that can be pasted into the caches.php file. Just to note a couple of other things as well about the current design and the other things you've mentioned: 1/ Definition processing. The db/caches.php files are only searched for and processed in the following situations: Install/Upgrade When manually triggered through the admin interface. When a developer asks for a cache that does not exist. Once processed they are written out to a cache config file that is stored within the Moodle data directory for the site as a single PHP file to be included when required. Upon the first interaction with the Cache API that file is included and parsed back into a cache config object used by the Cache API. 2/ Abusing the API. You mentioned Mark that abuse was one of the disadvantages to using the class solution, this made me smile as I had manually added the ability to abuse to my design. One of features available throught the current definition is to specify a custom cache class to use and the file it resides in. The properties are overrideclass and overrideclassfile (horrible names I know). If an override class is provided the Cache API will attempt to load and use that class as the loader rather than the otherwise expected cache_application/session/request class. The override class still must override the cache loader class for the mode that it is being designed for, in turn implies it is extending the base cache class and the cache_loader interface. Either way for truly advanced situations it is still possible to utilise your own cache loader class and abuse things as you wish, you're tied in to the extend so you must maintain the consistent public API at a minimum however I've made sure to not use the final keyword anywhere and instead wrap essential functionality that I don't think should be touched in private methods of the cache_loader class. The developer of such a class could take nearly full control of the loader should they wish, and through this have access to what ever stores and chained loaders/data sources that may be available. I only envisage this being used in the most advanced caches, but it is none the less supported and has unit tests to make sure it is not forgotten. 3/ Obsfucation hides all truth. This is about being given the cache class that the API decided is right for you rather than having full control over it by explicitly intialising or requesting a cache class. Tim's pretty much hit this one on the head I think, and certainly it is a very essential element in the current design. I completely agree with Tim I think that having some means of quering and/or displaying the structure of the cache API is going to be the best solution to this (as mentioned I still think the current design is the way to go). The cache loader you recieve may be one of 3 cache loader classes + a custom loader if you have defined one. That loader may be just the first element in a chain or loaders exposing several cache stores potentially over a distributed system, and possibly all backed up by a data source that is going to make things work regardless of the cache store outcomes anyway. This system offers use insane flexibility and while the average developer is not going to need to know anything about it for sure having some means to query + investigate + debug it is going to be essential. I will make a note to look at this, perhaps not before integration now, but certainly as a blocker to release. 4/ cache_helper => Convert static to normal methods and use as an instance. I channeled the design of the backup helper classes as I quite like the organisation of "helpful" functions that this approach provides. I think since you've looked at the code I have introduced a cache_factory class, the factory class took the methods of the helper responsible for instantiating and holding onto the instance of essential classes. None of the functions belonging to it have a need for an instance any more and in that sense things appear fine as static methods. Certainly your point about being able to override/switchout if required is a good point though and has got me thinking. I'm 50/50 on whether the helper should be instanced and those static methods converted to normal methods. I'd like to know what others thought of this. 5/ Admin interfaces I've attached a screenshot (Cache_administration_20120830.png) that shows the admin interface and quickly identifies a couple of things it does. Cheers Sam
          Hide
          Tim Hunt added a comment -

          Thanks for the screen-grab Sam. It amuses me in one way: You have had to annotate the headings to make them comprehensible. E.g. The "Installed cache plugins" for the "Plugin summaries" heading. I assume that, as part of the polishing, you plan to just use the better strings in the UI!

          Show
          Tim Hunt added a comment - Thanks for the screen-grab Sam. It amuses me in one way: You have had to annotate the headings to make them comprehensible. E.g. The "Installed cache plugins" for the "Plugin summaries" heading. I assume that, as part of the polishing, you plan to just use the better strings in the UI!
          Hide
          Tim Hunt added a comment -

          Just to make sure you see them. Three minor comments: http://docs.moodle.org/dev/Talk:Cache_API

          Show
          Tim Hunt added a comment - Just to make sure you see them. Three minor comments: http://docs.moodle.org/dev/Talk:Cache_API
          Hide
          Mark Nielsen added a comment -

          Hey Sam, thanks for the response!

          I hope you and Tim do not think that I'm suggesting that the developer would have any control over the actual cache store backend. I wasn't trying to remove any functionality, flexibility or configuration options, just re-organize the code into something that is hopefully easier to use and understand. So, the developer would basically be defining their definition in a class, the definition would make use of another class (Composition?) that would actually go and find the appropriate cache backend to use for that definition based on site configurations. So, the obsfucation is still in place.

          I'm glad you mentioned the IDE, sometimes I want to mention that, but not all folks use them and others just hate them. But, they are great for refactoring, looking for usages and the like, which I cannot do with an array definition. If I don't understand something in the array definition, I have to hunt down other definitions, documentation or dig through the processing code. Plus, I don't know when something has changed with the array definition unless the core code tells me or I find it documented somewhere. Yes, the array definition works, but I feel like it could be better.

          The admin interface looks great as it addresses another one of our concerns: if a specific definition needed some special handling, we need to be able to configure it to use a different cache backend. Wonderful!

          Thanks for considering my feedback, looking forward to having caching in Moodle.

          Show
          Mark Nielsen added a comment - Hey Sam, thanks for the response! I hope you and Tim do not think that I'm suggesting that the developer would have any control over the actual cache store backend. I wasn't trying to remove any functionality, flexibility or configuration options, just re-organize the code into something that is hopefully easier to use and understand. So, the developer would basically be defining their definition in a class, the definition would make use of another class (Composition?) that would actually go and find the appropriate cache backend to use for that definition based on site configurations. So, the obsfucation is still in place. I'm glad you mentioned the IDE, sometimes I want to mention that, but not all folks use them and others just hate them. But, they are great for refactoring, looking for usages and the like, which I cannot do with an array definition. If I don't understand something in the array definition, I have to hunt down other definitions, documentation or dig through the processing code. Plus, I don't know when something has changed with the array definition unless the core code tells me or I find it documented somewhere. Yes, the array definition works, but I feel like it could be better. The admin interface looks great as it addresses another one of our concerns: if a specific definition needed some special handling, we need to be able to configure it to use a different cache backend. Wonderful! Thanks for considering my feedback, looking forward to having caching in Moodle.
          Hide
          Mark van Hoek added a comment -

          Re: http://docs.moodle.org/dev/Talk:Cache_API

          Having the get function return false makes sense to me, Tim – it's normal flow meaning it's not found; an exception in the get would be if something goes wrong e.g. there was no back-end found. My model is Zend Cache http://framework.zend.com/manual/en/zend.cache.frontends.html

          Show
          Mark van Hoek added a comment - Re: http://docs.moodle.org/dev/Talk:Cache_API Having the get function return false makes sense to me, Tim – it's normal flow meaning it's not found; an exception in the get would be if something goes wrong e.g. there was no back-end found. My model is Zend Cache http://framework.zend.com/manual/en/zend.cache.frontends.html
          Hide
          Tim Hunt added a comment -

          Oops, there was a typo in my comment (now corrected). I meant set or delete should throw and exception for failures.

          For get, returning null is very natural. What can work even better is for get to take an optional extra parameter:

          public function get($key, $default = null);

          and return the given default if the key is not present - a bit link optional_param, or get_user_preference.

          Show
          Tim Hunt added a comment - Oops, there was a typo in my comment (now corrected). I meant set or delete should throw and exception for failures. For get, returning null is very natural. What can work even better is for get to take an optional extra parameter: public function get($key, $default = null); and return the given default if the key is not present - a bit link optional_param, or get_user_preference.
          Hide
          Rajesh Taneja added a comment -

          Hello Sam,

          Few things which I found in first round (Syntax/coding standards etc) of review, (Not major, but worth considering)

          Factory.php

          1. It might be nice to have constructor as final private https://github.com/samhemelryk/moodle/commit/016721ab6da009d2ae068da88b59c11869d54a50#L6R95. Also, it will be nice to have final private __clone.
          2. In reset shouldn't we pass forcereload = true https://github.com/samhemelryk/moodle/commit/016721ab6da009d2ae068da88b59c11869d54a50#L6R103 and it will save us from explicitly resetting array.
          3. https://github.com/samhemelryk/moodle/commit/016721ab6da009d2ae068da88b59c11869d54a50#L6R131 second parameter is not required as function accepts only one param.
          4. It might be nice to have $class initialised in if - else https://github.com/samhemelryk/moodle/commit/016721ab6da009d2ae068da88b59c11869d54a50#L6R232. Easy to track $class value depending on condition.

          It will be nice to have an abstract cache_store and rather then cache_store interface, which every store is implementing. This will avoid few things like re-defining default behaviour of few functions (get_many, delete_many etc.) and variables (like $name). At present stores/files/lib.php has different/limited checks as compared to stores/memcache/lib.php for function get_many()

          Locallib.php

          1. While saving configuration https://github.com/samhemelryk/moodle/commit/016721ab6da009d2ae068da88b59c11869d54a50#L12R64, shouldn't we check for return value and throw exception if configuration is not saved?
          2. Might be nice to have https://github.com/samhemelryk/moodle/commit/016721ab6da009d2ae068da88b59c11869d54a50#L12R64 similar to standard check in moodle (https://github.com/samhemelryk/moodle/commit/016721ab6da009d2ae068da88b59c11869d54a50#L12R30)

          I am still trying to understand the design and will add more inputs as I progress.

          Show
          Rajesh Taneja added a comment - Hello Sam, Few things which I found in first round (Syntax/coding standards etc) of review, (Not major, but worth considering) Factory.php It might be nice to have constructor as final private https://github.com/samhemelryk/moodle/commit/016721ab6da009d2ae068da88b59c11869d54a50#L6R95 . Also, it will be nice to have final private __clone. In reset shouldn't we pass forcereload = true https://github.com/samhemelryk/moodle/commit/016721ab6da009d2ae068da88b59c11869d54a50#L6R103 and it will save us from explicitly resetting array. https://github.com/samhemelryk/moodle/commit/016721ab6da009d2ae068da88b59c11869d54a50#L6R131 second parameter is not required as function accepts only one param. It might be nice to have $class initialised in if - else https://github.com/samhemelryk/moodle/commit/016721ab6da009d2ae068da88b59c11869d54a50#L6R232 . Easy to track $class value depending on condition. It will be nice to have an abstract cache_store and rather then cache_store interface , which every store is implementing. This will avoid few things like re-defining default behaviour of few functions (get_many, delete_many etc.) and variables (like $name). At present stores/files/lib.php has different/limited checks as compared to stores/memcache/lib.php for function get_many() Locallib.php While saving configuration https://github.com/samhemelryk/moodle/commit/016721ab6da009d2ae068da88b59c11869d54a50#L12R64 , shouldn't we check for return value and throw exception if configuration is not saved? Might be nice to have https://github.com/samhemelryk/moodle/commit/016721ab6da009d2ae068da88b59c11869d54a50#L12R64 similar to standard check in moodle ( https://github.com/samhemelryk/moodle/commit/016721ab6da009d2ae068da88b59c11869d54a50#L12R30 ) I am still trying to understand the design and will add more inputs as I progress.
          Hide
          Rajesh Taneja added a comment -

          Hello Sam,

          In addition to having abstracting cache_storage, not sure if we really want static_data_store and session_data_store abstract classes. Do you expect anyone extending these ever? IMO, this behaviour should be contained in cache_store_session and cache_store_static and should extend basic store behaviour by extending cache_store. Also, by theory you should have least one abstract function in abstract class, so design-wise it might be worth considering it.

          Show
          Rajesh Taneja added a comment - Hello Sam, In addition to having abstracting cache_storage , not sure if we really want static_data_store and session_data_store abstract classes. Do you expect anyone extending these ever? IMO, this behaviour should be contained in cache_store_session and cache_store_static and should extend basic store behaviour by extending cache_store . Also, by theory you should have least one abstract function in abstract class, so design-wise it might be worth considering it.
          Hide
          Sam Hemelryk added a comment -

          Hi guys,

          Sorry, I've been neglecting this tracker issue and have been sick the last couple of days and have not replied here despite having read through the comments.

          Tim,
          I looked back at the screenshot after posting it here and had a little chuckle. I will tidy up those headings before integration.
          I've since replied on the talk page and I see you've replied there as well.
          Personally I feel the returning of true|false in those situations is still the way to go. Strings were just one area where caches throwing exceptions could go a muck.
          The other big thing about returning true|false from the set and delete methods is that if most cases we really won't care whether it succeeds or not. The cache is a tool that we can use make things faster. If the data is there great. It not being there is a situation we need to deal with anyway. There are valid reasons for it failing as well, concurrency being the big one. Set may fail if a lock could not be acquired, delete could fail if the data is not there or a lock is in place. The operation failing isn't a fatal situation for the program, cache implementations can't require something come from the cache backend.
          In regards to get taking an optional argument for the default value to return, and changing to null from false I think that is probably a good idea. I was never too sold on returning false and I can't think why I didn't consider using null like we do in other places.
          As Raj and Eloy are both reviewing presently I'll talk to them about it, if both are in agreement then I shall convert things.

          Mark,
          I had imagined in what I understood of your class proposal that you were still delegating the responsiblity of selecting a backend which is a good thing. Certainly the cache definition is going to be an area that will require the developer to learn more about the cache API in order to really maximise initiative, I will continue to have a look and a think about that.

          Raj,
          Thanks for having a think about things, in reponse to your points.

          factory.php

          1. I'm always very apprehensive about marking anything private, and doubly so for marking things as final.
            If there are good reasons to, sure, but otherwise I feel it is much better to leave things protected and open to being overridden just in case some time in the future we have a need to extend.
            In this case I think having that method protected is fine, essentially in making it protected I achieve the goal I had set out to achieve which was to prevent people creating instances of the factory and in effect making it clear that it is purely a static API. I don't think at this point in time making it private final is required.
          2. good spotting looking back at that now there is no reason to manually clear things we can just force a reload.
          3. good spotting thanks will clean that up.
          4. Not sure what you mean here sorry, the class gets initialised on line 255 if we have not already got an instance of that class. The lines about it decide which class to use and ensure the default configuration is available.

          cache_store,
          It's interesting that you turned this up. The cache_store interface is something I spent quite a bit of time considering. We use abstract classes for most other plugin types throughout Moodle.
          I chose to use an interface for a couple of reasons:

          • Nearly all methods need to be implemented/overridden by the cache_store, the get_many and set_many methods included. All backends with a dedicated PHP plugin that I looked at provide some means for both setting a single item, and setting multiple items. I think providing some default translation between the two would allow people to be lazy in their plugin implementations rather than looking for an optimised means of getting/setting mutliple items. The diffence between the get_many methods of the file and memcache stores show how these methods will differ. Only a single file can be written at a time hence it makes sense for get_many to call get each time, however the memcache backend provides a means to set many items in a single transaction so we use that there and reprocess after. In regards to memcache this saves us having to make a request for each cache item individually.
            There are a couple of methods that could be better dealt with if this were an abstract class however. They are the supports_.... methods. If this were an abstract class those methods would be hardcoded there and simply call get_supported_features and process its results.
          • A class can implement multiple interfaces but only extend a single class. This factored into my thinking. Many backends now provide OO means of interaction, classes that you instantiate and work with, or PHP classes you can download and use that handle interaction with you. While none of the plugins I have created do this presently a possibility that I had considered was that you could create you cache plugin class to implement the cache_store interface and extend a backend class. For instance you could have the following `class cache_store_mongodb extends Mongo implements cache_store`. Whether that will ever be done is something to be seen, however it does reflect on the above point in that as there there is very little need to define default methods using an interface is chosing the more flexible option.

          Now that being said I'm interested to hear what others think of this and whether there are some arguments to convert cache_store from an interface to an abstract class?

          locallib.php

          1. definitely that config writing code can be made more robust.
          2. will change that at the same time as above.

          In regards to the static_data_store and session_data_store I will have a close look at these. When I first wrote the session and static stores these served an important purpose, they acted as a single gateway to the data being stored by the potentially many instance of static and session stores. However as the design progressed its ended up that these two stores don't allow for multiple instances.
          I'd bet looking at them now that they can be removed altogether.

          Many thanks
          Sam

          Show
          Sam Hemelryk added a comment - Hi guys, Sorry, I've been neglecting this tracker issue and have been sick the last couple of days and have not replied here despite having read through the comments. Tim, I looked back at the screenshot after posting it here and had a little chuckle. I will tidy up those headings before integration. I've since replied on the talk page and I see you've replied there as well. Personally I feel the returning of true|false in those situations is still the way to go. Strings were just one area where caches throwing exceptions could go a muck. The other big thing about returning true|false from the set and delete methods is that if most cases we really won't care whether it succeeds or not. The cache is a tool that we can use make things faster. If the data is there great. It not being there is a situation we need to deal with anyway. There are valid reasons for it failing as well, concurrency being the big one. Set may fail if a lock could not be acquired, delete could fail if the data is not there or a lock is in place. The operation failing isn't a fatal situation for the program, cache implementations can't require something come from the cache backend. In regards to get taking an optional argument for the default value to return, and changing to null from false I think that is probably a good idea. I was never too sold on returning false and I can't think why I didn't consider using null like we do in other places. As Raj and Eloy are both reviewing presently I'll talk to them about it, if both are in agreement then I shall convert things. Mark, I had imagined in what I understood of your class proposal that you were still delegating the responsiblity of selecting a backend which is a good thing. Certainly the cache definition is going to be an area that will require the developer to learn more about the cache API in order to really maximise initiative, I will continue to have a look and a think about that. Raj, Thanks for having a think about things, in reponse to your points. factory.php I'm always very apprehensive about marking anything private, and doubly so for marking things as final. If there are good reasons to, sure, but otherwise I feel it is much better to leave things protected and open to being overridden just in case some time in the future we have a need to extend. In this case I think having that method protected is fine, essentially in making it protected I achieve the goal I had set out to achieve which was to prevent people creating instances of the factory and in effect making it clear that it is purely a static API. I don't think at this point in time making it private final is required. good spotting looking back at that now there is no reason to manually clear things we can just force a reload. good spotting thanks will clean that up. Not sure what you mean here sorry, the class gets initialised on line 255 if we have not already got an instance of that class. The lines about it decide which class to use and ensure the default configuration is available. cache_store, It's interesting that you turned this up. The cache_store interface is something I spent quite a bit of time considering. We use abstract classes for most other plugin types throughout Moodle. I chose to use an interface for a couple of reasons: Nearly all methods need to be implemented/overridden by the cache_store, the get_many and set_many methods included. All backends with a dedicated PHP plugin that I looked at provide some means for both setting a single item, and setting multiple items. I think providing some default translation between the two would allow people to be lazy in their plugin implementations rather than looking for an optimised means of getting/setting mutliple items. The diffence between the get_many methods of the file and memcache stores show how these methods will differ. Only a single file can be written at a time hence it makes sense for get_many to call get each time, however the memcache backend provides a means to set many items in a single transaction so we use that there and reprocess after. In regards to memcache this saves us having to make a request for each cache item individually. There are a couple of methods that could be better dealt with if this were an abstract class however. They are the supports_.... methods. If this were an abstract class those methods would be hardcoded there and simply call get_supported_features and process its results. A class can implement multiple interfaces but only extend a single class. This factored into my thinking. Many backends now provide OO means of interaction, classes that you instantiate and work with, or PHP classes you can download and use that handle interaction with you. While none of the plugins I have created do this presently a possibility that I had considered was that you could create you cache plugin class to implement the cache_store interface and extend a backend class. For instance you could have the following `class cache_store_mongodb extends Mongo implements cache_store` . Whether that will ever be done is something to be seen, however it does reflect on the above point in that as there there is very little need to define default methods using an interface is chosing the more flexible option. Now that being said I'm interested to hear what others think of this and whether there are some arguments to convert cache_store from an interface to an abstract class? locallib.php definitely that config writing code can be made more robust. will change that at the same time as above. In regards to the static_data_store and session_data_store I will have a close look at these. When I first wrote the session and static stores these served an important purpose, they acted as a single gateway to the data being stored by the potentially many instance of static and session stores. However as the design progressed its ended up that these two stores don't allow for multiple instances. I'd bet looking at them now that they can be removed altogether. Many thanks Sam
          Hide
          Rajesh Taneja added a comment -

          Thanks Sam,

          Just wondering if we need to protect __clone in factory class.
          Also, I have no objection in using cache_store interface. Reason for asking it to be abstract is to have code reuse and maintainability. Often we do some enhancements/fixes and we end up missing on similar parts.

          Point 4: It's not important, but a personal preference of using if-else rather then if-if. IMO, cache should be best optimized for performance and with minimal overheads (if-else can avoid a redundant check). As I said it's not important, but worth considering for overall performance.

          Show
          Rajesh Taneja added a comment - Thanks Sam, Just wondering if we need to protect __clone in factory class. Also, I have no objection in using cache_store interface. Reason for asking it to be abstract is to have code reuse and maintainability. Often we do some enhancements/fixes and we end up missing on similar parts. Point 4: It's not important, but a personal preference of using if-else rather then if-if. IMO, cache should be best optimized for performance and with minimal overheads (if-else can avoid a redundant check). As I said it's not important, but worth considering for overall performance.
          Hide
          Sam Hemelryk added a comment -

          Putting this up for integration now.
          I've used the branch (wip-MDL-25290-m24-combined) that contains MUC + the 3 conversions I've used for reference/testing.

          If you want to check them out separately the following branches also exist:

          wip-MDL-25290-m24-compact
          Contains just the Cache API with commits intended for integration

          wip-MDL-25290-m24-MUC
          Same as above but with all of the commits I made during development, not intended for integration

          wip-MDL-25290-m24-Conversion-config
          The CFG conversion to MUC by itself.

          wip-MDL-25290-m24-Conversion-dbmeta
          The database meta information implementations of MUC. MySQL and Postgres only at the moment.

          wip-MDL-25290-m24-Conversion-string
          Conversion of the string language cache to MUC.

          Show
          Sam Hemelryk added a comment - Putting this up for integration now. I've used the branch (wip- MDL-25290 -m24-combined) that contains MUC + the 3 conversions I've used for reference/testing. If you want to check them out separately the following branches also exist: wip- MDL-25290 -m24-compact Contains just the Cache API with commits intended for integration wip- MDL-25290 -m24-MUC Same as above but with all of the commits I made during development, not intended for integration wip- MDL-25290 -m24-Conversion-config The CFG conversion to MUC by itself. wip- MDL-25290 -m24-Conversion-dbmeta The database meta information implementations of MUC. MySQL and Postgres only at the moment. wip- MDL-25290 -m24-Conversion-string Conversion of the string language cache to MUC.
          Hide
          Sam Hemelryk added a comment -

          Things still to do btw

          • Add a cron routine to purge incrementally and tidy up manual ttl systems.
          • Add support for locking into the cache_session class so that sessions support locking.
          • Consider adding support for identifiers when invalidating by event
          • Integrate definition invalidation with the event cache in order to support definition invalidation in session caches.
          • Implement APC cache store plugin
          • Implement XCache cache store plugin
          • Implement WinCache cache store plugin
          • Implement MySQL cache store plugin (prototype exists)
          • Implement PgSQL Cache store plugin
          Show
          Sam Hemelryk added a comment - Things still to do btw Add a cron routine to purge incrementally and tidy up manual ttl systems. Add support for locking into the cache_session class so that sessions support locking. Consider adding support for identifiers when invalidating by event Integrate definition invalidation with the event cache in order to support definition invalidation in session caches. Implement APC cache store plugin Implement XCache cache store plugin Implement WinCache cache store plugin Implement MySQL cache store plugin (prototype exists) Implement PgSQL Cache store plugin
          Hide
          Eloy Lafuente (stronk7) added a comment -

          This is the 1st in the list for this monkey.

          Show
          Eloy Lafuente (stronk7) added a comment - This is the 1st in the list for this monkey.
          Hide
          Adam Olley added a comment -

          Sam:

          A concern for the static cache. It takes the $data input and stores it, which means anyone using the static cache needs to be very careful not to alter any objects they've given the cache. While the set method could clone any incoming objects, that will only help for single layer objects...

          Show
          Adam Olley added a comment - Sam: A concern for the static cache. It takes the $data input and stores it, which means anyone using the static cache needs to be very careful not to alter any objects they've given the cache. While the set method could clone any incoming objects, that will only help for single layer objects...
          Hide
          Eloy Lafuente (stronk7) added a comment - - edited

          Report 001: Definitively, there is something strange/wrong within the patch as far as the 3 times I've installed it (m24-compact), I end with an infinite redirect between admin/index and admin/upgradesettings. Fixing this is a must.

          Workaround, avoid going to admin/index and continue reviewing and testing.

          Show
          Eloy Lafuente (stronk7) added a comment - - edited Report 001: Definitively, there is something strange/wrong within the patch as far as the 3 times I've installed it (m24-compact), I end with an infinite redirect between admin/index and admin/upgradesettings. Fixing this is a must. Workaround, avoid going to admin/index and continue reviewing and testing.
          Hide
          Eloy Lafuente (stronk7) added a comment -

          Report 002: The most I think about it, the more useful I find to have separate store and lock plugins. Right now if the store implements locking, it's used, else the default "locking instance" (doh?) is used. Instead we should have stores and lock plugins implemented separately and then, on each cache instance, define which store and which lock plugins are going to be used. Allowing a default one if desired too. But there are use cases for that. Imagine one shared file store (accessed by N web servers). Perhaps (because of the sharing method used) we cannot use one file locking there, and should switch that instance to DB advisory locking or another locking plugin.

          This has been already commented with Sam and he is analyzing the impact of such a change. As said, I think it will lead to clearer implementations and much more flexibility of the system under complex environments. Surely this is my main point.

          Show
          Eloy Lafuente (stronk7) added a comment - Report 002: The most I think about it, the more useful I find to have separate store and lock plugins. Right now if the store implements locking, it's used, else the default "locking instance" (doh?) is used. Instead we should have stores and lock plugins implemented separately and then, on each cache instance, define which store and which lock plugins are going to be used. Allowing a default one if desired too. But there are use cases for that. Imagine one shared file store (accessed by N web servers). Perhaps (because of the sharing method used) we cannot use one file locking there, and should switch that instance to DB advisory locking or another locking plugin. This has been already commented with Sam and he is analyzing the impact of such a change. As said, I think it will lead to clearer implementations and much more flexibility of the system under complex environments. Surely this is my main point.
          Hide
          Eloy Lafuente (stronk7) added a comment -

          Report 003: About instances configurations. I think it's a must that all store (and lock if implemented) plugins never must assume they are going to use "moodle-resources" (i.e: dataroot, DB...) but must be able to specify their own always. The stores I've tried until now (file, mongo) seem to fulfill this point, perfect. Just annotating this property is a must for any plugin.

          And, as an extension to this... perhaps also the current dataroot/muc/config.php should be configurable to be picked from another place (this is not a must).

          Show
          Eloy Lafuente (stronk7) added a comment - Report 003: About instances configurations. I think it's a must that all store (and lock if implemented) plugins never must assume they are going to use "moodle-resources" (i.e: dataroot, DB...) but must be able to specify their own always. The stores I've tried until now (file, mongo) seem to fulfill this point, perfect. Just annotating this property is a must for any plugin. And, as an extension to this... perhaps also the current dataroot/muc/config.php should be configurable to be picked from another place (this is not a must).
          Hide
          Eloy Lafuente (stronk7) added a comment -

          Report 004: We cannot have one subsystem called "cache" pointing to "cache" and one plugin type also called "cache" pointing to "cache/stores". NP leaving the subsystem as is and moving the plugin type to "cachestore" IMO.

          At the same time, we need the "cache" category to be made valid/available @ the Core APIs page. Linking to proper documentation if possible.

          Show
          Eloy Lafuente (stronk7) added a comment - Report 004: We cannot have one subsystem called "cache" pointing to "cache" and one plugin type also called "cache" pointing to "cache/stores". NP leaving the subsystem as is and moving the plugin type to "cachestore" IMO. At the same time, we need the "cache" category to be made valid/available @ the Core APIs page. Linking to proper documentation if possible.
          Hide
          Eloy Lafuente (stronk7) added a comment -

          Report 005: The prechecker found some indentation/whitespace/length problems (and some warnings about inline docs). Plz, review (non public-access link:).

          Show
          Eloy Lafuente (stronk7) added a comment - Report 005: The prechecker found some indentation/whitespace/length problems (and some warnings about inline docs). Plz, review ( non public-access link: ).
          Hide
          Eloy Lafuente (stronk7) added a comment -

          Report 006: Playing with testing instances, and running the performance test, 2 little points:

          006A) After saving the memcache/memcached configuration (servers), one "blank" page is shown (blank meaning, no setting nor "settings saved" is shown).

          006B) With the memcached test execution, I'm getting:

          Warning: Invalid argument supplied for foreach() in cache/stores/memcached/lib.php on line 107
          

          and the "similar" memcache one works without warning, so I bet something is being handled differently when saving/reading the configuration or so.

          Show
          Eloy Lafuente (stronk7) added a comment - Report 006: Playing with testing instances, and running the performance test, 2 little points: 006A) After saving the memcache/memcached configuration (servers), one "blank" page is shown (blank meaning, no setting nor "settings saved" is shown). 006B) With the memcached test execution, I'm getting: Warning: Invalid argument supplied for foreach() in cache/stores/memcached/lib.php on line 107 and the "similar" memcache one works without warning, so I bet something is being handled differently when saving/reading the configuration or so.
          Hide
          Eloy Lafuente (stronk7) added a comment - - edited

          Report 007: Other observations:

          007A) While I don't oppose to have specialized methods asking for a given feature from a cache_store (supports_native_ttl()....), there is also the get_supported_features() that leads to sort of "duplicate API" for features. IMO, all the features should be centralized somewhere and later, specialized methods (if useful for a more natural/verbose API) should be using that central place. Not critical at all.

          That or we need to rename one of the 2 groups of "features" to something different ("options" or whatever). For me the 1st solution is better central + specialized methods.

          007B) In the integration with core, I've read how it interacts when running unit tests. For sure it requires some "reset" system like we have for some of the current caches (session, dataroot, DB..). This is simply one reminder about that, surely the best person to ensure they are being reset properly and fulfilling all needs is Petr. Is the reset() method enough to guarantee everything (the stores contents too) has been cleaned for next test case execution?

          007C) About cacheable areas, I think any candidate should have a strong get/set API before adding MUC caching to it. Example, the "config" cache. There are code here and there that performs "custom" actions against the data. Some blocks or messaging are manually deleting from table the their config.... that should NOT be allowed IMO. And 1st step should be to ensure all uses are centralized and then apply the MUC to those central places, instead of spreading them to a lot of places). Easy to forget/make a mistake.

          007D) The DB columns cache reveals one question/potential problem. It's great to have one application cache for them, 100% agree, just I don't get if it really helps to use that application cache always (for each request to column specs) or it's better to continue using one static cache as intermediate (per request) cache. I mean, if in one looong request (say cron). We perform 1000 queries against the forum table, that will imply 1000 accesses to the application cache (file system by default), instead of the old 1 access to DB (really slow) + 999 accesses to static cache (blazing fast). I mean... perhaps some "hybrid mode" is more interesting there? Note I just used the DB columns as an example, I've not done any performance comparison, but in my mind it sounds like a good candidate to have 2 caches at the same time (and that may apply to other areas too). Only using it to reveal some use case where application caching could end performing worse than the original (in theory).

          Show
          Eloy Lafuente (stronk7) added a comment - - edited Report 007: Other observations: 007A) While I don't oppose to have specialized methods asking for a given feature from a cache_store (supports_native_ttl()....), there is also the get_supported_features() that leads to sort of "duplicate API" for features. IMO, all the features should be centralized somewhere and later, specialized methods (if useful for a more natural/verbose API) should be using that central place. Not critical at all. That or we need to rename one of the 2 groups of "features" to something different ("options" or whatever). For me the 1st solution is better central + specialized methods. 007B) In the integration with core, I've read how it interacts when running unit tests. For sure it requires some "reset" system like we have for some of the current caches (session, dataroot, DB..). This is simply one reminder about that, surely the best person to ensure they are being reset properly and fulfilling all needs is Petr. Is the reset() method enough to guarantee everything (the stores contents too) has been cleaned for next test case execution? 007C) About cacheable areas, I think any candidate should have a strong get/set API before adding MUC caching to it. Example, the "config" cache. There are code here and there that performs "custom" actions against the data. Some blocks or messaging are manually deleting from table the their config.... that should NOT be allowed IMO. And 1st step should be to ensure all uses are centralized and then apply the MUC to those central places, instead of spreading them to a lot of places). Easy to forget/make a mistake. 007D) The DB columns cache reveals one question/potential problem. It's great to have one application cache for them, 100% agree, just I don't get if it really helps to use that application cache always (for each request to column specs) or it's better to continue using one static cache as intermediate (per request) cache. I mean, if in one looong request (say cron). We perform 1000 queries against the forum table, that will imply 1000 accesses to the application cache (file system by default), instead of the old 1 access to DB (really slow) + 999 accesses to static cache (blazing fast). I mean... perhaps some "hybrid mode" is more interesting there? Note I just used the DB columns as an example, I've not done any performance comparison, but in my mind it sounds like a good candidate to have 2 caches at the same time (and that may apply to other areas too). Only using it to reveal some use case where application caching could end performing worse than the original (in theory).
          Hide
          Eloy Lafuente (stronk7) added a comment -

          And that's all I've found up to now. Hope it helps. Otherwise, it's looking really clever/complete/cool. Great work!

          Ciao

          Show
          Eloy Lafuente (stronk7) added a comment - And that's all I've found up to now. Hope it helps. Otherwise, it's looking really clever/complete/cool. Great work! Ciao
          Hide
          Tim Barker added a comment -

          Hi, I have the perfcomp tool set up and running reliably in the nightly build. I have several scenarios running against different branches. I decided upon those scenarios myself and they work great as a proof of concept. I do however need some input into what scenarios we should be testing for MUC.

          Current scenarios are:

          1. 100 users 5 loops with 1 new thread every 10 seconds.
          2. 1 user 5 loops (no thread rampup).
          3. 5 users 30 loops with 1 new thread every 10 seconds (probably going to can this one).
          4. 1 user 30 loops (no thread rampup).

          I am running against MySQL and Postgres, both on 2.3 integration and integration master.

          Can anyone add to this?

          Show
          Tim Barker added a comment - Hi, I have the perfcomp tool set up and running reliably in the nightly build. I have several scenarios running against different branches. I decided upon those scenarios myself and they work great as a proof of concept. I do however need some input into what scenarios we should be testing for MUC. Current scenarios are: 100 users 5 loops with 1 new thread every 10 seconds. 1 user 5 loops (no thread rampup). 5 users 30 loops with 1 new thread every 10 seconds (probably going to can this one). 1 user 30 loops (no thread rampup). I am running against MySQL and Postgres, both on 2.3 integration and integration master. Can anyone add to this?
          Hide
          Tim Barker added a comment -

          FYI, the run parameters are passed as vars from the Jenkins job so it takes me all of 2 mins to create new tests.

          Show
          Tim Barker added a comment - FYI, the run parameters are passed as vars from the Jenkins job so it takes me all of 2 mins to create new tests.
          Hide
          Sam Hemelryk added a comment -

          Thanks for looking at that Eloy, and thank you Adam for pointing out the reference issue (I've added a commit to address that and unit tests for it).

          The following are my thoughts/changes re whats happened here.

          R01: Absolutely, see commit eba5cc4
          R02: Agreed - I will work on that today
          R03: Yes 100%, all plugins (at least within core) should be stand alone and should not rely on any API's within Moodle (other than the cache API obviously)
          R04: Agreed - I will rename cache plugin type to cachestore.
          R05: Will deal with those today.

          Commits since review

          R01 + R06A:
          a2011ed MDL-25290 cache: Fixed up redirect loop with admin settings for cache stores
          eba5cc4 MDL-25290 cache: Fixed up issue with unit test blowing away cachedir

          • Moved settings from server.php => plugins.php
          • Added a cache stores category in preparation of cache lock plugins coming shortly.
          • Changed require_once to include in order to resolve ping-pong between index and upgrade settings.
          • Changed rows for the test server settings to 3 rather than defaul 5.

          R04:
          5746477 MDL-25290 cache: Renamed plugin from cache => cachestore

          R02 + R07b:
          d92c77f MDL-25290 cache: Added cache locking plugin and converted locking implementations to that
          46b819b MDL-25290 cache: Added UI to view the cache lock setups and tidied up a couple of things

          R03:
          471b458 MDL-25290 cache: Added optional CFG setting to control location of cache config file

          R05:
          22dec09 MDL-25290 cache: Fixed things up per coding style

          R06b:
          7e6d2df MDL-25290 cache: Fixed up memcache server settings and a couple of minor things

          Reference issues:
          f71f5be MDL-25290 cache: Fixed up handling of objects by cache loader

          On the todo list

          R07A
          Yes agreed, this area is a little messy/convoluted at the moment. I don't have any real great ideas there presently but no doubt there will be a great solution out there to it.

          R07C
          Definitely, in areas where we have an API but it is not being used consistently the conversion to the API should be a priority as well. Although of course we will have to assess the viability in each case.

          R07D
          Certainly, each conversion needs to be closely looked at and testing for things like that.

          Show
          Sam Hemelryk added a comment - Thanks for looking at that Eloy, and thank you Adam for pointing out the reference issue (I've added a commit to address that and unit tests for it). The following are my thoughts/changes re whats happened here. R01: Absolutely, see commit eba5cc4 R02: Agreed - I will work on that today R03: Yes 100%, all plugins (at least within core) should be stand alone and should not rely on any API's within Moodle (other than the cache API obviously) R04: Agreed - I will rename cache plugin type to cachestore. R05: Will deal with those today. Commits since review R01 + R06A: a2011ed MDL-25290 cache: Fixed up redirect loop with admin settings for cache stores eba5cc4 MDL-25290 cache: Fixed up issue with unit test blowing away cachedir Moved settings from server.php => plugins.php Added a cache stores category in preparation of cache lock plugins coming shortly. Changed require_once to include in order to resolve ping-pong between index and upgrade settings. Changed rows for the test server settings to 3 rather than defaul 5. R04: 5746477 MDL-25290 cache: Renamed plugin from cache => cachestore R02 + R07b: d92c77f MDL-25290 cache: Added cache locking plugin and converted locking implementations to that 46b819b MDL-25290 cache: Added UI to view the cache lock setups and tidied up a couple of things R03: 471b458 MDL-25290 cache: Added optional CFG setting to control location of cache config file R05: 22dec09 MDL-25290 cache: Fixed things up per coding style R06b: 7e6d2df MDL-25290 cache: Fixed up memcache server settings and a couple of minor things Reference issues: f71f5be MDL-25290 cache: Fixed up handling of objects by cache loader On the todo list R07A Yes agreed, this area is a little messy/convoluted at the moment. I don't have any real great ideas there presently but no doubt there will be a great solution out there to it. R07C Definitely, in areas where we have an API but it is not being used consistently the conversion to the API should be a priority as well. Although of course we will have to assess the viability in each case. R07D Certainly, each conversion needs to be closely looked at and testing for things like that.
          Hide
          Adam Olley added a comment -

          Hi Sam,

          I've had a look at your code that checks if it needs to serialize or not, we actually attempted something similar. The problem with that approach is that if the object/array/whatever has enough elements in it, then the cost of performing is_array and instanceof's on each and every element quickly outweighs simply serializing it. For simple items it's not a problem. But still...

          Have you done any performance tests to time the difference between just [un]serializing anything non-scalar compared to your recursive check/deepclone method for the various items your muc implementation already caches?

          Show
          Adam Olley added a comment - Hi Sam, I've had a look at your code that checks if it needs to serialize or not, we actually attempted something similar. The problem with that approach is that if the object/array/whatever has enough elements in it, then the cost of performing is_array and instanceof's on each and every element quickly outweighs simply serializing it. For simple items it's not a problem. But still... Have you done any performance tests to time the difference between just [un] serializing anything non-scalar compared to your recursive check/deepclone method for the various items your muc implementation already caches?
          Hide
          Sam Hemelryk added a comment -

          Hi Adam,

          Testing is on the agenda for today, unfortunately I did not have time yesterday to finish running tests.
          In writing the recursive search + clone methods yesterday I had in the back of my mind that perhaps serialisation is simply the road to take, especially given the current approach involves have to read through the data structure twice in order to perform a clone.
          I'll run testing today and see whether we can rule out the clone/serialisation approach, for the time being the important thing is that the reference issue is now tested for and dealt with. Perhaps (and likely) the solution can still be improved of course.

          Cheers
          Sam

          Show
          Sam Hemelryk added a comment - Hi Adam, Testing is on the agenda for today, unfortunately I did not have time yesterday to finish running tests. In writing the recursive search + clone methods yesterday I had in the back of my mind that perhaps serialisation is simply the road to take, especially given the current approach involves have to read through the data structure twice in order to perform a clone. I'll run testing today and see whether we can rule out the clone/serialisation approach, for the time being the important thing is that the reference issue is now tested for and dealt with. Perhaps (and likely) the solution can still be improved of course. Cheers Sam
          Hide
          Tony Levi added a comment -

          Just a note - you need to handle this problem on the 'get' side as somebody can get anything out of the cache and start unsetting properties. A serial/unserialize pair in the put won't solve it - you need to pay the unserialize on every get.

          Unfortunately this all makes the static cache much more 'heavy' than some use cases would like; it would be neat if this was optional so places using static cache as a dead simple key/value store in loops don't have to pay...

          Show
          Tony Levi added a comment - Just a note - you need to handle this problem on the 'get' side as somebody can get anything out of the cache and start unsetting properties. A serial/unserialize pair in the put won't solve it - you need to pay the unserialize on every get. Unfortunately this all makes the static cache much more 'heavy' than some use cases would like; it would be neat if this was optional so places using static cache as a dead simple key/value store in loops don't have to pay...
          Hide
          Tim Hunt added a comment -

          Tim B, in those perf tests, how big in the DB? It makes a big difference if the data you mention is the only data in the DB, or if you are running a few users/courses, out of a database with 100,000 users, 10,000 courses, etc.

          Show
          Tim Hunt added a comment - Tim B, in those perf tests, how big in the DB? It makes a big difference if the data you mention is the only data in the DB, or if you are running a few users/courses, out of a database with 100,000 users, 10,000 courses, etc.
          Hide
          Eloy Lafuente (stronk7) added a comment -

          Hi,

          beyond some minor details that I shared with Sam last Friday... I'd give my thumbs up to this from a infrastructure point of view. Here are the points that should be fixed before landing, hopefully all them minor:

          • UI: getting "Notice: Undefined index: locks in /cache/forms.php on line 49"
          • UI: the counters of instances/definitions are incorrect. Sure there are some issue beyond that.
          • UI: some labels would need a bit of love (plugins => cache store plugins...).
          • PHPUnit: Tried phpunit's code coverage when running the /cache tests and it seems to be keeping a huge % uncovered.
          • Git: Somehow, each time I pull the "compact" branch it continue borking my git repository, detecting local/codechecker as a submodule, although the .gitsubmodules file isn't there any more. Perhaps something is going with those commits and they need to be replayed in some way to keep them 100% clean of any submodule handling. Or perhaps it's my local git repo, I've not tried with a fresh clone.
          • Test: Install, upgrade, phpunit, random usage should be specicified for sure, and perhaps also, playing with the cache UI and switching to the current memcache/memcached/mongodb plugins to verify all them work ok.

          Note I've not played too much with current cache definitions (lang strings, db metadata and config), but the system seems to be working here ok, apparently.

          So perhaps it's ok to allow it to land (after fixing the points above), and then define the plan to follow along the coming weeks. It should include, at least (to avoid forgetting):

          • Improved phpunit & performance testing.
          • More store and locking plugins.
          • Some more uses of the cache for 2.4, although I'm not about to spread it everywhere for 2.4.0. I think it's far better to have only 4-5 uses and really center our efforts about profiling, improving and measuring them (together with the MUC implementation itself).
          • Also I'd create a LONGTEST, not needing the whole Moodle page bootstrap, but only the cache stuff to 1) verify the cache subsystem does not have dependencies of Moodle and 2) be able to profile and compare execution of that test without interferences from other parts of Moodle. Such test could be used both for profiling (1 thread) and concurrency (via jmeter or others).

          Sure I'm missing some points and I'll perform a final review/self-testing this week. If the points above are fixed and there are not anything new causing some red-flag, this is my +1 for this to land. Then, as said, the rest.

          Ciao

          Show
          Eloy Lafuente (stronk7) added a comment - Hi, beyond some minor details that I shared with Sam last Friday... I'd give my thumbs up to this from a infrastructure point of view. Here are the points that should be fixed before landing, hopefully all them minor: UI: getting "Notice: Undefined index: locks in /cache/forms.php on line 49" UI: the counters of instances/definitions are incorrect. Sure there are some issue beyond that. UI: some labels would need a bit of love (plugins => cache store plugins...). PHPUnit: Tried phpunit's code coverage when running the /cache tests and it seems to be keeping a huge % uncovered. Git: Somehow, each time I pull the "compact" branch it continue borking my git repository, detecting local/codechecker as a submodule, although the .gitsubmodules file isn't there any more. Perhaps something is going with those commits and they need to be replayed in some way to keep them 100% clean of any submodule handling. Or perhaps it's my local git repo, I've not tried with a fresh clone. Test: Install, upgrade, phpunit, random usage should be specicified for sure, and perhaps also, playing with the cache UI and switching to the current memcache/memcached/mongodb plugins to verify all them work ok. Note I've not played too much with current cache definitions (lang strings, db metadata and config), but the system seems to be working here ok, apparently. So perhaps it's ok to allow it to land (after fixing the points above), and then define the plan to follow along the coming weeks. It should include, at least (to avoid forgetting): Improved phpunit & performance testing. More store and locking plugins. Some more uses of the cache for 2.4, although I'm not about to spread it everywhere for 2.4.0. I think it's far better to have only 4-5 uses and really center our efforts about profiling, improving and measuring them (together with the MUC implementation itself). Also I'd create a LONGTEST, not needing the whole Moodle page bootstrap, but only the cache stuff to 1) verify the cache subsystem does not have dependencies of Moodle and 2) be able to profile and compare execution of that test without interferences from other parts of Moodle. Such test could be used both for profiling (1 thread) and concurrency (via jmeter or others). Sure I'm missing some points and I'll perform a final review/self-testing this week. If the points above are fixed and there are not anything new causing some red-flag, this is my +1 for this to land. Then, as said, the rest. Ciao
          Hide
          Sam Hemelryk added a comment -

          Hi guys,

          I've just updated the branch after having chatted to Eloy on Friday.
          I've dealt with the reference issues when returning from the cache as well as several things Eloy noted.
          In summary the following changes have been made.

          • Fixed up undefined index when editing store instance.
          • Removed the references to the code checker I had accidentally included (rewrote history to remove all traces)
          • Added many more unit tests around cache API admin.
          • Fixed up a couple of typo's and loose ends identified by the new unit tests.
          • Renamed some admin methods to reflect clearer the store instance and store plugin separation.
          • Handled reference issues when returning information and added unit tests to cover.

          Also as I was terribly slack and did not repost about the results from the testing the recursive cloning vs. serialising everything.
          Given the three conversion I have made already which are all using arrays (resulting in recusion) but containing just scalars (so not requiring cloning or serialisation) the recursion results are better.
          However those are three limited uses and for sure this is something we will have to continue monitoring.

          Just to summarise a couple of ideas Eloy had as well so that they are not forgotten:

          1. As this is stand alone we should create use scripts that are entirely separate from Moodle. These scripts could be used to much more accurately look at how the code is executing (using profilers etc) and how we could continue to improve it.
            I've reviewed it a couple of times using phpxprof but that would really paint a clearer picture.
          2. We need to continue looking at the strings, Eloy noted I switched between plugin and store several times where I shouldn't have. I've fixed the ones I noted quickly in the UI but we should keep an eye out for other places that could be corrected (as well as any general lang string improvements).
          3. More unit tests - can you ever have enough?
          4. The counts of things on the config page are out because of the defaults, I'll need to either adjust the wording or improve the counts going on there (they're basic presently).

          Many thanks
          Sam

          Show
          Sam Hemelryk added a comment - Hi guys, I've just updated the branch after having chatted to Eloy on Friday. I've dealt with the reference issues when returning from the cache as well as several things Eloy noted. In summary the following changes have been made. Fixed up undefined index when editing store instance. Removed the references to the code checker I had accidentally included (rewrote history to remove all traces) Added many more unit tests around cache API admin. Fixed up a couple of typo's and loose ends identified by the new unit tests. Renamed some admin methods to reflect clearer the store instance and store plugin separation. Handled reference issues when returning information and added unit tests to cover. Also as I was terribly slack and did not repost about the results from the testing the recursive cloning vs. serialising everything. Given the three conversion I have made already which are all using arrays (resulting in recusion) but containing just scalars (so not requiring cloning or serialisation) the recursion results are better. However those are three limited uses and for sure this is something we will have to continue monitoring. Just to summarise a couple of ideas Eloy had as well so that they are not forgotten: As this is stand alone we should create use scripts that are entirely separate from Moodle. These scripts could be used to much more accurately look at how the code is executing (using profilers etc) and how we could continue to improve it. I've reviewed it a couple of times using phpxprof but that would really paint a clearer picture. We need to continue looking at the strings, Eloy noted I switched between plugin and store several times where I shouldn't have. I've fixed the ones I noted quickly in the UI but we should keep an eye out for other places that could be corrected (as well as any general lang string improvements). More unit tests - can you ever have enough? The counts of things on the config page are out because of the defaults, I'll need to either adjust the wording or improve the counts going on there (they're basic presently). Many thanks Sam
          Hide
          Martin Dougiamas added a comment -

          I really hope this is landing this week ....

          Show
          Martin Dougiamas added a comment - I really hope this is landing this week ....
          Hide
          Eloy Lafuente (stronk7) added a comment - - edited
          domingo, 14 de octubre de 2012, 23:58:38 CEST
          

          Your hopes are holes in my armour
          (not sure what it means but sounded nice in my mind, lol)

          So here it's. It does not break install/upgrade/phpunit so... landed.
          (note there are some present problems I'm discussing with SamH yet)

          Ciao

          Show
          Eloy Lafuente (stronk7) added a comment - - edited domingo, 14 de octubre de 2012, 23:58:38 CEST Your hopes are holes in my armour (not sure what it means but sounded nice in my mind, lol) So here it's. It does not break install/upgrade/phpunit so... landed. (note there are some present problems I'm discussing with SamH yet) Ciao
          Hide
          Martin Dougiamas added a comment -

          Now the subtasks are coming ...

          Show
          Martin Dougiamas added a comment - Now the subtasks are coming ...
          Hide
          Martin Dougiamas added a comment - - edited

          I've moved all the subtasks for this to a new parent bug Stage 2 (MDL-34224) so that this one can be closed normally as Stage 1.

          Show
          Martin Dougiamas added a comment - - edited I've moved all the subtasks for this to a new parent bug Stage 2 ( MDL-34224 ) so that this one can be closed normally as Stage 1.
          Hide
          Dan Poltawski added a comment -

          We've had install/upgrade/phpunit tests run by ciserver

          Show
          Dan Poltawski added a comment - We've had install/upgrade/phpunit tests run by ciserver
          Hide
          Aparup Banerjee added a comment -

          Your issue has dug up some gold.
          It works great i've been told.
          Go forth, be brave, be bold.

          yay! "All your thoughts are belong to everyone."

          Thanks and ciao!

          Show
          Aparup Banerjee added a comment - Your issue has dug up some gold. It works great i've been told. Go forth, be brave, be bold. yay! "All your thoughts are belong to everyone." Thanks and ciao!
          Hide
          Nadav Kavalerchik added a comment -

          Two links, that might be useful to anyone considering testing these new features on their own servers:

          Moodle performance testing: how much more horsepower do each new versions of Moodle require?

          http://www.iteachwithmoodle.com/2012/10/12/moodle-performance-testing-how-much-more-horsepower-do-each-new-versions-of-moodle-require/

          How to load test your Moodle server using Loadstorm

          http://www.iteachwithmoodle.com/2012/10/11/how-to-stress-test-your-moodle-server-using-loadstorm/

          Show
          Nadav Kavalerchik added a comment - Two links, that might be useful to anyone considering testing these new features on their own servers: Moodle performance testing: how much more horsepower do each new versions of Moodle require? http://www.iteachwithmoodle.com/2012/10/12/moodle-performance-testing-how-much-more-horsepower-do-each-new-versions-of-moodle-require/ How to load test your Moodle server using Loadstorm http://www.iteachwithmoodle.com/2012/10/11/how-to-stress-test-your-moodle-server-using-loadstorm/
          Hide
          Helen Foster added a comment -

          Thanks to everyone who has worked on MUC stage 1!

          'Moodle Docs for this page' links on all caching pages in Moodle 2.4 (cache/admin.php etc) now redirect to http://docs.moodle.org/en/Caching

          If anyone can help with documentation for admins on this new feature it would be much appreciated!

          Show
          Helen Foster added a comment - Thanks to everyone who has worked on MUC stage 1! 'Moodle Docs for this page' links on all caching pages in Moodle 2.4 (cache/admin.php etc) now redirect to http://docs.moodle.org/en/Caching If anyone can help with documentation for admins on this new feature it would be much appreciated!
          Hide
          Mary Cooch added a comment -

          Removing docs_required label as I think this has been documented in the link given. (Please correct me if I'm mistaken.)

          Show
          Mary Cooch added a comment - Removing docs_required label as I think this has been documented in the link given. (Please correct me if I'm mistaken.)

            People

            • Votes:
              24 Vote for this issue
              Watchers:
              38 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: