|
|
|
Ah - I had removed a paragraph about controversy - shouldn't have. I think it's fair to say that Eloy agrees on the OOP (not sure about Penny's position on OOP, her original proposal was to stick to functional).
And I'm the only one of the 3 who thinks the dual API is doable / desirable ;-) -- I think Tim was in agreement that we should at least try to make the transition easy. But beer may have been influencing him! The other thing supporting OO for this is for unit testing ... it allows unit tests to override these functions so that we can test Moodle functionality (eg gradebook) against mock objects without affecting live databases.
I'm all for the plan described above, as long as we get agreement from a lot of core developers. Make some noise! Can someone also post about the benefits of placeholders in general and why we are doing this? I think this would really help and it may as well be documented here. I also dislike the idea of the dual api, surely that just looses all the security advantages in one fell swoop?
Also, whoever writes this up must contain a link to http://xkcd.com/327/ :P I am definitely in favour of this. The destination has to be an OOP interface with placeholders. I think placeholders speaks for itself. OOP needs more justification, but here are the benefits I see:
1) Allows mocking of functions for unit testing. Without this, unit testing Moodle is hard to impossible in almost all legacy areas (e.g. quiz). Note that the one area of Moodle with good unit tests is the gradebook that was just rewritten from scratch. 2) It enables you to have connections to more than one database (possibly with different prefixes). That helps enrolment plugins. Also it enables other things. I'll add an example later. As to how we get to this destination. Well I don't have any good ideas myself, but it will clearly cause a lot of short-term pain to a lot of people, so it is good that other people are thinking about this and seem to be coming up with good thoughts. OK, an example of a interesting use of OOP datalib. This is something we have considered doing here using a nasty trick like messing with the $db global. You know the way the moodle.org grinds to a halt for a few minutes each day when it sends out the forum cron emails, because that hammers the DB. Well, to avoid that, you could consider setting up a second database that is a read-only mirror of the first, and run all the huge read-only reporting SQL against that, in cases where you don't really mind if the data show is missing the last few minutes live data. So logs and stats reports, quiz and gradebook exports (not on screen where current data is more important and it is paged) some parts of cron etc. To enable this, inside Moodle you would have two objects, $mdb and $reportingdb. Normally $reportingdb would connect to the same database as $mdb, but to ensure developers don't make mistakes, it would have to use a datalib subclass that gives an error if you call any of write datalib functions. Then gradually, you could move a few areas of the code to the new $reportingdb class for some queries, and some admins could set up the necessary replication, change the $reportingdb connection, and start moving the reporting load off of the main transactional database. That might even let you tune the two servers better. Apparently the OU uses this sort of setup for a lot of our other system, which is why we thought about doing the same for Moodle. Quick reply to Dan - "little Danny tables" Poltawski? ;-) - that particular strip is a favourite around here. And no - supporthing the old API would not be a sec problem from the DB point of view because we _can_ run most of the legacy functions through the new ones, which will use placeholders. The ones we cannot are the _sql() ones - which are extremely rare outside of core code.
In which case, we might decide it is better to remove the _sql legacy functions, so code breaks instead of becoming insecure. That is probably the better trade off, especially as the fix of switching to the new API will be very easy.
Another question. Would the switch to a new OO database API be the time to introduce more transactions to Moodle? +100 for transactions. But IIRC, (like FKs) mysql needs to switch to use the innodb storage engine for any sensible support. :-( I've not seen any past thoughts on this?
Does mssql support transactions? The only issue I have with OOP is that whenever you want to call any function that's currently available in the global scope (eg get_records) you'll have to global $mdb every time you want to use it.
However, having a new $mdb object that's INTERNAL to dmllib, much like the current $db object, I have no problem with, as long as its creation can be overridden for unit tests / the scenario that Tim mentions. I really don't think having two APIs is sensible. If scripts have to declare a constant to say they want magic quotes and aren't going to use placeholders, firstly you end up with conditional addslashes everywhere, which is stupid and not sustainable, but more importantly , the second a script forgets to say "Hey, I don't want addslashes", you immediately have a security problem. Also, at some point someone commented that there are hardly any calls to _sql, here's a quick & dirty stats from 1.9:
penny@mickey:~/src/moodle/mdl19$ grep -r '_sql(' * | wc -l 2578 Two quick points
> the second a script forgets to say "Hey, I don't want addslashes", you immediately have a security problem. If the old api routes its calls through the new placeholder-based API, then _no_, you do not have a security problem. > at some point someone commented that there are hardly any calls to _sql ... in _contrib_! Of course we want to replace the API in core Moodle -- what we are arguing here is keeping some backwards compat. A quick grep shows that there are more instances of _sql() that I expected. Still, a lot of plugins and modules don't use it. ~/public_html/moodle_contrib$ grep -r --mmap -l '_sql' * | wc -l 345 > If the old api routes its calls through the new placeholder-based API, then _no_, you do not have a security problem.
err, sorry, but no. If you're using _sql functions you can't pass through new API. If you're going to the effort of changing ALL the calls to _sql (which is what I am advocating), then why would you bother keeping the old API? I guess it hasn't been clear enough -- what we are discussing is
- Moodle going 100% new API - Still supporting parts of the old API (parts being "not the _sql() calls") to ease the transition for contrib modules We all agree that _sql() calls are not routable via tha new API, but -- I have written -- it is not a major issue because most DB usage happens with calls we can be backwards compatible with. Why would we bother? Contrib and the myriad of in-house modules and plugins out there. > Still supporting parts of the old API (parts being "not the _sql() calls") to ease the transition for contrib modules
that's all fine, except that you're still breaking the API for the old modules in two places: - removing _sql - forcing conditional addslashes/stripslashes by forcing them to declare some constant. Modules that take no action on either part will result in broken contrib mods (1) or a security problem (2) either way, it's better to do it once properly. As far as: > I guess it hasn't been clear enough -- what we are discussing is Thanks, it was very unclear what we were talking about. Silly me I thought we were arguing about mauve vs purple for the new database :P :P Also, it has just occurred to me that it's not just the _sql functions that are a problem, it's _select too (as far as talking about functions that pass raw values to database and can't go through new API)
And using your contrib since I don't have a full checkout right now :P I have: [penny@mltest] /home/martin/public_html/moodle_contrib$ grep -r '_select(' * | wc -l 717 :( |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
I mentioned the discussion on this with Tim Hunt at Moot UK - outline of what I remember
- switch core moodle + modules to a the new OO dmllib, with calls like $mdb->get_records()
- with a bit of work, it can allow us to keep legacy function calls like get_records() around without conflict for a while
- it also allows us to rework things like what parameters they take - so we can re-design the API and drop things that have been bad ideas.
The main conflict is around addslashes() and magic_quotes. Here's what I recall (IIRC!)
- during the transition (while working), keep the automatic magic_quotes on
- converted files can define NOPLEASENOMAGICQUOTES just before including config.php so lib/setup.php skips them
- the legacy dmllib, when running insert_record/update_record()/select* runs striplashes and passes the call on to the new OO dmllib (which will use placeholders) - no security risk there - at the most we might strip an extra backslash accidentally
- once we are done, disable the automatic magic quotes and put warnings in the old dmllib calls - provide a switch in admin to enable auto magicquotes for legacy modules
One of the big benefits of the above plan is that - even if we remove it for the released 2.0, it means that the work can be more gradual, while having a working HEAD. Avoids huge all-or-nothing patches and hard-to-maintain dev branches.