Moodle
  1. Moodle
  2. MDL-15635

[gsoc] database migration utility

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 2.0
    • Fix Version/s: 2.3.2
    • Component/s: Database SQL/XMLDB
    • Labels:
      None
    • Database:
      Any
    • Affected Branches:
      MOODLE_20_STABLE
    • Fixed Branches:
      MOODLE_23_STABLE
    • Rank:
      632

      Description

      Utility to completely export and import a moodle database in a db-neutral format/

      After a long discussion, Petr and I agreed on the following:

      1. Data should not use STATEMENT but instead we will add a new part to the XMLDB file, CONTENT or something similar which contains all the data (Eloy to help here to decide the exact format)

      2. Current pieces of data in the XMLDB file should move from STATEMENT to CONTENT so that they are handled by dml rather than ddl.

      3. The export file should contain both schema and data, but the schema should come from the code install.xml, and should not be used when importing, but used to compare with the target system code.xml (so as to not try to import data into a schema that does not fit it)

      4. Introduction of new $DB->import_record function (used to import CONTENT parts of install.xml)

      5. Introduction of new contract reset_sequence function (for given table or all tables) to be implemented per driver (postgres will set sequence value, mysql will probably do nothing, not sure about ora or mssql yet)

      6. Different ways to export and import moodle database:

      • During moodle installation, add hook to install from a file or a remote database (before first steps that insert data, ideally)
      • Admin UI - blow away current database (truncate tables) and import data from file or connection to remote database
      • Admin UI - export current database to file, or into remote database

      General notes:

      Schema should never be imported, only used for comparison.
      Export file should contain moodle version (it is in data content already but it should also be in xmldb if not already) and this should be checked at import time.

      Andrei - I would like you to create subtasks for each individual component here you can see and write a specification in the wiki before you write any more code here. I am aware this is different to what you've already implemented (although I hope you'll be able to reuse much of the code) and I want to avoid this happening in the future ( this is definitely at least my fault in part for not getting back to you sooner )

        Issue Links

          Activity

          Penny Leach created issue -
          Petr Škoda made changes -
          Field Original Value New Value
          Link This issue has a non-specific relationship to MDL-15071 [ MDL-15071 ]
          Hide
          Andrei Bautu added a comment -

          On the "Schema should never be imported, only used for comparison" statement: importing the schema might be a good thing though. Imagine this scenario: you have a user how uses Moodle without knowledge of phpmyadmin or mysql tool; (s)he installed Moodle by providing the database params received from the hosting provider. after a code update that trigger a database update, the db transfer tool will refuse to import an older database because the versions do not match.

          Show
          Andrei Bautu added a comment - On the "Schema should never be imported, only used for comparison" statement: importing the schema might be a good thing though. Imagine this scenario: you have a user how uses Moodle without knowledge of phpmyadmin or mysql tool; (s)he installed Moodle by providing the database params received from the hosting provider. after a code update that trigger a database update, the db transfer tool will refuse to import an older database because the versions do not match.
          Hide
          Andrei Bautu added a comment -

          Message from me to Penny:

          -snip-snip-
          It's OK if I'll have to rewrite code, I don't mind, but I don't agree with some of the work directions there. Maybe you'll convince me otherwise

          Here are some things I think we should discus:

          1 & 2. About the CONTENT tag:

          • first of all, why should not use the STATEMENT tag? It's there and it does what we want it to do... store database content.
          • second, I tried to make the db migration tool compatible with the existing installation code and data. Adding a separate CONTENT tag might break this compatibility, unless changes are done in existing install.xml. I believe this is a bad idea, especially if we think to contrib modules.

          3. The export file currently contains the schema for the data taken from the install.xml files (for core and modules). I think data import should be like copying a file. It should happen in one of the following ways:

          • import the data directly into a blank database (this is already done in the tool)
          • drop existing tables in a non-blank database and import the data; this is currently done only for on-the-fly transfer

          We could also add you're suggested scenario (compare schemata and import or refuse data), but I think this is the least useful case. If we leave only this situation, the tool will be too restrictive. People won't be able to use their own data backups after a database upgrade (which should be possible to do). Another example: if you take you're database snapshot from an installation with some module and try to copy it on an installation without that module, you won't be able (because you don't have the required install.xml to check it against). I can think of other bad things that might happen.

          4. Same as 1 and 2. No new tags, unless proved necessary. Therefore, no import_record method.

          But if you really want it, I think it should be in the database_manager class where all the code for processing XMLDB objects is and because it's kind of database independent. Moreover, this method isn't something for every day use, like get_records, and putting it in moodle_database will do two things: make people abuse it and increase memory usage (not much, but 1k here, 1k there, soon we'll have 64M )

          5. The reset_sequence is useless if you'll agree with my opinions on point 3 (from the db transfer tool point of view, although it might be useful for other tasks).

          I can't check right now (my work PC is broken since Friday night ), but I think sqlite does not allow a sequence to be reset (unless dropping and recreating the table that generated it). For MySQL, this can only be done with DROP+CREATE or with TRUNCATE (which is the same thing). ALTER TABLE also works but with some limitations.

          6. I like this idea: "During moodle installation, add hook to install from a file".
          This is already done: "Admin UI - blow away current database (truncate tables) and import data from file".
          This is already done: "Admin UI - export current database to file"

          Things related to remote databases can be done, but "blindly" because we can't get a correct schema from a database (i.e. if you export a database's schema, you will not get the same xml data that was used to setup the database). However, currently you can export your source database as an XML file and import it in a target database. It's a two steps process.

          If you agree, I would post a message on moodle forums to ask people to test the current tool and comment on it.
          -snip-snip-

          Reply from Penny:
          -snip-snip-
          there's a lot in this to respond to.

          I definitely respect your right to disagree with what we've talked about but I think the conversation we have about this should be open (with input from Petr and Eloy as well especially since I spent so long today arguing about it, we should all be involved in decisions from now on I think)

          So can you please add your point of view to that bug and we can discuss it there tomorrow.

          I think that discussions of this nature are good because it thrashes out the best solution, and that it's important for them to be publicly archived so that people in future can go back and read through to understand why we choose a particular way.
          -snip-snip-

          Show
          Andrei Bautu added a comment - Message from me to Penny: - snip-snip - It's OK if I'll have to rewrite code, I don't mind, but I don't agree with some of the work directions there. Maybe you'll convince me otherwise Here are some things I think we should discus: 1 & 2. About the CONTENT tag: first of all, why should not use the STATEMENT tag? It's there and it does what we want it to do... store database content. second, I tried to make the db migration tool compatible with the existing installation code and data. Adding a separate CONTENT tag might break this compatibility, unless changes are done in existing install.xml. I believe this is a bad idea, especially if we think to contrib modules. 3. The export file currently contains the schema for the data taken from the install.xml files (for core and modules). I think data import should be like copying a file. It should happen in one of the following ways: import the data directly into a blank database (this is already done in the tool) drop existing tables in a non-blank database and import the data; this is currently done only for on-the-fly transfer We could also add you're suggested scenario (compare schemata and import or refuse data), but I think this is the least useful case. If we leave only this situation, the tool will be too restrictive. People won't be able to use their own data backups after a database upgrade (which should be possible to do). Another example: if you take you're database snapshot from an installation with some module and try to copy it on an installation without that module, you won't be able (because you don't have the required install.xml to check it against). I can think of other bad things that might happen. 4. Same as 1 and 2. No new tags, unless proved necessary. Therefore, no import_record method. But if you really want it, I think it should be in the database_manager class where all the code for processing XMLDB objects is and because it's kind of database independent. Moreover, this method isn't something for every day use, like get_records, and putting it in moodle_database will do two things: make people abuse it and increase memory usage (not much, but 1k here, 1k there, soon we'll have 64M ) 5. The reset_sequence is useless if you'll agree with my opinions on point 3 (from the db transfer tool point of view, although it might be useful for other tasks). I can't check right now (my work PC is broken since Friday night ), but I think sqlite does not allow a sequence to be reset (unless dropping and recreating the table that generated it). For MySQL, this can only be done with DROP+CREATE or with TRUNCATE (which is the same thing). ALTER TABLE also works but with some limitations. 6. I like this idea: "During moodle installation, add hook to install from a file". This is already done: "Admin UI - blow away current database (truncate tables) and import data from file". This is already done: "Admin UI - export current database to file" Things related to remote databases can be done, but "blindly" because we can't get a correct schema from a database (i.e. if you export a database's schema, you will not get the same xml data that was used to setup the database). However, currently you can export your source database as an XML file and import it in a target database. It's a two steps process. If you agree, I would post a message on moodle forums to ask people to test the current tool and comment on it. - snip-snip - Reply from Penny: - snip-snip - there's a lot in this to respond to. I definitely respect your right to disagree with what we've talked about but I think the conversation we have about this should be open (with input from Petr and Eloy as well especially since I spent so long today arguing about it, we should all be involved in decisions from now on I think) So can you please add your point of view to that bug and we can discuss it there tomorrow. I think that discussions of this nature are good because it thrashes out the best solution, and that it's important for them to be publicly archived so that people in future can go back and read through to understand why we choose a particular way. - snip-snip -
          Hide
          Petr Škoda added a comment -

          STATEMENT tag was designed for very simple inserts - it does not do proper inserts; my +1 to remove the STATEMENT tag completely and replace it with new section - fortunately backwards compatibility is not an issue here. I wonder what will Eloy say, I guess he is the one to decide whether to keep current STATEMENT tag and improve it OR remove it and add new CONTENT top level tag instead.

          Ddl is really not designed to handle content inserts, dml is designed to do that - any new content import code that does the actual database inserts must go there. The main reason is lobs and oracle hacks - each driver may use different code to send these into database == it does not belong into sql_generator, sorry. We can not use current $DB->insert_record because it can not support id field.

          I personally do not think that this should be extremely "easy to use" because it would be extremely easy to shoot yourself in the foot (or both feet) My +1 to design it restrictive first and then maybe relax it a bit.

          Reset of sequences is imho required, I do not understand how would you solve it without this - am I missing something? After using inserts with ID you need to adjust the sequences in some databases, right?

          We need some consistency check tool that compares existing table structure with xml file, the test should be also done before the export - recently we had many reports of broken database tables caused by the groups upgrade in 1.8.x which was uncovered later during upgrade to 1.9.x

          Show
          Petr Škoda added a comment - STATEMENT tag was designed for very simple inserts - it does not do proper inserts; my +1 to remove the STATEMENT tag completely and replace it with new section - fortunately backwards compatibility is not an issue here. I wonder what will Eloy say, I guess he is the one to decide whether to keep current STATEMENT tag and improve it OR remove it and add new CONTENT top level tag instead. Ddl is really not designed to handle content inserts, dml is designed to do that - any new content import code that does the actual database inserts must go there. The main reason is lobs and oracle hacks - each driver may use different code to send these into database == it does not belong into sql_generator, sorry. We can not use current $DB->insert_record because it can not support id field. I personally do not think that this should be extremely "easy to use" because it would be extremely easy to shoot yourself in the foot (or both feet) My +1 to design it restrictive first and then maybe relax it a bit. Reset of sequences is imho required, I do not understand how would you solve it without this - am I missing something? After using inserts with ID you need to adjust the sequences in some databases, right? We need some consistency check tool that compares existing table structure with xml file, the test should be also done before the export - recently we had many reports of broken database tables caused by the groups upgrade in 1.8.x which was uncovered later during upgrade to 1.9.x
          Hide
          Petr Škoda added a comment - - edited

          hmm, there are probably several other problems, for example:

          1/ during the export nobody should be logged-in and working, or else the database could be full of inconsistencies - admins may tweak firewall rules, filter access by IP in config.php or we could improve the maintenance mode
          2/ how much memory would be needed to import xml several gigabytes big? How much memory does it eat now?
          3/ when changing wwroot we also need to relink existing absolute links, recode block configs, etc - this would be really nice to have

          Show
          Petr Škoda added a comment - - edited hmm, there are probably several other problems, for example: 1/ during the export nobody should be logged-in and working, or else the database could be full of inconsistencies - admins may tweak firewall rules, filter access by IP in config.php or we could improve the maintenance mode 2/ how much memory would be needed to import xml several gigabytes big? How much memory does it eat now? 3/ when changing wwroot we also need to relink existing absolute links, recode block configs, etc - this would be really nice to have
          Hide
          Andrei Bautu added a comment -

          2. Several gigabytes big the xml file? I don't have such a file, but I estimate that it will take at least 15% than the actual file size (PHP will store all the data, the name of the object fields and the references between objects). This is why I've done the on-the-fly transfer method.
          3. This is hard to do. Not only absolute links might get break, but also relative links. Also, I believe this is outside of this project's agenda. Penny?

          Show
          Andrei Bautu added a comment - 2. Several gigabytes big the xml file? I don't have such a file, but I estimate that it will take at least 15% than the actual file size (PHP will store all the data, the name of the object fields and the references between objects). This is why I've done the on-the-fly transfer method. 3. This is hard to do. Not only absolute links might get break, but also relative links. Also, I believe this is outside of this project's agenda. Penny?
          Hide
          Penny Leach added a comment -

          You're right that it's sort of outside the project's agenda but I agree that it's important. backup and restore does this, I don't think it would be too hard. Put it at the end of your task list and see what happens.

          Andrei - do you have any further comments in response to Petr?

          Show
          Penny Leach added a comment - You're right that it's sort of outside the project's agenda but I agree that it's important. backup and restore does this, I don't think it would be too hard. Put it at the end of your task list and see what happens. Andrei - do you have any further comments in response to Petr?
          Hide
          Andrei Bautu added a comment -

          Q: "After using inserts with ID you need to adjust the sequences in some databases, right?"
          A: I was about to say that "they are not necesary if things are done the way I suggested (only use blank database or drop tables if they exist, or in other words, never reuse existing tables)".
          However, from what I recall, MySQL and SQLite adjust their sequences according to inserts, but PostgreSQL doesn't (Penny, am I right?). I don't know about MSSQL.
          So, yes, a set_sequence method is necesary.
          But, what's the "reset" in the name? Am I missing something.

          Show
          Andrei Bautu added a comment - Q: "After using inserts with ID you need to adjust the sequences in some databases, right?" A: I was about to say that "they are not necesary if things are done the way I suggested (only use blank database or drop tables if they exist, or in other words, never reuse existing tables)". However, from what I recall, MySQL and SQLite adjust their sequences according to inserts, but PostgreSQL doesn't (Penny, am I right?). I don't know about MSSQL. So, yes, a set_sequence method is necesary. But, what's the "reset" in the name? Am I missing something.
          Hide
          Andrei Bautu added a comment -

          On previous post: "at least 15% than the actual file size" should be "at least 15% more than the actual file size"

          Show
          Andrei Bautu added a comment - On previous post: "at least 15% than the actual file size" should be "at least 15% more than the actual file size"
          Hide
          Penny Leach added a comment -

          Postgres doesn't, you need to set the sequence after insert. reset = reset to correct value. set_sequence would imply that you have to tell it what to set the sequence to. reset_sequence means, just reset it to the right place.

          Show
          Penny Leach added a comment - Postgres doesn't, you need to set the sequence after insert. reset = reset to correct value. set_sequence would imply that you have to tell it what to set the sequence to. reset_sequence means, just reset it to the right place.
          Hide
          Eloy Lafuente (stronk7) added a comment -

          Hi,

          well, it seems that this topic is evolving really fast, cool! B-) Here there are some comments about what "vision" I had for the migration utility:

          1) New XML files will be 100% new and completely unrelated to XMLDB.
          2) New XML files don't need to contain XMLDB structures (just records data)

          I see the process of exporting like:

          1) Under admin, there are one "Create DB Image" option. The admin selects it.
          2) Moodle checks that the current DB structure (table and fields) are EXACTLY the ones in install.xml files (this is one test that can be developed separately to use it in other places, but MUST be executed here too).
          3) If the check in 2) is OK... the export to XML begins (the export entry point will receive and array of tables to export).
          4) The export also contains one version-number (main Moodle $CFG->version) to be used later. Also, the origin of the export (MSSQL, MySQL... can be annotated, just for informational purposes).
          5) One unique moodle_db_image.xml file is generated.

          And the process of importing like:

          1) Under admin, there are one "Import DB Image" option. The admin selects it.
          2) Moodle checks the version in the moodle_db_image.xml file is the same.
          3) Moodle warns and rewarns and rerewarns that the process will DESTROY current installation (suggesting to backup and so)
          4) For each table in export, if exists: truncate table, disable sequence, insert data (with original ids) and enable sequence (with correct number).

          That's all. That would be enough for people wanting to switch DB (it's only a matter of installing a new Moodle pointing to target DB) and also to developers. Keeping it apart from XMLDB stuff (but checking things are ok before exporting).

          How does it sound?

          About the XML format we can discuss but any simple <RECORD fieldname="fieldvalue" fieldname2="fieldvalue2".....> or CONTENT or whatever is ok for me as far as that info in 100% DB independent (nulls and empties are the critical ones).

          It's only how I thought it should work, more or less. Of course, feel free to change everything. But I think the ideas above are simple enough (and well separated), so we can have export/import working, and then add efforts to the checker and so on.

          Ciao

          Show
          Eloy Lafuente (stronk7) added a comment - Hi, well, it seems that this topic is evolving really fast, cool! B-) Here there are some comments about what "vision" I had for the migration utility: 1) New XML files will be 100% new and completely unrelated to XMLDB. 2) New XML files don't need to contain XMLDB structures (just records data) I see the process of exporting like: 1) Under admin, there are one "Create DB Image" option. The admin selects it. 2) Moodle checks that the current DB structure (table and fields) are EXACTLY the ones in install.xml files (this is one test that can be developed separately to use it in other places, but MUST be executed here too). 3) If the check in 2) is OK... the export to XML begins (the export entry point will receive and array of tables to export). 4) The export also contains one version-number (main Moodle $CFG->version) to be used later. Also, the origin of the export (MSSQL, MySQL... can be annotated, just for informational purposes). 5) One unique moodle_db_image.xml file is generated. And the process of importing like: 1) Under admin, there are one "Import DB Image" option. The admin selects it. 2) Moodle checks the version in the moodle_db_image.xml file is the same. 3) Moodle warns and rewarns and rerewarns that the process will DESTROY current installation (suggesting to backup and so) 4) For each table in export, if exists: truncate table, disable sequence, insert data (with original ids) and enable sequence (with correct number). That's all. That would be enough for people wanting to switch DB (it's only a matter of installing a new Moodle pointing to target DB) and also to developers. Keeping it apart from XMLDB stuff (but checking things are ok before exporting). How does it sound? About the XML format we can discuss but any simple <RECORD fieldname="fieldvalue" fieldname2="fieldvalue2".....> or CONTENT or whatever is ok for me as far as that info in 100% DB independent (nulls and empties are the critical ones). It's only how I thought it should work, more or less. Of course, feel free to change everything. But I think the ideas above are simple enough (and well separated), so we can have export/import working, and then add efforts to the checker and so on. Ciao
          Hide
          Petr Škoda added a comment -

          the recoding of wwwroot - we have an unsupported replace.php script in admin that does that already, there are several requests fro improvemetns some with patches also; this is not related to backup restore

          this is relatively easy to do imho

          Show
          Petr Škoda added a comment - the recoding of wwwroot - we have an unsupported replace.php script in admin that does that already, there are several requests fro improvemetns some with patches also; this is not related to backup restore this is relatively easy to do imho
          Hide
          Eloy Lafuente (stronk7) added a comment -

          yup. we can filter that both on export and import.

          Sounds easy and powerful, replacing every $CFG->wwwroot occurrence by something like $@WWWROOT@$ (and the opposite on import)

          +1 Ciao B-)

          Show
          Eloy Lafuente (stronk7) added a comment - yup. we can filter that both on export and import. Sounds easy and powerful, replacing every $CFG->wwwroot occurrence by something like $@WWWROOT@$ (and the opposite on import) +1 Ciao B-)
          Hide
          Penny Leach added a comment -

          After talking with Petr on irc, the following (related to Eloy's comment) became clear:

          • this is the intial implementation, following can come the real time db to db importexport (as opposed to just file)
          • same for install time options, to start is just existing admin interface
          • first step is to design the xml format and change dml to support import_record(s) and resetting sequences.
          Show
          Penny Leach added a comment - After talking with Petr on irc, the following (related to Eloy's comment) became clear: this is the intial implementation, following can come the real time db to db importexport (as opposed to just file) same for install time options, to start is just existing admin interface first step is to design the xml format and change dml to support import_record(s) and resetting sequences.
          Hide
          Penny Leach added a comment -

          Andrei - it is proposed that we have a meeting tomorrow to discuss this between me Petr Eloy and you. Petr has suggested 11am his and Eloy's time, CEST, which is 10am for me in the UK.

          We're currently using jabber for communication - you should be able to use your gmail account, I am not sure what Windows jabber clients there are.

          Can you have a think before then about what tasks are required and add them as subtasks to the bug that I created for you?

          Show
          Penny Leach added a comment - Andrei - it is proposed that we have a meeting tomorrow to discuss this between me Petr Eloy and you. Petr has suggested 11am his and Eloy's time, CEST, which is 10am for me in the UK. We're currently using jabber for communication - you should be able to use your gmail account, I am not sure what Windows jabber clients there are. Can you have a think before then about what tasks are required and add them as subtasks to the bug that I created for you?
          Hide
          Penny Leach added a comment -

          Eloy can you please create the room and invite us to it on the jabber server? Andrei has a gmail account that will probably do.

          Show
          Penny Leach added a comment - Eloy can you please create the room and invite us to it on the jabber server? Andrei has a gmail account that will probably do.
          Hide
          Eloy Lafuente (stronk7) added a comment -

          I've created the:

          gsoc-20080716-db-migration

          room in talk.moodle.org jabber server. it's open, so feel free to connect there. Anyone can invite others.

          Ciao

          Show
          Eloy Lafuente (stronk7) added a comment - I've created the: gsoc-20080716-db-migration room in talk.moodle.org jabber server. it's open, so feel free to connect there. Anyone can invite others. Ciao
          Hide
          Andrei Bautu added a comment -

          Here is the proposed XML Schema (and example) to be used for database migration:

          <?xml version="1.0" encoding="utf-8" ?>
          <xs:schema elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
          <xs:element name="moodle_database" type="databaseType" />
          <xs:complexType name="databaseType">
          <xs:sequence>
          <xs:element maxOccurs="unbounded" name="table" type="tableType" />
          </xs:sequence>
          <xs:attribute name="version" type="xs:positiveInteger" use="required" />
          <xs:attribute name="timestamp" type="xs:dateTime" use="required" />
          <xs:attribute name="comment" type="xs:string" />
          </xs:complexType>
          <xs:complexType name="tableType">
          <xs:sequence>
          <xs:element maxOccurs="unbounded" name="record" type="recordType" />
          </xs:sequence>
          <xs:attribute name="name" type="xs:NCName" use="required" />
          <xs:attribute name="schemaHash" type="xs:string" use="required" />
          </xs:complexType>
          <xs:complexType name="recordType">
          <xs:sequence>
          <xs:element maxOccurs="unbounded" name="field" type="fieldType" />
          </xs:sequence>
          </xs:complexType>
          <xs:complexType name="fieldType">
          <xs:simpleContent>
          <xs:extension base="xs:string">
          <xs:attribute name="name" type="xs:string" use="required" />
          <xs:attribute default="content" name="value" use="optional">
          <xs:simpleType>
          <xs:restriction base="xs:string">
          <xs:enumeration value="content" />
          <xs:enumeration value="null" />
          </xs:restriction>
          </xs:simpleType>
          </xs:attribute>
          </xs:extension>
          </xs:simpleContent>
          </xs:complexType>
          </xs:schema>

          <?xml version="1.0" encoding="utf-8"?>
          <moodle_database xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="untitled.xsd"
          version="20080502" timestamp="2008-06-01T02:03:04+05:00" comment="This is an optional user comment.">
          <table name="config_data" schemaHash="0123456789abcdefg0123456789abcdefg">
          <record>
          <field name="id">1</field>
          <field name="name">theme</field>
          <field name="value">chameleon</field>
          </record>
          <record>
          <field name="id">2</field>
          <field name="name">maintenance</field>
          <field name="value">0</field>
          </record>
          </table>
          <table name="users" schemaHash="abcdefg0123456789abcdefg0123456789">
          <record>
          <field name="id">1</field>
          <field name="name">admin</field>
          <field name="password">XSDFSASREWW</field>
          <field name="photo">user.png</field>
          </record>
          <record>
          <field name="id">2</field>
          <field name="name">someone</field>
          <field name="password">XSDFSXSDFSAS</field>
          <field name="photo" value="null" />
          </record>
          </table>
          </moodle_database>

          Show
          Andrei Bautu added a comment - Here is the proposed XML Schema (and example) to be used for database migration: <?xml version="1.0" encoding="utf-8" ?> <xs:schema elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="moodle_database" type="databaseType" /> <xs:complexType name="databaseType"> <xs:sequence> <xs:element maxOccurs="unbounded" name="table" type="tableType" /> </xs:sequence> <xs:attribute name="version" type="xs:positiveInteger" use="required" /> <xs:attribute name="timestamp" type="xs:dateTime" use="required" /> <xs:attribute name="comment" type="xs:string" /> </xs:complexType> <xs:complexType name="tableType"> <xs:sequence> <xs:element maxOccurs="unbounded" name="record" type="recordType" /> </xs:sequence> <xs:attribute name="name" type="xs:NCName" use="required" /> <xs:attribute name="schemaHash" type="xs:string" use="required" /> </xs:complexType> <xs:complexType name="recordType"> <xs:sequence> <xs:element maxOccurs="unbounded" name="field" type="fieldType" /> </xs:sequence> </xs:complexType> <xs:complexType name="fieldType"> <xs:simpleContent> <xs:extension base="xs:string"> <xs:attribute name="name" type="xs:string" use="required" /> <xs:attribute default="content" name="value" use="optional"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:enumeration value="content" /> <xs:enumeration value="null" /> </xs:restriction> </xs:simpleType> </xs:attribute> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:schema> <?xml version="1.0" encoding="utf-8"?> <moodle_database xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="untitled.xsd" version="20080502" timestamp="2008-06-01T02:03:04+05:00" comment="This is an optional user comment."> <table name="config_data" schemaHash="0123456789abcdefg0123456789abcdefg"> <record> <field name="id">1</field> <field name="name">theme</field> <field name="value">chameleon</field> </record> <record> <field name="id">2</field> <field name="name">maintenance</field> <field name="value">0</field> </record> </table> <table name="users" schemaHash="abcdefg0123456789abcdefg0123456789"> <record> <field name="id">1</field> <field name="name">admin</field> <field name="password">XSDFSASREWW</field> <field name="photo">user.png</field> </record> <record> <field name="id">2</field> <field name="name">someone</field> <field name="password">XSDFSXSDFSAS</field> <field name="photo" value="null" /> </record> </table> </moodle_database>
          Hide
          Petr Škoda added a comment -

          I do not understand the difference between string "null" and real NULL value, could you give some example, please?

          Show
          Petr Škoda added a comment - I do not understand the difference between string "null" and real NULL value, could you give some example, please?
          Hide
          Penny Leach added a comment -

          Petr, in our discussion we thought the best way to handle nulls was as follows:

          <field name="field" value="content">null</field>
          <field name="field" value="null"></field>

          the first one says that there is data, and it is the string 'null'. The second says the field had a null value.

          Most of the type, value="content" is implied (as in the above example, most cases) so it is left out (implied)

          Show
          Penny Leach added a comment - Petr, in our discussion we thought the best way to handle nulls was as follows: <field name="field" value="content">null</field> <field name="field" value="null"></field> the first one says that there is data, and it is the string 'null'. The second says the field had a null value. Most of the type, value="content" is implied (as in the above example, most cases) so it is left out (implied)
          Hide
          Petr Škoda added a comment - - edited

          aah, forgive my XML ignorance, I am no expert

          I expected something like <field name="photo" /> where no content means NULL
          and <field name="sometext" ></field> means empty string

          Show
          Petr Škoda added a comment - - edited aah, forgive my XML ignorance, I am no expert I expected something like <field name="photo" /> where no content means NULL and <field name="sometext" ></field> means empty string
          Hide
          Penny Leach added a comment -

          I'm not sure parsers would differentiate between those, could be wrong.

          Show
          Penny Leach added a comment - I'm not sure parsers would differentiate between those, could be wrong.
          Hide
          Andrei Bautu added a comment -

          Penny is right. <field name="photo" /> is a shortcut for <field name="sometext" ></field>. They mean the same thing.

          Show
          Andrei Bautu added a comment - Penny is right. <field name="photo" /> is a shortcut for <field name="sometext" ></field>. They mean the same thing.
          Hide
          Andrei Bautu added a comment -

          Summary of today meeting + some additions:

          (EXP) The export process will do the following steps:
          1. check that tables and fields in the database exist according to install.xml;
          2. create the xml file (let's call it out.xml) with the data and the md5 for tables.

          Special cases for 1:

          • additional fields are found in some table. The admin can choose: ignore these fields, export these fields, stop export. Later on, during import if fields exist in the target db, import their data, else ignore them.
            --------------------
            (IMP) The import process will do the following steps:
            1. verify md5 of tables from install.xml against md5 from out.xml; if any mismatch, then stop;
            2. check tables and fields in the database exist according to install.xml;
            3. delete all (truncate where available) from imported tables;
            4. import new data
            5. reset sequences

          Special situations for 2:

          • if tables are missing, they will be created according to install.xml;
          • if a table exist, but some fields are missing, the admin can choose to stop, or drop the table and recreate it according to install.xml
          • if a table exists, all install.xml fields are present and additional fields are found, the admin can choose to stop, keep the table schema or drop the table and recreate it according to install.xml
            --------------------
            (WR) Changing appearences of wwwroot:
            (12:55:38 PM) abautu: I don't feel comportable with that. I think we should change all (which we cant) or none.
            (12:55:58 PM) penny: I don't think that's actually that difficult
            (12:56:02 PM) penny: we do it now in backup
            (12:56:10 PM) penny: and i think it's important too
            (12:56:19 PM) penny: why do you think we can't change all#
            (12:56:21 PM) abautu: it's not about difficulty.
            (12:56:35 PM) abautu: it's about changing data we don't know nothing about.
            (12:57:02 PM) abautu: that address could be part of a password, for instance.
            (12:57:23 PM) ***penny frowns
            (12:57:39 PM) penny: I guess it could be config data
            (12:57:43 PM) abautu: or part of an resource stating how this site also has other resources.
            (12:57:53 PM) penny: like an address to a mnet host
            (12:57:56 PM) abautu: for instance,
            (12:58:12 PM) penny: hmmm
            (12:58:20 PM) abautu: I used to create a downloads folder in my moodle installation
            (12:58:34 PM) abautu: and point people to download software from there.
            (12:58:52 PM) abautu: if I switch the site, that address should be the same.
            (12:58:55 PM) penny: lets discuss on tracker
            (12:59:01 PM) abautu: ok.
            (12:59:02 PM) penny: we don't have to decide this now, there's lots else to do.
            (12:59:50 PM) abautu: just one thing. I also had installation with two addresses depending on the client's locations.
            (01:00:10 PM) abautu: it was necesary because of the network setup.
            --------------------
            Comments are welcome (on all topics).
          Show
          Andrei Bautu added a comment - Summary of today meeting + some additions: (EXP) The export process will do the following steps: 1. check that tables and fields in the database exist according to install.xml; 2. create the xml file (let's call it out.xml) with the data and the md5 for tables. Special cases for 1: additional fields are found in some table. The admin can choose: ignore these fields, export these fields, stop export. Later on, during import if fields exist in the target db, import their data, else ignore them. -------------------- (IMP) The import process will do the following steps: 1. verify md5 of tables from install.xml against md5 from out.xml; if any mismatch, then stop; 2. check tables and fields in the database exist according to install.xml; 3. delete all (truncate where available) from imported tables; 4. import new data 5. reset sequences Special situations for 2: if tables are missing, they will be created according to install.xml; if a table exist, but some fields are missing, the admin can choose to stop, or drop the table and recreate it according to install.xml if a table exists, all install.xml fields are present and additional fields are found, the admin can choose to stop, keep the table schema or drop the table and recreate it according to install.xml -------------------- (WR) Changing appearences of wwwroot: (12:55:38 PM) abautu: I don't feel comportable with that. I think we should change all (which we cant) or none. (12:55:58 PM) penny: I don't think that's actually that difficult (12:56:02 PM) penny: we do it now in backup (12:56:10 PM) penny: and i think it's important too (12:56:19 PM) penny: why do you think we can't change all# (12:56:21 PM) abautu: it's not about difficulty. (12:56:35 PM) abautu: it's about changing data we don't know nothing about. (12:57:02 PM) abautu: that address could be part of a password, for instance. (12:57:23 PM) ***penny frowns (12:57:39 PM) penny: I guess it could be config data (12:57:43 PM) abautu: or part of an resource stating how this site also has other resources. (12:57:53 PM) penny: like an address to a mnet host (12:57:56 PM) abautu: for instance, (12:58:12 PM) penny: hmmm (12:58:20 PM) abautu: I used to create a downloads folder in my moodle installation (12:58:34 PM) abautu: and point people to download software from there. (12:58:52 PM) abautu: if I switch the site, that address should be the same. (12:58:55 PM) penny: lets discuss on tracker (12:59:01 PM) abautu: ok. (12:59:02 PM) penny: we don't have to decide this now, there's lots else to do. (12:59:50 PM) abautu: just one thing. I also had installation with two addresses depending on the client's locations. (01:00:10 PM) abautu: it was necesary because of the network setup. -------------------- Comments are welcome (on all topics).
          Hide
          Dan Poltawski added a comment -

          About wwwroot changing - I understand the hesitation about doing it, but I don't think its particularly risky (i've piped db dumps through sed to achieve this many times ).

          One thing to note is that there is some stuff 'encoded' in the DB where it can't be found and replaced easily. I can't remember the major culprits off hand, but I suspect blocks are probably one - IIRC the config is serialized and stored in the db that way. But also perhaps the cache databases (perhaps cache data could be ignored from export?)

          Show
          Dan Poltawski added a comment - About wwwroot changing - I understand the hesitation about doing it, but I don't think its particularly risky (i've piped db dumps through sed to achieve this many times ). One thing to note is that there is some stuff 'encoded' in the DB where it can't be found and replaced easily. I can't remember the major culprits off hand, but I suspect blocks are probably one - IIRC the config is serialized and stored in the db that way. But also perhaps the cache databases (perhaps cache data could be ignored from export?)
          Hide
          Andrei Bautu added a comment -

          I browsed my database shortly and didn't noticed any serialized data. It's posible that I've missed it, but this a good point.
          Anyway, my point is that not all wwwroot appearences should be changed and we can't say which should and which should not. And now that I think about what Eloy said, this is a db transfer tool, not a site transfer tool. (e.g. I do my site with SQLite and later on, I move to MySQL or PostgreSQL).
          We could provide an option for the admin to decide and let him/her choose, but warn about possible problems.

          Show
          Andrei Bautu added a comment - I browsed my database shortly and didn't noticed any serialized data. It's posible that I've missed it, but this a good point. Anyway, my point is that not all wwwroot appearences should be changed and we can't say which should and which should not. And now that I think about what Eloy said, this is a db transfer tool, not a site transfer tool. (e.g. I do my site with SQLite and later on, I move to MySQL or PostgreSQL). We could provide an option for the admin to decide and let him/her choose, but warn about possible problems.
          Hide
          Petr Škoda added a comment -

          Special cases 2 - I really do not like them at all, nobody is going to stop export/import no matter how much it complains, then when it breaks several months later people would just file bug reports and expect us to fix their database somehow, which might not be possible
          Only people that really know what they are doing should be able to recover broken export file - we could add something like special attribute safetyoff="true" which would enable those special cases and extra recovery options. You would have to edit the export XML file by hand - this would be officially unsupported of course.

          yes, blocks are the encoded trouble, we also need to blacklist some columns

          Show
          Petr Škoda added a comment - Special cases 2 - I really do not like them at all, nobody is going to stop export/import no matter how much it complains, then when it breaks several months later people would just file bug reports and expect us to fix their database somehow, which might not be possible Only people that really know what they are doing should be able to recover broken export file - we could add something like special attribute safetyoff="true" which would enable those special cases and extra recovery options. You would have to edit the export XML file by hand - this would be officially unsupported of course. yes, blocks are the encoded trouble, we also need to blacklist some columns
          Hide
          Petr Škoda added a comment -

          The changing of www root is imo part of migration - at present this is not yet officially supported, but used very often.
          You do not need to implement it now, it could be implemented separately or later, no problem

          Show
          Petr Škoda added a comment - The changing of www root is imo part of migration - at present this is not yet officially supported, but used very often. You do not need to implement it now, it could be implemented separately or later, no problem
          Hide
          Andrei Bautu added a comment -

          When I say stop I mean it doesn't start at all and not that it will stop during import. The admin will answer all questions before any changes to the database are made. Is this better?

          Show
          Andrei Bautu added a comment - When I say stop I mean it doesn't start at all and not that it will stop during import. The admin will answer all questions before any changes to the database are made. Is this better?
          Hide
          Petr Škoda added a comment -

          hmm, some clicking and confirming should not enable non-supported operations. We really do have major support problems here in tracker caused by broken database schema. This is a critical operation, there can not be any guessing - either by admins or by the moodle code itself

          My +1 to make this db migration either 100% success or 0% (==prevented), nothing in between by default.

          • if tables are missing, they will be created according to install.xml; - not good, only install/upgrade process should be creating tables, we will have some post install operations, referential integrity, etc. in future
          • if a table exist, but some fields are missing - this is a major problem, this indicates major ptoblem - admin may fix the xml if needed by adding proper data there, but he/she must understand the consequences
          • if a table exists, all install.xml fields are present and additional fields are found - indication of unsupported 3rd party hack, again search replace in XML solves this

          In any case, pretty please, implement the migration of correct data now and only after it gets committed start with potential improvements. We do need smaller patches with minimal implementation first, the special cases would make it much harder to review

          Show
          Petr Škoda added a comment - hmm, some clicking and confirming should not enable non-supported operations. We really do have major support problems here in tracker caused by broken database schema. This is a critical operation, there can not be any guessing - either by admins or by the moodle code itself My +1 to make this db migration either 100% success or 0% (==prevented), nothing in between by default. if tables are missing, they will be created according to install.xml; - not good, only install/upgrade process should be creating tables, we will have some post install operations, referential integrity, etc. in future if a table exist, but some fields are missing - this is a major problem, this indicates major ptoblem - admin may fix the xml if needed by adding proper data there, but he/she must understand the consequences if a table exists, all install.xml fields are present and additional fields are found - indication of unsupported 3rd party hack, again search replace in XML solves this In any case, pretty please, implement the migration of correct data now and only after it gets committed start with potential improvements. We do need smaller patches with minimal implementation first, the special cases would make it much harder to review
          Petr Škoda made changes -
          Assignee Penny Leach [ mjollnir ] Petr ?koda (skodak) [ skodak ]
          Martin Dougiamas made changes -
          Fix Version/s 2.0.1 [ 10420 ]
          Fix Version/s 2.0 [ 10122 ]
          Martin Dougiamas made changes -
          Workflow jira [ 27430 ] MDL Workflow [ 43659 ]
          Petr Škoda made changes -
          Assignee Petr Škoda (skodak) [ skodak ] moodle.com [ moodle.com ]
          Fix Version/s DEV backlog [ 10464 ]
          Fix Version/s 2.0.1 [ 10420 ]
          Martin Dougiamas made changes -
          Workflow MDL Workflow [ 43659 ] MDL Full Workflow [ 72044 ]
          Petr Škoda made changes -
          Link This issue has a non-specific relationship to MDL-15672 [ MDL-15672 ]
          Petr Škoda made changes -
          Link This issue has a non-specific relationship to MDL-34441 [ MDL-34441 ]
          Petr Škoda made changes -
          Assignee moodle.com [ moodle.com ] Andrei Bautu [ abautu ]
          Hide
          Petr Škoda added a comment -

          Closing as fixed, the direct database migration should be finally visible in admin tree as experimental feature MDL-34441. The export/import of db data from/to files is unfinished and hidden in admin UI for now MDL-15672.

          Thanks a lot everybody and especially Andrei for the great help with the DML/DDL/DTL layers in 2.0.

          If necessary please create new issues, do not add any subtasks here, thanks and ciao.

          Show
          Petr Škoda added a comment - Closing as fixed, the direct database migration should be finally visible in admin tree as experimental feature MDL-34441 . The export/import of db data from/to files is unfinished and hidden in admin UI for now MDL-15672 . Thanks a lot everybody and especially Andrei for the great help with the DML/DDL/DTL layers in 2.0. If necessary please create new issues, do not add any subtasks here, thanks and ciao.
          Petr Škoda made changes -
          Status Open [ 1 ] Closed [ 6 ]
          Fix Version/s 2.3.2 [ 12353 ]
          Fix Version/s DEV backlog [ 10464 ]
          Resolution Fixed [ 1 ]

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: