Moodle
  1. Moodle
  2. MDL-28666

Extend repository API to create file references to external repository contents (like Equella)

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 2.2.1
    • Fix Version/s: DEV backlog
    • Component/s: Repositories
    • Labels:
    • Testing Instructions:
      Hide

      This issue needs a lot of tests.

      Changes overview

      • File APIs changes to support external files
      • Repository APIs changes to serve external files
      • Ability to cache external files
      • Backup/restore external file entries on same site
      • File picker can display iframe inside and pick a file from external repository (equella)

      Boxnet repository plugin

      • Pick a file from box.net using file reference in editor, save it, and edit it again, make sure no broken files
      • Pick a file from box.net without file reference, save it, edit again, make sure no broken files
      • Do the same tasks as above in file manager

      Dropbox repository plugin

      • Pick a file from dropbox repo using file reference in editor, save it, and edit it again, make sure no broken files, check moodledata/cache/filedir, make sure there is a file copy in there
      • Do the same task as above in file manager
      • Run cron in cli, looking the output, make sure dropbox is checking external files belonging to dropbox

      Private files repository plugin

      • Pick a file from private files repo plugin using file reference in editor, save it, and edit it again, make sure no broken files
      • Do the same task as above in file manager

      Server files repository plugin

      • Pick a file from server files repo plugin using file reference in editor, save it, and edit it again, make sure no broken files
      • Do the same task as above in file manager

      File system repository plugin

      • Pick a file from file system repo plugin using file reference in editor, save it, and edit it again, make sure no broken files
      • Replace the file you previous picked in disk, make sure the file in moodle get updated too, NOTE, chrome caches images very well, make sure you clean all browser caches to reflect changes.

      Pick a file reference from private files repo

      • Edit your private files, and pick a file from file system repo plugin using file reference
      • Make sure that file is viewable in private files tree
      • Go to moodle forum, add a new post, pick this file from private area, save it, make sure this file is viewable in forum
      • Replace the file from file system plugin in disk, the references in private file area and forum should get updated too

      Hide or disable repository plugins

      • Hide a repository plugin or instance won't affect external files
      • Disable repository plugin or instances, will ask user to copy all external files to local, or do nothing, in case of doing nothing, all external files will be broken

      Backup

      • Backup a course with file references
      • And the end of backup process, you should be able to see a warning, "There are file references in this course, they can only be restored in this site"

      Restore

      • Restore a backup package contains file references in the same site, files should be restored and worked
      • Restore a backup package contains file references in different sites, at the end of restoring process, you will warned some files cannot be restored because they are references

      Choose external files in file picker

      • Testing pick a file from equella
      • equella should be displayed fully in file picker window
      • Pick a file from equella repo will get you back the renaming && license selection screen just like other repo plugins
      Show
      This issue needs a lot of tests. Changes overview File APIs changes to support external files Repository APIs changes to serve external files Ability to cache external files Backup/restore external file entries on same site File picker can display iframe inside and pick a file from external repository (equella) Boxnet repository plugin Pick a file from box.net using file reference in editor, save it, and edit it again, make sure no broken files Pick a file from box.net without file reference, save it, edit again, make sure no broken files Do the same tasks as above in file manager Dropbox repository plugin Pick a file from dropbox repo using file reference in editor, save it, and edit it again, make sure no broken files, check moodledata/cache/filedir, make sure there is a file copy in there Do the same task as above in file manager Run cron in cli, looking the output, make sure dropbox is checking external files belonging to dropbox Private files repository plugin Pick a file from private files repo plugin using file reference in editor, save it, and edit it again, make sure no broken files Do the same task as above in file manager Server files repository plugin Pick a file from server files repo plugin using file reference in editor, save it, and edit it again, make sure no broken files Do the same task as above in file manager File system repository plugin Pick a file from file system repo plugin using file reference in editor, save it, and edit it again, make sure no broken files Replace the file you previous picked in disk, make sure the file in moodle get updated too, NOTE, chrome caches images very well, make sure you clean all browser caches to reflect changes. Pick a file reference from private files repo Edit your private files, and pick a file from file system repo plugin using file reference Make sure that file is viewable in private files tree Go to moodle forum, add a new post, pick this file from private area, save it, make sure this file is viewable in forum Replace the file from file system plugin in disk, the references in private file area and forum should get updated too Hide or disable repository plugins Hide a repository plugin or instance won't affect external files Disable repository plugin or instances, will ask user to copy all external files to local, or do nothing, in case of doing nothing, all external files will be broken Backup Backup a course with file references And the end of backup process, you should be able to see a warning, "There are file references in this course, they can only be restored in this site" Restore Restore a backup package contains file references in the same site, files should be restored and worked Restore a backup package contains file references in different sites, at the end of restoring process, you will warned some files cannot be restored because they are references Choose external files in file picker Testing pick a file from equella equella should be displayed fully in file picker window Pick a file from equella repo will get you back the renaming && license selection screen just like other repo plugins
    • Affected Branches:
      MOODLE_22_STABLE
    • Rank (Obsolete):
      18372

      Description

      Our repository API does not currently support anything beyond hierarchies of simple objects.

      Need to support:

      • complex objects
      • versioned objects
      • more metadata

      while maintaining security and updatability of files.

      As proof of concept develop the Equella module.

      1. 2011-11-09 15.23.bmml
        3 kB
        Dongsheng Cai
      2. 2011-11-09 15.23.bmml
        3 kB
        Dongsheng Cai
      3. external-repo-smurf.xml
        84 kB
        Dan Poltawski
      4. File picker options.bmml
        3 kB
        Dongsheng Cai
      1. 2011-11-09 15.23.png
        18 kB
      2. File picker options.png
        28 kB

        Issue Links

          Activity

          Hide
          Dongsheng Cai added a comment - - edited

          While working on file picker API, I need to fix a few file picker bugs too: MDL-26832

          Show
          Dongsheng Cai added a comment - - edited While working on file picker API, I need to fix a few file picker bugs too: MDL-26832
          Hide
          Dongsheng Cai added a comment -

          Added UI Mockup: <2011-11-09 15.23>

          Show
          Dongsheng Cai added a comment - Added UI Mockup: <2011-11-09 15.23>
          Hide
          Dongsheng Cai added a comment -

          Added UI Mockup: <File picker options>

          Show
          Dongsheng Cai added a comment - Added UI Mockup: <File picker options>
          Hide
          Dongsheng Cai added a comment -

          Will fix some small things related to MDL-29158

          Show
          Dongsheng Cai added a comment - Will fix some small things related to MDL-29158
          Hide
          Dongsheng Cai added a comment -

          File serving issue, shall we cache the repository file all the time when select it in file picker or we streaming the external resources directly?

          Backgrund:

          1. Currently in moodle we use send_stored_file to serving files, in the low level, it requires the file ready on disk, stored_file will return a file hander to moodle for file serving, and file size is required by byteserving_send_file()
          2. When we create a file with external reference, we need to provide a file size to file table, if we don't download the file we don't the the exact file size, we could ask repository plugin, but not all repository plugin could provide this kind of information

          The solution I could think of is we store the file locally when user pick a file, then when they request it, file is ready to serve, we can check if the file is up to date at this stage, and download the file then serve it by send_stored_file (this will go through repository plugin, not like normal moodle file of course).

          Streaming is preferred, that means pluginfile.php will serve the file while downloading it, I am not sure how to implement it, php cannot do multi-thread, we cannot create one thread download the file and and another on-hold to serving user download. We can create a loop to download 100 bytes each and flush it out to client immediately, I doubt it could be better than downloading the entire file in advance, in this case we have to request partial file and establish http connection costs too.

          Show
          Dongsheng Cai added a comment - File serving issue, shall we cache the repository file all the time when select it in file picker or we streaming the external resources directly? Backgrund: Currently in moodle we use send_stored_file to serving files, in the low level, it requires the file ready on disk, stored_file will return a file hander to moodle for file serving, and file size is required by byteserving_send_file() When we create a file with external reference, we need to provide a file size to file table, if we don't download the file we don't the the exact file size, we could ask repository plugin, but not all repository plugin could provide this kind of information The solution I could think of is we store the file locally when user pick a file, then when they request it, file is ready to serve, we can check if the file is up to date at this stage, and download the file then serve it by send_stored_file (this will go through repository plugin, not like normal moodle file of course). Streaming is preferred, that means pluginfile.php will serve the file while downloading it, I am not sure how to implement it, php cannot do multi-thread, we cannot create one thread download the file and and another on-hold to serving user download. We can create a loop to download 100 bytes each and flush it out to client immediately, I doubt it could be better than downloading the entire file in advance, in this case we have to request partial file and establish http connection costs too.
          Hide
          Dongsheng Cai added a comment -

          I just added Eloy to this issue.

          Eloy, I updated the repo in my github with backup support, can you please have a look if it's ok? And can you please write some guideline on restoring?

          Thanks.

          Show
          Dongsheng Cai added a comment - I just added Eloy to this issue. Eloy, I updated the repo in my github with backup support, can you please have a look if it's ok? And can you please write some guideline on restoring? Thanks.
          Hide
          Dongsheng Cai added a comment -

          Eloy

          Can you please comment here when you see my message

          I added backup support already, now I've to finish the restore, I see there is restore_dbops::send_files_to_pool in backup/util/dbops/restore_dbops.class.php, the problem is I am not sure what's the best way to pass samesite variable, additional parameter results in change all places using restore_dbops::send_files_to_pool. Can you please suggest the best way to do this?

          Regards.

          Show
          Dongsheng Cai added a comment - Eloy Can you please comment here when you see my message I added backup support already, now I've to finish the restore, I see there is restore_dbops::send_files_to_pool in backup/util/dbops/restore_dbops.class.php, the problem is I am not sure what's the best way to pass samesite variable, additional parameter results in change all places using restore_dbops::send_files_to_pool. Can you please suggest the best way to do this? Regards.
          Hide
          Dan Poltawski added a comment -

          Hi Dongsheng,

          I just skimmed some of this code after mention in the meeting the other day and noticed a few things:

          • There are some weird bits of whitespace and missing phpdoc which need cleaning up.
          • get_system_context should be: context_system::instance();
          • The TEXT column sizes have been deprecated in XMLDB - please reload the XMLDB file and generate the new XMLDB upgrade steps without the TEXT column sizes.
          • Coding guidelines state no underscores in variables please don't introduce new variables with underscores in them.
          Show
          Dan Poltawski added a comment - Hi Dongsheng, I just skimmed some of this code after mention in the meeting the other day and noticed a few things: There are some weird bits of whitespace and missing phpdoc which need cleaning up. get_system_context should be: context_system::instance(); The TEXT column sizes have been deprecated in XMLDB - please reload the XMLDB file and generate the new XMLDB upgrade steps without the TEXT column sizes. Coding guidelines state no underscores in variables please don't introduce new variables with underscores in them.
          Hide
          Sam Hemelryk added a comment -

          Hi Dongsheng,

          I'm reopening this and removing it from this weeks integration.

          I've just had a quick check of the code and the things that Dan noted earlier haven't yet been fixed (or replied to).
          Could you please go over this once more, fix things where required, reply about the rest, and probably run the code checkers over this.
          Also once done could you please get Dan (or someone else) to give it a proper pair review as it sounds like Dan has turned those other things up having just skimmed it.

          Cheers
          Sam

          Show
          Sam Hemelryk added a comment - Hi Dongsheng, I'm reopening this and removing it from this weeks integration. I've just had a quick check of the code and the things that Dan noted earlier haven't yet been fixed (or replied to). Could you please go over this once more, fix things where required, reply about the rest, and probably run the code checkers over this. Also once done could you please get Dan (or someone else) to give it a proper pair review as it sounds like Dan has turned those other things up having just skimmed it. Cheers Sam
          Hide
          Dongsheng Cai added a comment -

          Thanks Dan and Sam

          I fixed:

          1. odd whitespace in code (4 places)
          2. replaced get_system_context in code
          3. Updated XMLDB definition and upgrade code
          4. Removed underscores in my code, NOTE, I'm not being able to fix all underscores in files API, there are probably more than 1000 lines changes, it's very risky

          Dan, can you please have another look? As this is a big patch, could please you point out the place you spot issues? It will make it much easier for me to fix it. Thanks

          Show
          Dongsheng Cai added a comment - Thanks Dan and Sam I fixed: odd whitespace in code (4 places) replaced get_system_context in code Updated XMLDB definition and upgrade code Removed underscores in my code, NOTE, I'm not being able to fix all underscores in files API, there are probably more than 1000 lines changes, it's very risky Dan, can you please have another look? As this is a big patch, could please you point out the place you spot issues? It will make it much easier for me to fix it. Thanks
          Hide
          Dan Poltawski added a comment -

          Hi Dongsheng,

          Note that comments without punctuation / slight line length warnings are not too important, but great if you do have time to fix up.

          The incorrect spacing issues and missing phpdocs are more 'hard rules' for new code.

          Show
          Dan Poltawski added a comment - Hi Dongsheng, Note that comments without punctuation / slight line length warnings are not too important, but great if you do have time to fix up. The incorrect spacing issues and missing phpdocs are more 'hard rules' for new code.
          Hide
          Dan Poltawski added a comment -

          I'll review again once the automated warnings are fixed

          Show
          Dan Poltawski added a comment - I'll review again once the automated warnings are fixed
          Hide
          Dan Poltawski added a comment -

          Oops I mixed comment to say:

          Here is output from the codecheckers!

          Show
          Dan Poltawski added a comment - Oops I mixed comment to say: Here is output from the codecheckers!
          Hide
          Dongsheng Cai added a comment -

          Dan, I fixed the issues in external-repo-smurf.xml.

          btw what's the tool are you using to check coding style per commit?

          Show
          Dongsheng Cai added a comment - Dan, I fixed the issues in external-repo-smurf.xml. btw what's the tool are you using to check coding style per commit?
          Hide
          Dan Poltawski added a comment -

          Thanks Dongsheng - the check is done using marina/tims tool with Jenkins continuos integration server doing the per commit filtering.

          Show
          Dan Poltawski added a comment - Thanks Dongsheng - the check is done using marina/tims tool with Jenkins continuos integration server doing the per commit filtering.
          Hide
          Dongsheng Cai added a comment -

          Hello guys

          Can you submit this to Integration, or it requires more peer reviews?

          We are running out time.

          Show
          Dongsheng Cai added a comment - Hello guys Can you submit this to Integration, or it requires more peer reviews? We are running out time.
          Hide
          Martin Dougiamas added a comment -

          This really needs to land urgently. The filepicker work is coming and will need rebasing on top of this.

          Show
          Martin Dougiamas added a comment - This really needs to land urgently. The filepicker work is coming and will need rebasing on top of this.
          Hide
          Petr Skoda added a comment -

          wait a bit:
          1/ $file_record -> $filerecord might be ok according to coding style but it creates and inconsistencies which is hard to review, I personally think that is absolutely not acceptable to be part of this patch and it must be reverted
          2/ I sincerely doubt that the foreign key fileid pointing from files_reference to files table is going to work because so far the fileid was NOT supposed to be constant - I guess the reversed link might work better (files table optionally pointing to references)
          3/ fileid in files_references should have unique constraint, right?
          4/ I do not like the new stored_file->update_record() because it completely changes the API design
          5/ $stored_file->repository->send_file() should be handled inside $stored_file->send_file()
          6/ I do not think it is necessary to crate second filepool for cache files - technically you could implement that the same way as the image preview david is working on: new hidden file area with special file names, in this case you would use referenceid
          7/ $stored_file->repository->send_file($stored_file, $lifetime, $filter, $forcedownload, $filename, $dontdie); followed by die;
          8/ it looks like this is either incomplete or buggy because there are no changes in places that can not use referenced files such as assignment - so either it must be disabled by default (is it?) or it is enabled by default and buggy
          9/ I do not get how the content getting methods of stored_file are going to work, those extra if ($this->is_external_file())

          { throw exception}

          look like a major pita, I was surprised that stored_file->get_content() did not get it too
          10/ there is a code out there that reads files table directly, I guess that is going to break because the records will not contain the repo info, right?

          I guess I could continue with more issues if I read the patch properly, I spent only 10 minutes looking at the patch. Sorry if any of my findings or proposals are inaccurate, I did not think much about this, but still my -3 for integration in current state.

          Show
          Petr Skoda added a comment - wait a bit: 1/ $file_record -> $filerecord might be ok according to coding style but it creates and inconsistencies which is hard to review, I personally think that is absolutely not acceptable to be part of this patch and it must be reverted 2/ I sincerely doubt that the foreign key fileid pointing from files_reference to files table is going to work because so far the fileid was NOT supposed to be constant - I guess the reversed link might work better (files table optionally pointing to references) 3/ fileid in files_references should have unique constraint, right? 4/ I do not like the new stored_file->update_record() because it completely changes the API design 5/ $stored_file->repository->send_file() should be handled inside $stored_file->send_file() 6/ I do not think it is necessary to crate second filepool for cache files - technically you could implement that the same way as the image preview david is working on: new hidden file area with special file names, in this case you would use referenceid 7/ $stored_file->repository->send_file($stored_file, $lifetime, $filter, $forcedownload, $filename, $dontdie); followed by die; 8/ it looks like this is either incomplete or buggy because there are no changes in places that can not use referenced files such as assignment - so either it must be disabled by default (is it?) or it is enabled by default and buggy 9/ I do not get how the content getting methods of stored_file are going to work, those extra if ($this->is_external_file()) { throw exception} look like a major pita, I was surprised that stored_file->get_content() did not get it too 10/ there is a code out there that reads files table directly, I guess that is going to break because the records will not contain the repo info, right? I guess I could continue with more issues if I read the patch properly, I spent only 10 minutes looking at the patch. Sorry if any of my findings or proposals are inaccurate, I did not think much about this, but still my -3 for integration in current state.
          Hide
          Martin Dougiamas added a comment -

          Thanks for taking a look, Petr.

          1) If the old File API doesn't match coding styles then that is a new bug, I think. Dongsheng did the right thing keeping it consistent originally I think, but if we are going to be anal about coding style checkers then that is the result.

          2+ 4) this is pretty crucial to the current functionality. We need to be able to update files, and that was not part of the original design.

          6) This way was suggested originally and you (and others) did not want the File API "abused" that way. I am not keen to change it again unless there are very very good reasons everyone can understand.

          10) That's why we force the File API, right?

          I'll let DS address the rest ...

          Show
          Martin Dougiamas added a comment - Thanks for taking a look, Petr. 1) If the old File API doesn't match coding styles then that is a new bug, I think. Dongsheng did the right thing keeping it consistent originally I think, but if we are going to be anal about coding style checkers then that is the result. 2+ 4) this is pretty crucial to the current functionality. We need to be able to update files, and that was not part of the original design. 6) This way was suggested originally and you (and others) did not want the File API "abused" that way. I am not keen to change it again unless there are very very good reasons everyone can understand. 10) That's why we force the File API, right? I'll let DS address the rest ...
          Hide
          Dongsheng Cai added a comment -

          Hello Petr

          I gonna add unique index to file_references table, add send_file interface to $stored_file, and for check all forms in assignment module make sure file linking is disabled.

          Show
          Dongsheng Cai added a comment - Hello Petr I gonna add unique index to file_references table, add send_file interface to $stored_file, and for check all forms in assignment module make sure file linking is disabled.
          Hide
          Petr Skoda added a comment -

          1) so let's finally fix the style guide, changing unrelated code or doing cleanup as part in one commit with significant API changes is a much much bigger nono which should be immediately rejected by integrators

          Anyway I am not going to make problems here, I just want to make known that I strongly disagree with the current patch and I am not going to start fixing any regressions later.

          Show
          Petr Skoda added a comment - 1) so let's finally fix the style guide, changing unrelated code or doing cleanup as part in one commit with significant API changes is a much much bigger nono which should be immediately rejected by integrators Anyway I am not going to make problems here, I just want to make known that I strongly disagree with the current patch and I am not going to start fixing any regressions later.
          Hide
          Martin Dougiamas added a comment - - edited

          Areas to disable file linking would be:

          • assignment submissions content and attachments (because of due dates and grades)
          • form posts (because of notifications and threads)
          • forum attachments (because of notifications and threads)
          • quiz essay questions (because of due dates and grades)
          • workshop submissions content and attachment (because of due dates and grades)

          Anything else?

          Show
          Martin Dougiamas added a comment - - edited Areas to disable file linking would be: assignment submissions content and attachments (because of due dates and grades) form posts (because of notifications and threads) forum attachments (because of notifications and threads) quiz essay questions (because of due dates and grades) workshop submissions content and attachment (because of due dates and grades) Anything else?
          Hide
          Petr Skoda added a comment -

          yes, anything that deals with the content of files is probably affected because the patch seems to consider mostly pluginfile downloading, anyway my eyes can not see anything because of the unnecessary renaming and docs polishing.

          Show
          Petr Skoda added a comment - yes, anything that deals with the content of files is probably affected because the patch seems to consider mostly pluginfile downloading, anyway my eyes can not see anything because of the unnecessary renaming and docs polishing.
          Hide
          Petr Skoda added a comment -

          my -3 for this in 2.3

          Show
          Petr Skoda added a comment - my -3 for this in 2.3
          Hide
          Martin Dougiamas added a comment -

          What areas access the content of files uploaded into fileareas? And shouldn't they be doing it via the API anyway?

          Show
          Martin Dougiamas added a comment - What areas access the content of files uploaded into fileareas? And shouldn't they be doing it via the API anyway?
          Hide
          Dongsheng Cai added a comment -

          Latest changes:

          1. added unique index to files_reference.fileid
          2. by default, filemanager/filepicker allow create file references, in forum/assignment/workshop, only copying allowed
          3. stored_file.php: allow external file being downloaed in get_content_file_location function, that means, for example, in course restore, you pick a file from dropbox, and indicate creating a file reference, it will work
          4. added send_file method to stored_file class, it will call repository to get the file
          Show
          Dongsheng Cai added a comment - Latest changes: added unique index to files_reference.fileid by default, filemanager/filepicker allow create file references, in forum/assignment/workshop, only copying allowed stored_file.php: allow external file being downloaed in get_content_file_location function, that means, for example, in course restore, you pick a file from dropbox, and indicate creating a file reference, it will work added send_file method to stored_file class, it will call repository to get the file
          Hide
          Eloy Lafuente (stronk7) added a comment -

          The main moodle.git repository has just been updated with latest weekly modifications. You may wish to rebase your PULL branches to simplify history and avoid any possible merge conflicts. This would also make integrator's life easier next week.

          TIA and ciao

          Show
          Eloy Lafuente (stronk7) added a comment - The main moodle.git repository has just been updated with latest weekly modifications. You may wish to rebase your PULL branches to simplify history and avoid any possible merge conflicts. This would also make integrator's life easier next week. TIA and ciao
          Hide
          Petr Skoda added a comment - - edited

          1/ the file size of stored_file may be invalid - can cause huge problems for file serving, shortened files
          2/ content hash of stored_file may be invalid - this would break etag caching (which I already implemented and tested)
          3/ the time modified flag is invalid - breaks not-changed status in file serving
          4/ public function get_file_instance() returning false is backwards incompatible, we can introduce exceptions instead of false, but not the other way around

          note: I did not do full review, there are probably a lot more problems...

          Show
          Petr Skoda added a comment - - edited 1/ the file size of stored_file may be invalid - can cause huge problems for file serving, shortened files 2/ content hash of stored_file may be invalid - this would break etag caching (which I already implemented and tested) 3/ the time modified flag is invalid - breaks not-changed status in file serving 4/ public function get_file_instance() returning false is backwards incompatible, we can introduce exceptions instead of false, but not the other way around note: I did not do full review, there are probably a lot more problems...
          Hide
          Dongsheng Cai added a comment - - edited

          Petr

          The problem we have here is we don't really have the filesize and contenthash when create file reference, and when files in external repository changes, we cannot detect that then update the db accordingly. It might be not a issue we should worry about, for instance, box.net will give us a secret url for file reference, when moodle serving this file, box.net plugin will redirect user to box.net server, box.net will send etag/modifed-since headers. I understand it's bad for file handing, but I don't know what's the best to get all the correct hash and filesize.

          about 4, I will throw exception instead of returning false.

          Show
          Dongsheng Cai added a comment - - edited Petr The problem we have here is we don't really have the filesize and contenthash when create file reference, and when files in external repository changes, we cannot detect that then update the db accordingly. It might be not a issue we should worry about, for instance, box.net will give us a secret url for file reference, when moodle serving this file, box.net plugin will redirect user to box.net server, box.net will send etag/modifed-since headers. I understand it's bad for file handing, but I don't know what's the best to get all the correct hash and filesize. about 4, I will throw exception instead of returning false.
          Hide
          Petr Skoda added a comment - - edited

          I strongly disagree, all methods of stored_file must imo return valid results otherwise everything gets very messy soon - other APIs start breaking randomly. I believe if you can not get file content then the stuff does not belong to the files table, sorry, file api just can not work without the file content. Technically you could alter all these methods to work properly, that is: each method in stored_file should first look if file is external, then verify lifetime, then fetch new content if necessary and update ALL flags and store the new content in normal file pool. That way it would be 100% accurate, 99% backwards compatible and self contained without any need for second filepool you implemented.

          Show
          Petr Skoda added a comment - - edited I strongly disagree, all methods of stored_file must imo return valid results otherwise everything gets very messy soon - other APIs start breaking randomly. I believe if you can not get file content then the stuff does not belong to the files table, sorry, file api just can not work without the file content. Technically you could alter all these methods to work properly, that is: each method in stored_file should first look if file is external, then verify lifetime, then fetch new content if necessary and update ALL flags and store the new content in normal file pool. That way it would be 100% accurate, 99% backwards compatible and self contained without any need for second filepool you implemented.
          Hide
          Petr Skoda added a comment -

          If you need to strictly link only then please, please do it with links or special placeholders in html code that gets post-proccesed by filters.

          Show
          Petr Skoda added a comment - If you need to strictly link only then please, please do it with links or special placeholders in html code that gets post-proccesed by filters.
          Hide
          Martin Dougiamas added a comment -

          Petr, we can not copy all the external file content into Moodle. That was exactly the problem with the 2.x design and why everyone hates Moodle 2.x files.

          "other APIs start breaking randomly." - that is the detailed info that is needed now.

          Show
          Martin Dougiamas added a comment - Petr, we can not copy all the external file content into Moodle. That was exactly the problem with the 2.x design and why everyone hates Moodle 2.x files. "other APIs start breaking randomly." - that is the detailed info that is needed now.
          Hide
          Petr Skoda added a comment -

          Wait, this seems to be a misunderstanding, I did not talk about the linking (or better call it synchronisation with repos), I was describing problems in the implementation submitted for integration. I proposed a different implementation that does not break the current file APIs.

          Let me sum it up:
          1/ you want moodle files to "magically" synchronise with external repositories, right?
          2/ I believe the proposed patch breaks the current file API and is inefficient
          3/ I think there is a way to implement the necessary changes in a way that is 95% backwards compatible (the 5% is difference being the http caching + performance cost), most of the changes would be directly in stored_file
          4/ I can not imagine how could the users fetch files directly from repository without moodle getting the content of the file. It would be extremely hard to do it securely with proper access control and in any case we often need the file content in stored_file for operations such as zipping, csv upload processing, grades import, etc. Also what happens when external system if offline...
          5/ Internally it should imo behave like proxy cache with redefined user access control and extra security features (such as forced download). Users would not be in direct browser/http contact with the repository server, all communication would go through moodle server.

          Here is my proposal:

          • add new fields to files table:
            • externalid - linking to table describing the external file (pointing to more file and repository info, more files could link to one external file and bulk synchronise)
            • externalverified - last time when synchronised content with ext repo
            • externallifetime - how often to sync with ext repo
          • add new table for external repository files description - id, repoid, repodata, etc.
          • implement automatic content sync in stored_file class, sync would be called before any operation that involves repo file content - content itself, contenthash, time modified, file size.
          • add new options for file picker and friends which specifies if links are allowed - UI side only (client side validation)
          • add new options for file processing in forms - this would strip all ext links when not supported in given area that does not support later changes of file content such as assignment, workshop, etc. (server side validation)
          • add new options for creation of files from existing stored file - strip ext links option
          • lower HTTP lifetime of synchronised files and use ETags to prevent repeated sending of file content

          My main concern is that you want to push this through integration right before the freeze when it is evident that the patch has major design and implementation problems that can not be fixed in a matter of days. I am not against this new feature, my previous concern was performance and UI issues, that is why I did not want to implement it myself before. After working on file sending last weekend I think the performance should not be a big problem, let's see how the new UI from Marina is able to accommodate the elements necessary for this file sync.

          Show
          Petr Skoda added a comment - Wait, this seems to be a misunderstanding, I did not talk about the linking (or better call it synchronisation with repos), I was describing problems in the implementation submitted for integration. I proposed a different implementation that does not break the current file APIs. Let me sum it up: 1/ you want moodle files to "magically" synchronise with external repositories, right? 2/ I believe the proposed patch breaks the current file API and is inefficient 3/ I think there is a way to implement the necessary changes in a way that is 95% backwards compatible (the 5% is difference being the http caching + performance cost), most of the changes would be directly in stored_file 4/ I can not imagine how could the users fetch files directly from repository without moodle getting the content of the file. It would be extremely hard to do it securely with proper access control and in any case we often need the file content in stored_file for operations such as zipping, csv upload processing, grades import, etc. Also what happens when external system if offline... 5/ Internally it should imo behave like proxy cache with redefined user access control and extra security features (such as forced download). Users would not be in direct browser/http contact with the repository server, all communication would go through moodle server. Here is my proposal: add new fields to files table: externalid - linking to table describing the external file (pointing to more file and repository info, more files could link to one external file and bulk synchronise) externalverified - last time when synchronised content with ext repo externallifetime - how often to sync with ext repo add new table for external repository files description - id, repoid, repodata, etc. implement automatic content sync in stored_file class, sync would be called before any operation that involves repo file content - content itself, contenthash, time modified, file size. add new options for file picker and friends which specifies if links are allowed - UI side only (client side validation) add new options for file processing in forms - this would strip all ext links when not supported in given area that does not support later changes of file content such as assignment, workshop, etc. (server side validation) add new options for creation of files from existing stored file - strip ext links option lower HTTP lifetime of synchronised files and use ETags to prevent repeated sending of file content My main concern is that you want to push this through integration right before the freeze when it is evident that the patch has major design and implementation problems that can not be fixed in a matter of days. I am not against this new feature, my previous concern was performance and UI issues, that is why I did not want to implement it myself before. After working on file sending last weekend I think the performance should not be a big problem, let's see how the new UI from Marina is able to accommodate the elements necessary for this file sync.
          Hide
          Martin Dougiamas added a comment -

          Petr, this is not being pushed through in "days" - it's been under development for months.

          You took yourself out of the project at the start. You refused to be involved. You have missed months of discussions about the needs that led to this design, and have chosen to get yourself involved only now. I appreciate your opinion but proposing complete alternative designs now is not helping. The proposal phase is long gone.

          The key thing you are missing in your proposal is this: we do NOT always want to store copies of all repository files in Moodle and serve them ourselves. Some will be huge. Some will have copyright restrictions and monitoring requirements. Some will be frequently updated and those updates must propogate instantly. The repository plugin must be allowed to decide where the content is. They may choose to let the repository serve the content directly using their own token systems, perhaps via a CDN. We need to let repositories do what they are designed to do - not cripple them. This is a FUNDAMENTAL criteria.

          Please - either help polish this design in 2.3 or don't help at all.

          Show
          Martin Dougiamas added a comment - Petr, this is not being pushed through in "days" - it's been under development for months. You took yourself out of the project at the start. You refused to be involved. You have missed months of discussions about the needs that led to this design, and have chosen to get yourself involved only now. I appreciate your opinion but proposing complete alternative designs now is not helping. The proposal phase is long gone. The key thing you are missing in your proposal is this: we do NOT always want to store copies of all repository files in Moodle and serve them ourselves. Some will be huge. Some will have copyright restrictions and monitoring requirements. Some will be frequently updated and those updates must propogate instantly. The repository plugin must be allowed to decide where the content is. They may choose to let the repository serve the content directly using their own token systems, perhaps via a CDN. We need to let repositories do what they are designed to do - not cripple them. This is a FUNDAMENTAL criteria. Please - either help polish this design in 2.3 or don't help at all.
          Hide
          Petr Skoda added a comment -

          The stored_file as the name suggests deals with files stored in moodle, if you suddenly remove half of the methods or make them return invalid data stuff using it is going to break. Even if you do not have the file contents you need to return valid data from stored_file somehow for our PHP methods, you could lazy download the file form server when necessary. I agree that alternative download via CDN would be nice to have and it does not conflict with my proposal.

          If the patch was implemented without bugs and regressions I would keep my mouth shut, but that is not the case and I hope the problems I reported prove it.

          Show
          Petr Skoda added a comment - The stored_file as the name suggests deals with files stored in moodle, if you suddenly remove half of the methods or make them return invalid data stuff using it is going to break. Even if you do not have the file contents you need to return valid data from stored_file somehow for our PHP methods, you could lazy download the file form server when necessary. I agree that alternative download via CDN would be nice to have and it does not conflict with my proposal. If the patch was implemented without bugs and regressions I would keep my mouth shut, but that is not the case and I hope the problems I reported prove it.
          Hide
          Dongsheng Cai added a comment - - edited

          if you suddenly remove half of the methods

          I don't understand this, stored_file::get_content_file_location will ask repository plugin to download the file if moodle is asking for file's location, so it could handle the external files, but slow.

          If the patch was implemented without bugs and regressions I would keep my mouth shut, but that is not the case and I hope the problems I reported prove it.

          Yes, it has bugs, I'm fixing while you pointing them out, can we just work together to get this done, not just arguing the design, it has been announced for months, too late to change the whole thing.

          Show
          Dongsheng Cai added a comment - - edited if you suddenly remove half of the methods I don't understand this, stored_ file::get_content_file_location will ask repository plugin to download the file if moodle is asking for file's location, so it could handle the external files, but slow. If the patch was implemented without bugs and regressions I would keep my mouth shut, but that is not the case and I hope the problems I reported prove it. Yes, it has bugs, I'm fixing while you pointing them out, can we just work together to get this done, not just arguing the design, it has been announced for months, too late to change the whole thing.
          Hide
          Martin Dougiamas added a comment -

          Could you make sure all the methods, eg get_filesize(), do the same thing, DS? Just to make sure that repositories are ABLE to satisfy everything that someone might ask for via stored_file. (Whether or not that is technically possible for the repository plugin is up to the repository, of course)

          Show
          Martin Dougiamas added a comment - Could you make sure all the methods, eg get_filesize(), do the same thing, DS? Just to make sure that repositories are ABLE to satisfy everything that someone might ask for via stored_file. (Whether or not that is technically possible for the repository plugin is up to the repository, of course)
          Hide
          Dongsheng Cai added a comment -

          Martin

          Sure, I will do it tomorrow.

          Show
          Dongsheng Cai added a comment - Martin Sure, I will do it tomorrow.
          Hide
          Petr Skoda added a comment -

          Please try to reconsider the database structure changes, the current 'files_reference' table pointing to files table via fileid seems to me like a major problem because:
          1/ it makes $fs->get_file_instance() backwards incompatible because you need to join one more table when getting file record, it does not even seem to try to fetch missing info
          2/ if somebody decides we need more columns in file_reference table you need to modify all code again - I suspect there will need to be some more caching info there soon once the repositories start exposing filesize, content, contenthash, etc.
          3/ what happens when you create a file from existing linked file? The file_references start multiplicating like rabbits, is that really expected? How is that going to affect performance?

          I do not see any valid reason to have this info in separate tables when you only allow one-to-one mapping there and you fetch it in each and every case together and you have to manually keep them in sync at all times. If there was a foreign key in files table pointing to shared file reference table it might open new possibilities in the future.

          Show
          Petr Skoda added a comment - Please try to reconsider the database structure changes, the current 'files_reference' table pointing to files table via fileid seems to me like a major problem because: 1/ it makes $fs->get_file_instance() backwards incompatible because you need to join one more table when getting file record, it does not even seem to try to fetch missing info 2/ if somebody decides we need more columns in file_reference table you need to modify all code again - I suspect there will need to be some more caching info there soon once the repositories start exposing filesize, content, contenthash, etc. 3/ what happens when you create a file from existing linked file? The file_references start multiplicating like rabbits, is that really expected? How is that going to affect performance? I do not see any valid reason to have this info in separate tables when you only allow one-to-one mapping there and you fetch it in each and every case together and you have to manually keep them in sync at all times. If there was a foreign key in files table pointing to shared file reference table it might open new possibilities in the future.
          Hide
          Petr Skoda added a comment -

          For the record:

          I did not want to work this new feature because the original requirement was to "Allow serving of files without getting the content of files from external repositories by hacking current stored_file class" which is something I have no idea how to implement. A few weeks ago this was submitted for integration and I saw that the implementation actually gets the file content from external repositories - that is a HUGE difference from the original requirement for me. Soon I realised there is a different, relatively clean way to implement this, so I started pointing out the bugs to get your attention and I proposed an alternative implementation which I think has more advanced caching mechanism and better backwards compatibility.

          Show
          Petr Skoda added a comment - For the record: I did not want to work this new feature because the original requirement was to "Allow serving of files without getting the content of files from external repositories by hacking current stored_file class" which is something I have no idea how to implement. A few weeks ago this was submitted for integration and I saw that the implementation actually gets the file content from external repositories - that is a HUGE difference from the original requirement for me. Soon I realised there is a different, relatively clean way to implement this, so I started pointing out the bugs to get your attention and I proposed an alternative implementation which I think has more advanced caching mechanism and better backwards compatibility.
          Hide
          Dongsheng Cai added a comment -

          Petr

          3/ what happens when you create a file from existing linked file? The file_references start multiplicating like rabbits, is that really expected? How is that going to affect performance?

          We don't create reference to reference, so when user pick a reference file and saying he want to create a reference to it, repository plugin will find the original file then the new reference will point to the original file directly.

          At the very beginning, we was planing to add a new field to mdl_files table, but some developers strongly against this solution, so we moved reference field to separate table, I agree this made stored_file->file_record confusing, but it's really hard to make everyone happy...Petr are you suggesting adding reference field to mdl_files or we store filearea/component/path/filename in references table?

          Show
          Dongsheng Cai added a comment - Petr 3/ what happens when you create a file from existing linked file? The file_references start multiplicating like rabbits, is that really expected? How is that going to affect performance? We don't create reference to reference, so when user pick a reference file and saying he want to create a reference to it, repository plugin will find the original file then the new reference will point to the original file directly. At the very beginning, we was planing to add a new field to mdl_files table, but some developers strongly against this solution, so we moved reference field to separate table, I agree this made stored_file->file_record confusing, but it's really hard to make everyone happy...Petr are you suggesting adding reference field to mdl_files or we store filearea/component/path/filename in references table?
          Hide
          Petr Skoda added a comment -

          Hmm, this is not about making people happy, this is about code quality and maintainability. Artificial splitting of information into two tables (current approach) is really bad for maintainability, code complexity and backwards compatibility, the fact that it lacks the checks for the missing join in input data is critical because it may result in data loss. The second possible design alternative is to normalise the storage of 'reference' info by creating foreign key in the files table - I do not know if that is better way and what exactly is stored in the reference text field, but it looks like it may get duplicated and hence the database normalisation seems like a good candidate, it would also prevent accidents with missing data in $file_record, it might be also used for caching improvements inside stored_file (we simply need something that identifies the external file that can be searched in database, encoding that info into text field only seems like a bad idea to me, sorry).

          Forgive my persistence, but stored_file->update() is really really bad because it does not do any input validation. I can easily imagine it will be the most abused function in file api and will cause bugs and even data loss. Any new developer will just modify stuff there directly without thinking about consequences. Compare it with any create file method in file storage. If you need to do repo specific change please create a new method in stored_file->import_external_file() that does only what you need and does not allow anybody to break everything.

          Show
          Petr Skoda added a comment - Hmm, this is not about making people happy, this is about code quality and maintainability. Artificial splitting of information into two tables (current approach) is really bad for maintainability, code complexity and backwards compatibility, the fact that it lacks the checks for the missing join in input data is critical because it may result in data loss. The second possible design alternative is to normalise the storage of 'reference' info by creating foreign key in the files table - I do not know if that is better way and what exactly is stored in the reference text field, but it looks like it may get duplicated and hence the database normalisation seems like a good candidate, it would also prevent accidents with missing data in $file_record, it might be also used for caching improvements inside stored_file (we simply need something that identifies the external file that can be searched in database, encoding that info into text field only seems like a bad idea to me, sorry). Forgive my persistence, but stored_file->update() is really really bad because it does not do any input validation. I can easily imagine it will be the most abused function in file api and will cause bugs and even data loss. Any new developer will just modify stuff there directly without thinking about consequences. Compare it with any create file method in file storage. If you need to do repo specific change please create a new method in stored_file->import_external_file() that does only what you need and does not allow anybody to break everything.
          Hide
          Eloy Lafuente (stronk7) added a comment -

          Ho, my 2 cents:

          • If really we are getting the files from the repos, then I've nothing against to add one (or multiple) fields to the mdl_files table. Why? Simple, because then they are real "stored" files. As simple as that.
          • More yet, if we are getting the files from the repos (external or internal), I think that there are better solutions to keep all copies 100% in sync with source, asking repo each XXX time one file is accessed AND also in an automated way. It would be simply about to allow each file to specify one repositoryfileid pointing to one table where all the information pointing to the file source would be stored. Note it's one 1:n relation, so multiple mdl_files would be pointing to the same repositoryfile->id record. Easy to maintain, sync and without "destroying" the file api. They will be perfect normal and standard stored files.
          • You can name those files retrieved from repo whatever you want, proxied files, aliased files... and you can make them to stop working if they are deleted from the source repository. No problem with that. More yet, surely you can redirect browsers to access to the source file in repository when the repository has not "strange" access methods. Or you can serve them from Moodle. Each repo decides.
          • The key point here is that it does not break the files API and stored files are always that, stored files. How they are fetched/served and kept on sync is the extra functionality we are addressing here, but without breaking solid concepts.

          I'm sure that, if we are really downloading them... we can create "easily" an alternative schema to keep everything working without much disruption.

          And once working... it's a matter of expose it with a nice API for repository developers and UI for users. Sounds simple, isn't it?

          Dears all, I know it's horribly late and freeze is days away... but really, if we can do it 1) in time and 2) in a better way... then my +1 goes to that.

          Ciao

          Show
          Eloy Lafuente (stronk7) added a comment - Ho, my 2 cents: If really we are getting the files from the repos, then I've nothing against to add one (or multiple) fields to the mdl_files table. Why? Simple, because then they are real "stored" files. As simple as that. More yet, if we are getting the files from the repos (external or internal), I think that there are better solutions to keep all copies 100% in sync with source, asking repo each XXX time one file is accessed AND also in an automated way. It would be simply about to allow each file to specify one repositoryfileid pointing to one table where all the information pointing to the file source would be stored. Note it's one 1:n relation, so multiple mdl_files would be pointing to the same repositoryfile->id record. Easy to maintain, sync and without "destroying" the file api. They will be perfect normal and standard stored files. You can name those files retrieved from repo whatever you want, proxied files, aliased files... and you can make them to stop working if they are deleted from the source repository. No problem with that. More yet, surely you can redirect browsers to access to the source file in repository when the repository has not "strange" access methods. Or you can serve them from Moodle. Each repo decides. The key point here is that it does not break the files API and stored files are always that, stored files. How they are fetched/served and kept on sync is the extra functionality we are addressing here, but without breaking solid concepts. I'm sure that, if we are really downloading them... we can create "easily" an alternative schema to keep everything working without much disruption. And once working... it's a matter of expose it with a nice API for repository developers and UI for users. Sounds simple, isn't it? Dears all, I know it's horribly late and freeze is days away... but really, if we can do it 1) in time and 2) in a better way... then my +1 goes to that. Ciao
          Hide
          Petr Skoda added a comment -

          I was discussing this with Eloy a bit, here are my notes describing potential solution: https://github.com/skodak/moodle/compare/master...repoproxy

          Show
          Petr Skoda added a comment - I was discussing this with Eloy a bit, here are my notes describing potential solution: https://github.com/skodak/moodle/compare/master...repoproxy
          Hide
          Dongsheng Cai added a comment -

          Petr

          Thanks for the code, get_proxy_file_by_reference is pretty much the same as get_file_by_reference, I could integrate your proposals to current patch, I will look into it and post more details here later.

          Show
          Dongsheng Cai added a comment - Petr Thanks for the code, get_proxy_file_by_reference is pretty much the same as get_file_by_reference, I could integrate your proposals to current patch, I will look into it and post more details here later.
          Hide
          Dongsheng Cai added a comment -

          Petr

          I think you code is still caching everything, yes, it has proxy_readfile in repository, but stored_file->readfile() called too late in send_stored_file function, before it being called, we already cached content to get filesize, and a lot of moodle headers already sent, for simple redirection (most repositories do this), that's too much.

          Show
          Dongsheng Cai added a comment - Petr I think you code is still caching everything, yes, it has proxy_readfile in repository, but stored_file->readfile() called too late in send_stored_file function, before it being called, we already cached content to get filesize, and a lot of moodle headers already sent, for simple redirection (most repositories do this), that's too much.
          Hide
          Martin Dougiamas added a comment -

          As I said above: We do not always want to store copies of all repository files in Moodle. We can provide a cache facility if the repository wants to support it for efficiency, but they need to also be able to choose not to. This is a requirement.

          Show
          Martin Dougiamas added a comment - As I said above: We do not always want to store copies of all repository files in Moodle. We can provide a cache facility if the repository wants to support it for efficiency, but they need to also be able to choose not to. This is a requirement.
          Hide
          Petr Skoda added a comment -

          Dongsheng: you can undo headers if they were not send yet. I did not code this part because it was colliding with my X-Sendfile+ETag patch. So yes, this would need a few lines in readfile_accel() to make it work, I will rebase and add the missing bits.

          Martin: the missing content can be implemented via file->status, I will update the patch to illustrate what I mean.

          Show
          Petr Skoda added a comment - Dongsheng: you can undo headers if they were not send yet. I did not code this part because it was colliding with my X-Sendfile+ETag patch. So yes, this would need a few lines in readfile_accel() to make it work, I will rebase and add the missing bits. Martin: the missing content can be implemented via file->status, I will update the patch to illustrate what I mean.
          Hide
          Sam Hemelryk added a comment -

          Reopening this guys, sounds like there is still more dev work going on here and we're cleaning up the integration queue.

          Show
          Sam Hemelryk added a comment - Reopening this guys, sounds like there is still more dev work going on here and we're cleaning up the integration queue.
          Hide
          Marina Glancy added a comment - - edited

          Dongsheng,
          I copied the function stored_file::update() to my branch, because I also need it.
          https://github.com/dongsheng/moodle/compare/master...dev_MDL-28666_external_repo_master#L22R91

          I noticed that it is not completely correct: you allow to update field only if (isset($this->file_record->$field)). Which means that if field was empty string ('author' in my case), I can never set it to something meaningful.

              /**
               * Update some file record fields
               *
               * @param stdClass $dataobject
               */
              public function update($dataobject) {
                  global $DB;
                  $keys = array_keys((array)$this->file_record);
                  foreach ($dataobject as $field => $value) {
                      if (in_array($field, $keys)) {
                          $this->file_record->$field = $value;
                      } else {
                          throw new coding_exception("Invalid field name, $field doesn't exist in file record");
                      }
                  }
                  $DB->update_record('files', $this->file_record);
              }
          
          Show
          Marina Glancy added a comment - - edited Dongsheng, I copied the function stored_ file::update( ) to my branch, because I also need it. https://github.com/dongsheng/moodle/compare/master...dev_MDL-28666_external_repo_master#L22R91 I noticed that it is not completely correct: you allow to update field only if (isset($this->file_record->$field)) . Which means that if field was empty string ('author' in my case), I can never set it to something meaningful. /** * Update some file record fields * * @param stdClass $dataobject */ public function update($dataobject) { global $DB; $keys = array_keys((array)$ this ->file_record); foreach ($dataobject as $field => $value) { if (in_array($field, $keys)) { $ this ->file_record->$field = $value; } else { throw new coding_exception( "Invalid field name, $field doesn't exist in file record" ); } } $DB->update_record('files', $ this ->file_record); }
          Hide
          Dongsheng Cai added a comment -

          Marina

          You change has been added to the branch, thanks.

          Show
          Dongsheng Cai added a comment - Marina You change has been added to the branch, thanks.
          Hide
          Dongsheng Cai added a comment -

          Petr

          I integrated your db changes and APIs to record last sync times, added a repository api to return repo file lifetime.

          I haven't integrated readfile() method yet, I am not sure what's your plan, currently, it has to cache file all the time, because before calling readfile(), sync_external_file already stored the file in moodle. I don't want to touch the logic in send_stored_file() at this stage yet, we are too close to 2.3 release. Please comment your plan for readfile(), we can discuss it.

          Show
          Dongsheng Cai added a comment - Petr I integrated your db changes and APIs to record last sync times, added a repository api to return repo file lifetime. I haven't integrated readfile() method yet, I am not sure what's your plan, currently, it has to cache file all the time, because before calling readfile(), sync_external_file already stored the file in moodle. I don't want to touch the logic in send_stored_file() at this stage yet, we are too close to 2.3 release. Please comment your plan for readfile(), we can discuss it.
          Hide
          Marina Glancy added a comment -

          Actually, this function is very dangerous. It does not make any checks like in file_storage::
          create_file_from_storedfile(), create_file_from_pathname(), create_file_from_string(), create_file_from_url(), etc.

          By the way, why there is so much code duplication?

          Besides, in stored_file::update() 'timemodified' should be automatically changed if not specified like it happens in
          file_storage::create_file_from_pathname()

          Show
          Marina Glancy added a comment - Actually, this function is very dangerous. It does not make any checks like in file_storage:: create_file_from_storedfile(), create_file_from_pathname(), create_file_from_string(), create_file_from_url(), etc. By the way, why there is so much code duplication? Besides, in stored_ file::update( ) 'timemodified' should be automatically changed if not specified like it happens in file_storage::create_file_from_pathname()
          Hide
          Dongsheng Cai added a comment - - edited

          Marina

          This function is intended to be used by new filemanager, and not being used anywhere else, so feel free change it or start a new one to suits your needs.

          Show
          Dongsheng Cai added a comment - - edited Marina This function is intended to be used by new filemanager, and not being used anywhere else, so feel free change it or start a new one to suits your needs.
          Hide
          Marina Glancy added a comment -

          Dongsheng, it suits my needs. I can update license and author information using it.

          I just wonder that maybe we can write safe code without thinking that "it's only for us"?
          It's a public function which means any plugin developer can do serious harm to the file system without even understanding that. "The function exists, so it's safe to use it. Let's change the file component then!".

          Show
          Marina Glancy added a comment - Dongsheng, it suits my needs. I can update license and author information using it. I just wonder that maybe we can write safe code without thinking that "it's only for us"? It's a public function which means any plugin developer can do serious harm to the file system without even understanding that. "The function exists, so it's safe to use it. Let's change the file component then!".
          Hide
          Dongsheng Cai added a comment -

          Added validation to stored_file::update, I am not sure if this is the right place to do it, probably better to create a new method file_storage::update(stored_file $file, $filerecord), and keep stored_file::update simple.

          Show
          Dongsheng Cai added a comment - Added validation to stored_ file::update , I am not sure if this is the right place to do it, probably better to create a new method file_storage::update(stored_file $file, $filerecord), and keep stored_ file::update simple.
          Hide
          Dongsheng Cai added a comment -

          Created a new branch for rebasing

          Show
          Dongsheng Cai added a comment - Created a new branch for rebasing
          Hide
          Marina Glancy added a comment -

          Hi Dongsheng,
          this is the new branch that has my changes on top of yours:
          https://github.com/marinaglancy/moodle/compare/master...wip-files23
          Two things I noticed

          1. (very critical)

          {"error":"Error writing to database","stacktrace":"* line 416 of \/lib\/dml\/moodle_database.php: dml_write_exception thrown\n* line 955 of \/lib\/dml\/mysqli_native_moodle_database.php: call to moodle_database->query_end()\n* line 997 of \/lib\/dml\/mysqli_native_moodle_database.php: call to mysqli_native_moodle_database->insert_record_raw()\n* line 1271 of \/lib\/filestorage\/file_storage.php: call to mysqli_native_moodle_database->insert_record()\n* line 299 of \/repository\/repository_ajax.php: call to file_storage->create_file_from_reference()\n","debuginfo":"Field 'fileid' doesn't have a default value\nINSERT INTO mdl_files_reference (repositoryid,reference,lastsync,lifetime) VALUES(?,?,?,?)\n[array (\n 0 => 4,\n 1 => 'YTo2OntzOjk6ImNvbnRleHRpZCI7aToxMztzOjk6ImNvbXBvbmVudCI7czo0OiJ1c2VyIjtzOjY6Iml0ZW1pZCI7aTowO3M6ODoiZmlsZWFyZWEiO3M6NzoicHJpdmF0ZSI7czo4OiJmaWxlcGF0aCI7czoxOiIvIjtzOjg6ImZpbGVuYW1lIjtzOjEwOiJGcmFuY2UuanBnIjt9',\n 2 => 0,\n 3 => 86400,\n)]","reproductionlink":"http:\/\/marina.moodle.local\/master\/"}

          2. (a suggestion)
          I think you should throw an exception either in stored_file::rename or stored_file::update when trying to change filepath/filename to already existing. I check it in my code, but it should be checked in API as well

          And also I corrected the declaration of repository::cache_file_by_reference() in .../repository/dropbox/lib.php on line 457 by removing default null for stored file;
          it is now the same as in repository_dropbox::cache_file_by_reference()

          Show
          Marina Glancy added a comment - Hi Dongsheng, this is the new branch that has my changes on top of yours: https://github.com/marinaglancy/moodle/compare/master...wip-files23 Two things I noticed 1. (very critical) {"error":"Error writing to database","stacktrace":"* line 416 of \/lib\/dml\/moodle_database.php: dml_write_exception thrown\n* line 955 of \/lib\/dml\/mysqli_native_moodle_database.php: call to moodle_database->query_end()\n* line 997 of \/lib\/dml\/mysqli_native_moodle_database.php: call to mysqli_native_moodle_database->insert_record_raw()\n* line 1271 of \/lib\/filestorage\/file_storage.php: call to mysqli_native_moodle_database->insert_record()\n* line 299 of \/repository\/repository_ajax.php: call to file_storage->create_file_from_reference()\n","debuginfo":"Field 'fileid' doesn't have a default value\nINSERT INTO mdl_files_reference (repositoryid,reference,lastsync,lifetime) VALUES(?,?,?,?)\n[array (\n 0 => 4,\n 1 => 'YTo2OntzOjk6ImNvbnRleHRpZCI7aToxMztzOjk6ImNvbXBvbmVudCI7czo0OiJ1c2VyIjtzOjY6Iml0ZW1pZCI7aTowO3M6ODoiZmlsZWFyZWEiO3M6NzoicHJpdmF0ZSI7czo4OiJmaWxlcGF0aCI7czoxOiIvIjtzOjg6ImZpbGVuYW1lIjtzOjEwOiJGcmFuY2UuanBnIjt9',\n 2 => 0,\n 3 => 86400,\n)]","reproductionlink":"http:\/\/marina.moodle.local\/master\/"} 2. (a suggestion) I think you should throw an exception either in stored_ file::rename or stored_ file::update when trying to change filepath/filename to already existing. I check it in my code, but it should be checked in API as well And also I corrected the declaration of repository::cache_file_by_reference() in .../repository/dropbox/lib.php on line 457 by removing default null for stored file; it is now the same as in repository_dropbox::cache_file_by_reference()
          Hide
          Dan Poltawski added a comment -

          To help the collaboration here I suggest just working on top of wip-files23, if we need to tidy up commits for integration we can do this at last stage, so lets just fix things on top.

          Show
          Dan Poltawski added a comment - To help the collaboration here I suggest just working on top of wip-files23, if we need to tidy up commits for integration we can do this at last stage, so lets just fix things on top.
          Hide
          Marina Glancy added a comment -

          Dongsheng,
          another problem from the mismatching lib/upgrade.php and lib/install.xml:
          1.
          Notice: Undefined property: stdClass::$referencelifetime in /var/www/repositories/master/moodle/lib/filestorage/file_storage.php on line 856

          2.
          Error writing to database

          More information about this error
          Debug info: Unknown column 'referencelifetime' in 'field list'
          UPDATE mdl_files SET contenthash = ?, filesize = ?, referencelastsync = ?, referencelifetime = ?, timemodified = ? WHERE referencefileid = ? AND contenthash <> ?
          [array (
          0 => '905892d700978f5477cc097012429316cf38b4d9',
          1 => '6837',
          2 => 1336970772,
          3 => NULL,
          4 => 1336970772,
          5 => '4',
          6 => '905892d700978f5477cc097012429316cf38b4d9',
          )]
          Stack trace:

          line 416 of /lib/dml/moodle_database.php: dml_write_exception thrown
          line 782 of /lib/dml/mysqli_native_moodle_database.php: call to moodle_database->query_end()
          line 2170 of /repository/lib.php: call to mysqli_native_moodle_database->execute()
          line 468 of /lib/filestorage/stored_file.php: call to repository::sync_external_file()
          line 553 of /lib/filestorage/stored_file.php: call to stored_file->sync_external_file()
          line 589 of /lib/filelib.php: call to stored_file->get_filesize()
          line 334 of /lib/form/filemanager.php: call to file_get_drafarea_files()
          line 255 of /lib/form/filemanager.php: call to form_filemanager->__construct()
          line 183 of /lib/pear/HTML/QuickForm/Renderer/Tableless.php: call to MoodleQuickForm_filemanager->toHtml()
          line 2377 of /lib/formslib.php: call to HTML_QuickForm_Renderer_Tableless->renderElement()
          line 403 of /lib/pear/HTML/QuickForm/element.php: call to MoodleQuickForm_Renderer->renderElement()
          line 1631 of /lib/pear/HTML/QuickForm.php: call to HTML_QuickForm_element->accept()
          line 1407 of /lib/formslib.php: call to HTML_QuickForm->accept()
          line 1674 of /lib/pear/HTML/QuickForm.php: call to MoodleQuickForm->accept()
          line 435 of /lib/pear/HTML/Common.php: call to HTML_QuickForm->toHtml()
          line 204 of /lib/pear/HTML/QuickForm/DHTMLRulesTableless.php: call to HTML_Common->display()
          line 889 of /lib/formslib.php: call to HTML_QuickForm_DHTMLRulesTableless->display()
          line 658 of /course/modedit.php: call to moodleform->display()

          Show
          Marina Glancy added a comment - Dongsheng, another problem from the mismatching lib/upgrade.php and lib/install.xml: 1. Notice: Undefined property: stdClass::$referencelifetime in /var/www/repositories/master/moodle/lib/filestorage/file_storage.php on line 856 2. Error writing to database More information about this error Debug info: Unknown column 'referencelifetime' in 'field list' UPDATE mdl_files SET contenthash = ?, filesize = ?, referencelastsync = ?, referencelifetime = ?, timemodified = ? WHERE referencefileid = ? AND contenthash <> ? [array ( 0 => '905892d700978f5477cc097012429316cf38b4d9', 1 => '6837', 2 => 1336970772, 3 => NULL, 4 => 1336970772, 5 => '4', 6 => '905892d700978f5477cc097012429316cf38b4d9', )] Stack trace: line 416 of /lib/dml/moodle_database.php: dml_write_exception thrown line 782 of /lib/dml/mysqli_native_moodle_database.php: call to moodle_database->query_end() line 2170 of /repository/lib.php: call to mysqli_native_moodle_database->execute() line 468 of /lib/filestorage/stored_file.php: call to repository::sync_external_file() line 553 of /lib/filestorage/stored_file.php: call to stored_file->sync_external_file() line 589 of /lib/filelib.php: call to stored_file->get_filesize() line 334 of /lib/form/filemanager.php: call to file_get_drafarea_files() line 255 of /lib/form/filemanager.php: call to form_filemanager->__construct() line 183 of /lib/pear/HTML/QuickForm/Renderer/Tableless.php: call to MoodleQuickForm_filemanager->toHtml() line 2377 of /lib/formslib.php: call to HTML_QuickForm_Renderer_Tableless->renderElement() line 403 of /lib/pear/HTML/QuickForm/element.php: call to MoodleQuickForm_Renderer->renderElement() line 1631 of /lib/pear/HTML/QuickForm.php: call to HTML_QuickForm_element->accept() line 1407 of /lib/formslib.php: call to HTML_QuickForm->accept() line 1674 of /lib/pear/HTML/QuickForm.php: call to MoodleQuickForm->accept() line 435 of /lib/pear/HTML/Common.php: call to HTML_QuickForm->toHtml() line 204 of /lib/pear/HTML/QuickForm/DHTMLRulesTableless.php: call to HTML_Common->display() line 889 of /lib/formslib.php: call to HTML_QuickForm_DHTMLRulesTableless->display() line 658 of /course/modedit.php: call to moodleform->display()
          Hide
          Marina Glancy added a comment - - edited

          (deleted this comment because it does not reproduce after DB fix)

          Show
          Marina Glancy added a comment - - edited (deleted this comment because it does not reproduce after DB fix)
          Hide
          Marina Glancy added a comment -

          stored_file::is_external_file() does not seem to work for draft area files.
          Below is the dump of stored_file object that is a reference. Field 'repository' is empty, but reference exists (referencefileid, etc. are not empty):

          stored_file Object ( [fs:stored_file:private] => file_storage Object ( [filedir:file_storage:private] => /home/marina/web/moodledata/master/filedir [trashdir:file_storage:private] => /home/marina/web/moodledata/master/trashdir [tempdir:file_storage:private] => /home/marina/web/moodledata/master/temp/filestorage [dirpermissions:file_storage:private] => 511 [filepermissions:file_storage:private] => 438 ) [file_record:stored_file:private] => stdClass Object ( [id] => 24041 [contenthash] => 905892d700978f5477cc097012429316cf38b4d9 [pathnamehash] => 0942a22cdf661453e5e5266b758be8e2b6cb349a [contextid] => 13 [component] => user [filearea] => draft [itemid] => 702212956 [filepath] => / [filename] => France.jpg [userid] => 2 [filesize] => 6837 [mimetype] => image/jpeg [status] => 0 [source] => O:8:"stdClass":2:

          {s:6:"source";N;s:8:"original";s:220:"YTo2OntzOjk6ImNvbnRleHRpZCI7aToyOTY7czo5OiJjb21wb25lbnQiO3M6MTA6Im1vZF9mb2xkZXIiO3M6NjoiaXRlbWlkIjtpOjA7czo4OiJmaWxlYXJlYSI7czo3OiJjb250ZW50IjtzOjg6ImZpbGVwYXRoIjtzOjE6Ii8iO3M6ODoiZmlsZW5hbWUiO3M6MTA6IkZyYW5jZS5qcGciO30=";}

          [author] => Admin User [license] => allrightsreserved [timecreated] => 1336971687 [timemodified] => 1336971817 [sortorder] => 0 [referencefileid] => 11 [referencelastsync] => 1336971688 [referencelifetime] => 86400 ) [filedir:stored_file:private] => /home/marina/web/moodledata/master/filedir [repository] => )

          Show
          Marina Glancy added a comment - stored_ file::is_external_file( ) does not seem to work for draft area files. Below is the dump of stored_file object that is a reference. Field 'repository' is empty, but reference exists (referencefileid, etc. are not empty): stored_file Object ( [fs:stored_file:private] => file_storage Object ( [filedir:file_storage:private] => /home/marina/web/moodledata/master/filedir [trashdir:file_storage:private] => /home/marina/web/moodledata/master/trashdir [tempdir:file_storage:private] => /home/marina/web/moodledata/master/temp/filestorage [dirpermissions:file_storage:private] => 511 [filepermissions:file_storage:private] => 438 ) [file_record:stored_file:private] => stdClass Object ( [id] => 24041 [contenthash] => 905892d700978f5477cc097012429316cf38b4d9 [pathnamehash] => 0942a22cdf661453e5e5266b758be8e2b6cb349a [contextid] => 13 [component] => user [filearea] => draft [itemid] => 702212956 [filepath] => / [filename] => France.jpg [userid] => 2 [filesize] => 6837 [mimetype] => image/jpeg [status] => 0 [source] => O:8:"stdClass":2: {s:6:"source";N;s:8:"original";s:220:"YTo2OntzOjk6ImNvbnRleHRpZCI7aToyOTY7czo5OiJjb21wb25lbnQiO3M6MTA6Im1vZF9mb2xkZXIiO3M6NjoiaXRlbWlkIjtpOjA7czo4OiJmaWxlYXJlYSI7czo3OiJjb250ZW50IjtzOjg6ImZpbGVwYXRoIjtzOjE6Ii8iO3M6ODoiZmlsZW5hbWUiO3M6MTA6IkZyYW5jZS5qcGciO30=";} [author] => Admin User [license] => allrightsreserved [timecreated] => 1336971687 [timemodified] => 1336971817 [sortorder] => 0 [referencefileid] => 11 [referencelastsync] => 1336971688 [referencelifetime] => 86400 ) [filedir:stored_file:private] => /home/marina/web/moodledata/master/filedir [repository] => )
          Hide
          Marina Glancy added a comment -

          stored_file::search_reference_count() returns too many results. Seems that it counts draft area files as well

          Show
          Marina Glancy added a comment - stored_ file::search_reference_count( ) returns too many results. Seems that it counts draft area files as well
          Hide
          Dongsheng Cai added a comment -

          I created https://github.com/dongsheng/moodle/tree/wip-files23 branch.

          There is a new commit 40282b5, that fixed db regression and added more unit tests for draft files

          Show
          Dongsheng Cai added a comment - I created https://github.com/dongsheng/moodle/tree/wip-files23 branch. There is a new commit 40282b5, that fixed db regression and added more unit tests for draft files
          Hide
          Dongsheng Cai added a comment -

          File manager not loading, is it broken?

          Show
          Dongsheng Cai added a comment - File manager not loading, is it broken?
          Hide
          Marina Glancy added a comment -

          Re last comment (though we discussed it a lot yesterday in the meeting). Javascript files are cached in browser when using developer mode

          Show
          Marina Glancy added a comment - Re last comment (though we discussed it a lot yesterday in the meeting). Javascript files are cached in browser when using developer mode
          Hide
          Dan Poltawski added a comment -

          Hi Marina,

          That comment isn't quite accurate, moodle will cache JS if $CFG->cachejs is on (setting Appearance > AJAX and Javascript). In that case you'll need to either a) purge caches or b) have a moodle version number increment to get the new file served.

          If you have cachejs set to off, then the file will be served directly by your webserver (and expiry headers etc set by it. In this case perhaps we might need to tell the webserver to set cache to expire immediately).

          Actually after initially thinking that behavior was silly I understand it now.

          Show
          Dan Poltawski added a comment - Hi Marina, That comment isn't quite accurate, moodle will cache JS if $CFG->cachejs is on (setting Appearance > AJAX and Javascript). In that case you'll need to either a) purge caches or b) have a moodle version number increment to get the new file served. If you have cachejs set to off, then the file will be served directly by your webserver (and expiry headers etc set by it. In this case perhaps we might need to tell the webserver to set cache to expire immediately). Actually after initially thinking that behavior was silly I understand it now.
          Hide
          Marina Glancy added a comment -

          Dongsheng, one more comment to your code:

          file_storage::get_file_instance says in phpdocs: "@param stdClass $filerecord record from the files table"

          Although you actually expect it to be record from files left join files_reference table

          I suggest you to change phpdocs and display developer warning if passed object does not have additional fields from files_reference table.

          Show
          Marina Glancy added a comment - Dongsheng, one more comment to your code: file_storage::get_file_instance says in phpdocs: "@param stdClass $filerecord record from the files table" Although you actually expect it to be record from files left join files_reference table I suggest you to change phpdocs and display developer warning if passed object does not have additional fields from files_reference table.
          Hide
          Dongsheng Cai added a comment -

          phpdoc changes: c35f42b

          I don't think we should warn them, devs are not always create stored_file with reference support.

          Show
          Dongsheng Cai added a comment - phpdoc changes: c35f42b I don't think we should warn them, devs are not always create stored_file with reference support.
          Hide
          Dongsheng Cai added a comment -

          Seems there are changes in moodleform, not only private files editing form not saved, I just spotted the repository instance editing form not saved too, here is a fix: a9ef861, please integrate.

          Show
          Dongsheng Cai added a comment - Seems there are changes in moodleform, not only private files editing form not saved, I just spotted the repository instance editing form not saved too, here is a fix: a9ef861, please integrate.
          Hide
          Dan Poltawski added a comment -

          Broken url changes are related to MDL-30686

          Show
          Dan Poltawski added a comment - Broken url changes are related to MDL-30686
          Hide
          Petr Skoda added a comment -

          Dongsheng: it is not a correct fix, the admin/repositoryinstances.php does not set the actual url properly in

          admin_externalpage_setup($pagename)
          

          should be

          admin_externalpage_setup($pagename, '', null, new moodle_url('/admin/repositoryinstances.php'));
          

          I have created new issue for this MDL-33015.

          Show
          Petr Skoda added a comment - Dongsheng: it is not a correct fix, the admin/repositoryinstances.php does not set the actual url properly in admin_externalpage_setup($pagename) should be admin_externalpage_setup($pagename, '', null , new moodle_url('/admin/repositoryinstances.php')); I have created new issue for this MDL-33015 .
          Hide
          Marina Glancy added a comment -

          I picked Petr's commits in branch wip-files23. There was small merge conflict with the second commit.

          Show
          Marina Glancy added a comment - I picked Petr's commits in branch wip-files23. There was small merge conflict with the second commit.
          Hide
          Dan Poltawski added a comment -

          I have this hasn't been submitted for integration, but i've started reviewing.

          Show
          Dan Poltawski added a comment - I have this hasn't been submitted for integration, but i've started reviewing.
          Hide
          Martin Dougiamas added a comment -

          Adding a diff link for reference. Why was this never done before?

          Show
          Martin Dougiamas added a comment - Adding a diff link for reference. Why was this never done before?
          Show
          Helen Foster added a comment - Documentation on creating an alias added to the 2.3 docs: http://docs.moodle.org/23/en/Working_with_files http://docs.moodle.org/23/en/Private_files http://docs.moodle.org/23/en/Recent_files_repository http://docs.moodle.org/23/en/Server_files_repository http://docs.moodle.org/23/en/File_system_repository http://docs.moodle.org/23/en/Box.net_repository http://docs.moodle.org/23/en/EQUELLA_repository
          Hide
          Dongsheng Cai added a comment -

          This issue was assigned to me automatically, however I will not be able to work on this issue in the immediate future. In order to create a truer sense of the state of this issue and to allow other developers to have chance to become involved, I am removing myself as the assignee of this issue.
          For more information, see http://docs.moodle.org/dev/Changes_to_issue_assignment

          Show
          Dongsheng Cai added a comment - This issue was assigned to me automatically, however I will not be able to work on this issue in the immediate future. In order to create a truer sense of the state of this issue and to allow other developers to have chance to become involved, I am removing myself as the assignee of this issue. For more information, see http://docs.moodle.org/dev/Changes_to_issue_assignment
          Hide
          Frédéric Massart added a comment -

          Closing as this has been introduced in 2.3.

          Show
          Frédéric Massart added a comment - Closing as this has been introduced in 2.3.

            People

            • Votes:
              36 Vote for this issue
              Watchers:
              34 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: