Moodle
  1. Moodle
  2. MDL-41580

Allow an imsmanifest.xml file to be selected from a file system repository and allow relative linking

    Details

    • Testing Instructions:
      Hide

      set up a file system repository
      Add an extracted SCORM package to this repository
      Add a new SCORM to a course and select the imsmanifest.xml file from within the extracted SCORM package (make sure you select the option to make it an alias/shortcut (not a copy) - if alias is not selected when saving the SCORM page it should throw an error to say imsmanifest.xml is only supported when using an alias.

      Enter the SCORM and make sure all resources/images/files load as expected.

      more docs (screenshots) of this are available here:
      http://danmarsden.com/blog/2013/09/24/managing-scorm-content-in-moodle-2-6/

      Show
      set up a file system repository Add an extracted SCORM package to this repository Add a new SCORM to a course and select the imsmanifest.xml file from within the extracted SCORM package (make sure you select the option to make it an alias/shortcut (not a copy) - if alias is not selected when saving the SCORM page it should throw an error to say imsmanifest.xml is only supported when using an alias. Enter the SCORM and make sure all resources/images/files load as expected. more docs (screenshots) of this are available here: http://danmarsden.com/blog/2013/09/24/managing-scorm-content-in-moodle-2-6/
    • Affected Branches:
      MOODLE_26_STABLE
    • Fixed Branches:
      MOODLE_26_STABLE
    • Pull Master Branch:
      master_MDL-41580
    • Rank:
      52686

      Description

      This is a replacement for the imsrepository that was possible in Moodle 1.9 - it allows the teacher to have a repository of extracted SCORM content and allows them to select the imsmanifest.xml instead of the full SCORM zip - once this is done it loads all subsequent content from the repository. This does have security implications so will use an admin setting that is disabled by default.

        Issue Links

          Activity

          Hide
          Dan Marsden added a comment -

          initial version of this - code is quite hacky around the file handling but shows what I am trying to achieve here.

          Show
          Dan Marsden added a comment - initial version of this - code is quite hacky around the file handling but shows what I am trying to achieve here.
          Hide
          Dan Marsden added a comment -

          adding a few watchers here - would be great if I could get some feedback on this:.

          To get the root path of a repository a stored file is linked to I've added a function that returns the full repository for a stored file - is there a better suggested method of obtaining this information?

          The actual file handling stuff I've added needs a few more security related checks but I need to work out the best way of obtaining the basepath to use - does anyone have any feedback on the way I'm doing this or have suggestions on a better way?

          Show
          Dan Marsden added a comment - adding a few watchers here - would be great if I could get some feedback on this:. To get the root path of a repository a stored file is linked to I've added a function that returns the full repository for a stored file - is there a better suggested method of obtaining this information? The actual file handling stuff I've added needs a few more security related checks but I need to work out the best way of obtaining the basepath to use - does anyone have any feedback on the way I'm doing this or have suggestions on a better way?
          Hide
          Dan Marsden added a comment -

          bouncing up for peer review - I expect there will be feedback on the way I'm getting information from the repository - keen for any suggestions on how to improve this!

          Show
          Dan Marsden added a comment - bouncing up for peer review - I expect there will be feedback on the way I'm getting information from the repository - keen for any suggestions on how to improve this!
          Hide
          Dan Marsden added a comment -

          submitting for integration - have waited for peer review for over 7 days.

          Show
          Dan Marsden added a comment - submitting for integration - have waited for peer review for over 7 days.
          Hide
          Marina Glancy added a comment -

          Hi Dan, as we talked about in jabber, I reopen this issue and will do a peer review shortly.

          Show
          Marina Glancy added a comment - Hi Dan, as we talked about in jabber, I reopen this issue and will do a peer review shortly.
          Hide
          Marina Glancy added a comment - - edited

          Hi Dan,
          this is a good idea but implementation could be better.
          Frédéric Massart maybe you can also have a look and add something.

          This is my suggestion:

          Define interface inside repository/lib.php that contains only one function - something like get_linked_file(stored_file $mainfile, string $relativepath) that returns a file that is referenced to in $mainfile. Repository will make sure that file exists and accessible. Btw not only scorm but potentially file resource can use it to display html with embedded images.

          Repository implementing it is also responsible for checking capabilities to do such action. For example, repository_filesystem can define additional capability for users to be able to access linked files.

          In this case you don't check things like "get_class($file->get_repository()) == 'repository_filesystem'" or even worse "!empty($file->get_repository() – >root_path)" but instead just check if "$file->get_repository() instanceof YOURNEWINTERFACE"

          Also, this function should return file contents and never disclose an actual file location. Also you are not supposed to know outside of the repo the actual location of file in filesystem repository. What if repository changes the syntax of 'reference' field in the future or make root_path private like it should be?

          New function must have unittests that ensure that no injections are allowed. For example, relativepath must be cleaned (using clean_param), parent directories are not allowed, etc, etc. Make sure that nobody can use things like `command` inside the path, and obviously path must be strictly inside the filesystem repository.

          Show
          Marina Glancy added a comment - - edited Hi Dan, this is a good idea but implementation could be better. Frédéric Massart maybe you can also have a look and add something. This is my suggestion: Define interface inside repository/lib.php that contains only one function - something like get_linked_file(stored_file $mainfile, string $relativepath) that returns a file that is referenced to in $mainfile. Repository will make sure that file exists and accessible. Btw not only scorm but potentially file resource can use it to display html with embedded images. Repository implementing it is also responsible for checking capabilities to do such action. For example, repository_filesystem can define additional capability for users to be able to access linked files. In this case you don't check things like "get_class($file->get_repository()) == 'repository_filesystem'" or even worse "!empty($file->get_repository() – >root_path)" but instead just check if "$file->get_repository() instanceof YOURNEWINTERFACE" Also, this function should return file contents and never disclose an actual file location. Also you are not supposed to know outside of the repo the actual location of file in filesystem repository. What if repository changes the syntax of 'reference' field in the future or make root_path private like it should be? New function must have unittests that ensure that no injections are allowed. For example, relativepath must be cleaned (using clean_param), parent directories are not allowed, etc, etc. Make sure that nobody can use things like `command` inside the path, and obviously path must be strictly inside the filesystem repository.
          Hide
          Marina Glancy added a comment - - edited

          I changed my mind a little, there is no need for interface, just add this function to repository class with the default implementation returning null or throwing exception "not supported".

          class repository {
              // ...
              public function get_linked_file_contents(stored_file $mainfile, string $relativepath, $strictness = IGNORE_MISSING) {
                  // Default implementation: either return null or throw exception depending on $strictness
              }
          }
          
          class stored_file {
              // ...
              function get_linked_file_contents(string $relativepath, $strictness = IGNORE_MISSING) {
                  if ($this->repository) {
                      $relativepath = clean_param($relativepath, PARAM_PATH);
                      return $this->repository->get_linked_file_contents($this, $relativepath, $strictness);
                  } else if ($strictness == MUST_EXIST) {
                      throw new moodle_exception ....
                  } else {
                      return null;
                  }
              }
          }
          
          class repository_filesystem {
              // ...
              public function get_linked_file_contents(stored_file $mainfile, string $relativepath, $strictness = IGNORE_MISSING) {
                  // Check capabilities, check that file is present, etc.
              }
          }
          

          also we won't need to add function stored_file::get_repository() - it is better to keep it internal

          Show
          Marina Glancy added a comment - - edited I changed my mind a little, there is no need for interface, just add this function to repository class with the default implementation returning null or throwing exception "not supported". class repository { // ... public function get_linked_file_contents(stored_file $mainfile, string $relativepath, $strictness = IGNORE_MISSING) { // Default implementation: either return null or throw exception depending on $strictness } } class stored_file { // ... function get_linked_file_contents(string $relativepath, $strictness = IGNORE_MISSING) { if ($ this ->repository) { $relativepath = clean_param($relativepath, PARAM_PATH); return $ this ->repository->get_linked_file_contents($ this , $relativepath, $strictness); } else if ($strictness == MUST_EXIST) { throw new moodle_exception .... } else { return null ; } } } class repository_filesystem { // ... public function get_linked_file_contents(stored_file $mainfile, string $relativepath, $strictness = IGNORE_MISSING) { // Check capabilities, check that file is present, etc. } } also we won't need to add function stored_ file::get_repository( ) - it is better to keep it internal
          Hide
          Frédéric Massart added a comment -

          I am not sure I like the idea of adding a new method to the repository mechanism, I find that very specific to the SCORM activity (even though it could be used for the file resource). Also, I find the current patch a bit too hacky and relying on the file_system in a wrong way (root_path?).

          For now, I would suggest implementing a more robust function in SCORM lib, that ensures that the repository is the file system one, and uses repository::get_file/send_file() instead of constructing the path that way. Direct access to the file system should really be prevented.

          Also, this has highlighted another issue: MDL-41807. And the way this patch works also allows for that exploit.

          Show
          Frédéric Massart added a comment - I am not sure I like the idea of adding a new method to the repository mechanism, I find that very specific to the SCORM activity (even though it could be used for the file resource). Also, I find the current patch a bit too hacky and relying on the file_system in a wrong way (root_path?). For now, I would suggest implementing a more robust function in SCORM lib, that ensures that the repository is the file system one, and uses repository::get_file/send_file() instead of constructing the path that way. Direct access to the file system should really be prevented. Also, this has highlighted another issue: MDL-41807. And the way this patch works also allows for that exploit.
          Hide
          Dan Marsden added a comment -

          Frederic - I'm aware current patch is hacky - as mentioned above I didn't like it much either. I'd asked for feedback but it didn't come so I sent through integration to force peer review and get some feedback.

          The files that SCORM is trying to access might be changed at any point - we don't want to create new file records for those files - we simply want to serve them to the browser as is. So a function in SCORM would need to get access to something like root_path so it could serve the file without creating a file record or the repository needs a method to return the CONTENT of the file as Marina mentions. Implementing a function in SCORM lib seems to me like it would continue to look hacky.

          A function in repository code as Marina mentions seems to make more sense to me - we also need to be very careful to make sure this doesn't impact performance - the code should be very efficient as it is going to be called multiple times on a single page and is likely to deal with large files as well - many SCORM packages have 200MB+ video files.

          Show
          Dan Marsden added a comment - Frederic - I'm aware current patch is hacky - as mentioned above I didn't like it much either. I'd asked for feedback but it didn't come so I sent through integration to force peer review and get some feedback. The files that SCORM is trying to access might be changed at any point - we don't want to create new file records for those files - we simply want to serve them to the browser as is. So a function in SCORM would need to get access to something like root_path so it could serve the file without creating a file record or the repository needs a method to return the CONTENT of the file as Marina mentions. Implementing a function in SCORM lib seems to me like it would continue to look hacky. A function in repository code as Marina mentions seems to make more sense to me - we also need to be very careful to make sure this doesn't impact performance - the code should be very efficient as it is going to be called multiple times on a single page and is likely to deal with large files as well - many SCORM packages have 200MB+ video files.
          Hide
          Dan Marsden added a comment -

          Hi Marina, I've updated the code again - it would be good if you were able to take another look.
          I haven't implemented Unit tests yet - will do that when we have the other stuff right - Also I haven't implemented a new capability as I think that makes it a lot harder for users to configure - if we added a new capability on scorm edit/creation we would need to:
          check all roles that can submit SCORMS and make sure all those roles also have the extra capability to view linked content and then display some form of useful error to the teacher if it detects a role that can submit but can't view the linked files - I'll implement it if you really think it's needed but I'm not so sure.

          Thanks!

          Show
          Dan Marsden added a comment - Hi Marina, I've updated the code again - it would be good if you were able to take another look. I haven't implemented Unit tests yet - will do that when we have the other stuff right - Also I haven't implemented a new capability as I think that makes it a lot harder for users to configure - if we added a new capability on scorm edit/creation we would need to: check all roles that can submit SCORMS and make sure all those roles also have the extra capability to view linked content and then display some form of useful error to the teacher if it detects a role that can submit but can't view the linked files - I'll implement it if you really think it's needed but I'm not so sure. Thanks!
          Hide
          Marina Glancy added a comment -

          Dan, this is exactly the approach I was suggesting.

          Big comment:

          I don't know if SCORM setting is necessary, I'll leave it up to you (but imho we just should always allow xml file, what could be the problem with them?). But we definitely need something in filesystem repository - either capability or setting that is set for particular instance. Also repository_filesystem::get_linked_file_contents() should better check that file is actually inside the repository because relativename can start with "../". Fred can do better review on that
          I already see another usage for this - in User private files, obviously also with setting.

          Small comments:

          • stored_file::get_linked_file_contents() should return false and not null and in phpdocs it should be "@return false|string" (same in other get_linked_file_contents's). Also all phpdocs should say "@throws ...." if the function throws an exception
          • Please add more documentation to repository::get_linked_file_contents() - why is it needed, that repository can overwrite it but has to be extra careful, see filesystem.
          • Instead of "substr($relativepath, strrpos($relativepath, '/')+1)" please use "basename($relativepath)"
          • repository/upgrade.txt should mention new function

          But....
          We already have disagreement with Fred on what would be the best solution. So let's ask for more opinions, for example what Petr Škoda or Eloy Lafuente (stronk7) think?

          Show
          Marina Glancy added a comment - Dan, this is exactly the approach I was suggesting. Big comment: I don't know if SCORM setting is necessary, I'll leave it up to you (but imho we just should always allow xml file, what could be the problem with them?). But we definitely need something in filesystem repository - either capability or setting that is set for particular instance. Also repository_filesystem::get_linked_file_contents() should better check that file is actually inside the repository because relativename can start with "../". Fred can do better review on that I already see another usage for this - in User private files, obviously also with setting. Small comments: stored_ file::get_linked_file_contents( ) should return false and not null and in phpdocs it should be "@return false|string" (same in other get_linked_file_contents's). Also all phpdocs should say "@throws ...." if the function throws an exception Please add more documentation to repository::get_linked_file_contents() - why is it needed, that repository can overwrite it but has to be extra careful, see filesystem. Instead of "substr($relativepath, strrpos($relativepath, '/')+1)" please use "basename($relativepath)" repository/upgrade.txt should mention new function But.... We already have disagreement with Fred on what would be the best solution. So let's ask for more opinions, for example what Petr Škoda or Eloy Lafuente (stronk7) think?
          Hide
          Dan Marsden added a comment -

          Good point - we probably don't need a SCORM setting - bit of setting overkill there - removed that from the patch.

          I like the idea of an instance level setting in the repository more than a capability - before I implement that I'll wait on further feedback. have tidied up some of the other things you mentioned there too - I really need feedback on this asap so I can get it in before 2.6 freeze.

          Show
          Dan Marsden added a comment - Good point - we probably don't need a SCORM setting - bit of setting overkill there - removed that from the patch. I like the idea of an instance level setting in the repository more than a capability - before I implement that I'll wait on further feedback. have tidied up some of the other things you mentioned there too - I really need feedback on this asap so I can get it in before 2.6 freeze.
          Hide
          Dan Marsden added a comment -

          added realpath/sanity check to filesystem:get_linked_file_contents to make sure the file is inside the current repository.

          Show
          Dan Marsden added a comment - added realpath/sanity check to filesystem:get_linked_file_contents to make sure the file is inside the current repository.
          Hide
          Dan Marsden added a comment -

          added instance level config for repositories to enable relative path linking. Hopefully it makes sense.

          Show
          Dan Marsden added a comment - added instance level config for repositories to enable relative path linking. Hopefully it makes sense.
          Hide
          Dan Marsden added a comment -

          in coversation with Frederic - it makes more sense for this new function to actually send the file so we can use proper send_file functions instead of passing large file contents into vars.. updated patch to send_file instead of contents.

          Show
          Dan Marsden added a comment - in coversation with Frederic - it makes more sense for this new function to actually send the file so we can use proper send_file functions instead of passing large file contents into vars.. updated patch to send_file instead of contents.
          Hide
          Frédéric Massart added a comment -

          Hi Dan, Marina,

          I finally decided to like this approach . Looking at the new patch it looks really good, I've been chatting with Dan suggesting the following:

          • Renaming the 'external' fake filearea for pluginfile to something more explicit, we thought imsmanifest was the best.
          • Changing the method get_linked_file_contents() to send_relative_file()
            • Firstly, to avoid the confusion with reference/alias/shortcut in repository
            • Secondly, to use the low-level API to serve files straight from there, copying the logic from filesystem::send_file(). This prevents the load of a big file (like a video file of 200MB) in memory to then be sent.
            • Thirdly, I don't think there is a need (yet) for a method that returns the content of the file directly.
          • A new callback method supports_relative_file(), returning false in repository:: and custom logic in filesystem::
          • The stored_file method and/or the repository method should check that the repository supports relative file serving, or send a 404 I suppose.

          That's it for me .
          Fred

          Show
          Frédéric Massart added a comment - Hi Dan, Marina, I finally decided to like this approach . Looking at the new patch it looks really good, I've been chatting with Dan suggesting the following: Renaming the 'external' fake filearea for pluginfile to something more explicit, we thought imsmanifest was the best. Changing the method get_linked_file_contents() to send_relative_file() Firstly, to avoid the confusion with reference/alias/shortcut in repository Secondly, to use the low-level API to serve files straight from there, copying the logic from filesystem::send_file(). This prevents the load of a big file (like a video file of 200MB) in memory to then be sent. Thirdly, I don't think there is a need (yet) for a method that returns the content of the file directly. A new callback method supports_relative_file(), returning false in repository:: and custom logic in filesystem:: The stored_file method and/or the repository method should check that the repository supports relative file serving, or send a 404 I suppose. That's it for me . Fred
          Hide
          Marina Glancy added a comment -

          awesome!
          just a question - what's the point in callback supports_relative_file() - why do we need two new methods and not just one ?

          Show
          Marina Glancy added a comment - awesome! just a question - what's the point in callback supports_relative_file() - why do we need two new methods and not just one ?
          Hide
          Frédéric Massart added a comment -

          I thought that it would be ideal for any plugin to be able to know if a repository was supporting this. Otherwise you have to call send_relative_file() with a dummy path in order to check if it works. https://github.com/danmarsden/moodle/compare/moodle:master...master_MDL-41580#L5R361

          Show
          Frédéric Massart added a comment - I thought that it would be ideal for any plugin to be able to know if a repository was supporting this. Otherwise you have to call send_relative_file() with a dummy path in order to check if it works. https://github.com/danmarsden/moodle/compare/moodle:master...master_MDL-41580#L5R361
          Hide
          Marina Glancy added a comment -

          ok maybe it can be useful in the global perspective.

          In case of scorm however - why do we need to check this? Why not allowing the .xml file from any source (shortcut or not)? It is possible to have only global URLs in it, isn't it?

          Show
          Marina Glancy added a comment - ok maybe it can be useful in the global perspective. In case of scorm however - why do we need to check this? Why not allowing the .xml file from any source (shortcut or not)? It is possible to have only global URLs in it, isn't it?
          Hide
          Dan Marsden added a comment -

          Marina - imsmanifest.xml file is just a configuration file - if we can't access any other files using relative linking then the SCORM package won't have any content to display. You also can't use a global URL as the SCORM must sit on the same domain as Moodle otherwise you run into issues with cross-domain communication security in the browser.

          Show
          Dan Marsden added a comment - Marina - imsmanifest.xml file is just a configuration file - if we can't access any other files using relative linking then the SCORM package won't have any content to display. You also can't use a global URL as the SCORM must sit on the same domain as Moodle otherwise you run into issues with cross-domain communication security in the browser.
          Hide
          Dan Marsden added a comment -

          I've just pushed a commit that has a unit test to check that files outside the repo are not available - not sure if this is quite right.

          Also - I can't see a way to test a succesful access which results in a send_file - as headers are set by send_file and I can't easily stop those from being set.

          Show
          Dan Marsden added a comment - I've just pushed a commit that has a unit test to check that files outside the repo are not available - not sure if this is quite right. Also - I can't see a way to test a succesful access which results in a send_file - as headers are set by send_file and I can't easily stop those from being set.
          Hide
          Frédéric Massart added a comment -

          Just commenting to say that I am happy with the patch and the unit tests. It is tricky (or impossible) to properly test send_file() so it seems good to me to only do a security check on the serving of files outside the repository. If Marina agrees on this I think it's good for integration.

          Cheers,
          Fred

          Show
          Frédéric Massart added a comment - Just commenting to say that I am happy with the patch and the unit tests. It is tricky (or impossible) to properly test send_file() so it seems good to me to only do a security check on the serving of files outside the repository. If Marina agrees on this I think it's good for integration. Cheers, Fred
          Hide
          Dan Marsden added a comment -

          ping Marina - are you happy with this now? - I need to get this into integration for next week.

          Show
          Dan Marsden added a comment - ping Marina - are you happy with this now? - I need to get this into integration for next week.
          Hide
          Marina Glancy added a comment - - edited

          Hi Dan, sorry took a while.
          There is something not consistent now after combining my and Fred's suggestions
          What is function send_relative_file() supposed to do now - return file content or send it to output?
          In some cases you treat it as if it returns the contents but the actual implementation for filesystem sends the contents to output. You need to pick one. If you choose to send contents to output, all 3 functions should not throw exception or return anything - they just either call send_file() or send_file_not_found() in the end (and you won't need all error strings and $strictness any more).

          Another thing - should not repository_filesystem check that file actually exists before calling send_file() ?

          The rest looks good.

          Show
          Marina Glancy added a comment - - edited Hi Dan, sorry took a while. There is something not consistent now after combining my and Fred's suggestions What is function send_relative_file() supposed to do now - return file content or send it to output? In some cases you treat it as if it returns the contents but the actual implementation for filesystem sends the contents to output. You need to pick one. If you choose to send contents to output, all 3 functions should not throw exception or return anything - they just either call send_file() or send_file_not_found() in the end (and you won't need all error strings and $strictness any more). Another thing - should not repository_filesystem check that file actually exists before calling send_file() ? The rest looks good.
          Hide
          Dan Marsden added a comment -

          Thanks Marina - great point. I've tidied that up but that means it's no longer testable as send_file_not_found also sets headers which make it incompatible with unit testing.

          filesystem does a file_exists call after running realpath and making sure it still sits within the repository.

          unless there are any objections I'll push this through for integration - thanks to everyone who helped with this!

          Show
          Dan Marsden added a comment - Thanks Marina - great point. I've tidied that up but that means it's no longer testable as send_file_not_found also sets headers which make it incompatible with unit testing. filesystem does a file_exists call after running realpath and making sure it still sits within the repository. unless there are any objections I'll push this through for integration - thanks to everyone who helped with this!
          Hide
          Dan Marsden added a comment -

          no objections so pushing up for integration.

          Show
          Dan Marsden added a comment - no objections so pushing up for integration.
          Hide
          Paul Holden added a comment -
          Show
          Paul Holden added a comment - There is an undefined $strictness argument remaining: https://github.com/danmarsden/moodle/compare/moodle:master...master_MDL-41580#L0R1009
          Hide
          Dan Marsden added a comment -

          thanks for spotting that Paul - fixed!

          Show
          Dan Marsden added a comment - thanks for spotting that Paul - fixed!
          Hide
          Dan Marsden added a comment -

          adding a link to some docs I've written on this so far - probably need to be copied into docs.moodle.org when we set up a 2.6 version.

          Show
          Dan Marsden added a comment - adding a link to some docs I've written on this so far - probably need to be copied into docs.moodle.org when we set up a 2.6 version.
          Hide
          Sam Hemelryk added a comment -

          Thanks Dan - this has been integrated now.

          Show
          Sam Hemelryk added a comment - Thanks Dan - this has been integrated now.
          Hide
          Adrian Greeve added a comment -

          Tested on the master integration branch.
          Works exactly as described.
          Test passed.

          Show
          Adrian Greeve added a comment - Tested on the master integration branch. Works exactly as described. Test passed.
          Hide
          Dan Marsden added a comment -

          thanks Adrian & Sam!

          Show
          Dan Marsden added a comment - thanks Adrian & Sam!
          Hide
          Marina Glancy added a comment -

          And THANK YOU again for making Moodle better every day!

          Another weekly release has been released.

          Show
          Marina Glancy added a comment - And THANK YOU again for making Moodle better every day! Another weekly release has been released.
          Hide
          Mary Cooch added a comment -

          Removing docs_required label as this is documented here http://docs.moodle.org/26/en/SCORM_settings with thanks to and a link to Dan's blog post documentation.

          Show
          Mary Cooch added a comment - Removing docs_required label as this is documented here http://docs.moodle.org/26/en/SCORM_settings with thanks to and a link to Dan's blog post documentation.
          Hide
          Dan Marsden added a comment -

          Thanks Mary!

          Show
          Dan Marsden added a comment - Thanks Mary!

            People

            • Votes:
              1 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: