Moodle
  1. Moodle
  2. MDL-33857

Server files repository runs out of memory on big site

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 2.3
    • Fix Version/s: 2.3.3
    • Component/s: Repositories
    • Labels:
    • Testing Instructions:
      Hide

      Test 1 (speed):
      You need a really big site (for example moodle.org with a lot of folder)
      Open Filepicker and navigate inside 'Server files'. Should be faster than it was
      Please note that filepicker Javascript is still slow for rendering files (rendering starts when loading gif stops spinning)

      Test 2 (regression):

      1. Add one/several courses with at least one module of each changed module type: book, data, forum, glossary, imscp, workshop
      2. Add data with files of different types (images and non-images) in those modules
      3. Enable legacy file area in one of the courses and add files there
      4. As admin/teacher create a folder and make sure you can navigate in filepicker Server files through all modules
      5. Open insert image filepicker in some textarea and make sure you can navigate and Server files only shows folders with image files
      Show
      Test 1 (speed): You need a really big site (for example moodle.org with a lot of folder) Open Filepicker and navigate inside 'Server files'. Should be faster than it was Please note that filepicker Javascript is still slow for rendering files (rendering starts when loading gif stops spinning) Test 2 (regression): Add one/several courses with at least one module of each changed module type: book, data, forum, glossary, imscp, workshop Add data with files of different types (images and non-images) in those modules Enable legacy file area in one of the courses and add files there As admin/teacher create a folder and make sure you can navigate in filepicker Server files through all modules Open insert image filepicker in some textarea and make sure you can navigate and Server files only shows folders with image files
    • Affected Branches:
      MOODLE_23_STABLE
    • Fixed Branches:
      MOODLE_23_STABLE
    • Pull Master Branch:
      wip-MDL-33857-master
    • Rank:
      41957

      Description

      On test upgrade of 2.3 on moodle.org, logged in as admin the server runs out of memory when clicking on server files.

      I'm pretty sure its doing something stupid there, previously have mentioned this issue in MDL-27236 and MDL-27236.

      I did one 'blind' bugfix to improve the situation in the past, we need to resolve the problem properly.

        Issue Links

          Activity

          Hide
          Marina Glancy added a comment -

          I noticed that the most time-consuming operation is checking that folder in Server files is empty (in order to decide whether to display it or not).

          I introduced some functions inside file_info to check the emptiness faster.

          This is still not a perfect solution but it does make things faster.

          Show
          Marina Glancy added a comment - I noticed that the most time-consuming operation is checking that folder in Server files is empty (in order to decide whether to display it or not). I introduced some functions inside file_info to check the emptiness faster. This is still not a perfect solution but it does make things faster.
          Hide
          Marina Glancy added a comment -

          TO INTEGRATORS: please cherry-pick to 23_STABLE

          Show
          Marina Glancy added a comment - TO INTEGRATORS: please cherry-pick to 23_STABLE
          Hide
          Michael Woods added a comment -

          Is there a non-git way for me to download the patch files as a zip to test on our site? I don't have git setup at the moment. Cheers. (And yes, I know I should have it!)

          Show
          Michael Woods added a comment - Is there a non-git way for me to download the patch files as a zip to test on our site? I don't have git setup at the moment. Cheers. (And yes, I know I should have it!)
          Show
          Dan Poltawski added a comment - Hi Michael, Try: https://github.com/marinaglancy/moodle/commit/edbb5c151579a08449d7d13bcc719c11890c7705.patch
          Hide
          Marina Glancy added a comment -

          Dan, how did you do it?
          I created a gist for Michael: https://gist.github.com/3001965

          Show
          Marina Glancy added a comment - Dan, how did you do it? I created a gist for Michael: https://gist.github.com/3001965
          Hide
          Dan Poltawski added a comment -

          Put .patch at the end of a commit url.

          Show
          Dan Poltawski added a comment - Put .patch at the end of a commit url.
          Hide
          Michael Woods added a comment -

          Hi Marina and Dan,

          I'm afraid to say I don't think the fix worked. The patches applied fine, but I'm still getting the following message.

          Invalid JSON string
          Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 144535 bytes) in /mnt/moodle/moodle2c/lib/dml/mysqli_native_moodle_database.php on line 868

          It's as a teacher, when clicking on the 'System' link in the filepicker breadcrumb, or clicking the tree-view (which I suppose has to try and display the System files also).

          I'll attach a screenshot also, just to be sure we are talking about the same bug.

          Thanks for your help.
          Michael

          Show
          Michael Woods added a comment - Hi Marina and Dan, I'm afraid to say I don't think the fix worked. The patches applied fine, but I'm still getting the following message. Invalid JSON string Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 144535 bytes) in /mnt/moodle/moodle2c/lib/dml/mysqli_native_moodle_database.php on line 868 It's as a teacher, when clicking on the 'System' link in the filepicker breadcrumb, or clicking the tree-view (which I suppose has to try and display the System files also). I'll attach a screenshot also, just to be sure we are talking about the same bug. Thanks for your help. Michael
          Hide
          Michael Woods added a comment -

          Screenshot of JSON error

          Show
          Michael Woods added a comment - Screenshot of JSON error
          Hide
          Eloy Lafuente (stronk7) added a comment -

          Reopening both based on Michael tests and not being able to find any memory improvement in the patch (but only speed, apparently). Are we mixing things here?

          Ciao

          Show
          Eloy Lafuente (stronk7) added a comment - Reopening both based on Michael tests and not being able to find any memory improvement in the patch (but only speed, apparently). Are we mixing things here? Ciao
          Hide
          CiBoT added a comment -

          Moving this reopened issue out from current integration. Please, re-submit it for integration once ready.

          Show
          CiBoT added a comment - Moving this reopened issue out from current integration. Please, re-submit it for integration once ready.
          Hide
          Eloy Lafuente (stronk7) added a comment -

          I think the problem here is that both icon and list views show things properly (current path), and you're able to move up using the "breadcrumb" on top of the picker.

          But when you switch to to tree view, it always goes to root (system) and then display everything from it recursively and it's a really expensive process, not needing a big site to notice it.

          IMO, we should make tree view to:

          1) display tree under a given (current) node only.
          2) make it to have the standard "breadcrumb" to allow going up (like the icon and list views allow).

          and that would alleviate a lot of extra work. For sure this won't fix the problem (if we go up to System for big sites using the breadcrumb), but at least it won't break on each tree view request and also will show the information in a more focussed way.

          Just one opinion, ciao

          Show
          Eloy Lafuente (stronk7) added a comment - I think the problem here is that both icon and list views show things properly (current path), and you're able to move up using the "breadcrumb" on top of the picker. But when you switch to to tree view, it always goes to root (system) and then display everything from it recursively and it's a really expensive process, not needing a big site to notice it. IMO, we should make tree view to: 1) display tree under a given (current) node only. 2) make it to have the standard "breadcrumb" to allow going up (like the icon and list views allow). and that would alleviate a lot of extra work. For sure this won't fix the problem (if we go up to System for big sites using the breadcrumb), but at least it won't break on each tree view request and also will show the information in a more focussed way. Just one opinion, ciao
          Hide
          Susan Mangan added a comment - - edited

          Any progress on this one? I just upgraded to the newest weekly build [2.3.1 2012-07-26] and am still seeing this error in IE and FF.

          It seems as though the problem was fixed on our development server by increasing the PHP memory limit to 512M but this did not work on one of our production servers... I wonder what a reasonable limit is for a large site?

          I just want to make sure we're not missing anything here on our own web server settings. Thanks!!

          Show
          Susan Mangan added a comment - - edited Any progress on this one? I just upgraded to the newest weekly build [2.3.1 2012-07-26] and am still seeing this error in IE and FF. It seems as though the problem was fixed on our development server by increasing the PHP memory limit to 512M but this did not work on one of our production servers... I wonder what a reasonable limit is for a large site? I just want to make sure we're not missing anything here on our own web server settings. Thanks!!
          Hide
          Marina Glancy added a comment -

          It will hopefully be finished by the end of next week but it will take a while to integrate

          Show
          Marina Glancy added a comment - It will hopefully be finished by the end of next week but it will take a while to integrate
          Hide
          Eloy Lafuente (stronk7) added a comment -

          The integration of this issue has been delayed to next week because the integration period is over (Monday, Tuesday) and testing must happen on Wednesday.

          This change to a more rigid timeframe on each integration/testing cycle aims to produce a better and clear separation and organization of tasks for everybody.

          This is a bulk-automated message, so if you want to blame somebody/thing/where, don't do it here (use git instead) :-D :-P

          Apologizes for the inconvenient, this will be integrated next week. Thanks for your collaboration & ciao

          Show
          Eloy Lafuente (stronk7) added a comment - The integration of this issue has been delayed to next week because the integration period is over (Monday, Tuesday) and testing must happen on Wednesday. This change to a more rigid timeframe on each integration/testing cycle aims to produce a better and clear separation and organization of tasks for everybody. This is a bulk-automated message, so if you want to blame somebody/thing/where, don't do it here (use git instead) :-D :-P Apologizes for the inconvenient, this will be integrated next week. Thanks for your collaboration & ciao
          Hide
          Sam Hemelryk added a comment -

          This is first on list for this arvo

          Show
          Sam Hemelryk added a comment - This is first on list for this arvo
          Hide
          Sam Hemelryk added a comment -

          Hi Marina,

          Interesting improvements here, I spent part of yesterday arvo looking them over and most of this morning wrapping my head around things in that area.
          There is some tough code in there and I imagine it wasn't easy to monitor and observe performance and code effect in there.
          All in all I like this solution. The following are the things I've noted. Mostly minor, feel free to correct me on any of them.

          lib/filebrowser/file_info_context_coursecat.php

          • file_info_context_coursecat::count_non_empty_children
            • Returns 0 or 2? There is no count going on here. Surely it would be better to have this function actually do the counting so that should it ever be used anywhere else it won't result in problems. Perhaps I havn't seen enough to understand what is going on there presently.
            • Both the get_records calls there for course_categories and course should be using record sets and should probably be preloading the database queries to avoid the 1+n database queries that would be going on in there in order to load each context separately if we don't have it.
            • Alternativily to the above (and perhaps Eloy/Petr will confirm this as evil) you could get the category context for the current category and use its context path + depth and run a search on the database to generate a single resultset on immediate child contexts (sub categories and courses). Done as a recordset you would have just a single DB query and preload as you go to hopefully further reduce the interaction.

          lib/filebrowser/file_info_context_module.php

          • file_info_context_module::get_filtered_children
            • Does that fast pre-check really help? assign has 3 areas, lesson 2, and workshop 4. Plus the two areas that forced (intro + backup/activity) makes for 5, 4, and 6 queries for those activities respectivily just to pre-check an area.
            • The use of get_file_info there has to be horribly ineffiecent despite being convienient right? I see it was doing that before hand anyway so no probs but perhaps if performance is the nature of this issue that could be looked at. Totally up to you this one.
            • Again the count thing catches my eye, 0, 1 or 2 for many. It may be true to what ever uses we have at the moment, but it doesn't seem true to the functionality.
            • sql2 and params2 probably could just be sql and params now.
          • get_non_empty_children + count_non_empty_children
            • unused global $DB
            • comman after $extensions in the phpdoc block

          lib/filebrowser/file_info.php

          • file_info::count_non_empty_children
            • Thought occurs to me when looking at this method. Given the way count works presently 0,1, and 2 this method could be perhaps further optimised. The list of children could be double processed looking first to files that are not a directory and second looking at directories. That way we avoid calling count_non_empty_childen on child directories unless we actually need to. In the situation a file_info object already contains 2..n files this will result in a quicker operation I guess.

          repository/local/lib.php

          • Double copyright on the repository_local class. Just noting as I need to check that, normally first to create it gets the copyright, I think we do that more to avoid long lists of copyrights. I'll chat to Eloy about it.
          • repository_local::get_listing
            • $context should probably be defaulted to something like the other properties picked up from the params.
            • where you are calling get_context_info_array you could probably just be calling $this->context->get_course_context by looks. I see that that code was already there, so up to you if you fix it or not, just something funny going on with the diffs.
            • Looks like strtolower is coming from the browser, perhaps best to use our textlib strtolower there.
          • Typo in comment for get_non_empty_children `subfolers`
          • Typo in comment for can_skip `Wether`
          • can_skip unused static var $skipcategories

          Most of that stuff is really pretty trivial.
          The only thing that really struck me is how `count` is operating within these changes, particually as it is reserving the count_non_empty_children method and is not returning a true count.
          I wonder whether perhaps we should be implementing two methods has_non_empty_children to return bool, and count_non_empty_children to return the count. Or perhaps we could look to implement an argument that says I don't care how many more children there are then 2 and operate on that. Anyway just throwing around ideas here I'd like to know what you think Marina.
          As there are a couple of other minor changes I'll send this back this week so that they can be made and see what your thoughts are on the count situation.

          Many thanks
          Sam

          Show
          Sam Hemelryk added a comment - Hi Marina, Interesting improvements here, I spent part of yesterday arvo looking them over and most of this morning wrapping my head around things in that area. There is some tough code in there and I imagine it wasn't easy to monitor and observe performance and code effect in there. All in all I like this solution. The following are the things I've noted. Mostly minor, feel free to correct me on any of them. lib/filebrowser/file_info_context_coursecat.php file_info_context_coursecat::count_non_empty_children Returns 0 or 2? There is no count going on here. Surely it would be better to have this function actually do the counting so that should it ever be used anywhere else it won't result in problems. Perhaps I havn't seen enough to understand what is going on there presently. Both the get_records calls there for course_categories and course should be using record sets and should probably be preloading the database queries to avoid the 1+n database queries that would be going on in there in order to load each context separately if we don't have it. Alternativily to the above (and perhaps Eloy/Petr will confirm this as evil) you could get the category context for the current category and use its context path + depth and run a search on the database to generate a single resultset on immediate child contexts (sub categories and courses). Done as a recordset you would have just a single DB query and preload as you go to hopefully further reduce the interaction. lib/filebrowser/file_info_context_module.php file_info_context_module::get_filtered_children Does that fast pre-check really help? assign has 3 areas, lesson 2, and workshop 4. Plus the two areas that forced (intro + backup/activity) makes for 5, 4, and 6 queries for those activities respectivily just to pre-check an area. The use of get_file_info there has to be horribly ineffiecent despite being convienient right? I see it was doing that before hand anyway so no probs but perhaps if performance is the nature of this issue that could be looked at. Totally up to you this one. Again the count thing catches my eye, 0, 1 or 2 for many. It may be true to what ever uses we have at the moment, but it doesn't seem true to the functionality. sql2 and params2 probably could just be sql and params now. get_non_empty_children + count_non_empty_children unused global $DB comman after $extensions in the phpdoc block lib/filebrowser/file_info.php file_info::count_non_empty_children Thought occurs to me when looking at this method. Given the way count works presently 0,1, and 2 this method could be perhaps further optimised. The list of children could be double processed looking first to files that are not a directory and second looking at directories. That way we avoid calling count_non_empty_childen on child directories unless we actually need to. In the situation a file_info object already contains 2..n files this will result in a quicker operation I guess. repository/local/lib.php Double copyright on the repository_local class. Just noting as I need to check that, normally first to create it gets the copyright, I think we do that more to avoid long lists of copyrights. I'll chat to Eloy about it. repository_local::get_listing $context should probably be defaulted to something like the other properties picked up from the params. where you are calling get_context_info_array you could probably just be calling $this->context->get_course_context by looks. I see that that code was already there, so up to you if you fix it or not, just something funny going on with the diffs. Looks like strtolower is coming from the browser, perhaps best to use our textlib strtolower there. Typo in comment for get_non_empty_children `subfolers` Typo in comment for can_skip `Wether` can_skip unused static var $skipcategories Most of that stuff is really pretty trivial. The only thing that really struck me is how `count` is operating within these changes, particually as it is reserving the count_non_empty_children method and is not returning a true count. I wonder whether perhaps we should be implementing two methods has_non_empty_children to return bool, and count_non_empty_children to return the count. Or perhaps we could look to implement an argument that says I don't care how many more children there are then 2 and operate on that. Anyway just throwing around ideas here I'd like to know what you think Marina. As there are a couple of other minor changes I'll send this back this week so that they can be made and see what your thoughts are on the count situation. Many thanks Sam
          Hide
          CiBoT added a comment -

          Moving this reopened issue out from current integration. Please, re-submit it for integration once ready.

          Show
          CiBoT added a comment - Moving this reopened issue out from current integration. Please, re-submit it for integration once ready.
          Hide
          Dan Marsden added a comment -

          I just cherry-picked Marina's patch onto a 2.3 latest stable running php 5.4.7/IIS/wincache (don't ask) and it doesn't help when the filepicker goes to the root. eg - when you use private files block on /my/ hit "manage my private files" - then try to add a file and select the server files repo. It just sits there for ages and eventually throws a json error. So it might help improve things but there's still further work needed somewhere.

          thanks!

          Show
          Dan Marsden added a comment - I just cherry-picked Marina's patch onto a 2.3 latest stable running php 5.4.7/IIS/wincache (don't ask) and it doesn't help when the filepicker goes to the root. eg - when you use private files block on /my/ hit "manage my private files" - then try to add a file and select the server files repo. It just sits there for ages and eventually throws a json error. So it might help improve things but there's still further work needed somewhere. thanks!
          Hide
          Marina Glancy added a comment -

          Hi Sam,

          thanks for the throughout review. I added new commit to the branches.

          The most of your suggestions I have implemented. I changed the interface of count_non_empty_children to make it more clear.

          lib/filebrowser/file_info_context_coursecat.php

          file_info_context_coursecat::count_non_empty_children
          I changed to recordsets and retrieve context ids. But searching children contexts is less indexed than searching children courses/subcategories. Also there are no public functions in context object to help me, so I have to pass context id and it will retrieve record from

          {context}

          again.

          lib/filebrowser/file_info_context_module.php

          file_info_context_module::get_filtered_children
          [Does that fast pre-check really help? The use of get_file_info there has to be horribly ineffiecent....]
          Yes it helps. This function may be called from parent course to find out whether we need to show module. So it will be repeated for each module inside a course. So it is 5-6 queries times number of modules in a course. Yes, terribly inefficient but absolutely necessary since it calls the callback function from particular module type.

          lib/filebrowser/file_info.php

          file_info::count_non_empty_children
          [The list of children could be double processed looking first to files that are not a directory and second looking at directories]
          I changed it but it won't help much. The only type of file_info that can contain BOTH files and folders is file_info_stored_file, and there this function is overwritten anyway.

          repository/local/lib.php

          [Double copyright on the repository_local class.]
          MD suggested it. There is nothing left from DS code in this class. For the last half year I re-written tons of his code and all my code is now under his copyright.

          [$context should probably be defaulted to something like the other properties picked up from the params.]
          sorry not quite sure what you mean

          Show
          Marina Glancy added a comment - Hi Sam, thanks for the throughout review. I added new commit to the branches. The most of your suggestions I have implemented. I changed the interface of count_non_empty_children to make it more clear. lib/filebrowser/file_info_context_coursecat.php file_info_context_coursecat::count_non_empty_children I changed to recordsets and retrieve context ids. But searching children contexts is less indexed than searching children courses/subcategories. Also there are no public functions in context object to help me, so I have to pass context id and it will retrieve record from {context} again. lib/filebrowser/file_info_context_module.php file_info_context_module::get_filtered_children [Does that fast pre-check really help? The use of get_file_info there has to be horribly ineffiecent....] Yes it helps. This function may be called from parent course to find out whether we need to show module. So it will be repeated for each module inside a course. So it is 5-6 queries times number of modules in a course. Yes, terribly inefficient but absolutely necessary since it calls the callback function from particular module type. lib/filebrowser/file_info.php file_info::count_non_empty_children [The list of children could be double processed looking first to files that are not a directory and second looking at directories] I changed it but it won't help much. The only type of file_info that can contain BOTH files and folders is file_info_stored_file, and there this function is overwritten anyway. repository/local/lib.php [Double copyright on the repository_local class.] MD suggested it. There is nothing left from DS code in this class. For the last half year I re-written tons of his code and all my code is now under his copyright. [$context should probably be defaulted to something like the other properties picked up from the params.] sorry not quite sure what you mean
          Hide
          Dan Poltawski added a comment -

          The main moodle.git repository has just been updated with latest weekly modifications. You may wish to rebase your PULL branches to simplify history and avoid any possible merge conflicts. This would also make integrator's life easier next week.

          TIA and ciao

          Show
          Dan Poltawski added a comment - The main moodle.git repository has just been updated with latest weekly modifications. You may wish to rebase your PULL branches to simplify history and avoid any possible merge conflicts. This would also make integrator's life easier next week. TIA and ciao
          Hide
          Eloy Lafuente (stronk7) added a comment -

          The integration of this issue has been delayed to next week because the integration period is over (Monday, Tuesday) and testing must happen on Wednesday.

          This change to a more rigid timeframe on each integration/testing cycle aims to produce a better and clear separation and organization of tasks for everybody.

          This is a bulk-automated message, so if you want to blame somebody/thing/where, don't do it here (use git instead) :-D :-P

          Apologises for the inconvenient, this will be integrated next week. Thanks for your collaboration & ciao

          Show
          Eloy Lafuente (stronk7) added a comment - The integration of this issue has been delayed to next week because the integration period is over (Monday, Tuesday) and testing must happen on Wednesday. This change to a more rigid timeframe on each integration/testing cycle aims to produce a better and clear separation and organization of tasks for everybody. This is a bulk-automated message, so if you want to blame somebody/thing/where, don't do it here (use git instead) :-D :-P Apologises for the inconvenient, this will be integrated next week. Thanks for your collaboration & ciao
          Hide
          Didier Raboud added a comment -

          Hi,

          we tested Marina's patches here and we are (sometimes) getting the following error when trying to upload images:

          Error updating user preference 'filepicker_recentrepository' using ajax. Clicking this link will repeat the Ajax call that failed so you can see the error:

          Show
          Didier Raboud added a comment - Hi, we tested Marina's patches here and we are (sometimes) getting the following error when trying to upload images: Error updating user preference 'filepicker_recentrepository' using ajax. Clicking this link will repeat the Ajax call that failed so you can see the error:
          Hide
          Sam Hemelryk added a comment -

          Thanks Marina, this has been integrated now.

          Didier, I've just been chatting to Marina about the error that you were seeing, we couldn't map it back to any changes in this patch. We've decided that this can land and we'll open a new issue for the JSON changes.
          I'll add you as a watcher to that, if you possible could you please add anything that may help us in diagnosing the issue further. What was being done at the time, whether there was any pattern to it occurring that you could recognise etc.

          Many thanks
          Sam

          Show
          Sam Hemelryk added a comment - Thanks Marina, this has been integrated now. Didier, I've just been chatting to Marina about the error that you were seeing, we couldn't map it back to any changes in this patch. We've decided that this can land and we'll open a new issue for the JSON changes. I'll add you as a watcher to that, if you possible could you please add anything that may help us in diagnosing the issue further. What was being done at the time, whether there was any pattern to it occurring that you could recognise etc. Many thanks Sam
          Hide
          Mark Nelson added a comment -

          Test 1:

          I did not do this as I do not have access to a large site that is currently using integration (this was discussed in developer chat).

          Test 2:

          Works as expected. Found no issues. I am passing based on this.

          Show
          Mark Nelson added a comment - Test 1: I did not do this as I do not have access to a large site that is currently using integration (this was discussed in developer chat). Test 2: Works as expected. Found no issues. I am passing based on this.
          Hide
          Aparup Banerjee added a comment -

          Your issue has dug up some gold.
          It works great i've been told.
          Go forth, be brave, be bold.

          yay! "All your thoughts are belong to everyone."

          Thanks and ciao!

          Show
          Aparup Banerjee added a comment - Your issue has dug up some gold. It works great i've been told. Go forth, be brave, be bold. yay! "All your thoughts are belong to everyone." Thanks and ciao!
          Hide
          Tim Lock added a comment -

          My testing found that using a mod_resource with many sub folders and an images folder with 50+ files under tree view was 2 to 3 times worse with the patch ontop of Moodle 2.3.2.

          Any suggestions how to improve this and/or Automated backups folder under tree view?

          Show
          Tim Lock added a comment - My testing found that using a mod_resource with many sub folders and an images folder with 50+ files under tree view was 2 to 3 times worse with the patch ontop of Moodle 2.3.2. Any suggestions how to improve this and/or Automated backups folder under tree view?

            People

            • Votes:
              10 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: