Uploaded image for project: 'Moodle'
  1. Moodle
  2. MDL-38151

"File-less" course backups

    XMLWordPrintable

    Details

    • Type: New Feature
    • Status: Closed
    • Priority: Minor
    • Resolution: Duplicate
    • Affects Version/s: 1.9.19, 2.1.8, 2.1.9, 2.1.10, 2.2.5, 2.2.6, 2.2.7, 2.3.2, 2.3.3, 2.3.4, 2.4, 2.4.1
    • Fix Version/s: None
    • Component/s: Backup
    • Affected Branches:
      MOODLE_19_STABLE, MOODLE_21_STABLE, MOODLE_22_STABLE, MOODLE_23_STABLE, MOODLE_24_STABLE

      Description

      Many mission critical installations of Moodle utilize storage file systems that support snapshots. Many larger Moodle installations report they are unable to use the scheduled course backup system because of the load created by generating course backups files. Both IOPs and CPU are used heavily in the process of moving a courses files to the course zip file. Because the zip files are binary blobs backup systems cannot determine that only minor changes have been made to the course. This increase load on the backup system as well and also greatly increases the amount of storage used on both the Moodle production system and the backup system. Scheduled course backups are convenient to provide a self-service means for teachers to restore accidentally deleted / modified courses.

      Solution
      Implement a version of course backup that doesn't copy the course files. Restoration of these "file-less" course backups would use the snapshot copy of the data root from the correct time period, so that file I/O is only generated when restores happen which is much less often.

      Many SANs now implement online access to their snapshots via a .snapshot folder. Some mechanism is needed to define the master path and rules for snapshots so this can work with majority of SANs. An example would be ./snapshot-21022013 fora snapshot taken on February 21st. I can provide more detail on the exact format used for both NetApp and GPFS based snapshots.

      Zipless backup
      Another variation of this concept for large sites not using a SAN would be to use zipless course backups. Then rsync style differential backups could be used to reduce I/O and CPU usage. There would be an increase in file storage used, but many large courses are primarily video and graphic content that doesn't compress well anyway. Hardlink replacement methods could be use to provide multiple points in time. Course backup files would be generated in realtime only when a user attempted to download the file.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              moodle.com moodle.com
              Reporter:
              moorejon Jonathan Moore
              Participants:
              Component watchers:
              Adrian Greeve, Jake Dallimore, Mathew May, Mihail Geshoski, Peter Dias, Sujith Haridasan
              Votes:
              2 Vote for this issue
              Watchers:
              8 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: