Uploaded image for project: 'Moodle Community Sites'
  1. Moodle Community Sites
  2. MDLSITE-4902

Enhance "Anonymise" plugin for Project Inspire Phase I



    • Type: Task
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Component/s: research.moodle.org
    • Labels:


      Description of purpose: Moodle Project Inspire is intended to identify and validate indicators of student, teacher, and institutional engagement in educational activities for the purpose of developing learning analytics software features with the following functions:

      • description of learning engagement and progress,
      • diagnosis of learning engagement and progress,
      • prediction of learning progress, and
      • recommendations for improvement of learning progress.

      Project Inspire will provide learning analytics tools within Moodle Core. These analytics will be based on inputs that will be extracted and validated using data from as many participants from the Moodle community as possible. In order to facilitate participation, we need a tool to extract and de-identify data from production Moodle sites.

      Michael De Raat's "Anonymise" plugin (https://github.com/deraadt/moodle-local_anonymise, latest version in https://github.com/moodlehq/moodle-local_anonymise) is expected to serve as the basis of this extraction tool, with enhancements as noted below. The "Anonymise" plugin, as currently designed, operates on a copy of a production site, and while it currently replaces many potentially identifying fields with anonymised identifiers, this process needs to be more complete in order to satisfy data protection laws in many countries, especially in the US (FERPA) and EU (Directive 95/46/EC). All short text fields need to be anonymised, and all long fields and attachments need to be replaced with "dummy" data of the same size (as size and type of user activity may be an indicator in our analytics model).

      Enhancements requested:

      1. Replace all Personally Identifiable Information data in User, Course, and Category records (all short text fields, e.g. names, email addresses, course and category names and ID numbers) with unique, consistent identifiers not based on user identifiable information, e.g. keyed hashed values appended with a literal text field identifier such as “_firstname”
      2. Replace all long text fields (e.g. forum posts, activity descriptions) with “dummy” text of the same length (e.g. repeated null words, lorem ipso, etc.)
      3. Replace all attached files with “dummy” files of approximately the same size and type.

      (Edited to reference the plugin's latest version)




            • Votes:
              0 Vote for this issue
              2 Start watching this issue


              • Created: