Uploaded image for project: 'Moodle'
  1. Moodle
  2. MDL-65202

filter_out_training_samples will not work well with time-splitting methods that generate incremental training data

    XMLWordPrintable

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 3.7
    • None
    • Analytics
    • None
    • MOODLE_37_STABLE

    Description

      filter_out_training_samples is executed during the analysis (training, prediction and evaluation) and it filters out samples that have already been used for training. This function was designed assuming that all the training data for a sample is trained during the same run. This is fine for models whose time-splitting methods know the number of ranges beforehand. In such cases the samples are not used for training until the end of the analysable. Examples of these targets are quarters or quarters accumulative.

      In 3.7 we introduce time-splitting methods whose number of ranges is not known before hand. An example of this is a weekly time-splitting method. In these cases we have new training data every week.

      We need to add the rangeindex field to mdl_analytics_train_samples and we need an upgrade step so that all the current records are repeated for each of the rangeindexes available in the time-splitting method used by the model.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              dmonllao David Monllaó
              Amaia Anabitarte, Carlos Escobedo, Ferran Recio, Ilya Tregubov, Sara Arjona (@sarjona)
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: