We store datasets with indicators and targets calculations. These datasets are fed into the machine learning backends.
There is a case where we deal with a single file and we don't need to: When our model is based on a sitewide analyser we don't need to merge each different analysable dataset because there is just 1 analysable. At the moment we are generating a per-analysable dataset like we do in by_course analyser and later "merging" it with nothing else. This process is unnecessary as we could just move or copy the file somewhere else in the file system.