-
Improvement
-
Resolution: Fixed
-
Minor
-
3.0, 3.2
-
MOODLE_30_STABLE, MOODLE_32_STABLE
-
MOODLE_32_STABLE
-
MDL-50888-master -
Using unix socket is a significantly faster way of scanning files than using command line exec call, but it is only available on unix-like systems. This will be implemented as an option for ClamAV plugin letting user choose if they want to use sockets or command line utility. Command line will remain the default option.
How scanning works
For more details on sockets commands that ClamAV accepts, see the manual. In this particular case we use SCAN command with full file path as parameter. ClamAV user should be able to access the file - adding clamav user to www-data group should do the trick. There is no easier way to resolve access unfortunately, granting read permission of uploaded file to everyone is not a good option, thus not used here. Potentially, FILDES command can be used to overcome the need to resolve access, but to use it, minimum php version needs to be bumped to 5.5 (socket_sendmsg function is required that implements https://wiki.php.net/rfc/sendrecvmsg needed to build bsd4.4 style package with file descriptor encapsulated).
There is a potential possibility of using TCP sockets, e.g. if ClamAV is run on the different system, but it is not good performance-wise (all files will need to be network-transferred for scanning), thus, this option has not been implemented.
Statistical analysis
In order to verify a statistical significance between running methods the test script has been designed. The test has been run 100 times for each file of different size (1mb, 10mb, 50mb, 100mb, 500mb) and for each running method (command line and socket), the time taken to scan file has been recorded in milliseconds. General descriptive statistics and graph representing it is shown below.
Command line | Unix socket | |
---|---|---|
1 Mb | M = 16.738, SD = 5.384 | M = 9.360, SD = 2.15 |
10 Mb | M = 35.148, SD = 13.890 | M = 29.533, SD = 5.270 |
50 Mb | M = 8.943, SD = 2.898 | M = 0.928, SD = 0.229 |
100 Mb | M = 8.967, SD = 2.796 | M = 0.619, SD = 0.334 |
500 Mb | M = 9.332, SD = 3.101 | M = 1.312, SD = 0.388 |
Two-sample t-test has been applied to groups of variables recorded for same file scanned using two different running methods. The result demonstrated significant difference at confidence interval of 95% in all compared groups of samples. See attached pdf for detailed results.