Issue Details (XML | Word | Printable)

Key: MDL-2307
Type: Bug Bug
Status: Resolved Resolved
Resolution: Fixed
Priority: Minor Minor
Assignee: Petr Skoda
Reporter: Daniel Miksik
Votes: 12
Watchers: 5
Operations

Add/Edit UI Mockup to this issue
If you were logged in you would be able to see more operations.
Moodle

Special characters left in filenames after unzipping

Created: 23/Dec/04 01:01 AM   Updated: 11/May/09 11:48 PM
Return to search
Component/s: Administration
Affects Version/s: 1.5.3
Fix Version/s: 1.8.9, 1.9.5

File Attachments: 1. File unzip_clean_filenames.diff (0.9 kB)
2. File unzip_clean_filenames.diff (1.0 kB)

Environment: All
Issue Links:
Duplicate
 
Relates
 

Participants: Charles Fulton, Daniel Miksik, Deb Sarlin, Kenneth Newquist, Martin Dougiamas, Petr Skoda, Sara Arjona and Teresa Gibbison
Security Level: None
Resolved date: 08/May/09
Affected Branches: MOODLE_15_STABLE
Fixed Branches: MOODLE_18_STABLE, MOODLE_19_STABLE


 Description  « Hide
When an uploaded zip file is unzipped in the Files area, special characters are left in the unzipped files - resulting in Sorry, the requested file could not be found message when trying to access those files. Tested at my site and moodle.org today.

 All   Comments   Change History   Version Control      Sort Order: Ascending order - Click to sort in descending order
Martin Dougiamas added a comment - 17/Nov/05 07:39 AM
From Jaroslav Šeděnka (jaroslav at sedenka.net) Sunday, 16 October 2005, 03:24 AM:

I uploaded a simple patch replacing ilegal characters with underscore (on Linux; might be neccessary on other Unixes as well, but I can't test it).

From Jaroslav Šeděnka (jaroslav at sedenka.net) Sunday, 16 October 2005, 03:25 AM:

PS. the code was written by Dan Miksik, I't not like to forget the credits


Michael Blake made changes - 21/Aug/06 03:57 PM
Field Original Value New Value
Assignee Martin Dougiamas [ martindougiamas ] Martin Dougiamas [ dougiamas ]
Martin Dougiamas made changes - 23/Aug/06 11:48 PM
Status In Progress [ 3 ] Open [ 1 ]
Kenneth Newquist added a comment - 05/Mar/08 11:38 PM
This problem still exists in Moodle 1.8.4+.

Teresa Gibbison added a comment - 30/Jul/08 07:44 AM
This also still exists in Moodle 1.9 - it would be great if this can be fixed as the files are basically useless.

Petr Skoda made changes - 08/Jan/09 08:14 PM
Link This issue has been marked as being related by MDL-17820 [ MDL-17820 ]
Charles Fulton added a comment - 14/Jan/09 11:35 AM
The problem can be traced to the fact that the file display function and the unzip function don't use the same "cleaning" method: unzip_clean_filename() has several built in regexs, while file.php calls clean_param (with PARAM_PATH), which has its own more extensive regexs. I've added a patch which changes unzip_clean_filename(); it now also calls clean_param() as part of its cleaning routine.

The desired behavior, then, is to remove strange characters from the get-go, so they'll never cause problems down the road.


Charles Fulton made changes - 14/Jan/09 11:35 AM
Attachment unzip_clean_filenames.diff [ 16058 ]
Kenneth Newquist added a comment - 14/Jan/09 11:26 PM
Thanks for looking into this; I'd meant to look into this and never got around to it. I agree that the desired behavior is for it to strip any special characters, which is what happens when you upload a single file. This bug does render zip upload useless for a goodly number of faculty who end up using some sort of special character in their file name (who are then extremely annoyed at their inability to move/delete a file).

I'll check out the patch ASAP.


Martin Dougiamas made changes - 15/Jan/09 05:16 PM
Assignee Martin Dougiamas [ dougiamas ] Petr ?koda [ skodak ]
Martin Dougiamas made changes - 15/Jan/09 05:16 PM
Affects Version/s 2.0 [ 10122 ]
Affects Version/s 1.9.4 [ 10300 ]
Martin Dougiamas made changes - 15/Jan/09 05:17 PM
Affects Version/s 1.9.4 [ 10300 ]
Fix Version/s 2.0 [ 10122 ]
Fix Version/s 1.9.4 [ 10300 ]
Affects Version/s 2.0 [ 10122 ]
Paul Holden made changes - 15/Jan/09 06:45 PM
Link This issue is duplicated by MDL-11468 [ MDL-11468 ]
Deb Sarlin added a comment - 16/Jan/09 12:48 AM
Thanks Charles,
Really helpful - will try unzip_clean_filenames.diff as well.
Deb

Petr Skoda added a comment - 22/Jan/09 03:01 AM
this is already solved in HEAD

Petr Skoda made changes - 22/Jan/09 03:01 AM
Fix Version/s 1.9.4 [ 10300 ]
Fix Version/s 1.9.5 [ 10320 ]
Fix Version/s 2.0 [ 10122 ]
Sara Arjona added a comment - 11/Mar/09 05:41 PM
This solution doesn't work for us. We work with Moodle 1.8.7 with Oracle database and this solution still left some special characters (like ç, ñ, ' ...). Besides, we have solved this problem adding the following lines in the unzip_cleanfilename function (lib/moodlelib.php file):

...
$p_header['filename'] = cleardoubleslashes($p_header['filename']); //normalize the slashes/backslashes
// ******* Added
$tmp = split("/", $p_header['filename']);
$tmp[sizeof($tmp)-1]=clean_filename($tmp[sizeof($tmp)-1]);
$p_header['filename']=implode("/", $tmp);
// ***** End added
return 1;
...

It would be possible to revise the code and add this lines to solve definitively the problem? Thank you in advance!


Charles Fulton added a comment - 15/Apr/09 05:21 AM
Updated per Sara Arjona's comment. clean_filename() is even more restrictive than clean_param(), even when PARAM_FILE is passed (which doesn't make much sense to me).

Charles Fulton made changes - 15/Apr/09 05:21 AM
Attachment unzip_clean_filenames.diff [ 16887 ]
Petr Skoda committed 1 file to 'Moodle CVS' on branch 'MOODLE_19_STABLE' - 06/May/09 04:48 PM
MDL-2307, MDL-18415 - improved unzipping - now properly removing forbidden chars and symlinks
MODIFY lib/moodlelib.php   Rev. 1.960.2.130    (+71 -2 lines)
Petr Skoda committed 1 file to 'Moodle CVS' on branch 'MOODLE_18_STABLE' - 06/May/09 04:49 PM
MDL-2307, MDL-18415 - improved unzipping - now properly removing forbidden chars and symlinks; backported from MOODLE_19_STABLE
MODIFY lib/moodlelib.php   Rev. 1.837.2.97    (+72 -3 lines)
Petr Skoda added a comment - 06/May/09 04:52 PM
should be fixed now, please test latest cvs
thanks the report

Petr Skoda made changes - 06/May/09 04:52 PM
Fix Version/s 1.8.9 [ 10322 ]
Resolution Fixed [ 1 ]
Status Open [ 1 ] Resolved [ 5 ]
Daniel Miksik added a comment - 07/May/09 06:03 PM
I have tested the latest MOODLE_18_STABLE code from CVS, still getting illegal characters (like space or Czech characters with diacritic marks) in the names of the unzipped files.

Petr Skoda added a comment - 07/May/09 06:17 PM
which zip method are you using?

Petr Skoda made changes - 07/May/09 06:17 PM
Status Resolved [ 5 ] Reopened [ 4 ]
Resolution Fixed [ 1 ]
Petr Skoda committed 1 file to 'Moodle CVS' on branch 'MOODLE_19_STABLE' - 07/May/09 06:47 PM
MDL-2307 improved file name cleanup during unzip
MODIFY lib/moodlelib.php   Rev. 1.960.2.131    (+23 -28 lines)
Petr Skoda committed 1 file to 'Moodle CVS' on branch 'MOODLE_18_STABLE' - 07/May/09 06:48 PM
MDL-2307 improved file name cleanup during unzip; backported from MOODLE-19_STABLE
MODIFY lib/moodlelib.php   Rev. 1.837.2.98    (+16 -28 lines)
Petr Skoda added a comment - 07/May/09 06:50 PM
reclosing, the internal zipping should work fine too now.
International characters, spaces, etc. are not forbidden, only chars like '"<> are - this is required for backwards compatibility and also they will be allowed in 2.0

thanks a lot for the testing


Petr Skoda made changes - 07/May/09 06:50 PM
Resolution Fixed [ 1 ]
Status Reopened [ 4 ] Resolved [ 5 ]
Daniel Miksik added a comment - 07/May/09 08:03 PM
I don't know what exactly is meant by "zip method", sorry. I'm creating the ZIP files using 7-Zip (compression method: deflate, dictionary size: 32 KB, word size: 64).

I think something has got broken by the last fix. I get the following error trying to unzip any ZIP file now:
Notice: Undefined variable: temppath in /home/elfdev/lib/moodlelib.php on line 6562
PCLZIP_ERR_INVALID_PARAMETER (-3) : Invalid optional parameter '' for this method

And a final note: If international characters are not forbidden, then why are they removed from the names of files uploaded directly to course files?


Petr Skoda added a comment - 08/May/09 12:39 AM
zip mehods are two, internal using pclzip and external infozib binary,
reopening and sorry for the regression in 1.8.x branch

Petr Skoda made changes - 08/May/09 12:39 AM
Resolution Fixed [ 1 ]
Status Resolved [ 5 ] Reopened [ 4 ]
Petr Skoda committed 2 files to 'Moodle CVS' on branch 'MOODLE_18_STABLE' - 08/May/09 12:44 AM
MDL-2307 fixing unzip regression, sorry
MODIFY lib/moodlelib.php   Rev. 1.837.2.100    (+3 -3 lines)
MODIFY lib/moodlelib.php   Rev. 1.837.2.99    (+14 -5 lines)
Petr Skoda added a comment - 08/May/09 12:45 AM
fixed, thanks

Petr Skoda made changes - 08/May/09 12:45 AM
Resolution Fixed [ 1 ]
Status Reopened [ 4 ] Resolved [ 5 ]
Daniel Miksik added a comment - 11/May/09 09:26 PM
Thanks for the quick fixes, Petr. Could you please also respond to the question from above: If international characters are not forbidden, then why are they removed from the names of files uploaded directly to course files? Thank you.

Petr Skoda added a comment - 11/May/09 11:48 PM
They are not forbidden for security reasons. They do break often if you use them. Unfortunately we can not remove them because it would break links in uploaded html files.

In 2.0 all the "not forbidden" chars should finally "sort of" work.