Moodle

Special characters left in filenames after unzipping

Details

  • Type: Bug Bug
  • Status: Closed Closed
  • Priority: Minor Minor
  • Resolution: Fixed
  • Affects Version/s: 1.5.3
  • Fix Version/s: 1.8.9, 1.9.5
  • Component/s: Administration
  • Labels:
    None
  • Environment:
    All
  • Affected Branches:
    MOODLE_15_STABLE
  • Fixed Branches:
    MOODLE_18_STABLE, MOODLE_19_STABLE

Description

When an uploaded zip file is unzipped in the Files area, special characters are left in the unzipped files - resulting in Sorry, the requested file could not be found message when trying to access those files. Tested at my site and moodle.org today.

Issue Links

Activity

Hide
Martin Dougiamas added a comment -

From Jaroslav Šeděnka (jaroslav at sedenka.net) Sunday, 16 October 2005, 03:24 AM:

I uploaded a simple patch replacing ilegal characters with underscore (on Linux; might be neccessary on other Unixes as well, but I can't test it).

From Jaroslav Šeděnka (jaroslav at sedenka.net) Sunday, 16 October 2005, 03:25 AM:

PS. the code was written by Dan Miksik, I't not like to forget the credits

Show
Martin Dougiamas added a comment - From Jaroslav Šeděnka (jaroslav at sedenka.net) Sunday, 16 October 2005, 03:24 AM: I uploaded a simple patch replacing ilegal characters with underscore (on Linux; might be neccessary on other Unixes as well, but I can't test it). From Jaroslav Šeděnka (jaroslav at sedenka.net) Sunday, 16 October 2005, 03:25 AM: PS. the code was written by Dan Miksik, I't not like to forget the credits
Hide
Kenneth Newquist added a comment -

This problem still exists in Moodle 1.8.4+.

Show
Kenneth Newquist added a comment - This problem still exists in Moodle 1.8.4+.
Hide
Teresa Gibbison added a comment -

This also still exists in Moodle 1.9 - it would be great if this can be fixed as the files are basically useless.

Show
Teresa Gibbison added a comment - This also still exists in Moodle 1.9 - it would be great if this can be fixed as the files are basically useless.
Hide
Charles Fulton added a comment -

The problem can be traced to the fact that the file display function and the unzip function don't use the same "cleaning" method: unzip_clean_filename() has several built in regexs, while file.php calls clean_param (with PARAM_PATH), which has its own more extensive regexs. I've added a patch which changes unzip_clean_filename(); it now also calls clean_param() as part of its cleaning routine.

The desired behavior, then, is to remove strange characters from the get-go, so they'll never cause problems down the road.

Show
Charles Fulton added a comment - The problem can be traced to the fact that the file display function and the unzip function don't use the same "cleaning" method: unzip_clean_filename() has several built in regexs, while file.php calls clean_param (with PARAM_PATH), which has its own more extensive regexs. I've added a patch which changes unzip_clean_filename(); it now also calls clean_param() as part of its cleaning routine. The desired behavior, then, is to remove strange characters from the get-go, so they'll never cause problems down the road.
Hide
Kenneth Newquist added a comment -

Thanks for looking into this; I'd meant to look into this and never got around to it. I agree that the desired behavior is for it to strip any special characters, which is what happens when you upload a single file. This bug does render zip upload useless for a goodly number of faculty who end up using some sort of special character in their file name (who are then extremely annoyed at their inability to move/delete a file).

I'll check out the patch ASAP.

Show
Kenneth Newquist added a comment - Thanks for looking into this; I'd meant to look into this and never got around to it. I agree that the desired behavior is for it to strip any special characters, which is what happens when you upload a single file. This bug does render zip upload useless for a goodly number of faculty who end up using some sort of special character in their file name (who are then extremely annoyed at their inability to move/delete a file). I'll check out the patch ASAP.
Hide
Deb Sarlin added a comment -

Thanks Charles,
Really helpful - will try unzip_clean_filenames.diff as well.
Deb

Show
Deb Sarlin added a comment - Thanks Charles, Really helpful - will try unzip_clean_filenames.diff as well. Deb
Hide
Petr Škoda (skodak) added a comment -

this is already solved in HEAD

Show
Petr Škoda (skodak) added a comment - this is already solved in HEAD
Hide
Sara Arjona added a comment -

This solution doesn't work for us. We work with Moodle 1.8.7 with Oracle database and this solution still left some special characters (like ç, ñ, ' ...). Besides, we have solved this problem adding the following lines in the unzip_cleanfilename function (lib/moodlelib.php file):

...
$p_header['filename'] = cleardoubleslashes($p_header['filename']); //normalize the slashes/backslashes
// ******* Added
$tmp = split("/", $p_header['filename']);
$tmp[sizeof($tmp)-1]=clean_filename($tmp[sizeof($tmp)-1]);
$p_header['filename']=implode("/", $tmp);
// ***** End added
return 1;
...

It would be possible to revise the code and add this lines to solve definitively the problem? Thank you in advance!

Show
Sara Arjona added a comment - This solution doesn't work for us. We work with Moodle 1.8.7 with Oracle database and this solution still left some special characters (like ç, ñ, ' ...). Besides, we have solved this problem adding the following lines in the unzip_cleanfilename function (lib/moodlelib.php file): ... $p_header['filename'] = cleardoubleslashes($p_header['filename']); //normalize the slashes/backslashes // ******* Added $tmp = split("/", $p_header['filename']); $tmp[sizeof($tmp)-1]=clean_filename($tmp[sizeof($tmp)-1]); $p_header['filename']=implode("/", $tmp); // ***** End added return 1; ... It would be possible to revise the code and add this lines to solve definitively the problem? Thank you in advance!
Hide
Charles Fulton added a comment -

Updated per Sara Arjona's comment. clean_filename() is even more restrictive than clean_param(), even when PARAM_FILE is passed (which doesn't make much sense to me).

Show
Charles Fulton added a comment - Updated per Sara Arjona's comment. clean_filename() is even more restrictive than clean_param(), even when PARAM_FILE is passed (which doesn't make much sense to me).
Hide
Petr Škoda (skodak) added a comment -

should be fixed now, please test latest cvs
thanks the report

Show
Petr Škoda (skodak) added a comment - should be fixed now, please test latest cvs thanks the report
Hide
Daniel Miksik added a comment -

I have tested the latest MOODLE_18_STABLE code from CVS, still getting illegal characters (like space or Czech characters with diacritic marks) in the names of the unzipped files.

Show
Daniel Miksik added a comment - I have tested the latest MOODLE_18_STABLE code from CVS, still getting illegal characters (like space or Czech characters with diacritic marks) in the names of the unzipped files.
Hide
Petr Škoda (skodak) added a comment -

which zip method are you using?

Show
Petr Škoda (skodak) added a comment - which zip method are you using?
Hide
Petr Škoda (skodak) added a comment -

reclosing, the internal zipping should work fine too now.
International characters, spaces, etc. are not forbidden, only chars like '"<> are - this is required for backwards compatibility and also they will be allowed in 2.0

thanks a lot for the testing

Show
Petr Škoda (skodak) added a comment - reclosing, the internal zipping should work fine too now. International characters, spaces, etc. are not forbidden, only chars like '"<> are - this is required for backwards compatibility and also they will be allowed in 2.0 thanks a lot for the testing
Hide
Daniel Miksik added a comment -

I don't know what exactly is meant by "zip method", sorry. I'm creating the ZIP files using 7-Zip (compression method: deflate, dictionary size: 32 KB, word size: 64).

I think something has got broken by the last fix. I get the following error trying to unzip any ZIP file now:
Notice: Undefined variable: temppath in /home/elfdev/lib/moodlelib.php on line 6562
PCLZIP_ERR_INVALID_PARAMETER (-3) : Invalid optional parameter '' for this method

And a final note: If international characters are not forbidden, then why are they removed from the names of files uploaded directly to course files?

Show
Daniel Miksik added a comment - I don't know what exactly is meant by "zip method", sorry. I'm creating the ZIP files using 7-Zip (compression method: deflate, dictionary size: 32 KB, word size: 64). I think something has got broken by the last fix. I get the following error trying to unzip any ZIP file now: Notice: Undefined variable: temppath in /home/elfdev/lib/moodlelib.php on line 6562 PCLZIP_ERR_INVALID_PARAMETER (-3) : Invalid optional parameter '' for this method And a final note: If international characters are not forbidden, then why are they removed from the names of files uploaded directly to course files?
Hide
Petr Škoda (skodak) added a comment -

zip mehods are two, internal using pclzip and external infozib binary,
reopening and sorry for the regression in 1.8.x branch

Show
Petr Škoda (skodak) added a comment - zip mehods are two, internal using pclzip and external infozib binary, reopening and sorry for the regression in 1.8.x branch
Hide
Petr Škoda (skodak) added a comment -

fixed, thanks

Show
Petr Škoda (skodak) added a comment - fixed, thanks
Hide
Daniel Miksik added a comment -

Thanks for the quick fixes, Petr. Could you please also respond to the question from above: If international characters are not forbidden, then why are they removed from the names of files uploaded directly to course files? Thank you.

Show
Daniel Miksik added a comment - Thanks for the quick fixes, Petr. Could you please also respond to the question from above: If international characters are not forbidden, then why are they removed from the names of files uploaded directly to course files? Thank you.
Hide
Petr Škoda (skodak) added a comment -

They are not forbidden for security reasons. They do break often if you use them. Unfortunately we can not remove them because it would break links in uploaded html files.

In 2.0 all the "not forbidden" chars should finally "sort of" work.

Show
Petr Škoda (skodak) added a comment - They are not forbidden for security reasons. They do break often if you use them. Unfortunately we can not remove them because it would break links in uploaded html files. In 2.0 all the "not forbidden" chars should finally "sort of" work.
Hide
Susana Leitão added a comment -

We have 1.9.5 and uploading and unziping zip file using IE (6 and 8) browser the problem persists. Can anyone confirm it?
(ex: çãéõà...)

Show
Susana Leitão added a comment - We have 1.9.5 and uploading and unziping zip file using IE (6 and 8) browser the problem persists. Can anyone confirm it? (ex: çãéõà...)

Dates

  • Created:
    Updated:
    Resolved: