View Issue Details

IDProjectCategoryView StatusLast Update
14091Bug reports[All Projects] Survey takingpublic2019-01-15 10:37
ReportertbartAssigned ToLouisGac 
PrioritynoneSeverityminor 
Status assignedResolutionreopened 
Product Version3.13.x 
Target VersionFixed in Version 
Summary14091: Filenames of uploads starting with special characters truncated
Description

Non-ANSI characters (e.g. umlauts (äöü) ) at the beginning of uploaded files are cut off.

Steps To Reproduce

When taking a survey that contains a file upload field:

  1. Choose a file with an umlaut at the beginning
  2. See file with umlaut at the beginning cut off
Additional Information

This is due to locale-awareness of PHP's basename().
Also see https://stackoverflow.com/questions/45268499/php-basename-and-pathinfo-with-multibytes-utf-8-file-names/45268539#45268539 for other related functions with the same feature.

Adding

setlocale(LC_ALL, 'en_US.UTF8');

in index.php fixes it, but I am sure this is not the right place to add this.

My system locale (on debian) is correctly set (apart from LC_ALL, which normally is just an override and should not be set).

Please either respect system's LANG or set the locale to some UTF-8 compatible one.

TagsNo tags attached.
Complete LimeSurvey version number (& build)3.14.9+180917
I will donate to the project if issue is resolvedNo
Browser
Database & DB-VersionmySQL 5.5.60
Server OS (if known)debian
Webserver software & version (if known)apache 2.4.10
PHP Version5.6.36

Activities

LouisGac

LouisGac

2019-01-10 17:40

manager   ~50169

yes we have hard filtering of file names for security reason.

tbart

tbart

2019-01-14 17:40

reporter   ~50188

I understand file names get filtered, but this does not seem related. Why does setting the correct locale circumvent the filtering?
I feel like filtering should be done after correctly interpreting the code points with the correct locale.

But I may be wrong.

If only ASCII (or even a more reduced subset) is allowed, a note should advise users to rename their files (and only list allowed characters). Though this seems pretty uncommon. There should be standard functions to sanitize the string and frankly, umlauts don't seem all too dangerous :)

DenisChenu

DenisChenu

2019-01-15 08:40

developer   ~50192

Last edited: 2019-01-15 08:51

View 2 revisions

«yes we have hard filtering of file names for security reason. » ???

Filtering äöü have nothing with security , andis not filtered like this in 'Upload question' type.

@tbart : where it's filtered like this ? And what is the broken locale

DenisChenu

DenisChenu

2019-01-15 08:40

developer  

tbart

tbart

2019-01-15 09:06

reporter   ~50194

Your filenames do not start with special characters. See screenshot attached.

The locale should be en_US.UTF-8, at least that's what the environment sets for LANG:

set | egrep "^(LANG|LC)"

LANG=en_US.UTF-8
LANGUAGE=en_US:en
LC_NUMERIC=de_AT.utf8
LC_PAPER=de_AT.UTF-8
LC_TIME=de_AT.utf8



DenisChenu

DenisChenu

2019-01-15 10:00

developer   ~50205

My filename was äöü.png ;) , but my locale is OK.

Can show your current locale in PHP ?

tbart

tbart

2019-01-15 10:37

reporter   ~50207

Sorry, I misread your screenshot and thought the left column were the filenames, my bad.

phpinfo() gives me:
LANGUAGE en_US:en
LANG C

It's still strange why only umlauts at the beginning get stripped off. This should not be locale dependent!

Issue History

Date Modified Username Field Change
2018-09-24 16:21 tbart New Issue
2019-01-10 17:40 LouisGac Assigned To => LouisGac
2019-01-10 17:40 LouisGac Status new => closed
2019-01-10 17:40 LouisGac Resolution open => no change required
2019-01-10 17:40 LouisGac Note Added: 50169
2019-01-14 17:40 tbart Note Added: 50188
2019-01-15 08:40 DenisChenu Status closed => feedback
2019-01-15 08:40 DenisChenu Resolution no change required => reopened
2019-01-15 08:40 DenisChenu Note Added: 50192
2019-01-15 08:40 DenisChenu File Added: Capture d’écran du 2019-01-15 08-40-13.png
2019-01-15 08:51 DenisChenu Note Edited: 50192 View Revisions
2019-01-15 09:06 tbart File Added: umlauts_at_the_beginning_cut_off.png
2019-01-15 09:06 tbart Note Added: 50194
2019-01-15 09:06 tbart Status feedback => assigned
2019-01-15 10:00 DenisChenu Note Added: 50205
2019-01-15 10:37 tbart Note Added: 50207