View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
05988 | Bug reports | Conditions | public | 2012-04-05 19:53 | 2012-08-04 14:51 |
Reporter | Lise94 | Assigned To | c_schmitz | ||
Priority | normal | Severity | minor | ||
Status | closed | Resolution | fixed | ||
Product Version | 1.92+ | ||||
Fixed in Version | 1.92+ | ||||
Summary | 05988: Problem displaying words with French accent within Expression Manager | ||||
Description | Hi, I want to use expression manager to display a title question depending of a previous answer. No problem to do that, but when an accent is in the expression manager, then, the question is not displayed properly and the respondent can see é instead of é : this problem is only seen in the respondent screen, and not in the admin part. see file attached for an exemple | ||||
Tags | No tags attached. | ||||
Attached Files | |||||
Bug heat | 12 | ||||
Complete LimeSurvey version number (& build) | 120330 | ||||
I will donate to the project if issue is resolved | No | ||||
Browser | Firefox | ||||
Database type & version | mysql | ||||
Server OS (if known) | Windows | ||||
Webserver software & version (if known) | ? | ||||
PHP Version | 5 | ||||
the respondent see & e a c u t e ; instead of é |
|
c_schmitz - the value in the database has the "wrong" value - it contains "Pourquoi {if((boisson.code=='Y'),'aimez-vous le thé','non')} ?" EM uses htmlspecialchars_decode() when processing database values. Should it be using html_entities_decode(), or should the database be storing non-entity-encoded values? |
|
The database should always store the pure UTF-8 encoded non-entity-encoded value. |
|
OK, so EM is working properly, and somehow the wrong value is being inserted into the database. |
|
Maybe something with database. I try: importing lss file, and look and save question text.
Then i test with: Did LEM have to accept utf8 encoding AND html encoding ? Then don't replace & to & amp; |
|
Javascript is good: LEMif(LEManyNA('boisson.code'),'',(LEMif((LEMval('boisson.code') == 'Y'), 'aimez-vous le th& eacute;', 'non')))); (without & amp;) It's a javascript issue , see some example : |
|
TMSWhite: I am sorry, but when I meant 'should' it is not meant in an absolute way. As long as it is valid HTML also entity-encoded is allowed, preferred is non-encoded though. |
|
One core question is how to properly protect EM from cross-site scripting attacks. I used htmlspecialchars, which takes care of '>','<','"',"'", and '&'. Sounds like we may not need to escape '&'. However the main issue is that the database is getting the wrong value stored. If you fix the database contents, you'll see the following generated: LEMif(LEManyNA('boisson.code'),'',(LEMif((LEMval('boisson.code') == 'Y'), 'aimez-vous le thé', 'non')))); |
|
Carsten- We could have EM process all content through html_entities_decode(), but that is potentially risky. It will work if the original is encoded through html_entities_encode(), but if the original is only processed through html_specialchars(), we might get the wrong result. It would be nice to ensure we're consistent in the database and always encode/decode with the same function. |
|
I don't understand why we might get the wrong result? AFAIK htmlentities_decode does not care if encoding with html_specialchars or html_entities_encode was done. If we assume that always valid HTML is supposed to be used it should be fine. |
|
Carsten: it's javascript functionality. If you make : You get: EDIT : mlaybe try with https://developer.mozilla.org/fr/JavaScript/R%C3%A9f%C3%A9rence_JavaScript/R%C3%A9f%C3%A9rence_JavaScript/Fonctions_globales/decodeURIComponent |
|
Denis - if so, where in the source code is the problem? Somewhere before insertion into the database, I presume. Carsten- You are right - I thought there might be a way to do injection attacks by first encoding with html_spacialchars() and then decoding with html_entities_decode(), but I can't create a working test case of that. |
|
Carsten- I just tried a dozen permutations of changing some or all of the htmlspecialchars_decode() within EM to html_entity_decode(). All made it worse. I also remember spending nearly 20 hours last year trying to get the right balance of specialchars and entity encode and decode. I think we should insist that the database be htmlspecialchars() encoded and not html_entities() encoded. Trying to mix and match the two types of encoding is a big mess. So, the core fix would be to ensure that entity-encoded data doesn't get into the database. |
|
I try to have html_entities in db, the onky way was to import the lss file and don't change the question text (don't save). To fix, the only way seem, in em_javascript.js:
|
|
I think we need a group decision on this since it affects how we handle cross-site scripting protection. This we want to automatically do: Things I'm not sure about: |
|
1) [...]be processed through htmlspecialchars() on output, yes 1.) Think so, yes. It should obey the XSS global filter setting in the admin, though. So if someone tries to save an equation like and the according filter setting is activated it should be filtered accordinlgy. 2.) People? Who is that? 3.) Store as entered. Consider raw data always unsafe to be displayed, but safe to store (assuming that that SQL-injection safe storing methods are used). |
|
For 1 & 2, people = survey authors. |
|
Then 2.) Yes, they should be able to do that. |
|
OK. That isn't how EM currently handles this, so I (or someone else) will need to look through all of the existing calls to htmlspecialchars() within EM and make the needed adjustments. In general, seems like: |
|
That proposal sounds fine to me. |
|
attached sample survey for testing special characters. It is possible it won't import correctly. It uses the following text repeatedly: This <question> has "special chars" including '&'; foreign chars like ßüöäÜÖħéèàçù£, and entities like < > é " ' £ é ò ô However, that text gets converted at many points, especially during editing. So, consistent handling of entities may be more pervasive than just EM. |
|
Hmm, even Mantis does conversion - the section after "and entities like" spells out the actual entities: Here they are with the ampersand replaced by a tilde so that they don't get converted: and entities like ~lt; ~gt; ~eacute; ~quot; ~apos; ~pound; ~eacute; ~ograve; ~ocirc; |
|
Can you point out where it is still going 'wrong' beside EM? |
|
The short answer is that the problem appears to be isolated to EM, but looking at the code, I'm worried it may be more pervasive (which might explain why it took so long to get it "right" in EM in the first place). When I do a regular expressions search of (/html(specialchars|_entity|entities)/ of 1.92 codebase (searching all .php and .js source, I get 542 matches across 78 files. Among them, we have several versions of these JavaScript functions: We should probably move the phpjs.org functions out of em_javascript.js and maintain them separately. At present, I didn't want to touch this since I don't know whether standardizing those functions would break anything. Also, I'm not clear on some basics, such as, "how should strings be natively stored within": And what is the proper way map from one to the other? Furthermore, if we do encoding, should we be doing double encoding and double decoding, or not. For example, "&" => "&"? My recommendation would be to have someone create succinct coding guidance documentation to indicate the proper way to manage special characters and entities within LS. That way we can (perhaps slowly) look through those 542 matches and make sure they are all correct and consistent. |
|
As mentioned earlier: Strings should always be stored raw, be it in database, fieldmap or session. On display: |
|
Fix committed to master branch: http://bugs.limesurvey.org/plugin.php?page=Source/view&id=8200 |
|
Fix committed to Yii branch: http://bugs.limesurvey.org/plugin.php?page=Source/view&id=8201 |
|
New 1.92+ build released |
|
LimeSurvey: master acf57bc2 2012-04-20 02:24 Details Diff |
Fixed issue 05988: Problem displaying words with French accent within Expression Manager |
Affected Issues 05988 |
|
mod - scripts/em_javascript.js | Diff File | ||
LimeSurvey: Yii 1cdf15df 2012-04-20 02:26 Details Diff |
Fixed issue 05988: Problem displaying words with French accent within Expression Manager |
Affected Issues 05988 |
|
mod - scripts/expressions/em_javascript.js | Diff File |
Date Modified | Username | Field | Change |
---|---|---|---|
2012-04-05 19:53 | Lise94 | New Issue | |
2012-04-05 19:53 | Lise94 | Status | new => assigned |
2012-04-05 19:53 | Lise94 | Assigned To | => lemeur |
2012-04-05 19:53 | Lise94 | File Added: limesurvey_survey_44141.lss | |
2012-04-05 19:55 | Lise94 | Issue Monitored: Lise94 | |
2012-04-05 19:58 | Lise94 | Note Added: 18236 | |
2012-04-05 21:11 | c_schmitz | Assigned To | lemeur => c_schmitz |
2012-04-07 12:55 | DenisChenu | Issue Monitored: DenisChenu | |
2012-04-07 16:32 | TMSWhite | Note Added: 18247 | |
2012-04-07 17:08 | c_schmitz | Note Added: 18250 | |
2012-04-07 17:09 | TMSWhite | Note Added: 18251 | |
2012-04-07 17:48 | DenisChenu | Note Added: 18253 | |
2012-04-07 18:05 | DenisChenu | Note Added: 18254 | |
2012-04-07 18:38 | c_schmitz | Note Added: 18257 | |
2012-04-07 18:39 | TMSWhite | Note Added: 18258 | |
2012-04-07 18:41 | c_schmitz | Note Edited: 18257 | |
2012-04-07 18:48 | TMSWhite | Note Added: 18260 | |
2012-04-07 19:08 | c_schmitz | Note Added: 18261 | |
2012-04-07 19:14 | DenisChenu | Note Added: 18262 | |
2012-04-07 19:18 | DenisChenu | Note Edited: 18262 | |
2012-04-07 19:30 | TMSWhite | Note Added: 18263 | |
2012-04-07 21:41 | TMSWhite | Note Added: 18266 | |
2012-04-07 21:48 | TMSWhite | Note Edited: 18266 | |
2012-04-08 12:06 | DenisChenu | Note Added: 18278 | |
2012-04-09 15:10 | TMSWhite | Note Added: 18284 | |
2012-04-09 15:43 | c_schmitz | Note Added: 18285 | |
2012-04-09 15:45 | c_schmitz | Note Edited: 18285 | |
2012-04-09 15:48 | TMSWhite | Note Added: 18286 | |
2012-04-09 16:12 | c_schmitz | Note Added: 18289 | |
2012-04-09 16:12 | c_schmitz | Note Edited: 18289 | |
2012-04-09 16:41 | TMSWhite | Note Added: 18292 | |
2012-04-09 17:08 | c_schmitz | Note Added: 18295 | |
2012-04-11 19:10 | TMSWhite | File Added: ls2_specialchars.lss | |
2012-04-11 19:13 | TMSWhite | Note Added: 18321 | |
2012-04-11 19:15 | TMSWhite | Note Added: 18322 | |
2012-04-11 19:15 | TMSWhite | Note Edited: 18322 | |
2012-04-16 09:37 | c_schmitz | Note Added: 18361 | |
2012-04-16 17:21 | TMSWhite | Note Added: 18379 | |
2012-04-17 09:45 | c_schmitz | Note Added: 18402 | |
2012-04-17 09:45 | c_schmitz | Note Edited: 18402 | |
2012-04-20 09:23 | c_schmitz | Status | assigned => resolved |
2012-04-20 09:23 | c_schmitz | Fixed in Version | => 1.92+ |
2012-04-20 09:23 | c_schmitz | Resolution | open => fixed |
2012-04-20 09:24 | c_schmitz | Changeset attached | => LimeSurvey master acf57bc2 |
2012-04-20 09:24 | c_schmitz | Note Added: 18437 | |
2012-04-20 09:26 | c_schmitz | Changeset attached | => LimeSurvey Yii 1cdf15df |
2012-04-20 09:26 | c_schmitz | Note Added: 18438 | |
2012-05-01 11:56 | c_schmitz | Note Added: 18519 | |
2012-05-01 11:56 | c_schmitz | Status | resolved => closed |
2021-08-03 06:11 | guest | Bug heat | 8 => 12 |