View Issue Details

IDProjectCategoryView StatusLast Update
16531Bug reportsConditionspublic2020-08-07 16:50
Reportergabrieljenik Assigned To 
PrioritynoneSeverityminor 
Status confirmedResolutionopen 
Product Version4.3.6 
Summary16531: Validation regex including unicode characters fails
Description

This is a clone of 16273.
Should we review if the same error appears here.


I´m testing this simple regex to validate the content of an user response:

/^[A-Z0-9\s]+$/

It matches capital letters, numbers and whitespaces. It seems to be correct and works fine.

But assume I would like to include the unicode char "á".

I've tested the following regexes with no results:

/^[áA-Z0-9\s]+$/

/^[\x00E1A-Z0-9\s]+$/

/^[\x{00E1}A-Z0-9\s]+$/

/^[\u00E1A-Z0-9\s]+$/

0x00E1 is the hex code value for "á"

Tha validation test fails in all the cases above

Steps To Reproduce

Using a test survey, apply this validation regex to any response field and test...

TagsNo tags attached.
Complete LimeSurvey version number (& build)4.3.6
I will donate to the project if issue is resolvedNo
Browser
Database & DB-VersionMysql
Server OS (if known)
Webserver software & version (if known)
PHP Version7

Relationships

related to 16273 confirmed Validation regex including unicode characters fails 

Activities

gabrieljenik

gabrieljenik

2020-07-29 19:58

developer   ~59160

PR: https://github.com/LimeSurvey/LimeSurvey/pull/1521

Decoding html before running regex. This (a decode string) is similar to what the PHP side regex function gets.

sushmanadendla

sushmanadendla

2020-08-06 15:11

manager   ~59343

Tested the issue before pulling the PR, Issue exist. Tested the issue after pulling the PR, below are my findings:
The Scenario fails in below cases:
/^[\x00E1A-Z0-9\s]+$/

/^[\x{00E1}A-Z0-9\s]+$/

/^[\u00E1A-Z0-9\s]+$/

Please refer the attachment for more details

16531_AfterPR_Unicode.png (232,740 bytes)
16531_AfterPR_Hexcode.png (117,974 bytes)
sushmanadendla

sushmanadendla

2020-08-07 16:47

manager   ~59372

Actually the codes mentioned above where wrong , I tried giving as below :
^[\u00E1A-Z0-9\s]+$

^[\xE1A-Z0-9\s]+$

Working as expected

sushmanadendla

sushmanadendla

2020-08-07 16:50

manager   ~59373

Please refer the attachment for more details

HexaCode.png (73,573 bytes)   
HexaCode.png (73,573 bytes)   

Issue History

Date Modified Username Field Change
2020-07-28 02:20 gabrieljenik New Issue
2020-07-28 02:20 gabrieljenik Issue generated from: 16273
2020-07-28 02:20 gabrieljenik Relationship added related to 16273
2020-07-29 14:40 cdorin Status new => confirmed
2020-07-29 19:58 gabrieljenik Note Added: 59160
2020-08-06 15:11 sushmanadendla Note Added: 59343
2020-08-06 15:11 sushmanadendla File Added: 16531_AfterPR_Unicode.png
2020-08-06 15:11 sushmanadendla File Added: 16531_AfterPR_Hexcode.png
2020-08-06 15:11 sushmanadendla File Added: 16531_BeforePR_Unicode.png
2020-08-07 16:47 sushmanadendla Note Added: 59372
2020-08-07 16:50 sushmanadendla Note Added: 59373
2020-08-07 16:50 sushmanadendla File Added: HexaCode.png