View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
18709 | Feature requests | Security | public | 2023-03-29 11:09 | 2023-04-08 11:52 |
Reporter | r0bis | Assigned To | |||
Priority | none | Severity | feature | ||
Status | new | Resolution | open | ||
Summary | 18709: response encryption security - suggestion for improvement | ||||
Description | I really like that LS can encrypt responses using the sodium library. There is a good use case for this in healthcare environments, if we ask patients to submit any textual responses. In free text people can inadvertently (or deliberately) submit information that would allow to identify them - even if their responses are linked to a secret id number and personally identifiable information is not collected on purpose. I understand that storing both private and public key in a configuration file is needed so that the encrypted response can be displayed in LS browsing interface. I would like, however, an option where the secret key would be downloaded locally and only used to decrypt data once they are downloaded to the computer performing data analysis. In my case I use R with it's excellent limer library. I could easily add sodium library and do analysis locally. In the current implementation the response encryption protects data from whoever is able to access the database with superuser rights. It is a good step, but if an attacker has compromised the web server (i.e. is able to read the secret key), it is a problem. My proposal would be to implement a second choice for encryption of responses - an option where keeping the private key secure and available is wholly the user's responsibility. In that case the response data on the server would be always encrypted; the same if they are downloaded as .CSV. Holch said this would likely be too difficult/confusing for regular users and I agree. However it could be a valuable feature for data analysts in big organisations dealing with sensitive data (e.g. NHS). | ||||
Additional Information | Discussion with Holch: Many thanks for the excellent project, it is great to see limesurvey develop. Best wishes, Roberts | ||||
Tags | No tags attached. | ||||
Bug heat | 256 | ||||
Story point estimate | 0 | ||||
Users affected % | 0 | ||||
:+1: Maybe have a look to https://github.com/SamMousa/limesurvey-encrypt system : whole response line was encrypted , and current response was deleted. |
|
Another idea : 1. We need 2 different settings for encryption : the current one, and the "private" one. |
|
Discussion on forum, and idea for this new feature
I'm happy to create a Pull Request for this if it's OK to implement in core. |
|
Just to clarify better, why I thought this is needed. If you deal with potentially sensitive personal data, one does not really wish to store direct personal information (names, e-mails, etc) in a mysql database on a server even encrypted. As long as the private key is on the server and becomes available to an attacker, they can decrypt the personal data. So for such material direct personal information will be stored locally on the user's system and the only link with the LS survey records will be a one-way hash (the link between the local and protected participant record and the LS answers table). Denis asked - why encrypt then, if no direct personal information is on the server. To not encrypt is relatively OK with numeric data. But if you want to go a step further and allow patients to submit free text responses, you can have a situation where people disclose personal data either theirs or of others inadvertently (or deliberately). Let us call it indirect personal information - you were not asking for it, but people submitted it anyway. You still have the duty to protect it as best you can. For an example, think of mental healthcare settings where desire to protect vulnerable people's info is high and people not always have the capacity to evaluate properly what they are writing. If you have the text responses encrypted, you are not worried about the free text responses. Even if Limesurvey were to be compromised, the responses are safe. And if encryption is available - why not encrypt also numeric answers for good measure. I am not sure if people know just how difficult it is to trust with personal data in healthcare settings, but healthcare providers are big organisations and they need good open-source mechanisms to collect and use patient feedback. The setup with LS and R or Python would fit the need very well, especially if responses can be encrypted as decrypting them in an automated way locally is trivial if you have the private key. |
|
Just to add - if direct personal information is NOT stored on LS server - there is no need to worry about participant table or token encryption. They are not used, because that part of making respondent links and distributing them (making the links with id codes embedded, QR codes) is handled locally. It is pretty easy to do that and I'd be happy to publish a scheme how this. |
|
Date Modified | Username | Field | Change |
---|---|---|---|
2023-03-29 11:09 | r0bis | New Issue | |
2023-04-01 17:59 | DenisChenu | Issue Monitored: DenisChenu | |
2023-04-01 17:59 | DenisChenu | Bug heat | 250 => 252 |
2023-04-01 18:01 | DenisChenu | Note Added: 74331 | |
2023-04-01 18:01 | DenisChenu | Bug heat | 252 => 254 |
2023-04-03 15:52 | DenisChenu | Note Added: 74350 | |
2023-04-05 08:54 | DenisChenu | Note Added: 74372 | |
2023-04-08 11:48 | r0bis | Note Added: 74435 | |
2023-04-08 11:48 | r0bis | Bug heat | 254 => 256 |
2023-04-08 11:52 | r0bis | Note Added: 74436 |