View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
09611 | Bug reports | Import/Export | public | 2015-04-20 23:30 | 2015-05-08 09:16 |
Reporter | nwinter | Assigned To | mfaber | ||
Priority | normal | Severity | minor | ||
Status | closed | Resolution | fixed | ||
Product Version | 2.05+ | ||||
Fixed in Version | 2.05+ | ||||
Summary | 09611: Stata XML Export fails when output file > ~15MB | ||||
Description | When exporting to Stata XML, the export fails when the exported file is greater than about 15MB. I've experimented with several surveys; by limiting the range of observations exported the export succeeds. The maximum number of observations that can be exported successfully appears to vary from survey to survey, and does not appear to correspond to a specific output file size. For the tests I've done, the failures begin when the exported file gets bigger than about 15 MB. | ||||
Steps To Reproduce | Create a survey, populate with thousands of responses, export to Stata XML. | ||||
Tags | No tags attached. | ||||
Attached Files | |||||
Bug heat | 12 | ||||
Complete LimeSurvey version number (& build) | 150413 | ||||
I will donate to the project if issue is resolved | No | ||||
Browser | several | ||||
Database type & version | 178 | ||||
Server OS (if known) | x86_64-redhad-linux-gnu | ||||
Webserver software & version (if known) | Apache/2.2.17 | ||||
PHP Version | 5.3.8 | ||||
nwinter: i unfortunately do not have data to test this with but your problems could be due to php or sql memory settings. Are there any error messages in LS's debug mode? Alternatively, can you provide me with an (anonymized) dataset to run some tests myself? |
|
Ah, progress! When I turn on debugging, the console says: Resource interpreted as Document but transferred with MIME type application/download: "http://SITEURL/index.php/admin/export/sa/exportresults/surveyid/631256". and a small file is downloaded. The downloaded file has the following contents: Fatal error: Allowed memory size of 268435456 bytes exhausted (tried to allocate 4 bytes) in /web.pri/SITEURL/application/core/plugins/ExportSTATAxml/STATAxmlWriter.php on line 439 In other tests the line number varies, of course; in my second test with a slightly different dataset it was 394, for example. So it seems (I think) like PHP is running out of memory. Is there a way the plugin can/should be changed to limit its memory use? As it stands, the ultimate XML file it is creating to export would be 16 or 17MB, so if it is using >200MB I wonder if there is some inefficiency somewhere? |
|
A memory leak at stata export generation? |
|
There are a lot of conversions going on and the data is held multiple times in different arrays so the plugin might be quite memory intensive with large datasets. Of course i cannot say that there is not room for improvement of the plugin. ;) |
|
I don't think its a memory leak. However, the function updateCustomresponsemap() uses a lot of memory along the way. One quick change I found that reduced that overhead considerably is to iterate through the responses with a for loop and a counter variable, rather than with foreach. I replaced lines 368-369 from this: foreach ($this->customResponsemap as $iRespId => $aResponses) to this: $keys = array_keys($this->customResponsemap); It seemed to work and to reduce considerably the peak memory usage of the export routine. However, I don't trust my knowledge of PHP enough to be sure this isn't having some side effect (though in a couple of examples the output datasets were the same). I also don't have any experience with Github, so even if I trusted my change I don't know how to post it there. But maybe if this seems wise it could be incorporated? |
|
That's interesting, thanks for investigating. Using your favorite search engine, you can actually find a lot of reports of memory problems using foreach loops over large arrays. Your solution seems fine and i am happy to make the changes for you. Maybe I should replace other foreach loops in the process. Can you tell me how you checked for memory usage while the plugin ran? |
|
Thanks--this is great! The short answer is, very laboriously. I made a copy of the plugin, commented out the "header" and actual output code, and added lots of lines like echo "memory at point 1: ".memory_get_usage(); or if ($counter%200==0) { echo "memory after ".$counter." records: ".memory_get_usage(); and so on... |
|
Thanks! Could you check if adding a "&" before the "$aResponses" works ok for you? So replacing by |
|
I was wondering about that approach. On my system it maxes out just below the counter approach--the difference is probably the memory taken up by the $keys array. Would it make sense to do some of the other loops as references too? Thank you! (One off-topic question: I made some custom code in my production version of the Stata writer to recode optional-other responses from '-oth-' to a numeric value, which allows otherwise-all-numeric responses to be labelled. That might be of broader interest--should I post that as a feature request, or is there some other way to communicate that sort of thing? Or should I figure out GitHub?) |
|
-- About your off-topic question -- Perhaps Carsten, Denis or Sam are appropiate people to answer this question and obviously creating a pull request in github is the best way but some times we merge patches suggested in these bug reports so feel you free to create a new feature issue and paste your patch in diff format. Don't forget to describe how to test the feature or improvement. |
|
Hi, If you asking me : i say : move whole export to external core plugin ;) And actually : i don't have any idea about some plugin. Stata is an example, i don't have any advice on Stata ;). |
|
Denis: no problem, it already IS a core plugin ;) |
|
Yes, i know .... and i say we can move it to : "External plugin not distributed by default with LS core" but downloadable in a clean "download plugin" system. For ls3 or more ;) |
|
nwinter: could you check the attached version of the plugin? Memory use should be considerably less now. Thanks! |
|
mfaber: that one isn't workig for me at all. Even with a small dataset, I get a "This webpage is not available |
|
That's strange. I downloaded the attached file again and put it into another test installation. Tested on different sets of responses and it runs without problems here. |
|
I explored some more. The problem seems to be with the max() functions in two lines: $aStatatypelist[$this->headersSGQA[$iVarid]]['type'] = max($iDatatype, $aStatatypelist[$this->headersSGQA[$iVarid]]['type']); and $aStatatypelist[$this->headersSGQA[$iVarid]]['format'] = max($iStringlength, $aStatatypelist[$this->headersSGQA[$iVarid]]['format']); The function seems to choke when one of the arguments is not yet set; i.e., on the first time through. I wrapped each of those lines with a check on whether the array value is set, and then things work perfectly. Surely this isn't the most elegant approach, but it worked and did demonstrate that on my system at least that is the problem: |
|
Fix committed to master branch: http://bugs.limesurvey.org/plugin.php?page=Source/view&id=15125 |
|
Fix committed to 2.06 branch: http://bugs.limesurvey.org/plugin.php?page=Source/view&id=15126 |
|
2.05+ Build 150508 released |
|
LimeSurvey: master 1f1f3cc8 2015-05-02 22:19 Committer: mfaber Details Diff |
Fixed issue 09611: High memory use of STATA export plugin Dev: refactored plugin to get rid of some memory intensive arrays. |
Affected Issues 09611 |
|
mod - application/core/plugins/ExportSTATAxml/STATAxmlWriter.php | Diff File | ||
LimeSurvey: 2.06 4b0d6170 2015-05-02 22:19 Committer: mfaber Details Diff |
Fixed issue 09611: High memory use of STATA export plugin Dev: refactored plugin to get rid of some memory intensive arrays. |
Affected Issues 09611 |
|
mod - application/core/plugins/ExportSTATAxml/STATAxmlWriter.php | Diff File |
Date Modified | Username | Field | Change |
---|---|---|---|
2015-04-20 23:30 | nwinter | New Issue | |
2015-04-21 14:26 | mfaber | Note Added: 32031 | |
2015-04-21 14:26 | mfaber | Issue Monitored: mfaber | |
2015-04-21 21:01 | nwinter | Note Added: 32034 | |
2015-04-21 21:02 | nwinter | Note Edited: 32034 | |
2015-04-22 00:36 | aesteban | Note Added: 32035 | |
2015-04-22 15:55 | mfaber | Note Added: 32039 | |
2015-04-22 20:09 | nwinter | Note Added: 32040 | |
2015-04-22 21:29 | mfaber | Note Added: 32041 | |
2015-04-22 21:30 | mfaber | Assigned To | => mfaber |
2015-04-22 21:30 | mfaber | Status | new => assigned |
2015-04-22 22:07 | nwinter | Note Added: 32042 | |
2015-04-22 23:15 | mfaber | Note Added: 32043 | |
2015-04-22 23:55 | nwinter | Note Added: 32044 | |
2015-04-23 00:33 | aesteban | Note Added: 32045 | |
2015-04-23 12:16 | DenisChenu | Note Added: 32047 | |
2015-04-23 14:51 | mfaber | Note Added: 32048 | |
2015-04-23 14:53 | DenisChenu | Note Added: 32049 | |
2015-04-24 18:43 | mfaber | File Added: STATAxmlWriter.php | |
2015-04-24 18:44 | mfaber | Note Added: 32055 | |
2015-04-24 19:23 | nwinter | Note Added: 32056 | |
2015-04-26 13:59 | mfaber | Note Added: 32061 | |
2015-04-26 22:57 | nwinter | Note Added: 32064 | |
2015-05-02 22:20 | mfaber | Changeset attached | => LimeSurvey master 1f1f3cc8 |
2015-05-02 22:20 | mfaber | Note Added: 32084 | |
2015-05-02 22:20 | mfaber | Resolution | open => fixed |
2015-05-02 22:44 | mfaber | Changeset attached | => LimeSurvey 2.06 4b0d6170 |
2015-05-02 22:44 | mfaber | Note Added: 32085 | |
2015-05-02 22:48 | mfaber | Status | assigned => resolved |
2015-05-02 22:48 | mfaber | Fixed in Version | => 2.05+ |
2015-05-08 09:16 | c_schmitz | Note Added: 32117 | |
2015-05-08 09:16 | c_schmitz | Status | resolved => closed |
2021-08-02 20:50 | guest | Bug heat | 10 => 12 |