I have users who cannot or do not want to connect with relational databases, but instead prefer to work with data exported to excel files. The recordsets exported from these database data can become rather large. (I also export to CSV files).
我有一些用户不能或不想连接关系数据库,但更喜欢使用导出到excel文件的数据。从这些数据库数据导出的记录集可以变得相当大。(我也导出到CSV文件)。
My question is related to this one: Handling java.lang.OutOfMemoryError when writing to Excel from R.
我的问题与此相关:处理java.lang。从R写到Excel时出现OutOfMemoryError错误。
As recommended in the accepted anser to this question (or rather the first comment), I now use the Rcpp-based openxlsx
package to export some views from the database. It works when the export has ~67000 rows, but it does not work for larger datasets (~1 million rows, ~20 params, all numeric except a few datetimes).
正如已接受的anser对这个问题(或第一个注释)所建议的,我现在使用基于rcppbased openxlsx包从数据库导出一些视图。当导出有~67000行时,它可以工作,但是对于更大的数据集(~1百万行,~20个params,所有的数值,除了一些datetimes除外)都不起作用。
openxlsx::write.xlsx(data, file = "data.2008-2016.xlsx") # 800000 rows
Error: zipping up workbook failed. Please make sure Rtools is installed or a zip application is available to R.
Try installr::install.rtools() on Windows
(I'm using a Linux PC, and /usr/bin/zip is available to R)
(我使用的是Linux PC,而/usr/bin/zip对R可用)
Can I give the openxlsx package more memory? Or set some tuneable options to perform better with large datasets?
我能给openxlsx包更多的内存吗?或者设置一些可调选项以更好地使用大型数据集?
For openxlsx, is there something like the options(java.parameters = "-Xmx1000m")
for the java-based xlsx package?
对于openxlsx,是否存在类似选项(java)的东西。基于java的xlsx包的参数=“-Xmx1000m”)?
The openxlsx vignette does not mention any options. But maybe there are some undocumented ways or options? (e.g. showing a progress bar during saving)
openxlsx的描述没有提到任何选项。但也许有一些没有记录的方法或选择?(例如在保存过程中显示进度条)
At this point I proceed like this: close all unneeded apps, restart Rstudio, keep few/no large objects around in the global environment, query db, then run write.xlsx()
. With a "clean slate" like this, it succeeded in exporting the 800000 row dataset to a 93MB-xlsx-file.
此时,我将这样进行:关闭所有不需要的应用程序,重新启动Rstudio,在全局环境中保持很少/没有大型对象,查询db,然后运行write.xlsx()。使用这样的“clean slate”,它成功地将800000行数据集导出到一个93mb -xlsx文件。
5
Your problem isn't the memory. openxlsx
requires installing RTools or similar to save larger excel files.
你的问题不是记忆。openxlsx需要安装RTools或类似的方法来保存更大的excel文件。
I had the same problem and same error you're seeing just yesterday. Below is a link for the windows installer:
就在昨天,我遇到了同样的问题和同样的错误。下面是windows安装程序的链接:
https://cran.r-project.org/bin/windows/Rtools/index.html
https://cran.r-project.org/bin/windows/Rtools/index.html
The following site further explains the requirements:
以下网址进一步解释有关规定:
https://www.r-project.org/nosvn/pandoc/openxlsx.html
https://www.r-project.org/nosvn/pandoc/openxlsx.html
本站翻译的文章,版权归属于本站,未经许可禁止转摘,转摘请注明本文地址:http://www.silva-art.net/blog/2016/03/15/7de67b557ad0cafab0bd82ffaca5d29c.html。