-
Notifications
You must be signed in to change notification settings - Fork 6
Description
I have a function to convert large scale SAS data, e.g., over 5GB, to parquet format by chunks. nanoparquet::write_parquet is used to write the first chunk, and nanoparquet::append_parquet is used to append other chunks. I have this error message after appending about 10 chunks, "Wrong number of bytes written for parquet column @ lib/ParquetOutFile.cpp:321".
My sessionInfo is
R version 4.4.1 (2024-06-14)
Platform: x86_64-pc-linux-gnu
Running under: Red Hat Enterprise Linux 8.10 (Ootpa)
Matrix products: default
BLAS/LAPACK: /usr/lib64/libopenblasp-r0.3.15.so; LAPACK version 3.9.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
time zone: America/Edmonton
tzcode source: system (glibc)
attached base packages:
[1] stats graphics grDevices datasets utils methods base
other attached packages:
[1] dplyr_1.1.4 nanoparquet_0.4.2