Skip to content

Wrong number of bytes written for parquet column @ lib/ParquetOutFile.cpp:321 #146

@YipengUva

Description

@YipengUva

I have a function to convert large scale SAS data, e.g., over 5GB, to parquet format by chunks. nanoparquet::write_parquet is used to write the first chunk, and nanoparquet::append_parquet is used to append other chunks. I have this error message after appending about 10 chunks, "Wrong number of bytes written for parquet column @ lib/ParquetOutFile.cpp:321".

My sessionInfo is
R version 4.4.1 (2024-06-14)
Platform: x86_64-pc-linux-gnu
Running under: Red Hat Enterprise Linux 8.10 (Ootpa)

Matrix products: default
BLAS/LAPACK: /usr/lib64/libopenblasp-r0.3.15.so; LAPACK version 3.9.0

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

time zone: America/Edmonton
tzcode source: system (glibc)

attached base packages:
[1] stats graphics grDevices datasets utils methods base

other attached packages:
[1] dplyr_1.1.4 nanoparquet_0.4.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions