Skip to content

Segfault when reading an empty Parquet file #138

@thisisnic

Description

@thisisnic

Working with the dev version, and it segfaults when I read an empty Parquet file. To reproduce:

tf <- tempfile()
arrow::write_parquet(data.frame(), tf)
nanoparquet::read_parquet(tf)

I ran it with the debugger attached, and output is shown below.

(gdb) run
Starting program: /usr/lib/R/bin/exec/R --no-save --no-restore-data --quiet
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/x86_64-linux-gnu/libthread_db.so.1".
[Detaching after vfork from child process 10261]
[Detaching after vfork from child process 10263]
> devtools::load_all()
[New Thread 0x7ffff31ff640 (LWP 10275)]
[Detaching after vfork from child process 10281]
[Detaching after vfork from child process 10283]
[Detaching after vfork from child process 10286]
ℹ Loading nanoparquet
> library(arrow)
[New Thread 0x7fffe95ff640 (LWP 10298)]
[Detaching after vfork from child process 10299]

Attaching package: ‘arrow’

The following objects are masked from ‘package:nanoparquet’:

    read_parquet, write_parquet

The following object is masked from ‘package:testthat’:

    matches

The following object is masked from ‘package:utils’:

    timestamp

> tf <- tempfile()
> arrow::write_parquet(data.frame(), tf)
> read_parquet(tf)
[New Thread 0x7fffe8dfe640 (LWP 10321)]
[New Thread 0x7fffe3fff640 (LWP 10322)]
data frame with 0 columns and 0 rows
> nanoparquet::read_parquet(tf)

Thread 1 "R" received signal SIGSEGV, Segmentation fault.
0x00007ffff7c3b8f5 in Rf_type2char () from /usr/lib/R/lib/libR.so

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugan unexpected problem or unintended behavior

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions