Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data Explorer: emptiness in column names #3084

Open
EmilHvitfeldt opened this issue May 9, 2024 · 5 comments
Open

Data Explorer: emptiness in column names #3084

EmilHvitfeldt opened this issue May 9, 2024 · 5 comments
Labels
area: data explorer Issues related to Data Explorer category. bug Something isn't working

Comments

@EmilHvitfeldt
Copy link

Positron Version:

Positron Version: 2024.05.0 (Universal) build 1157
Code - OSS Version: 1.89.0
Commit: ed7ad00
Date: 2024-05-07T08:14:42.800Z
Electron: 28.2.8
Chromium: 120.0.6099.291
Node.js: 18.18.2
V8: 12.0.267.19-electron.0
OS: Darwin arm64 23.4.0

Steps to reproduce the issue:

  1. Run following code
example <- data.frame(1:10, 1:10, 1:10)

names(example) <- c("", "age", "age ")

View(example)

Screenshot 2024-05-09 at 11 13 42 AM

What did you expect to happen?

I don't have a good solution, but I still have nightmares from the time I had a dataset with column names, padded with spaces

Were there any error messages in the output or Developer Tools console?

Nope

@EmilHvitfeldt EmilHvitfeldt added the bug Something isn't working label May 9, 2024
@juliasilge juliasilge added the area: data explorer Issues related to Data Explorer category. label May 9, 2024
@EmilHvitfeldt EmilHvitfeldt changed the title Data Viewer: emptiness in column names Data Explorer: emptiness in column names May 9, 2024
@jthomasmock
Copy link
Contributor

jthomasmock commented May 9, 2024

A) It's kind of good that it "works" right now, but I don't think we are safe in this example.
B) I bet this will break our eventual summary statistics
C) In pandas, this is also a problem:

image

@jthomasmock
Copy link
Contributor

jthomasmock commented May 9, 2024

Although, tbf - this is not really a valid data.frame name

> names(example)
[1] ""     "age"  "age "

> example[""]
Error in `[.data.frame`:
! undefined columns selected
Show Traceback

> tibble::tibble(example)
Error in `env[[name]] <- x`:
! attempt to use zero-length variable name
Show Traceback

But it does work in Pandas

# blank names
import pandas as pd
num_print = pd.DataFrame({'Name':['Ashika', 'Tanu', 'Ashwin', 'Mohit', 'Sourabh'],
        '': [100000000, 100025000, 210000000, 190000000, 0.100000000115151]})

>>> num_print[""]
0    100000000.0
1    100025000.0
2    210000000.0
3    190000000.0
4            0.1
Name: , dtype: float64

Also even works for summary statistics:

# blank names
import pandas as pd
num_print = pd.DataFrame({'Name':['Ashika', 'Tanu', 'Ashwin', 'Mohit', 'Sourabh'],
        'age': [100000000, 100025000, 210000000, 190000000, 0.100000000115151]})

num_print.rename(columns = {'age': '', 'Name': ''}, inplace=True)

image

@jennybc
Copy link
Member

jennybc commented May 9, 2024

Several years ago I spent a lot of time ruminating on names in the R world, most especially in the context of data.frames. I'll link the write-up of where all of that ended up, in case ideas or vocabulary are helpful in working out what we're going to support here:

https://design.tidyverse.org/names.html

@jthomasmock
Copy link
Contributor

@EmilHvitfeldt I split off your other example of leading/trailing whitespace into a separate issue: #3089

@wesm
Copy link
Contributor

wesm commented Dec 6, 2024

A strawman solution started here, can be refined into a finalized strategy once we gather feedback / ideas: #5653

@wesm wesm modified the milestones: Future, 2025.01.0 Pre-Release Dec 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: data explorer Issues related to Data Explorer category. bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants