-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Open
Description
Describe the bug
I'm not sure if this is a bug.
I see that a FeatureType object contains an attribute called self.dtype that is not covered when this feature is a Sequence or a List.
When I try to run a multilabel classification with this example script from the transformers library:
I get this error on the linked line:
AttributeError: 'List' object has no attribute 'dtype'. Did you mean: '_type'?Looking at the check that the script is attempting to perform, could we perhaps add a self.dtype="list" attribute for this FeatureType 's: Sequence, List, etc.?
Steps to reproduce the bug
For example, this code works for me:
from datasets import ClassLabel, Features, Sequence, Value
features = {'text': Value('string'), 'label': ClassLabel(names=['No', 'Yes'])}
print(features["text"].dtype)
print(features["label"].dtype)'string'
'int64'
and this code does not work for me:
from datasets import ClassLabel, Features, Sequence, Value
features = {'text': Value('string'), 'label': Sequence(ClassLabel(names=['No', 'Yes']))}
print(features["label"].dtype) # it could be equal to "list"?Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'List' object has no attribute 'dtype'. Did you mean: '_type'?
Expected behavior
The attribute dtype equal to "list" when using objects of type Sequence.
from datasets import ClassLabel, Features, Sequence, Value
features = {'text': Value('string'), 'label': Sequence(ClassLabel(names=['No', 'Yes']))}
print(features["label"].dtype)'list'
Environment info
I have installed datasets==4.5.0.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels