You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
task: text_classificationbase_model: google-bert/bert-base-multilingual-uncasedproject_name: products-to-categories-finetunedlog: tensorboardbackend: localdata:
path: data/ train_split: train # this must be either train.csv or train.jsonvalid_split: validate # this must be either validate.csv or validate.jsoncolumn_mapping:
text_column: name # this must be the name of the column containing the texttarget_column: category_id # this must be the name of the column containing the targetparams: # Default values...max_seq_length: 512epochs: 3batch_size: 4lr: 2e-5optimizer: adamw_torchscheduler: lineargradient_accumulation: 1# mixed_precision: fp16hub:
username: ${HF_USERNAME}token: ${HF_TOKEN}push_to_hub: false
Error Logs
INFO | 2024-10-01 14:22:42 | __main__:train:70 - loading dataset from disk
ERROR | 2024-10-01 14:22:42 | autotrain.trainers.common:wrapper:120 - train has failed due to an exception: Traceback (most recent call last):
File "/Users/anthony/.pyenv/versions/3.11.2/lib/python3.11/site-packages/autotrain/trainers/common.py", line 117, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/anthony/.pyenv/versions/3.11.2/lib/python3.11/site-packages/autotrain/trainers/text_classification/__main__.py", line 98, in train
raise ValueError(
ValueError: Number of classes in train and valid are not the same. Training has 1936 and valid has 1064
ERROR | 2024-10-01 14:22:42 | autotrain.trainers.common:wrapper:121 - Number of classes in train and valid are not the same. Training has 1936 and valid has 1064
Additional Information
Replacing the check with if num_classes_valid > num_classes: (or removing it, because a previous check makes sure that there are no classes in the validation data that are not in the training data) does not seem to cause any additional issues.
Is there a reason for this check?
Is it possible to make this change permanent?
Thank you!
The text was updated successfully, but these errors were encountered:
we cannot calculate metrics if the validation classes are not same as training classes.
are you validation classes a subset of training classes and you are getting this error?
Prerequisites
Backend
Local
Interface Used
CLI
CLI Command
autotrain --config training.yml
UI Screenshots & Parameters
training.yml:
Error Logs
Additional Information
Replacing the check with
if num_classes_valid > num_classes:
(or removing it, because a previous check makes sure that there are no classes in the validation data that are not in the training data) does not seem to cause any additional issues.Is there a reason for this check?
Is it possible to make this change permanent?
Thank you!
The text was updated successfully, but these errors were encountered: