-
Notifications
You must be signed in to change notification settings - Fork 401
Description
Hi
When training FedAS on Tiny-Imagenet with 20 clients and Dirichlet(α=0.1) for data partitioning, I encountered NaN values in the FIM trace after 182 rounds. This eventually caused a ValueError in roc_auc_score
Setup
Dataset: Tiny-Imagenet
Clients: 20
Data distribution: Dirichlet(α=0.1)
Algorithm: FedAS
Rounds: 300
-------------Round number: 182-------------
Evaluate global model
Traceback (most recent call last):
File "/home/infor/Code/Master-thesis/system/main.py", line 554, in
run(args)
File "/home/infor/Code/Master-thesis/system/main.py", line 367, in run
server.train()
File "/home/infor/Code/Master-thesis/system/flcore/servers/serveras.py", line 72, in train
self.evaluate()
File "/home/infor/Code/Master-thesis/system/flcore/servers/serverbase.py", line 229, in evaluate
stats = self.test_metrics()
^^^^^^^^^^^^^^^^^^^
File "/home/infor/Code/Master-thesis/system/flcore/servers/serverbase.py", line 203, in test_metrics
ct, ns, auc = c.test_metrics()
^^^^^^^^^^^^^^^^
File "/home/infor/Code/Master-thesis/system/flcore/clients/clientbase.py", line 118, in test_metrics
auc = metrics.roc_auc_score(y_true, y_prob, average='micro')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/infor/miniconda3/envs/PFL/lib/python3.11/site-packages/sklearn/utils/_param_validation.py", line 218, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/infor/miniconda3/envs/PFL/lib/python3.11/site-packages/sklearn/metrics/_ranking.py", line 665, in roc_auc_score
y_score = check_array(y_score, ensure_2d=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/infor/miniconda3/envs/PFL/lib/python3.11/site-packages/sklearn/utils/validation.py", line 1105, in check_array
_assert_all_finite(
File "/home/infor/miniconda3/envs/PFL/lib/python3.11/site-packages/sklearn/utils/validation.py", line 120, in _assert_all_finite
_assert_all_finite_element_wise(
File "/home/infor/miniconda3/envs/PFL/lib/python3.11/site-packages/sklearn/utils/validation.py", line 169, in _assert_all_finite_element_wise
raise ValueError(msg_err)
ValueError: Input contains NaN.
Additional information
Let me know if more logs or config files are needed.