Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distributed support (rework) #996

Draft
wants to merge 21 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
6bd297b
Merge remote-tracking branch 'upstream/master'
lrzpellegrini Apr 8, 2022
55a5480
Reworking distributed support (WIP).
lrzpellegrini Apr 12, 2022
b0ce2e3
Working strategy composition and example (naive, replay, scheduler).
lrzpellegrini Apr 21, 2022
976e5c5
Fixed pep8 issues.
lrzpellegrini Apr 22, 2022
efb7f86
Fixed typing error. Removed debug code.
lrzpellegrini Apr 22, 2022
e13f067
Merge remote-tracking branch 'upstream/master' into distributed_suppo…
lrzpellegrini Apr 22, 2022
3017aeb
Removed debug prints.
lrzpellegrini Apr 22, 2022
f8882d7
Implemented lazy creation of the default logger.
lrzpellegrini Apr 29, 2022
8571b91
[Distributed] Simplified internal API and example. Added in-code guide.
lrzpellegrini Apr 29, 2022
b752568
Added support for general use_local in strategies.
lrzpellegrini Apr 29, 2022
f5eaf96
Merge remote-tracking branch 'upstream/master' into distributed_suppo…
lrzpellegrini Apr 29, 2022
b13cc9b
Merge remote-tracking branch 'upstream/master' into distributed_suppo…
lrzpellegrini Jul 19, 2022
d1b9d28
Add type hints to _make_data_loader. Fix distributed training example.
lrzpellegrini Jul 19, 2022
f104a0e
Partial merge remote-tracking branch 'upstream/master' into distribut…
lrzpellegrini Nov 10, 2022
88f75a9
Integrated distributed training with RNGManager, new collate system. …
lrzpellegrini Nov 22, 2022
1717b8d
Improved management of dataloader arguments in strategies. Improved d…
lrzpellegrini Nov 23, 2022
da5c58c
Improved distributed strategy unit tests. Fixed PEP8 issues.
lrzpellegrini Nov 23, 2022
cdcd8c4
Aligned environment update action content.
lrzpellegrini Nov 23, 2022
2a93ad8
Fix multitask issues. Improve distributed training support and tests.
lrzpellegrini Dec 11, 2022
1174f33
Added additional unit tests. Issue with all_gather to be fixed.
lrzpellegrini Jan 10, 2023
6a3dd1f
Tests for DistributedHelper. Distributed support field in plugins.
lrzpellegrini Jan 16, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion .github/workflows/environment-update.yml
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,11 @@ jobs:
id: unittest
shell: bash -l -c "conda run -n avalanche-env --no-capture-output bash {0}"
run: |
python -m unittest discover tests
python -m unittest discover tests &&
echo "Running checkpointing tests..." &&
bash ./tests/checkpointing/test_checkpointing.sh &&
echo "Running distributed training tests..." &&
python ./tests/run_dist_tests.py &&
- name: checkout avalanche-docker repo
if: always()
uses: actions/checkout@v3
Expand Down
2 changes: 2 additions & 0 deletions .github/workflows/unit-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,8 @@ jobs:
python -m unittest discover tests &&
echo "Running checkpointing tests..." &&
bash ./tests/checkpointing/test_checkpointing.sh &&
echo "Running distributed training tests..." &&
python ./tests/run_dist_tests.py &&
echo "While running unit tests, the following datasets were downloaded:" &&
ls ~/.avalanche/data

15 changes: 6 additions & 9 deletions avalanche/benchmarks/scenarios/classification_scenario.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import copy
import re
import warnings
from abc import ABC
from typing import (
Generic,
Expand All @@ -18,10 +19,8 @@
Mapping,
)

from typing_extensions import Protocol

import warnings
from torch.utils.data.dataset import Dataset
from typing_extensions import Protocol

from avalanche.benchmarks.scenarios.generic_definitions import (
TCLExperience,
Expand All @@ -32,10 +31,8 @@
from avalanche.benchmarks.scenarios.lazy_dataset_sequence import (
LazyDatasetSequence,
)
from avalanche.benchmarks.utils import make_classification_dataset
from avalanche.benchmarks.utils.classification_dataset import (
ClassificationDataset,
)
from avalanche.benchmarks.utils import \
make_classification_dataset, AvalancheDataset
from avalanche.benchmarks.utils.dataset_utils import manage_advanced_indexing

TGenericCLClassificationScenario = TypeVar(
Expand Down Expand Up @@ -494,7 +491,7 @@ def _check_and_adapt_user_stream_def(
# exp_data[0] must contain the generator
stream_length = exp_data[1]
is_lazy = True
elif isinstance(exp_data, ClassificationDataset):
elif isinstance(exp_data, AvalancheDataset):
# Single element
exp_data = [exp_data]
is_lazy = False
Expand All @@ -506,7 +503,7 @@ def _check_and_adapt_user_stream_def(

if not is_lazy:
for i, dataset in enumerate(exp_data):
if not isinstance(dataset, ClassificationDataset):
if not isinstance(dataset, AvalancheDataset):
raise ValueError(
"All experience datasets must be subclasses of"
" AvalancheDataset"
Expand Down
Loading