Introduce functionality for chunking experiments #10

mcw92 · 2024-10-24T11:32:03Z

This PR introduces functionality for chunking experiments. The following changes have been made:

Make the synthetic data generation consistent throughout the code. This means that in the serial case, the dataset generated with generate_and_distribute_synthetic_dataset without local or global imbalances equals the completely balanced dataset generated with make_classification_dataset when using the same random state. This ensures comparability of the strong scaling experiment series with and without chunking as the same datasets are created when passing the same random state.
Fix passing additional keyword arguments in both train_parallel_on_synthetic data and train_parallel_on_balanced_synthetic_data. This was completely missing in the former case. In addition, the argument parser was lacking some of the keyword arguments used in sklearn's make_classification and train_test_split used under the hood.

… different train functions

github-actions · 2024-10-24T11:39:18Z

Name	Stmts	Miss	Cover	Missing
specialcouscous/__init__.py	0	0	100%
specialcouscous/rf_parallel.py	119	9	92%	89-93, 194, 198, 271-273, 447
specialcouscous/synthetic_classification_data.py	215	49	77%	88-90, 185, 304-324, 358, 469, 471, 561-567, 585, 871-885, 1095-1151, 1224-1246
specialcouscous/train.py	260	1	99%	525
specialcouscous/utils/__init__.py	61	33	46%	31, 81-82, 106-287
specialcouscous/utils/plot.py	136	74	46%	152, 277-302, 319-405, 421-547
specialcouscous/utils/result_handling.py	22	1	95%	79
specialcouscous/utils/slurm.py	79	72	9%	22-116, 133-149, 166-177
specialcouscous/utils/timing.py	35	0	100%
TOTAL	927	239	74%

codecov-commenter · 2024-10-24T11:39:26Z

Codecov Report

Attention: Patch coverage is 93.10345% with 2 lines in your changes missing coverage. Please review.

Project coverage is 74.21%. Comparing base (864d122) to head (0e91832).
Report is 2 commits behind head on main.

Files with missing lines	Patch %	Lines
specialcouscous/utils/__init__.py	0.00%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #10      +/-   ##
==========================================
+ Coverage   73.62%   74.21%   +0.59%     
==========================================
  Files           8        8              
  Lines         906      927      +21     
==========================================
+ Hits          667      688      +21     
  Misses        239      239

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

mcw92 added 6 commits September 24, 2024 10:29

add job script generation script for chunking experiment series

c3c760c

Merge branch 'feature/inference_flavor' into feature/chunking

27202a0

update chunking job script generation script with more data seeds

337df8a

make synthetic dataset generation consistent over example scripts and…

ab78953

… different train functions

update target columns

2bdc3ec

merge main into branch

1817cc3

mcw92 requested a review from fluegelk October 24, 2024 11:32

mcw92 self-assigned this Oct 24, 2024

mcw92 added 14 commits October 25, 2024 13:42

fix bug with random state for model

8ca8fa1

add plotting scripts for strong and weak scaling results

dae1b5d

remove testing for MacOS for now

f9f1367

consistify plotting style

d6ce7cb

add efficiency plot

3c90685

add chunking plotting script

1855e69

consistify plotting style with error bars

5034e52

consistify plotting style

b01cc93

consistify plotting style

ae3c1f2

refactor plotting settings

43ed00d

add bash script for all the plotting

1a48ec8

clean up scripts (WIP) and consistify plotting style

06399bc

set memory correctly

6c038ec

rename script

0e91832

mcw92 closed this Dec 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce functionality for chunking experiments #10

Introduce functionality for chunking experiments #10

mcw92 commented Oct 24, 2024

github-actions bot commented Oct 24, 2024 •

edited

Loading

codecov-commenter commented Oct 24, 2024 •

edited

Loading

Introduce functionality for chunking experiments #10

Introduce functionality for chunking experiments #10

Conversation

mcw92 commented Oct 24, 2024

github-actions bot commented Oct 24, 2024 • edited Loading

codecov-commenter commented Oct 24, 2024 • edited Loading

Codecov Report

github-actions bot commented Oct 24, 2024 •

edited

Loading

codecov-commenter commented Oct 24, 2024 •

edited

Loading