-
Notifications
You must be signed in to change notification settings - Fork 91
Pull requests: NVIDIA/NeMo-Curator
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Clean up internal column logic in Run GPU CI/CD on PR
_run_classifier_helper
function
gpuci
#457
opened Dec 23, 2024 by
sarahyurick
•
Draft
Update
get_all_files_paths_under
examples to include keep_extensions
#450
opened Dec 20, 2024 by
sarahyurick
Loading…
Support the new minhash 25.02 api
gpuci
Run GPU CI/CD on PR
#445
opened Dec 20, 2024 by
praateekmahajan
Loading…
3 tasks
[WIP] Add RAPIDS Nightly to GPU CI
gpuci
Run GPU CI/CD on PR
#436
opened Dec 17, 2024 by
praateekmahajan
•
Draft
3 tasks
Bump nltk from 3.8.1 to 3.9 in /tutorials/dapt-curation/code
dependencies
Pull requests that update a dependency file
#429
opened Dec 13, 2024 by
dependabot
bot
Loading…
Create notebook tutorials for distributed data classifiers
documentation
Improvements or additions to documentation
#415
opened Dec 6, 2024 by
sarahyurick
Loading…
3 tasks done
Create separate files for each deduplication class
gpuci
Run GPU CI/CD on PR
#409
opened Dec 3, 2024 by
sarahyurick
Loading…
[WIP] Efficient Exact Duplicate Removal Code
#404
opened Dec 2, 2024 by
praateekmahajan
•
Draft
3 tasks
Fix GPU error messages for fuzzy deduplication
gpuci
Run GPU CI/CD on PR
#387
opened Nov 22, 2024 by
sarahyurick
Loading…
2 tasks done
Fuzzy Dedup: Make skipping the False positive check the default
enhancement
New feature or request
gpuci
Run GPU CI/CD on PR
#386
opened Nov 21, 2024 by
ayushdg
Loading…
2 of 3 tasks
Remove Run GPU CI/CD on PR
max_text_bytes_per_part
gpuci
#385
opened Nov 20, 2024 by
sarahyurick
Loading…
Global Run GPU CI/CD on PR
cache_dir
variable for exact, fuzzy, and semantic deduplication
gpuci
#384
opened Nov 19, 2024 by
sarahyurick
Loading…
3 tasks done
Convert
translation_example.py
into a Jupyter Notebook tutorial
#336
opened Oct 29, 2024 by
sarahyurick
•
Draft
Added example notebook for translation with ct2 model.
documentation
Improvements or additions to documentation
Adding an example for executing NeMo modules using kubernetes Python …
documentation
Improvements or additions to documentation
#148
opened Jul 9, 2024 by
dpadmanabhan03
Loading…
2 of 3 tasks
Previous Next
ProTip!
Updated in the last three days: updated:>2024-12-21.