pytorch / FBGEMM Public

Notifications You must be signed in to change notification settings
Fork 498
Star 1.2k

Code
Issues 36
Pull requests 356
Discussions
Actions
Projects
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security
Insights

Pull requests: pytorch/FBGEMM

Labels 19 Milestones 0

New pull request New

356 Open 2,812 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Improve FP8 BMM heuristic for large shapes and MoE E2E performance cla signed fb-exported

#3344 opened Nov 8, 2024 by jiawenliu64

Loading…

Add new optimizer state row_counter for Adam [Backend] cla signed fb-exported

#3342 opened Nov 8, 2024 by spcyppt

Loading…

Remove unused-variable in /fbcode/deeplearning/fbgemm/fbgemm_gpu/src/ssd_split_embeddings_cache/kv_db_table_batched_embeddings.cpp cla signed fb-exported

#3335 opened Nov 6, 2024 by r-barnes

Loading…

Fix global namespace pollution in ATen/Dispatch.h cla signed fb-exported

#3334 opened Nov 6, 2024 by slyfox3

Loading…

open-source SLL jagged_dense_bmm cla signed fb-exported

#3331 opened Nov 5, 2024 by brad-mengchi

Loading…

Remove unused-variable in some generated code cla signed fb-exported

#3327 opened Nov 5, 2024 by r-barnes

Loading…

Add support for int32_t indices in TBE training (2/N) cla signed fb-exported

#3326 opened Nov 5, 2024 by q10

Loading…

Add template info into generated files cla signed fb-exported

#3325 opened Nov 5, 2024 by q10

Loading…

- Kernel support for multiple buckets per rank cla signed fb-exported

#3323 opened Nov 5, 2024 by dstaay-fb

Loading…

Add support for int32_t indices cla signed fb-exported

#3319 opened Nov 3, 2024 by q10

Loading…

Update benchmark test for Int32_t Indicies cla signed fb-exported

#3317 opened Nov 3, 2024 by q10

Loading…

Unitifed Prefetching API for CPU TBE cla signed fb-exported

#3314 opened Nov 2, 2024 by excelle08

Loading…

Allow registering SSD prefetcher after initiliaztion of the TBE module cla signed fb-exported

#3313 opened Nov 2, 2024 by excelle08

Loading…

Store the prefetching results in a queue in SSD prefetcher, retrieve later in .forward() method cla signed fb-exported

#3312 opened Nov 2, 2024 by excelle08

Loading…

Add manual loop unroll for rocm devices in fwd pass cla signed module: rocm

#3309 opened Nov 1, 2024 by avbokovoy

Loading…

Do not call scalar_type cla signed fb-exported

#3308 opened Nov 1, 2024 by malfet

Loading…

Clean up sparse bucketize cla signed fb-exported

#3302 opened Oct 31, 2024 by q10

Loading…

Reorganize sparse block bucketize macros cla signed fb-exported

#3296 opened Oct 30, 2024 by q10

Loading…

Refactor repeat code in sparse_block_bucketize cla signed fb-exported

#3295 opened Oct 30, 2024 by q10

Loading…

Add large my_size support in _block_bucketize_pooled_sparse_features_cuda_kernel2 cla signed fb-exported

#3294 opened Oct 30, 2024 by sryap

Loading…

Add feature knobs to FBGEMM cla signed fb-exported

#3290 opened Oct 29, 2024 by q10

Loading…

Use c10::irange in deeplearning/fbgemm/BUCK +10 cla signed fb-exported

#3288 opened Oct 29, 2024 by q10

Loading…

Amd 2 cla signed fb-exported module: rocm

#3279 opened Oct 25, 2024 by jianyuh • Draft

repo cla signed fb-exported

#3274 opened Oct 24, 2024 by laithsakka

Loading…

re-warmup cla signed fb-exported

#3271 opened Oct 23, 2024 by minhua-chen

Loading…

Previous 1 2 3 4 5 … 14 15 Next

Previous Next

ProTip! Add no:assignee to see everything that’s not assigned.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly