Skip to content

Comments

[WIP] Merging AutoSP into DeepSpeed#7860

Draft
neeldani wants to merge 3 commits intodeepspeedai:masterfrom
neeldani:autosp
Draft

[WIP] Merging AutoSP into DeepSpeed#7860
neeldani wants to merge 3 commits intodeepspeedai:masterfrom
neeldani:autosp

Conversation

@neeldani
Copy link

@neeldani neeldani commented Feb 19, 2026

This is a WIP PR to merge AutoSP into DeepSpeed. AutoSP enables Ulysses styled sequence parallelism at the compiler level for transformer based torch models.

AutoSP's passes operate at the Torch IR level before partitioning of the forward/ backward graph and are mainly responsible for:
(1) auto-sharding of the input sequence, position ids and labels
(2) inserting the a2a before and after attention for Ulysses-styled sequence parallelism

Pending items/ discussion:

  1. The users need to call prepare_autosp_inputs. This API marks the sequence dimension as symbolic which helps to propogate the shape information to other ops after sharding the inputs.
  2. Better interface to enable AutoSP from DeepSpeep's config [WIP]
  3. Integrate AutoSP's selective activation checkpointing [WIP]

For reviewers

The compiler passes would remain the same however the interface from DeepSpeed config will be updated and so the initial review can be limited to the design of the passes.

Here is where the backend for AutoSP is initialized and passes are triggered

The directory bench_dc_ulysses contains the benchmarking scripts and will not be merged. But run_acc_lm.py contains an example of users could use AutoSP.

@neeldani neeldani changed the title [WIP] Merging AutoSP into Deepspeed [WIP] Merging AutoSP into DeepSpeed Feb 19, 2026
@tohtana
Copy link
Collaborator

tohtana commented Feb 20, 2026

Hi @neeldani,
Thank you for opening this PR! This is truly exciting.

Since this is a large PR, let’s proceed step by step. Here are my suggestions:

  • Code Location: This PR contains a significant amount of client code in bench_dc_ulysses. Could we move that to DeepSpeedExamples instead? Feel free to open a separate PR there for it.
  • Documentation: The README in bench_dc_ulysses appears to be outdated. Could you update it with instructions so we can reproduce the results?
  • API Design: Could you share the current API design? As you mentioned, we should discuss this further. You can either add the details to this PR or start a new Discussion in this repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants