Skip to content

[PyTorch] Adjusted the logic of MHA and DPA to enable speculative decoding #3206

[PyTorch] Adjusted the logic of MHA and DPA to enable speculative decoding

[PyTorch] Adjusted the logic of MHA and DPA to enable speculative decoding #3206

Workflow file for this run

# Copyright (c) 2022-2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
#
# See LICENSE for license information.
# A workflow to trigger TE build on GitHub
name: 'Build'
on:
pull_request:
workflow_dispatch:
jobs:
pytorch:
name: 'PyTorch'
runs-on: ubuntu-latest
if: false # NGC PyTorch container does not fit on GitHub runner
container:
image: nvcr.io/nvidia/pytorch:23.03-py3
options: --user root
steps:
- name: 'Checkout'
uses: actions/checkout@v3
with:
submodules: recursive
- name: 'Build'
run: pip install . -v --no-deps
env:
NVTE_FRAMEWORK: pytorch
MAX_JOBS: 1
- name: 'Sanity check'
run: python tests/pytorch/test_sanity_import.py
jax:
name: 'JAX'
runs-on: ubuntu-latest
container:
image: ghcr.io/nvidia/jax:latest
options: --user root
steps:
- name: 'Checkout'
uses: actions/checkout@v3
with:
submodules: recursive
- name: 'Build'
run: pip install . -v
env:
NVTE_FRAMEWORK: jax
- name: 'Sanity check'
run: python tests/jax/test_sanity_import.py