Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
jeffra authored Nov 3, 2023
1 parent c08c52b commit 4fc45b8
Showing 1 changed file with 4 additions and 4 deletions.
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,11 +27,11 @@

<!-- tocstop -->

# DeepSpeed Model Implementations for Inference
# DeepSpeed Model Implementations for Inference (MII)

Introducing DeepSpeed-MII, an open-source Python library designed by DeepSpeed to democratize powerful model inference with a focus on high-throughput, low latency, and cost-effectiveness.
Introducing MII, an open-source Python library designed by DeepSpeed to democratize powerful model inference with a focus on high-throughput, low latency, and cost-effectiveness.

* DeepSpeed-MII v0.1 introduces several features such as blocked KV-caching, continuous batching, Dynamic SplitFuse, tensor parallelism, and high-performance CUDA kernels to support fast high throughput text-generation for LLMs such as Llama-2-70B. DeepSpeed-MII delivers up to 2.3 times higher effective throughput compared to leading systems such as vLLM. For detailed performance results please see our [DeepSpeed-FastGen blog](https://github.com/microsoft/DeepSpeed/tree/master/blogs/deepspeed-fastgen).
* MII v0.1 introduces several features such as blocked KV-caching, continuous batching, Dynamic SplitFuse, tensor parallelism, and high-performance CUDA kernels to support fast high throughput text-generation for LLMs such as Llama-2-70B. MII delivers up to 2.3 times higher effective throughput compared to leading systems such as vLLM. For detailed performance results please see our [DeepSpeed-FastGen blog](https://github.com/microsoft/DeepSpeed/tree/master/blogs/deepspeed-fastgen).

<div align="center">
<img src="docs/images/fastgen-hero-light.png#gh-light-mode-only" width="800px">
Expand All @@ -44,7 +44,7 @@ Introducing DeepSpeed-MII, an open-source Python library designed by DeepSpeed t

## MII for High-Throughput Text Generation

DeepSpeed-MII provides accelerated text-generation inference through the use of four key technologies:
MII provides accelerated text-generation inference through the use of four key technologies:

* Blocked KV Caching
* Continuous Batching
Expand Down

0 comments on commit 4fc45b8

Please sign in to comment.