From 8dfe96cdedd6c18b3dff27c70e11c2b7fdceff65 Mon Sep 17 00:00:00 2001 From: Giles Bathgate Date: Thu, 21 Dec 2023 20:24:35 +0000 Subject: [PATCH] Fix typo/spelling in README.md sunch -> some --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 2cb8df6..668fe01 100644 --- a/README.md +++ b/README.md @@ -124,7 +124,7 @@ For this we can choose as chunk size the window size. For each chunk, we thus ne # Sparse Mixture of Experts (SMoE) -Sparse Mixture of Experts allows one to decouple throughput from memory costs by only activating subsets of the overall model for each token. In this approach, each token is assigned to one or more "experts" -- a separate set of weights -- and only processed by sunch experts. This division happens at feedforward layers of the model. The expert models specialize in different aspects of the data, allowing them to capture complex patterns and make more accurate predictions. +Sparse Mixture of Experts allows one to decouple throughput from memory costs by only activating subsets of the overall model for each token. In this approach, each token is assigned to one or more "experts" -- a separate set of weights -- and only processed by some experts. This division happens at feedforward layers of the model. The expert models specialize in different aspects of the data, allowing them to capture complex patterns and make more accurate predictions. ![SMoE](assets/smoe.png)