remove megatron large sparsity data

because we captured megatron large dense data by mistake. Signed-off-by: Vincent Huang <[email protected]>
hiennguyen9874 · Oct 28, 2022 · 3e345ef · 3e345ef
1 parent 007c347
commit 3e345ef
Showing 1 changed file with 0 additions and 54 deletions.
diff --git a/demo/BERT/README.md b/demo/BERT/README.md
@@ -476,33 +476,6 @@ Results were obtained by running `scripts/inference_benchmark.sh --gpu Ampere` o
 | 384 | 64 | 45.62 | 45.63 | 45.40 | 84.26 | 84.56 | 83.63 |
 | 384 | 128 | 89.51 | 89.55 | 89.01 | 164.56 | 164.95 | 163.70 |
 
-##### Megatron Large with Sparsity
-
-| Sequence Length | Batch Size | INT8 QAT Latency (ms) |               |         |
-|-----------------|------------|-----------------|-----------------|---------|
-|                 |            | 95th Percentile | 99th Percentile | Average |
-| 128 | 1 | 1.17 | 1.18 | 1.14 |
-| 128 | 2 | 1.43 | 1.82 | 1.43 |
-| 128 | 4 | 1.90 | 1.90 | 1.90 |
-| 128 | 8 | 3.08 | 3.08 | 3.05 |
-| 128 | 12 | 3.36 | 3.36 | 3.36 |
-| 128 | 16 | 4.42 | 4.42 | 4.42 |
-| 128 | 24 | 6.01 | 6.01 | 6.00 |
-| 128 | 32 | 7.75 | 7.76 | 7.75 |
-| 128 | 64 | 13.91 | 14.04 | 13.81 |
-| 128 | 128 | 27.11 | 27.12 | 26.85 |
-| 384 | 1 | 1.71 | 1.71 | 1.71 |
-| 384 | 2 | 2.37 | 2.37 | 2.37 |
-| 384 | 4 | 3.92 | 3.92 | 3.92 |
-| 384 | 8 | 6.80 | 6.80 | 6.80 |
-| 384 | 12 | 9.02 | 9.03 | 9.02 |
-| 384 | 16 | 12.15 | 12.16 | 12.15 |
-| 384 | 24 | 17.54 | 17.55 | 17.41 |
-| 384 | 32 | 22.94 | 22.96 | 22.71 |
-| 384 | 64 | 43.88 | 43.90 | 43.61 |
-| 384 | 128 | 85.42 | 85.45 | 84.89 |
-
-
 #### Inference performance: NVIDIA A30
 
 Results were obtained by running `scripts/inference_benchmark.sh --gpu Ampere` on NVIDIA A30.
@@ -559,33 +532,6 @@ Results were obtained by running `scripts/inference_benchmark.sh --gpu Ampere` o
 | 384 | 64 | 92.04 | 92.37 | 91.21 | 174.21 | 174.91 | 173.29 |
 | 384 | 128 | 180.77 | 181.11 | 179.78 | 343.25 | 343.80 | 342.30 |
 
-##### Megatron Large with Sparsity
-
-| Sequence Length | Batch Size | INT8 QAT Latency (ms) |               |         |
-|-----------------|------------|-----------------|-----------------|---------|
-|                 |            | 95th Percentile | 99th Percentile | Average |
-| 128 | 1 | 1.43 | 1.43 | 1.43 |
-| 128 | 2 | 1.90 | 1.90 | 1.90 |
-| 128 | 4 | 3.12 | 3.13 | 3.09 |
-| 128 | 8 | 4.79 | 4.79 | 4.78 |
-| 128 | 12 | 6.38 | 6.39 | 6.35 |
-| 128 | 16 | 8.63 | 8.67 | 8.55 |
-| 128 | 24 | 11.99 | 12.00 | 11.92 |
-| 128 | 32 | 16.42 | 16.43 | 16.37 |
-| 128 | 64 | 30.11 | 30.12 | 29.91 |
-| 128 | 128 | 58.93 | 59.03 | 58.39 |
-| 384 | 1 | 2.70 | 2.70 | 2.70 |
-| 384 | 2 | 4.18 | 4.18 | 4.17 |
-| 384 | 4 | 7.33 | 7.35 | 7.26 |
-| 384 | 8 | 13.78 | 13.79 | 13.63 |
-| 384 | 12 | 19.47 | 19.48 | 19.30 |
-| 384 | 16 | 25.55 | 25.56 | 25.34 |
-| 384 | 24 | 37.13 | 37.15 | 36.55 |
-| 384 | 32 | 48.76 | 48.78 | 48.20 |
-| 384 | 64 | 95.57 | 95.85 | 94.96 |
-| 384 | 128 | 186.36 | 186.83 | 185.37 |
-
-
 #### Inference performance: NVIDIA T4 (16GB)
 
 Results were obtained by running `scripts/inference_benchmark.sh --gpu Turing` on NVIDIA T4 (16G).