|
1 | 1 | # Tiled layout |
2 | 2 |
|
3 | | -!!! Warning |
4 | | - Tiled layout is *pre-release* and this describes how it's intended to |
5 | | - work. Errors may be silently ignored. |
| 3 | +> **Caution:** Tiled layout is *pre-release* and this describes how it's |
| 4 | +> intended to work. Errors may be silently ignored. |
6 | 5 |
|
7 | 6 |  |
8 | 7 | <br>Figure 1 |
9 | 8 |
|
10 | 9 | Figure 1 shows how an array F32[3,5] is laid out in memory with 2x2 tiling. A |
11 | 10 | shape with this layout is written as F32[3,5]{1,0:T(2,2)}, where 1,0 relates to |
12 | | -the physical order of dimensions (minor_to_major field in Layout) while (2,2) |
| 11 | +the physical order of dimensions (`minor_to_major` field in Layout) while (2,2) |
13 | 12 | after the colon indicates tiling of the physical dimensions by a 2x2 tile. |
14 | 13 |
|
15 | | -Intuitively tiles are laid out to cover the shape and then within each tile, |
| 14 | +Intuitively, tiles are laid out to cover the shape and then within each tile, |
16 | 15 | elements are then laid out without tiling, as in the example above, where the |
17 | 16 | right part of the example shows the layout in memory, including the white |
18 | 17 | padding elements that are added in order to have complete 2x2 tiles even though |
@@ -70,9 +69,9 @@ array, padding is inserted as in Figure 1. Both the tiles and elements within |
70 | 69 | tiles are laid out recursively without tiling. |
71 | 70 |
|
72 | 71 | For the example in Figure 1, element (2,3) has tile index (1,1), and within-tile |
73 | | -index (0,1), for a combined coordinate vector of (1, 1, 0, 1). The tile indices |
74 | | -have bounds (2, 3) and the tile itself is (2, 2) for a combined vector of (2, 3, |
75 | | -2, 2). The linear index with tile for the element with index (2, 3) in the |
| 72 | +index (0,1), for a combined coordinate vector of (1,1,0,1). The tile indices |
| 73 | +have bounds (2,3) and the tile itself is (2,2) for a combined vector of |
| 74 | +(2,3,2,2). The linear index with tile for the element with index (2,3) in the |
76 | 75 | logical shape is then |
77 | 76 |
|
78 | 77 | linear_index_with_tile((2,3), (3,5), (2,2)) <br> |
@@ -125,12 +124,12 @@ XLA's tiling becomes even more flexible by applying it repeatedly. |
125 | 124 |
|
126 | 125 | Figure 2 shows how an array of size 4x8 is tiled by two levels of tiling (first |
127 | 126 | 2x4 then 2x1). We represent this repeated tiling as (2,4)(2,1). Each color |
128 | | -indicates a 2x4 tile and each red border box is a 2x1 tile. The numbers |
129 | | -indicates the linear index in memory of that element in the tiled format. This |
130 | | -format matches the format used for BF16 on TPU, except that the initial tile is |
131 | | -bigger, namely the tiling is (8,128)(2,1), where the purpose of the second |
132 | | -tiling by 2x1 is to collect together two 16 bit values to form one 32 bit value |
133 | | -in a way that aligns with the architecture of a TPU. |
| 127 | +indicates a 2x4 tile and each red border box is a 2x1 tile. The numbers indicate |
| 128 | +the linear index in memory of that element in the tiled format. This format |
| 129 | +matches the format used for BF16 on TPU, except that the initial tile is bigger, |
| 130 | +namely the tiling is (8,128)(2,1), where the purpose of the second tiling by 2x1 |
| 131 | +is to collect together two 16-bit values to form one 32-bit value in a way that |
| 132 | +aligns with the architecture of a TPU. |
134 | 133 |
|
135 | 134 | Note that a second or later tile can refer to both the minor within-tile |
136 | 135 | dimensions, which just rearranges data within the tile, as in this example with |
|
0 commit comments