Skip to content
This repository was archived by the owner on Aug 30, 2024. It is now read-only.

Commit d6e36e3

Browse files
zhenwei-intelVincyZhang
authored andcommitted
update readme path and copy hidden files (#185)
* move hidden files Signed-off-by: zhenwei-intel <[email protected]> * update readme path Signed-off-by: zhenwei-intel <[email protected]> --------- Signed-off-by: zhenwei-intel <[email protected]>
1 parent 4824186 commit d6e36e3

File tree

11 files changed

+52
-33
lines changed

11 files changed

+52
-33
lines changed

.clang-format

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
Language: Cpp
2+
BasedOnStyle: Google
3+
DerivePointerAlignment: false
4+
ColumnLimit: 120
5+
SpaceBeforeParens: ControlStatements
6+
SpaceBeforeRangeBasedForLoopColon: true
7+
SortIncludes: false

.editorconfig

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
root = true
2+
3+
[*]
4+
charset = utf-8
5+
indent_style = space
6+
indent_size = 2
7+
end_of_line = lf
8+
insert_final_newline = true
9+
trim_trailing_whitespace = true
10+
11+
[*.py]
12+
indent_size = 4

docs/Installation.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ pip install intel-extension-for-transformers
2222
git clone https://github.com/intel/intel-extension-for-transformers.git itrex
2323
cd itrex
2424
git submodule update --init --recursive
25-
cd intel_extension_for_transformers/backends/neural_engine/
25+
cd intel_extension_for_transformers/llm/runtime/
2626
mkdir build
2727
cd build
2828
cmake .. -DPYTHON_EXECUTABLE=$(which python3) -DNE_WITH_SPARSELIB=True

docs/add_customized_pattern.md

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -5,17 +5,17 @@
55
- [Fuse Pattern and Set Attributes of New Pattern after Fusion](#fuse-pattern-and-set-attributes-of-new-pattern-after-fusion)
66

77
## Introduction
8-
The `Neural Engine` in `Intel® Extension for Transformers` support user to add customized pattern of model, which means you can compile your own pretrained model to `Neural Engine` IR (Intermediate Representation) just by adding the specific patterns which the [`compile`](/intel_extension_for_transformers/backends/neural_engine/compile) does not contain.
8+
The `Neural Engine` in `Intel® Extension for Transformers` support user to add customized pattern of model, which means you can compile your own pretrained model to `Neural Engine` IR (Intermediate Representation) just by adding the specific patterns which the [`compile`](/intel_extension_for_transformers/llm/runtime/compile) does not contain.
99

1010
The intermediate graph in `Neural Engine` can be treated as a `list` that stores all nodes of the model under control flow. Some certain nodes may compose a pattern which needs to be fused for speeding up inference. For simplifying the network structure, we also design different attributes attached to fused nodes. To aim at adding a customized pattern, there are three steps: **1. register the nodes' op_types; 2. set the pattern mapping config and register the pattern; 3. fuse pattern and set attributes of the new pattern after fusion.**
1111

1212
![](imgs/layernorm_distilbert_base_onnx.png)
1313

14-
Above is a `LayerNorm` pattern in the `Distilbert_Base` onnx model. Assume it is a customized pattern in your model that need to be added in [`compile`](/intel_extension_for_transformers/backends/neural_engine/compile). Follow the steps below to make `Neural Engine` support this pattern, and fuse these 9 nodes to one node called `LayerNorm`.
14+
Above is a `LayerNorm` pattern in the `Distilbert_Base` onnx model. Assume it is a customized pattern in your model that need to be added in [`compile`](/intel_extension_for_transformers/llm/runtime/compile). Follow the steps below to make `Neural Engine` support this pattern, and fuse these 9 nodes to one node called `LayerNorm`.
1515

1616
## Register the Nodes' Op Types
1717

18-
First, you should check whether the nodes' op_types in the pattern are registered in `Engine` or not. If not, you need to add the op_type class for [`compile`](/intel_extension_for_transformers/backends/neural_engine/compile) loading and extracting the origin model. All the ops can be found from the [`compile.ops`](/intel_extension_for_transformers/backends/neural_engine/compile/ops). For quick check, use the commands below.
18+
First, you should check whether the nodes' op_types in the pattern are registered in `Engine` or not. If not, you need to add the op_type class for [`compile`](/intel_extension_for_transformers/llm/runtime/compile) loading and extracting the origin model. All the ops can be found from the [`compile.ops`](/intel_extension_for_transformers/llm/runtime/compile/ops). For quick check, use the commands below.
1919

2020
```python
2121
# make sure you have cloned intel_extension_for_transformers repo and installed intel_extension_for_transformers
@@ -30,11 +30,11 @@ The print result will show all registered ops, for example:
3030
{'Gelu': <class 'intel_extension_for_transformers.llm.runtime.compile.ops.gelu.Gelu'>, 'Unsqueeze': <class 'intel_extension_for_transformers.llm.runtime.compile.ops.unsqueeze.Unsqueeze'>, 'OptimizeDataset': <class 'intel_extension_for_transformers.llm.runtime.compile.ops.optimize_dataset.OptimizeDataset'>, 'IteratorV2': <class 'intel_extension_for_transformers.llm.runtime.compile.ops.iterator_v2.IteratorV2'>, 'QuantizeLinear': <class 'intel_extension_for_transformers.llm.runtime.compile.ops.quantize_linear.QuantizeLinear'>, 'Gather': <class 'intel_extension_for_transformers.llm.runtime.compile.ops.gather.Gather'>, 'GatherV2': <class 'intel_extension_for_transformers.llm.runtime.compile.ops.gather.GatherV2'>, 'GatherElements': <class 'intel_extension_for_transformers.llm.runtime.compile.ops.gather_elements.GatherElements'>, 'Unpack': <class 'intel_extension_for_transformers.llm.runtime.compile.ops.unpack.Unpack'>, 'MapAndBatchDataset': <class 'intel_extension_for_transformers.llm.runtime.compile.ops.map_and_batch_dataset.MapAndBatchDataset'>, 'Concat': <class 'intel_extension_for_transformers.llm.runtime.compile.ops.concat.Concat'>, ...}
3131
```
3232

33-
These ops can be roughly divided into two categories, the one is without attributes, like `Mul`, the other one is with attributes, for example, `Reshape` has the attributes `dst_shape`. You can look through the [`executor`](/intel_extension_for_transformers/backends/neural_engine/executor) for more info about the `Neural Engine` ops' attribute settings.
33+
These ops can be roughly divided into two categories, the one is without attributes, like `Mul`, the other one is with attributes, for example, `Reshape` has the attributes `dst_shape`. You can look through the [`executor`](/intel_extension_for_transformers/llm/runtime/executor) for more info about the `Neural Engine` ops' attribute settings.
3434

35-
Assume the `Sqrt` and `ReduceMean` in `LayerNorm` pattern are new op_types for [`compile`](/intel_extension_for_transformers/backends/neural_engine/compile). Here are the examples that show how to register them.
35+
Assume the `Sqrt` and `ReduceMean` in `LayerNorm` pattern are new op_types for [`compile`](/intel_extension_for_transformers/llm/runtime/compile). Here are the examples that show how to register them.
3636

37-
`Sqrt` has no attributes. You can add this op class in [`compile.ops.empty_ops`](https://github.com/intel/intel-extension-for-transformers/blob/main/intel_extension_for_transformers/backends/neural_engine/compile/ops/empty_ops.py).
37+
`Sqrt` has no attributes. You can add this op class in [`compile.ops.empty_ops`](https://github.com/intel/intel-extension-for-transformers/blob/main/intel_extension_for_transformers/llm/runtime/compile/ops/empty_ops.py).
3838

3939
```python
4040
# register the 'Sqrt' class in OPERATORS
@@ -47,9 +47,9 @@ class Sqrt(Operator):
4747

4848
`ReduceMean` has `keep_dims` and `axis` two attributes, you need to set them by extracting the node from the origin model.
4949

50-
Create a python file (for example, name can be `reduce_mean.py`) in [`compile.ops`](/intel_extension_for_transformers/backends/neural_engine/compile/ops) and add the `ReduceMean` op class.
50+
Create a python file (for example, name can be `reduce_mean.py`) in [`compile.ops`](/intel_extension_for_transformers/llm/runtime/compile/ops) and add the `ReduceMean` op class.
5151

52-
In this `LayerNorm` pattern, the `ReduceMean` node in origin onnx model just has `axes` value which is a list, that is the value of `axis` attribute comes from. The `keep_dims` attribute is `False` by default in [`executor`](/intel_extension_for_transformers/backends/neural_engine/executor), so if the `ReduceMean` node has the `keep_dims` attribute, you should extract and set it. Otherwise, you can just ignore it.
52+
In this `LayerNorm` pattern, the `ReduceMean` node in origin onnx model just has `axes` value which is a list, that is the value of `axis` attribute comes from. The `keep_dims` attribute is `False` by default in [`executor`](/intel_extension_for_transformers/llm/runtime/executor), so if the `ReduceMean` node has the `keep_dims` attribute, you should extract and set it. Otherwise, you can just ignore it.
5353

5454
```python
5555
from .op import Operator, operator_registry
@@ -99,9 +99,9 @@ If nothing wrong, the output result should be `True`.
9999

100100
## Set the Pattern Mapping Config and Register the Pattern
101101

102-
In `Neural Engine`, we treat the pattern fusion as the process of pattern mapping: from a group nodes to another group nodes. In this step, you need to provide a config for `pattern_mapping` function and register your pattern, in order to make sure the [`compile`](/intel_extension_for_transformers/backends/neural_engine/compile) implements pattern fusion correctly.
102+
In `Neural Engine`, we treat the pattern fusion as the process of pattern mapping: from a group nodes to another group nodes. In this step, you need to provide a config for `pattern_mapping` function and register your pattern, in order to make sure the [`compile`](/intel_extension_for_transformers/llm/runtime/compile) implements pattern fusion correctly.
103103

104-
- Create a python file (for example, name can be `layer_norm.py`) in [`compile.sub_graph`](/intel_extension_for_transformers/backends/neural_engine/compile/sub_graph) and add the `LayerNorm` pattern mapping config.
104+
- Create a python file (for example, name can be `layer_norm.py`) in [`compile.sub_graph`](/intel_extension_for_transformers/llm/runtime/compile/sub_graph) and add the `LayerNorm` pattern mapping config.
105105

106106
For the above `LayerNorm` pattern, the config example can be like this:
107107

@@ -139,7 +139,7 @@ In `Neural Engine`, we treat the pattern fusion as the process of pattern mappin
139139
}
140140
```
141141

142-
The dict in the config will guide the `pattern_mapping` function on how to find all the group nodes that belong to `LayerNorm` pattern in intermediate graph and how to replace them with new pattern. We use this config to store many dicts because different models (even the same model) could have different representations for a certain pattern. If you want to delve into it, please see [pattern_recognize](https://github.com/intel/intel-extension-for-transformers/blob/main/intel_extension_for_transformers/backends/neural_engine/docs/pattern_recognize.md) and [graph_fusion](https://github.com/intel/intel-extension-for-transformers/blob/main/intel_extension_for_transformers/backends/neural_engine/docs/graph_fusion.md) docs for more details.
142+
The dict in the config will guide the `pattern_mapping` function on how to find all the group nodes that belong to `LayerNorm` pattern in intermediate graph and how to replace them with new pattern. We use this config to store many dicts because different models (even the same model) could have different representations for a certain pattern. If you want to delve into it, please see [pattern_recognize](https://github.com/intel/intel-extension-for-transformers/blob/main/intel_extension_for_transformers/llm/runtime/docs/pattern_recognize.md) and [graph_fusion](https://github.com/intel/intel-extension-for-transformers/blob/main/intel_extension_for_transformers/llm/runtime/docs/graph_fusion.md) docs for more details.
143143

144144
- Register the `LayerNorm` pattern
145145

@@ -213,9 +213,9 @@ In `Neural Engine`, we treat the pattern fusion as the process of pattern mappin
213213

214214
- Define the pattern fusion order
215215

216-
Fusing patterns should follow specific order if a model has multiple patterns. For example, if the model has A pattern (nodes: a-->b) and B pattern (nodes: a-->b-->c), and B pattern is actually equivalent to A pattern + c node. So you should fuse A pattern first, then B pattern (more info and details please see the [graph_fusion](https://github.com/intel/intel-extension-for-transformers/blob/main/intel_extension_for_transformers/backends/neural_engine/docs/graph_fusion.md)).
216+
Fusing patterns should follow specific order if a model has multiple patterns. For example, if the model has A pattern (nodes: a-->b) and B pattern (nodes: a-->b-->c), and B pattern is actually equivalent to A pattern + c node. So you should fuse A pattern first, then B pattern (more info and details please see the [graph_fusion](https://github.com/intel/intel-extension-for-transformers/blob/main/intel_extension_for_transformers/llm/runtime/docs/graph_fusion.md)).
217217

218-
There is a list called `supported_patterns` in [`compile.sub_graph.pattern`](/intel_extension_for_transformers/backends/neural_engine/compile/sub_graph/pattern.py). It controls the order of pattern fusion. You need to add your customized pattern name (the `pattern_type` you register in step 2) into `supported_patterns` at appropriate location (If a pattern does not influence other patterns, you can put it at an arbitrary location).
218+
There is a list called `supported_patterns` in [`compile.sub_graph.pattern`](/intel_extension_for_transformers/llm/runtime/compile/sub_graph/pattern.py). It controls the order of pattern fusion. You need to add your customized pattern name (the `pattern_type` you register in step 2) into `supported_patterns` at appropriate location (If a pattern does not influence other patterns, you can put it at an arbitrary location).
219219

220220
For example, change the `supported_patterns` like:
221221

@@ -245,7 +245,7 @@ In `Neural Engine`, we treat the pattern fusion as the process of pattern mappin
245245

246246
- Set the attributes of new pattern
247247

248-
Every new pattern generated after fusion could have its attributes (when we talk about pattern attributes, it stands for the operator's attributes in the pattern, which are defined by the [`executor`](/intel_extension_for_transformers/backends/neural_engine/executor) ). As for `LayerNorm` pattern, the above 9 nodes are fused to one node with op_type `LayerNorm`. This operation has an attribute `epsilon` in [`executor`](/intel_extension_for_transformers/backends/neural_engine/executor), which is a value added to the denominator for numerical stability.
248+
Every new pattern generated after fusion could have its attributes (when we talk about pattern attributes, it stands for the operator's attributes in the pattern, which are defined by the [`executor`](/intel_extension_for_transformers/llm/runtime/executor) ). As for `LayerNorm` pattern, the above 9 nodes are fused to one node with op_type `LayerNorm`. This operation has an attribute `epsilon` in [`executor`](/intel_extension_for_transformers/llm/runtime/executor), which is a value added to the denominator for numerical stability.
249249

250250
We recommend to write a `_set_attr` function and call it after pattern mapping to set the nodes' attributes. Here is the example for `LayerNorm` pattern.
251251

@@ -335,7 +335,7 @@ class LayerNorm(Pattern):
335335
return model
336336
```
337337

338-
After finishing these three steps in [`compile`](/intel_extension_for_transformers/backends/neural_engine/compile), reinstall `intel_extension_for_transformers` and then use [`compile`](/intel_extension_for_transformers/backends/neural_engine/compile) function would compile your model with the customized pattern.
338+
After finishing these three steps in [`compile`](/intel_extension_for_transformers/llm/runtime/compile), reinstall `intel_extension_for_transformers` and then use [`compile`](/intel_extension_for_transformers/llm/runtime/compile) function would compile your model with the customized pattern.
339339

340340
>**Note**:
341341
>

docs/deploy_and_integration.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -123,12 +123,12 @@ mkdir engine_integration && cd engine_integration
123123
git init
124124
git submodule add https://github.com/intel/intel-extension-for-transformers itrex
125125
git submodule update --init --recursive
126-
cp itrex/intel_extension_for_transformers/backends/neural_engine/CMakeLists.txt .
127-
cp itrex/intel_extension_for_transformers/backends/neural_engine/executor/src/nlp_executor.cc neural_engine_example.cc
126+
cp itrex/intel_extension_for_transformers/llm/runtime/CMakeLists.txt .
127+
cp itrex/intel_extension_for_transformers/llm/runtime/executor/src/nlp_executor.cc neural_engine_example.cc
128128
```
129129
Modify the NE_ROOT in the CmakeLists.txt.
130130
```cmake
131-
set(NE_ROOT "${PROJECT_SOURCE_DIR}/itrex/intel_extension_for_transformers/backends/neural_engine")
131+
set(NE_ROOT "${PROJECT_SOURCE_DIR}/itrex/intel_extension_for_transformers/llm/runtime")
132132
```
133133

134134
Compile neural_engine_example.cc as binary named neural_engine_example and link Nerual Engine include/lib into neural_engine_example.

0 commit comments

Comments
 (0)