-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversion to torchscript or ONNX #24
Comments
Looking at code in pytorch master: It seems like Dict[Tuple[str,str]] should be covered in ~that recursion... but maybe not in: |
Update: all the above was with torch.jit.script() which is actually a sort of transpiler, rather than a tracer. I am progressing with torch.jit.trace(). |
Hi @drewm1980 Unfortunately, I have not much experience with torchscript and torch.jit, so I am not sure how to solve that problem. However, I recommend to check the Unfortunately, not all equivariant modules support this functionality yet, but the most commonly used ones do. Let me know if this helps! Gabriele |
Thanks for pointing that out; I'll see if I can adapt my code to use it (I currently inherit from torch.nn.Module). One obstacle might be that my code uses torch.nn.ModuleList. There are no hits for "ModuleList" in the e2cnn repo, so I'm guessing there is no equivalent. |
I see; anyways, if your torch.nn.Module contains some EquivariantModules, you could just manually call the .export() method of the equivariant submodules. |
My Unet is structured as a module that depends on two other modules. They all call operators in their forward() methods that don't have Module subclass equivalents, and they're also statically typed. To do this "right" it seems like I would need to define non-e2cnn versions of each of those, along with ports of the forward() functions with correct static type signatures. As you suggest, I could try recursively export()'ing all of the owned modules in place. The dynamic type signatures will deviate from the static ones if I do that, but maybe something is possible with generics. Is there some way to build a DAG of modules in pytorch that I'm missing, or is subclassing (along with the consequences for composability I'm hitting here) really the only way? Are there any inference time computations or abstractions that you're certain wouldn't just get pruned out by the model optimizer anyway? If it's probably going to boil down to the same inference network, I'll skip .export() for now and just hope for the best with torchscript tracing and the tensorrt compiler. |
After setting .eval() mode, my code should be a bit optimized such that useless computations are skipped but the code of the library is still carrying a lot of additional structure and data (and many asserts I used to implement a manual form of static typing, e.g. to ensure the tensors passed to a module have the right FieldType) which you may not want to have at deployment. Unfortunately, I don't know enough about torchscript work to be able to answer your question properly :( I would still reccommend using the .export() method. That seems the cleanest and safest option to me. I don't see a simple solution to this for the moment though. Let me know if you find some nicer solutions Best, P.S.: In some future release, I am thinking of relaxing the strongly typed structure of the equivariant modules, such that they can accept both Pytorch tensors and GeometricTensors, such that one doesn't need to wrap them necessarily. |
I'm back working on getting my model into tensorrt... Currently trying torch.jit.trace() followed by torch.onnx.export(). My current blocker is that ONNX supports einsum starting in opset 12, but tensorrt only supports up to opset 11. My model isn't calling einsum, so I need to look into how hard it would be to convert the einsum calls in e2cnn into opset 11 operators. |
Still working on tracing which einsum calls in e2cnn I'm actually hitting, but these seem like some likely candidates: e2cnn/e2cnn/nn/geometric_tensor.py Line 357 in 1abf950
|
TensorRT's list of supported ONNX operators is here: |
Hi @drewm1980 Nice to hear from you again! I think your problem is the einsum in
The one inside GeometricTensor is not usually called inside a neural network. Anyways, if you use Regarding Best, |
I was actually looking at the code for the wrong branch; your comment was fine :) Thanks for the confirmation that I'm probably going in the right direction! Cheers, |
Hi @drewm1980 Best, |
I'm working on optimizing my model inference, trying conversion to torchscript as a first step. When I call torch.jit.script() on my model, I hit:
This pytorch code resides here:
https://github.com/pytorch/pytorch/blob/22902b9242853a4ce319e7c5c4a1c94bc00ccb7a/torch/jit/_recursive.py#L126
Torch can't trace through a Dict[Tuple[str,str],_], which is used here:
e2cnn/e2cnn/nn/modules/r2_conv/basisexpansion_blocks.py
Line 295 in b2c26e3
My goal is to get the model to run as fast as possible on NVIDIA hardware, probably using tensorrt. Is there another known-good conversion path?
The above error was with torch 1.6.0, e2cnn v0.1.
The text was updated successfully, but these errors were encountered: