-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IACA analysis incompatible with clang (3.8 and 4.0) #58
Comments
Update: This seems to occur when the index increment is exactly 1, which happens when the compiler does neither vectorize nor otherwise unroll the loop:
In this case the index register (here rdx) is not increased by an addq instruction but by a simple incq. It's not exclusive to clang; if I prevent vectorization with the Intel compiler, the same error occurs. |
Still does not work reliably in 0.6.0:
The loop mechanics looks like this:
|
Can you paste me the whole assembly block? There is more than just the loop mechanics being used for the detection.
|
Here we are: Had to rename it - github does not allow .s files as attchments :-/ |
Difficult one. There are two stores in the loop, one of which goes onto the stack. That confuses the increment detector, because its offset does not change from one iteration to the other and therefore the loop increment would be 0. One workaround would be to ignore anything related to the stack pointer register, but who knows if another compiler will decide to make use of it in another way? |
I have tried kerncraft (current checkout, 0.5.7) with the himeno.c code and clang:
This happens with clang 3.8 and 4.0.
The text was updated successfully, but these errors were encountered: