-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lea followed by an add
is generated
#121064
Comments
Prior to Icelake, on Intel CPUs, 3 source LEAs have a latency of 3 and a reciprocal throughput of 1. 2 source LEAs and add have a latency of 1 and reciprocal throughput of 0.25. So we split 3 source LEAs into 2 instructions. I think -mtune=icelake or newer or tuning for AMD CPUs will disable this. |
@llvm/issue-subscribers-backend-x86 Author: Denis Yaroshevskiy (DenisYaroshevskiy)
Hi
I observe the following codegen (the input code is tricky to share)
This "add" following "lea" instruction looks weird to me. Is this expected? |
Ran the experiment My CPU:
With -march=native, -mtune=native:
With -march=native -mtune=icelake-client
This latter one is worse than the default one on my machine. I guess the lea split is correct. Numbers (the padding indicates different code alignment - I test alignments from 0 to 56 bytes in increments of 8. -mtune=native
-mtune=icelake-client
Curious that the best case for icelake is slightly better than default tuning, but that's probably just noise. |
@topperc I happened to observe this as well, and it seems like I think LLVM should remove TuningSlow3OpsLEA for IceLake+ CPUs? (uops.info also confirms that IceLake+ no longer suffers from Slow3opsLea, see https://uops.info/html-instr/LEA_B_I_D32_R64.html). |
@sillycross IceLake has already removed it. But please note Alderlake is a hybrid architecture. It's E-Core still has slow 3op LEA as shown in your link. So I think we still need to keep it there. |
@phoebewang Thanks for the quick reply! Indeed, if I change the flags to Interestingly, |
We haven't defined new turning for Arrowlake. It's using the Alderlake turning for now. |
Hi
I observe the following codegen (the input code is tricky to share)
This "add" following "lea" instruction looks weird to me. Is this expected?
The text was updated successfully, but these errors were encountered: