Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Basic wrapping arithmetic on pointers is pessimized compared to integers #135798

Open
orlp opened this issue Jan 20, 2025 · 5 comments
Open

Basic wrapping arithmetic on pointers is pessimized compared to integers #135798

orlp opened this issue Jan 20, 2025 · 5 comments
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@orlp
Copy link
Contributor

orlp commented Jan 20, 2025

Consider these two very simple functions:

pub fn foo_int(x: usize, y: usize, z: isize) -> (usize, usize) {
    let mask = (z.wrapping_sub(15) >> (isize::BITS - 1)) as usize;
    (x.wrapping_sub(mask), y ^ mask)
}

pub fn foo_ptr(x: *const u8, y: usize, z: isize) -> (*const u8, usize) {
    let mask = (z.wrapping_sub(15) >> (isize::BITS - 1)) as usize;
    (x.wrapping_sub(mask), y ^ mask)
}

I would expect these to generate identical codegen. Here we are treating a pointer as an ordinary number, with wrapping_sub. But we actually see a pessimization in the codegen for the pointer version:

foo_int:
        mov     rax, rdi
        add     rdx, -15
        sar     rdx, 63
        sub     rax, rdx
        xor     rdx, rsi
        ret

foo_ptr:
        add     rdx, -15
        mov     rax, rdx
        shr     rax, 63
        sar     rdx, 63
        add     rax, rdi
        xor     rdx, rsi
        ret

I think this has something to do with provenance. Consider the following two modified examples:

pub fn foo_expose_ptr(x: *const u8, y: usize, z: isize) -> (*const u8, usize) {
    let mask = (z.wrapping_sub(15) >> (isize::BITS - 1)) as usize;
    let u = x.expose_provenance().wrapping_sub(mask);
    (std::ptr::with_exposed_provenance(u), y ^ mask)
}

pub fn foo_map_addr_ptr(x: *const u8, y: usize, z: isize) -> (*const u8, usize) {
    let mask = (z.wrapping_sub(15) >> (isize::BITS - 1)) as usize;
    (x.map_addr(|a| a.wrapping_sub(mask)), y ^ mask)
}

foo_expose_ptr is efficient like foo_int, but foo_map_addr_ptr is pessimized like foo_ptr. I think this is a bug, provenance should not pessimize basic wrapping integer arithmetic, even if it occurs on pointers.

@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Jan 20, 2025
@saethlin
Copy link
Member

alive2 does not accept the transformation from the optimized IR of foo_expose_ptr to food_map_addr_ptr: https://alive2.llvm.org/ce/z/qH9pKL
Transforming the unoptimized IR is similarly rejected: https://alive2.llvm.org/ce/z/Evkhgg
So I believe there is a real problem here. Whether alive2 is pointing at the problem, I don't know.

@saethlin
Copy link
Member

After squinting at this for a while, I'm growing quite confident that this is due to our implementation of wrapping_sub:

    pub const fn wrapping_sub(self, count: usize) -> Self
    where
        T: Sized,
    {
        self.wrapping_offset((count as isize).wrapping_neg())
    }

@saethlin
Copy link
Member

I think all pointer arithmetic in LLVM is getelementptr, which is effectively just add. So to do wrapping_sub, we need a negation operation somewhere all the way through LLVM IR. Maybe the LLVM x86 backend can recognize what's going on here and lower to a sub instead of an add with a negation.

@saethlin saethlin added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Jan 21, 2025
@jieyouxu jieyouxu added the C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such label Jan 21, 2025
@orlp
Copy link
Contributor Author

orlp commented Jan 21, 2025

@saethlin I can confirm that the integer version is indeed pessimized as well if I replace wrapping_sub(mask) with wrapping_add(mask.wrapping_neg()).

@scottmcm
Copy link
Member

Sounds like something to file under https://github.com/llvm/llvm-project/issues then, since as @saethlin says we can't emit anything differently, really. (Doing wrapping_sub by always exposing would be clearly wrong.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

5 participants