-
Notifications
You must be signed in to change notification settings - Fork 166
WIP: use a trampoline for FLF functions to intercept timings #3595
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
1976c3c to
222e3a4
Compare
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #3595 +/- ##
==========================================
- Coverage 62.01% 62.00% -0.02%
==========================================
Files 140 140
Lines 13312 13312
Branches 1762 1762
==========================================
- Hits 8256 8254 -2
- Misses 4267 4270 +3
+ Partials 789 788 -1 see 1 file with indirect coverage changes Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
morrisonlevi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this can run afoul of permissions somehow, since we're making executable code at runtime? I'm not well versed here, and yes, if the JIT is enabled you'd have to have that capability anyway, but I'm trying to understand implications of this WIP.
Benchmarks [ profiler ]Benchmark execution time: 2026-01-23 10:57:23 Comparing candidate commit 2f81b3a in PR branch Found 0 performance improvements and 2 performance regressions! Performance is the same for 27 metrics, 7 unstable metrics. scenario:php-profiler-timeline-memory-control
|
Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>
222e3a4 to
0eb96ee
Compare
Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>
dc825a7 to
c6bdefc
Compare
|
@morrisonlevi dynasmrt takes care of setting RX permissions after compiling the code. As long as you're not running this under a hardened runtime (like app store apps or android (?)), there's no fundamental problem. Also I currently call |
95f422a to
7471599
Compare
Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>
7471599 to
4358ab8
Compare
profiling/src/wall_time.rs
Outdated
| dynasm!(assembler | ||
| ; mov rax, QWORD original as i64 | ||
| ; call rax | ||
| ; mov rax, QWORD interrupt_addr as i64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume original is not returning anything, because you're writing over RAX.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct
profiling/src/wall_time.rs
Outdated
| #[cfg(target_arch = "aarch64")] | ||
| dynasm!(assembler | ||
| ; mov x16, original as u64 | ||
| ; blr x16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this overwrites x30/lr, aren't you gonna lose the original return location of the handler? so when br x16 returns, it goes back to calling interrupt_addr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right. call on x86_64 pushes %eip to the stack, but blr doesn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed up, is it correct now?
Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>
| ); | ||
| #[cfg(target_arch = "x86_64")] | ||
| dynasm!(assembler | ||
| ; mov rax, QWORD *orig as i64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you also need a sub rsp, 8 to align the stack to 16 bytes and satisfy the abi requirements
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, compilers often handle that by just pushing rbp anyway. Will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to actually push and pop to restore rsp too, for the final jump.
a3c79dc to
078acef
Compare
Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>
e4930fe to
2604b22
Compare
Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>
2604b22 to
6307860
Compare
Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>
a246940 to
bcebfb0
Compare
Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>
Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>
Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>
b0995bc to
d9d3f43
Compare
They don't check EG(vm_interrupt) by design, so we manually handle this.