Simple but significant speed improvement in textprinter.cpp #1613

fredfrance-oss · 2025-11-24T14:19:37Z

Currently the textprinter plugin does a full get_prog_point for all callbacks.
However a simple speedup is to verify that any tap point has the current PC before continuing.

For my test case this gave a very big improvement as you can see below.

Without optimization:

        test.rec:  2535950507 (  1.00%) instrs. 1422.76 sec.  2.06 GB ram.
        test.rec:  5071900973 (  2.00%) instrs. 2593.45 sec.  2.59 GB ram.
        test.rec:  7607851447 (  3.00%) instrs. 4124.41 sec.  2.80 GB ram.

With opt:

        test.rec:  2535950507 (  1.00%) instrs.  528.39 sec.  1.96 GB ram.
        test.rec:  5071900973 (  2.00%) instrs.  922.36 sec.  2.14 GB ram.
        test.rec:  7607851447 (  3.00%) instrs. 1660.56 sec.  2.39 GB ram.

Copilot

Pull request overview

This PR introduces a performance optimization to the textprinter plugin by adding an early PC (program counter) check before the expensive get_prog_point call. The optimization uses an unordered_set to quickly filter callbacks where the PC is not in the tap points, resulting in significant speedup (2.5-3x improvement based on the benchmark data).

Key changes:

Added fast-path PC lookup using unordered_set<target_ulong> for O(1) filtering
Modified mem_callback to check PC membership before expensive get_prog_point operation
Populated the PC set alongside existing tap_points in init_plugin

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.

Copilot · 2025-11-24T19:41:59Z

panda/plugins/textprinter/textprinter.cpp

+        mem_counter++;
+        return;
+    }


The mem_counter is incremented twice for tap point matches: once in the early return path (line 64 for non-matches) and once at line 87 (after processing matches). This creates an inconsistency where non-matching PCs increment the counter at the start of the function, while matching PCs increment it at the end. The increment on line 64 should be removed to maintain the original behavior where mem_counter is always incremented at line 87, ensuring consistent counting regardless of whether the PC matches.

Suggested change

mem_counter++;

return;

}

return;

}

As we only increase the counter before a return from the function there is no way the counter would increase twice for a single call. But a cleaner solution might be to move the counter increase to the top of the function instead of having it in two locations.

lacraig2 · 2025-11-24T19:50:38Z

I think this is a reasonable optimization. Ultimately, the comparison:

panda/panda/plugins/callstack_instr/prog_point.h

Lines 52 to 61 in 840acd9

    
           bool operator ==(const prog_point &p) const { 
        
               bool sids_match = true; 
        
               if ((this->sidFirst != p.sidFirst) || 
        
                   (this->sidSecond != p.sidSecond) || 
        
                   (this->isKernelMode != p.isKernelMode) || 
        
                   (this->stackKind != p.stackKind)) { 
        
                   sids_match = false; 
        
               } 
        
               return ((this->pc == p.pc) && (this->caller == p.caller) && sids_match); 
        
           }

will be applied, but given that PC must match one of them (and this is a significant differentiator) this should help.

The copilot suggestion about the count is worthwhile, but otherwise this looks good.

fredfrance-oss · 2025-11-25T07:40:20Z

As we only increase the counter before a return from the function there is no way the counter would increase twice for a single call. But a cleaner solution might be to move the counter increase to the top of the function instead of having it in two locations.

Something like this:

void mem_callback(CPUState *env, target_ulong pc, target_ulong addr,
                  size_t size, uint8_t *buf, gzFile f) {
    mem_counter++;
    if (tap_pc.find(pc) == tap_pc.end()) {
        return;
    }
    prog_point p = {};
    get_prog_point(env, &p);
    if (tap_points.find(p) == tap_points.end()) {
        return;
    }
    
    target_ulong callers[16] = {0};
    int nret = get_callers(callers, 16, env);
    unsigned char *buf_uc = static_cast<unsigned char *>(buf);
    for (unsigned int i = 0; i < size; i++) {
        for (int j = nret-1; j > 0; j--) {
            gzprintf(f, TARGET_FMT_lx " ", callers[j]);
        }
        gzprintf(f,
                 TARGET_FMT_lx " " TARGET_FMT_lx " %d " TARGET_FMT_lx
                               " " TARGET_FMT_lx " %s " TARGET_FMT_lx
                               " %ld %02x\n",
                 p.caller, p.pc, p.stackKind, p.sidFirst, p.sidSecond,
                 p.isKernelMode ? "kernel" : "user", addr + i, mem_counter,
                 buf_uc[i]);
    }
    
    return;
}

fredfrance-oss · 2025-12-04T09:43:01Z

Is there any action required from my side on this or will this be merged in due time?
(Same question for the other PR)

lacraig2 · 2025-12-04T15:56:00Z

You had suggested a cleaner approach in the comment above. I wasn't sure if you were you going to implement that in the PR.

Refactor memory callback to avoid duplicate code and improve readability.

fredfrance-oss · 2025-12-08T09:24:55Z

I understand i should have made it a question. I was unsure if a refactoring change combined with a new-functionality change was welcome.

Pull request updated.

Simple but significant speed improvement in textprinter.cpp

3ec0352

lacraig2 requested a review from Copilot November 24, 2025 19:40

Copilot started reviewing on behalf of lacraig2 November 24, 2025 19:40 View session

Copilot finished reviewing on behalf of lacraig2 November 24, 2025 19:41

Copilot AI reviewed Nov 24, 2025

View reviewed changes

Make code more readable

2323c8f

Refactor memory callback to avoid duplicate code and improve readability.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Simple but significant speed improvement in textprinter.cpp #1613

Simple but significant speed improvement in textprinter.cpp #1613

Uh oh!

fredfrance-oss commented Nov 24, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Nov 24, 2025

Uh oh!

fredfrance-oss Nov 25, 2025

Uh oh!

lacraig2 commented Nov 24, 2025

Uh oh!

fredfrance-oss commented Nov 25, 2025 •

edited

Loading

Uh oh!

fredfrance-oss commented Dec 4, 2025

Uh oh!

lacraig2 commented Dec 4, 2025

Uh oh!

fredfrance-oss commented Dec 8, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Simple but significant speed improvement in textprinter.cpp #1613

Are you sure you want to change the base?

Simple but significant speed improvement in textprinter.cpp #1613

Uh oh!

Conversation

fredfrance-oss commented Nov 24, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

fredfrance-oss Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

lacraig2 commented Nov 24, 2025

Uh oh!

fredfrance-oss commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fredfrance-oss commented Dec 4, 2025

Uh oh!

lacraig2 commented Dec 4, 2025

Uh oh!

fredfrance-oss commented Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fredfrance-oss commented Nov 25, 2025 •

edited

Loading

fredfrance-oss commented Dec 8, 2025 •

edited

Loading