Skip to content

[Enhancement] remove high-cost log location patterns to fix FE latency #69594

Open
yohengyang wants to merge 1 commit intoStarRocks:mainfrom
yohengyang:log4j-boost
Open

[Enhancement] remove high-cost log location patterns to fix FE latency #69594
yohengyang wants to merge 1 commit intoStarRocks:mainfrom
yohengyang:log4j-boost

Conversation

@yohengyang
Copy link

@yohengyang yohengyang commented Feb 27, 2026

Why I'm doing:

FE (Frontend) monitoring revealed significant latency spikes in HTTP/JDBC requests, often reaching 10s to 14s. Detailed analysis using Arthas and stack traces identified that the bottleneck was caused by Log4j2's location information retrieval (%C, %M, %L).

---[5354.485427ms] com.starrocks.mysql.nio.AcceptListener:handleEvent()
    +---[3103.027647ms] org.apache.logging.log4j.Logger:info()
        ---[2643.006046ms] org.apache.logging.log4j.util.StackLocator:calcLocation()
            ---[2622.302338ms] java.lang.StackWalker:walk() 

Each log entry required a StackWalker.walk() operation to resolve the class name, method name, and line number. This operation is extremely expensive and caused synchronous blocking in the logging path. This not only delayed query responses but also severely degraded write concurrency, as data load transactions (e.g., Stream Load) were frequently blocked by the audit and profile logging logic.

What I'm doing:

Optimized logging performance by removing high-cost stack-walking patterns:

  • Plaintext Layout: Removed %C{1}.%M():%L from the log pattern.

  • JSON Layout: Removed $resolver: source (line, file, method) and set locationInfoEnabled="false".

Alternative: Added the logger name resolver (%c) to provide class identification. This uses the pre-cached Logger name instead of walking the stack, providing similar visibility with near-zero overhead.

Fixes #issue
image

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
    • This pr needs auto generate documentation
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 4.1
    • 4.0
    • 3.5
    • 3.4

@mergify
Copy link
Contributor

mergify bot commented Feb 27, 2026

⚠️ The sha of the head commit of this PR conflicts with #54845. Mergify cannot evaluate rules on this PR. Once #54845 is merged or closed, Mergify will resume processing this PR. ⚠️

@github-actions
Copy link
Contributor

github-actions bot commented Feb 27, 2026

🌎 Translation Required?

All translation files are up to date.
No translation actions are required for this PR.

🕒 Last updated: Fri, 27 Feb 2026 12:30:38 GMT

@yohengyang yohengyang changed the title [Enhancement] remove high-cost log location patterns to fix FE latency and write concurrency bottlenecks [Enhancement] remove high-cost log location patterns to fix FE latency Mar 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant