Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Propeller] Exploration and Integration into CachyOS (Blocked by LLVM 18) #343

Open
ptr1337 opened this issue Dec 2, 2024 · 0 comments
Open

Comments

@ptr1337
Copy link
Member

ptr1337 commented Dec 2, 2024

Propeller is a profile-guided, post-link optimization framework developed by Google to enhance the performance of large-scale applications. It operates by relinking binaries based on precise runtime profiles, enabling optimizations that are challenging to achieve during the initial compilation phase.

Propeller does following:

  • Basic Block Reordering
  • Function Reordering
  • Function Splitting

So, basically it has equalities with BOLT.

Propellor should be applied after the final AutoFDO compilation and then needs to be profiled and compiled again. The workflow would look like following:

  1. Compile Kernel with AUTOFDO_CLANG
  2. boot into and profile the Kernel with AUTOFDO_CLANG
  3. Convert the AutoFDO Profile
  4. Compile the Kernel with the AutoFDO Profile passed and enable AUTOFDO_CLANG
  5. Boot into the AutoFDO profiled Kernel with PROPELLER_CLANG enabled and profile the kernel
  6. Convert the profile with following command:
create_llvm_prof --binary=<vmlinux> --profile=<perf_file> \
     --format=propeller --propeller_output_module_name \
     --out=<propeller_profile_prefix>_cc_profile.txt \
     --propeller_symorder=<propeller_profile_prefix>_ld_profile.txt
  1. Compile the kernel with the AutoFDO and Propeller Profile passed:
CLANG_AUTOFDO_PROFILE=<autofdo_profile> CLANG_PROPELLER_PROFILE_PREFIX=<propeller_profile_prefix> CLANG_AUTOFDO_PROFILE=<profile_file>

Adding additionally Propeller support on top of AutoFDO generally brings one additional compilation, which brings in total 3 compilation each architecture.
We need to recheck, if we can reuse the Propellor Profile for the v4 archtitecture, since this is the much less used architecture.

According to the paper from Google, the performance benefit at common application on top of ThinLTO and PGO vary between 1-8%:

Propeller has been deployed in production environments at Google, with tens of millions of cores executing Propeller-optimized code. Evaluations on internal warehouse-scale applications have demonstrated performance improvements ranging from 1.1% to 8% beyond existing optimizations like Profile-Guided Optimization (PGO) and ThinLTO. For instance, compiler tools such as Clang have shown a 7% performance increase, while MySQL has improved by 1%.

Paper: https://research.google/pubs/propeller-a-profile-guided-relinking-optimizer-for-warehouse-scale-applications/?utm_source=chatgpt.com

@1Naim 1Naim changed the title [Propeller] Exploration and Integration into CachyOS (Blocked by LLVM 19 currently) [Propeller] Exploration and Integration into CachyOS (Blocked by LLVM 18) Dec 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant