Skip to content

Fix internal shfl check#4282

Merged
fbusato merged 1 commit intoNVIDIA:mainfrom
fbusato:fix_shfl_check
Mar 27, 2025
Merged

Fix internal shfl check#4282
fbusato merged 1 commit intoNVIDIA:mainfrom
fbusato:fix_shfl_check

Conversation

@fbusato
Copy link
Contributor

@fbusato fbusato commented Mar 27, 2025

No description provided.

@fbusato fbusato added the 3.0 label Mar 27, 2025
@fbusato fbusato self-assigned this Mar 27, 2025
@fbusato fbusato requested a review from a team as a code owner March 27, 2025 00:23
@fbusato fbusato requested a review from wmaxey March 27, 2025 00:23
@fbusato fbusato added this to CCCL Mar 27, 2025
@github-project-automation github-project-automation bot moved this to Todo in CCCL Mar 27, 2025
@fbusato fbusato enabled auto-merge (squash) March 27, 2025 00:23
@cccl-authenticator-app cccl-authenticator-app bot moved this from Todo to In Review in CCCL Mar 27, 2025
Copy link
Contributor

@miscco miscco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This suggests a lack of testing, can you add a test that verifies this works correctly?

@bernhardmgruber
Copy link
Contributor

This suggests a lack of testing, can you add a test that verifies this works correctly?

Agreed. When fixing a bug, it's best to first add a test exposing the bug and then adding the fix. This way, we can avoid regressions.

@github-actions
Copy link
Contributor

🟩 CI finished in 17h 15m: Pass: 100%/162 | Total: 3d 10h | Avg: 30m 39s | Max: 1h 21m | Hits: 66%/254022
  • 🟩 cub: Pass: 100%/45 | Total: 1d 21h | Avg: 1h 00m | Max: 1h 21m | Hits: 46%/53899

    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total:  1d 18h | Avg: 59m 54s | Max:  1h 21m | Hits:  46%/51449 
      🟩 arm64              Pass: 100%/2   | Total:  2h 20m | Avg:  1h 10m | Max:  1h 11m | Hits:  34%/2450  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  5h 53m | Avg:  1h 10m | Max:  1h 12m | Hits:  34%/5953  
      🟩 12.6               Pass: 100%/2   | Total:  2h 32m | Avg:  1h 16m | Max:  1h 17m | Hits:  31%/2262  
      🟩 12.8               Pass: 100%/38  | Total:  1d 12h | Avg: 58m 10s | Max:  1h 21m | Hits:  48%/45684 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  2h 16m | Avg:  1h 08m | Max:  1h 12m | Hits:  35%/2110  
      🟩 nvcc12.0           Pass: 100%/5   | Total:  5h 53m | Avg:  1h 10m | Max:  1h 12m | Hits:  34%/5953  
      🟩 nvcc12.6           Pass: 100%/2   | Total:  2h 32m | Avg:  1h 16m | Max:  1h 17m | Hits:  31%/2262  
      🟩 nvcc12.8           Pass: 100%/36  | Total:  1d 10h | Avg: 57m 37s | Max:  1h 21m | Hits:  49%/43574 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  2h 16m | Avg:  1h 08m | Max:  1h 12m | Hits:  35%/2110  
      🟩 nvcc               Pass: 100%/43  | Total:  1d 18h | Avg: 59m 59s | Max:  1h 21m | Hits:  46%/51789 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  4h 26m | Avg:  1h 06m | Max:  1h 10m | Hits:  34%/4908  
      🟩 Clang15            Pass: 100%/2   | Total:  2h 13m | Avg:  1h 06m | Max:  1h 09m | Hits:  34%/2450  
      🟩 Clang16            Pass: 100%/2   | Total:  2h 08m | Avg:  1h 04m | Max:  1h 04m | Hits:  34%/2450  
      🟩 Clang17            Pass: 100%/2   | Total:  2h 13m | Avg:  1h 06m | Max:  1h 09m | Hits:  34%/2450  
      🟩 Clang18            Pass: 100%/7   | Total:  6h 25m | Avg: 55m 00s | Max:  1h 12m | Hits:  54%/8235  
      🟩 GCC7               Pass: 100%/2   | Total:  2h 15m | Avg:  1h 07m | Max:  1h 11m | Hits:  34%/2454  
      🟩 GCC8               Pass: 100%/1   | Total:  1h 11m | Avg:  1h 11m | Max:  1h 11m | Hits:  34%/1227  
      🟩 GCC9               Pass: 100%/2   | Total:  2h 21m | Avg:  1h 10m | Max:  1h 12m | Hits:  34%/2454  
      🟩 GCC10              Pass: 100%/2   | Total:  2h 13m | Avg:  1h 06m | Max:  1h 07m | Hits:  34%/2454  
      🟩 GCC11              Pass: 100%/2   | Total:  2h 11m | Avg:  1h 05m | Max:  1h 06m | Hits:  34%/2450  
      🟩 GCC12              Pass: 100%/2   | Total:  2h 11m | Avg:  1h 05m | Max:  1h 06m | Hits:  34%/2450  
      🟩 GCC13              Pass: 100%/11  | Total:  7h 42m | Avg: 42m 00s | Max:  1h 17m | Hits:  69%/13475 
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 31m | Avg:  1h 15m | Max:  1h 19m | Hits:  37%/2090  
      🟩 MSVC14.42          Pass: 100%/2   | Total:  2h 39m | Avg:  1h 19m | Max:  1h 21m | Hits:  37%/2090  
      🟩 NVHPC25.1          Pass: 100%/2   | Total:  2h 32m | Avg:  1h 16m | Max:  1h 17m | Hits:  31%/2262  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total: 17h 26m | Avg:  1h 01m | Max:  1h 12m | Hits:  42%/20493 
      🟩 GCC                Pass: 100%/22  | Total: 20h 05m | Avg: 54m 48s | Max:  1h 17m | Hits:  52%/26964 
      🟩 MSVC               Pass: 100%/4   | Total:  5h 10m | Avg:  1h 17m | Max:  1h 21m | Hits:  37%/4180  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 32m | Avg:  1h 16m | Max:  1h 17m | Hits:  31%/2262  
    🟩 gpu
      🟩 h100               Pass: 100%/3   | Total:  1h 19m | Avg: 26m 37s | Max: 28m 32s | Hits:  77%/3675  
      🟩 rtx2080            Pass: 100%/34  | Total:  1d 15h | Avg:  1h 09m | Max:  1h 21m | Hits:  34%/40424 
      🟩 rtxa6000           Pass: 100%/8   | Total:  4h 40m | Avg: 35m 05s | Max:  1h 10m | Hits:  83%/9800  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  1d 18h | Avg:  1h 08m | Max:  1h 21m | Hits:  34%/44099 
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 27m 21s | Avg: 27m 21s | Max: 27m 21s | Hits:  99%/1225  
      🟩 GraphCapture       Pass: 100%/1   | Total: 19m 53s | Avg: 19m 53s | Max: 19m 53s | Hits:  99%/1225  
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 20m | Avg: 26m 48s | Max: 27m 18s | Hits:  99%/3675  
      🟩 TestGPU            Pass: 100%/3   | Total:  1h 08m | Avg: 22m 44s | Max: 24m 01s | Hits:  99%/3675  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total:  1h 19m | Avg: 26m 37s | Max: 28m 32s | Hits:  77%/3675  
      🟩 90;90a;100         Pass: 100%/1   | Total:  1h 17m | Avg:  1h 17m | Max:  1h 17m | Hits:  34%/1225  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 23h 18m | Avg:  1h 09m | Max:  1h 19m | Hits:  34%/23712 
      🟩 20                 Pass: 100%/25  | Total: 21h 57m | Avg: 52m 42s | Max:  1h 21m | Hits:  55%/30187 
    
  • 🟩 thrust: Pass: 100%/45 | Total: 22h 02m | Avg: 29m 23s | Max: 51m 21s | Hits: 80%/79911

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 38m 30s | Avg: 19m 15s | Max: 27m 00s | Hits:  88%/3554  
    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total: 21h 07m | Avg: 29m 29s | Max: 51m 21s | Hits:  80%/76358 
      🟩 arm64              Pass: 100%/2   | Total: 54m 28s | Avg: 27m 14s | Max: 28m 52s | Hits:  77%/3553  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  2h 49m | Avg: 33m 48s | Max: 47m 10s | Hits:  78%/8876  
      🟩 12.6               Pass: 100%/2   | Total:  1h 42m | Avg: 51m 18s | Max: 51m 21s | Hits:  65%/3552  
      🟩 12.8               Pass: 100%/38  | Total: 17h 30m | Avg: 27m 39s | Max: 45m 16s | Hits:  81%/67483 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 51m 09s | Avg: 25m 34s | Max: 27m 09s | Hits:  77%/3552  
      🟩 nvcc12.0           Pass: 100%/5   | Total:  2h 49m | Avg: 33m 48s | Max: 47m 10s | Hits:  78%/8876  
      🟩 nvcc12.6           Pass: 100%/2   | Total:  1h 42m | Avg: 51m 18s | Max: 51m 21s | Hits:  65%/3552  
      🟩 nvcc12.8           Pass: 100%/36  | Total: 16h 39m | Avg: 27m 46s | Max: 45m 16s | Hits:  81%/63931 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 51m 09s | Avg: 25m 34s | Max: 27m 09s | Hits:  77%/3552  
      🟩 nvcc               Pass: 100%/43  | Total: 21h 11m | Avg: 29m 33s | Max: 51m 21s | Hits:  80%/76359 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  1h 58m | Avg: 29m 40s | Max: 32m 36s | Hits:  77%/7104  
      🟩 Clang15            Pass: 100%/2   | Total: 57m 38s | Avg: 28m 49s | Max: 29m 41s | Hits:  77%/3552  
      🟩 Clang16            Pass: 100%/2   | Total: 57m 44s | Avg: 28m 52s | Max: 29m 24s | Hits:  77%/3552  
      🟩 Clang17            Pass: 100%/2   | Total: 58m 25s | Avg: 29m 12s | Max: 29m 16s | Hits:  77%/3552  
      🟩 Clang18            Pass: 100%/7   | Total:  2h 34m | Avg: 22m 07s | Max: 31m 43s | Hits:  83%/12432 
      🟩 GCC7               Pass: 100%/2   | Total:  1h 03m | Avg: 31m 35s | Max: 33m 00s | Hits:  77%/3554  
      🟩 GCC8               Pass: 100%/1   | Total: 31m 16s | Avg: 31m 16s | Max: 31m 16s | Hits:  77%/1777  
      🟩 GCC9               Pass: 100%/2   | Total:  1h 03m | Avg: 31m 54s | Max: 33m 33s | Hits:  77%/3554  
      🟩 GCC10              Pass: 100%/2   | Total:  1h 06m | Avg: 33m 28s | Max: 34m 48s | Hits:  77%/3554  
      🟩 GCC11              Pass: 100%/2   | Total:  1h 02m | Avg: 31m 09s | Max: 31m 19s | Hits:  77%/3554  
      🟩 GCC12              Pass: 100%/2   | Total:  1h 04m | Avg: 32m 10s | Max: 33m 39s | Hits:  77%/3554  
      🟩 GCC13              Pass: 100%/10  | Total:  3h 34m | Avg: 21m 24s | Max: 35m 29s | Hits:  86%/17770 
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 30m | Avg: 45m 20s | Max: 47m 10s | Hits:  81%/3540  
      🟩 MSVC14.42          Pass: 100%/3   | Total:  1h 55m | Avg: 38m 35s | Max: 45m 16s | Hits:  87%/5310  
      🟩 NVHPC25.1          Pass: 100%/2   | Total:  1h 42m | Avg: 51m 18s | Max: 51m 21s | Hits:  65%/3552  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  7h 27m | Avg: 26m 18s | Max: 32m 36s | Hits:  80%/30192 
      🟩 GCC                Pass: 100%/21  | Total:  9h 25m | Avg: 26m 57s | Max: 35m 29s | Hits:  81%/37317 
      🟩 MSVC               Pass: 100%/5   | Total:  3h 26m | Avg: 41m 17s | Max: 47m 10s | Hits:  84%/8850  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 42m | Avg: 51m 18s | Max: 51m 21s | Hits:  65%/3552  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 28m 44s | Avg: 14m 22s | Max: 16m 50s | Hits:  88%/3554  
      🟩 rtx2080            Pass: 100%/33  | Total: 18h 04m | Avg: 32m 52s | Max: 51m 21s | Hits:  76%/58604 
      🟩 rtx4090            Pass: 100%/10  | Total:  3h 29m | Avg: 20m 54s | Max: 45m 16s | Hits:  91%/17753 
    🟩 jobs
      🟩 Build              Pass: 100%/38  | Total: 20h 33m | Avg: 32m 27s | Max: 51m 21s | Hits:  77%/67481 
      🟩 TestCPU            Pass: 100%/3   | Total: 43m 10s | Avg: 14m 23s | Max: 26m 49s | Hits:  99%/5323  
      🟩 TestGPU            Pass: 100%/4   | Total: 45m 55s | Avg: 11m 28s | Max: 11m 57s | Hits:  99%/7107  
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 28m 44s | Avg: 14m 22s | Max: 16m 50s | Hits:  88%/3554  
      🟩 90;90a;100         Pass: 100%/1   | Total: 35m 29s | Avg: 35m 29s | Max: 35m 29s | Hits:  77%/1777  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 11h 22m | Avg: 34m 08s | Max: 51m 21s | Hits:  77%/35511 
      🟩 20                 Pass: 100%/23  | Total: 10h 00m | Avg: 26m 07s | Max: 51m 16s | Hits:  82%/40846 
    
  • 🟩 libcudacxx: Pass: 100%/43 | Total: 11h 18m | Avg: 15m 46s | Max: 36m 48s | Hits: 62%/107640

    🟩 cpu
      🟩 amd64              Pass: 100%/41  | Total: 11h 08m | Avg: 16m 18s | Max: 36m 48s | Hits:  60%/101753
      🟩 arm64              Pass: 100%/2   | Total:  9m 36s | Avg:  4m 48s | Max:  4m 51s | Hits:  95%/5887  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  1h 34m | Avg: 18m 48s | Max: 21m 34s | Hits:  41%/14339 
      🟩 12.6               Pass: 100%/2   | Total:  1h 13m | Avg: 36m 43s | Max: 36m 48s | Hits:  29%/5834  
      🟩 12.8               Pass: 100%/36  | Total:  8h 30m | Avg: 14m 11s | Max: 31m 39s | Hits:  68%/87467 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 45m 51s | Avg: 22m 55s | Max: 25m 07s | Hits:  27%/5848  
      🟩 nvcc12.0           Pass: 100%/5   | Total:  1h 34m | Avg: 18m 48s | Max: 21m 34s | Hits:  41%/14339 
      🟩 nvcc12.6           Pass: 100%/2   | Total:  1h 13m | Avg: 36m 43s | Max: 36m 48s | Hits:  29%/5834  
      🟩 nvcc12.8           Pass: 100%/34  | Total:  7h 45m | Avg: 13m 40s | Max: 31m 39s | Hits:  71%/81619 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 45m 51s | Avg: 22m 55s | Max: 25m 07s | Hits:  27%/5848  
      🟩 nvcc               Pass: 100%/41  | Total: 10h 32m | Avg: 15m 25s | Max: 36m 48s | Hits:  64%/101792
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  1h 12m | Avg: 18m 02s | Max: 24m 44s | Hits:  48%/11664 
      🟩 Clang15            Pass: 100%/2   | Total: 44m 57s | Avg: 22m 28s | Max: 23m 34s | Hits:  42%/5844  
      🟩 Clang16            Pass: 100%/2   | Total: 10m 57s | Avg:  5m 28s | Max:  5m 38s | Hits:  96%/5844  
      🟩 Clang17            Pass: 100%/2   | Total:  9m 56s | Avg:  4m 58s | Max:  5m 08s | Hits:  97%/5844  
      🟩 Clang18            Pass: 100%/6   | Total:  1h 29m | Avg: 14m 53s | Max: 25m 07s | Hits:  56%/14635 
      🟩 GCC7               Pass: 100%/2   | Total: 30m 42s | Avg: 15m 21s | Max: 21m 16s | Hits:  57%/5781  
      🟩 GCC8               Pass: 100%/1   | Total:  5m 16s | Avg:  5m 16s | Max:  5m 16s | Hits:  95%/2901  
      🟩 GCC9               Pass: 100%/2   | Total: 45m 10s | Avg: 22m 35s | Max: 24m 29s | Hits:  31%/5793  
      🟩 GCC10              Pass: 100%/2   | Total: 28m 16s | Avg: 14m 08s | Max: 23m 55s | Hits:  64%/5850  
      🟩 GCC11              Pass: 100%/2   | Total:  9m 29s | Avg:  4m 44s | Max:  4m 53s | Hits:  97%/5846  
      🟩 GCC12              Pass: 100%/2   | Total: 32m 56s | Avg: 16m 28s | Max: 28m 18s | Hits:  64%/5846  
      🟩 GCC13              Pass: 100%/10  | Total:  2h 14m | Avg: 13m 27s | Max: 31m 39s | Hits:  70%/14896 
      🟩 MSVC14.29          Pass: 100%/2   | Total: 46m 16s | Avg: 23m 08s | Max: 24m 42s | Hits:  32%/5495  
      🟩 MSVC14.42          Pass: 100%/2   | Total: 44m 59s | Avg: 22m 29s | Max: 23m 04s | Hits:  83%/5567  
      🟩 NVHPC25.1          Pass: 100%/2   | Total:  1h 13m | Avg: 36m 43s | Max: 36m 48s | Hits:  29%/5834  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/16  | Total:  3h 47m | Avg: 14m 12s | Max: 25m 07s | Hits:  63%/43831 
      🟩 GCC                Pass: 100%/21  | Total:  4h 46m | Avg: 13m 38s | Max: 31m 39s | Hits:  67%/46913 
      🟩 MSVC               Pass: 100%/4   | Total:  1h 31m | Avg: 22m 48s | Max: 24m 42s | Hits:  58%/11062 
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 13m | Avg: 36m 43s | Max: 36m 48s | Hits:  29%/5834  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 21m 37s | Avg: 10m 48s | Max: 13m 45s | Hits:  95%/3033  
      🟩 rtx2080            Pass: 100%/41  | Total: 10h 56m | Avg: 16m 01s | Max: 36m 48s | Hits:  61%/104607
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total: 10h 08m | Avg: 16m 26s | Max: 36m 48s | Hits:  62%/107600
      🟩 NVRTC              Pass: 100%/2   | Total: 33m 53s | Avg: 16m 56s | Max: 17m 09s | Hits:  90%/40    
      🟩 Test               Pass: 100%/3   | Total: 33m 47s | Avg: 11m 15s | Max: 13m 45s
      🟩 VerifyCodegen      Pass: 100%/1   | Total:  2m 20s | Avg:  2m 20s | Max:  2m 20s
    🟩 sm
      🟩 75                 Pass: 100%/2   | Total: 33m 53s | Avg: 16m 56s | Max: 17m 09s | Hits:  90%/40    
      🟩 90                 Pass: 100%/2   | Total: 21m 37s | Avg: 10m 48s | Max: 13m 45s | Hits:  95%/3033  
      🟩 90;90a;100         Pass: 100%/1   | Total: 31m 39s | Avg: 31m 39s | Max: 31m 39s | Hits:  30%/3033  
    🟩 std
      🟩 17                 Pass: 100%/21  | Total:  6h 06m | Avg: 17m 26s | Max: 36m 48s | Hits:  55%/57547 
      🟩 20                 Pass: 100%/21  | Total:  5h 09m | Avg: 14m 45s | Max: 36m 39s | Hits:  70%/50093 
    
  • 🟩 cudax: Pass: 100%/22 | Total: 2h 24m | Avg: 6m 34s | Max: 14m 16s | Hits: 95%/12244

    🟩 cpu
      🟩 amd64              Pass: 100%/18  | Total:  2h 08m | Avg:  7m 09s | Max: 14m 16s | Hits:  95%/9908  
      🟩 arm64              Pass: 100%/4   | Total: 15m 41s | Avg:  3m 55s | Max:  4m 10s | Hits:  95%/2336  
    🟩 ctk
      🟩 12.0               Pass: 100%/1   | Total: 11m 15s | Avg: 11m 15s | Max: 11m 15s | Hits:  89%/282   
      🟩 12.6               Pass: 100%/2   | Total: 17m 47s | Avg:  8m 53s | Max:  8m 54s | Hits:  93%/1164  
      🟩 12.8               Pass: 100%/19  | Total:  1h 55m | Avg:  6m 04s | Max: 14m 16s | Hits:  95%/10798 
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/1   | Total: 11m 15s | Avg: 11m 15s | Max: 11m 15s | Hits:  89%/282   
      🟩 nvcc12.6           Pass: 100%/2   | Total: 17m 47s | Avg:  8m 53s | Max:  8m 54s | Hits:  93%/1164  
      🟩 nvcc12.8           Pass: 100%/19  | Total:  1h 55m | Avg:  6m 04s | Max: 14m 16s | Hits:  95%/10798 
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/22  | Total:  2h 24m | Avg:  6m 34s | Max: 14m 16s | Hits:  95%/12244 
    🟩 cxx
      🟩 Clang14            Pass: 100%/1   | Total:  4m 18s | Avg:  4m 18s | Max:  4m 18s | Hits:  95%/586   
      🟩 Clang15            Pass: 100%/1   | Total:  4m 56s | Avg:  4m 56s | Max:  4m 56s | Hits:  95%/584   
      🟩 Clang16            Pass: 100%/1   | Total:  4m 38s | Avg:  4m 38s | Max:  4m 38s | Hits:  95%/584   
      🟩 Clang17            Pass: 100%/1   | Total:  4m 31s | Avg:  4m 31s | Max:  4m 31s | Hits:  95%/584   
      🟩 Clang18            Pass: 100%/4   | Total: 24m 47s | Avg:  6m 11s | Max: 12m 42s | Hits:  96%/2336  
      🟩 GCC10              Pass: 100%/1   | Total:  4m 53s | Avg:  4m 53s | Max:  4m 53s | Hits:  95%/586   
      🟩 GCC11              Pass: 100%/1   | Total:  4m 55s | Avg:  4m 55s | Max:  4m 55s | Hits:  95%/584   
      🟩 GCC12              Pass: 100%/2   | Total: 18m 00s | Avg:  9m 00s | Max: 13m 12s | Hits:  97%/1168  
      🟩 GCC13              Pass: 100%/6   | Total: 34m 07s | Avg:  5m 41s | Max: 14m 16s | Hits:  95%/3504  
      🟩 MSVC14.39          Pass: 100%/1   | Total: 11m 15s | Avg: 11m 15s | Max: 11m 15s | Hits:  89%/282   
      🟩 MSVC14.42          Pass: 100%/1   | Total: 10m 24s | Avg: 10m 24s | Max: 10m 24s | Hits:  89%/282   
      🟩 NVHPC25.1          Pass: 100%/2   | Total: 17m 47s | Avg:  8m 53s | Max:  8m 54s | Hits:  93%/1164  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/8   | Total: 43m 10s | Avg:  5m 23s | Max: 12m 42s | Hits:  96%/4674  
      🟩 GCC                Pass: 100%/10  | Total:  1h 01m | Avg:  6m 11s | Max: 14m 16s | Hits:  96%/5842  
      🟩 MSVC               Pass: 100%/2   | Total: 21m 39s | Avg: 10m 49s | Max: 11m 15s | Hits:  89%/564   
      🟩 NVHPC              Pass: 100%/2   | Total: 17m 47s | Avg:  8m 53s | Max:  8m 54s | Hits:  93%/1164  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 18m 07s | Avg:  9m 03s | Max: 14m 16s | Hits:  97%/1168  
      🟩 rtx2080            Pass: 100%/20  | Total:  2h 06m | Avg:  6m 19s | Max: 13m 12s | Hits:  95%/11076 
    🟩 jobs
      🟩 Build              Pass: 100%/19  | Total:  1h 44m | Avg:  5m 29s | Max: 11m 15s | Hits:  94%/10492 
      🟩 Test               Pass: 100%/3   | Total: 40m 10s | Avg: 13m 23s | Max: 14m 16s | Hits:  99%/1752  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total: 21m 49s | Avg:  7m 16s | Max: 14m 16s | Hits:  96%/1752  
      🟩 90a                Pass: 100%/1   | Total:  4m 06s | Avg:  4m 06s | Max:  4m 06s | Hits:  95%/584   
    🟩 std
      🟩 17                 Pass: 100%/4   | Total: 20m 24s | Avg:  5m 06s | Max:  8m 54s | Hits:  94%/2334  
      🟩 20                 Pass: 100%/18  | Total:  2h 04m | Avg:  6m 53s | Max: 14m 16s | Hits:  95%/9910  
    
  • 🟩 stdpar: Pass: 100%/4 | Total: 16m 20s | Avg: 4m 05s | Max: 4m 50s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  9m 35s | Avg:  4m 47s | Max:  4m 50s
      🟩 arm64              Pass: 100%/2   | Total:  6m 45s | Avg:  3m 22s | Max:  3m 23s
    🟩 ctk
      🟩 12.6               Pass: 100%/4   | Total: 16m 20s | Avg:  4m 05s | Max:  4m 50s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/4   | Total: 16m 20s | Avg:  4m 05s | Max:  4m 50s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/4   | Total: 16m 20s | Avg:  4m 05s | Max:  4m 50s
    🟩 cxx
      🟩 NVHPC25.1          Pass: 100%/4   | Total: 16m 20s | Avg:  4m 05s | Max:  4m 50s
    🟩 cxx_family
      🟩 NVHPC              Pass: 100%/4   | Total: 16m 20s | Avg:  4m 05s | Max:  4m 50s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/4   | Total: 16m 20s | Avg:  4m 05s | Max:  4m 50s
    🟩 jobs
      🟩 Build              Pass: 100%/4   | Total: 16m 20s | Avg:  4m 05s | Max:  4m 50s
    🟩 std
      🟩 17                 Pass: 100%/2   | Total:  8m 07s | Avg:  4m 03s | Max:  4m 45s
      🟩 20                 Pass: 100%/2   | Total:  8m 13s | Avg:  4m 06s | Max:  4m 50s
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 20m 36s | Avg: 10m 18s | Max: 18m 02s | Hits: 96%/328

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 20m 36s | Avg: 10m 18s | Max: 18m 02s | Hits:  96%/328   
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total: 20m 36s | Avg: 10m 18s | Max: 18m 02s | Hits:  96%/328   
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total: 20m 36s | Avg: 10m 18s | Max: 18m 02s | Hits:  96%/328   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 20m 36s | Avg: 10m 18s | Max: 18m 02s | Hits:  96%/328   
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 20m 36s | Avg: 10m 18s | Max: 18m 02s | Hits:  96%/328   
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 20m 36s | Avg: 10m 18s | Max: 18m 02s | Hits:  96%/328   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total: 20m 36s | Avg: 10m 18s | Max: 18m 02s | Hits:  96%/328   
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 34s | Avg:  2m 34s | Max:  2m 34s | Hits:  94%/164   
      🟩 Test               Pass: 100%/1   | Total: 18m 02s | Avg: 18m 02s | Max: 18m 02s | Hits:  98%/164   
    
  • 🟩 python: Pass: 100%/1 | Total: 1h 08m | Avg: 1h 08m | Max: 1h 08m

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total:  1h 08m | Avg:  1h 08m | Max:  1h 08m
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total:  1h 08m | Avg:  1h 08m | Max:  1h 08m
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total:  1h 08m | Avg:  1h 08m | Max:  1h 08m
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total:  1h 08m | Avg:  1h 08m | Max:  1h 08m
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total:  1h 08m | Avg:  1h 08m | Max:  1h 08m
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total:  1h 08m | Avg:  1h 08m | Max:  1h 08m
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total:  1h 08m | Avg:  1h 08m | Max:  1h 08m
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total:  1h 08m | Avg:  1h 08m | Max:  1h 08m
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
+/- libcu++
CUB
Thrust
CUDA Experimental
stdpar
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental
+/- stdpar
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 162)

# Runner
113 linux-amd64-cpu16
15 windows-amd64-cpu16
12 linux-arm64-cpu16
8 linux-amd64-gpu-rtx2080-latest-1
6 linux-amd64-gpu-rtxa6000-latest-1
5 linux-amd64-gpu-h100-latest-1
3 linux-amd64-gpu-rtx4090-latest-1

@fbusato fbusato merged commit fcfcc81 into NVIDIA:main Mar 27, 2025
178 of 179 checks passed
@github-project-automation github-project-automation bot moved this from In Review to Done in CCCL Mar 27, 2025
andralex pushed a commit to andralex/cccl that referenced this pull request Apr 1, 2025
davebayer pushed a commit to davebayer/cccl that referenced this pull request Apr 7, 2025
caugonnet added a commit that referenced this pull request Apr 15, 2025
* clang-format

* remove unnecessary headers

* Remove few remaining qualifiers _CCCL_NODISCARD

This fixes the build, after the macro definition was removed in #4265

* fix headers

* Cleanup libcu++ `force_include.h` test file (#4262)

* there are no more tests in this header

* Simplify and reduce data copying in algorithm.cuh

* Much less tupling and untupling

* A few additional simplifications and a few eliminations of copies

* Fix ratio plot (#4099)

* Fix ratio plot

* [pre-commit.ci] auto code formatting

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Bernhard Manfred Gruber <bernhardmgruber@gmail.com>

* Drop `_CCCL_NORETURN` (#4268)

Co-authored-by: Michael Schellenberger Costa <miscco@nvidia.com>

* fix clang portability issue in `__rcvr_with_env_t` and remove dead code (#4277)

* change version check in `type_list.h` so that *NO* clang-19.X compilers try to use pack indexing (#4278)

* fix shfl check (#4282)

* Add necessary headers

* tweak the cccl compiler version check macros to better agree with intuition (#4279)

* tweak the cccl compiler version check macros to better agree with intuition

prior to this commit, a compiler check such as:

```c++
```

would fail if the compiler was actually v19.1. that is because 19.1 is
greater than 19. what the author of this code probably intended was to
check only the compiler's major version number, in which case the check
would have succeed.

this commit changes the behavior of the following macros when only a
major version number is specified:

* `_CCCL_COMPILER`
* `_CCCL_CUDA_COMPILER`
* `_CCCL_CUDACC_BELOW`
* `_CCCL_CUDACC_AT_LEAST`

* guard `_CCCL_COMPILER(FOO)` with an extra set of parens

* Implement `ranges::single_view` (#4255)

* Implement fp overflow handlers (#4261)

* Implement fp overflow handlers

* I hate nvfp types

* use `[[nodiscard]]`

---------

Co-authored-by: Michael Schellenberger Costa <miscco@nvidia.com>

* Fix CI failure

* Drop `_LIBCUDACXX_HAS_NO_UNICODE_CHARS` (#4295)

* [Version] Update main to v3.1.0 (#4175)

* Bump main to 3.1.0.

* Update ci/update_version.sh to edit the docs VERSION.md file

* Rerun ci/version_update.sh

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Wesley Maxey <wesley.maxey@gmail.com>

* WIP: Refactor allocators

* Make metadata direct member instead of pointer

* Add cached_block_allocator_fifo from PR #2674

* use _CCCL_ASSERT instead of assert

* add missing ctors

---------

Co-authored-by: Cedric Augonnet <caugonnet@nvidia.com>
Co-authored-by: Oleksandr Pavlyk <21087696+oleksandr-pavlyk@users.noreply.github.com>
Co-authored-by: David Bayer <48736217+davebayer@users.noreply.github.com>
Co-authored-by: Georgii Evtushenko <evtushenko.georgy@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Bernhard Manfred Gruber <bernhardmgruber@gmail.com>
Co-authored-by: Michael Schellenberger Costa <miscco@nvidia.com>
Co-authored-by: Eric Niebler <eniebler@nvidia.com>
Co-authored-by: Federico Busato <50413820+fbusato@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Wesley Maxey <wesley.maxey@gmail.com>
Co-authored-by: Cédric Augonnet <158148890+caugonnet@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

3 participants