Skip to content

[Feature] Auto-batching inference server: multiprocessing transport#3494

Open
vmoens wants to merge 4 commits intogh/vmoens/236/basefrom
gh/vmoens/236/head
Open

[Feature] Auto-batching inference server: multiprocessing transport#3494
vmoens wants to merge 4 commits intogh/vmoens/236/basefrom
gh/vmoens/236/head

Conversation

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Feb 11, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3494

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures

As of commit 0bb4078 with merge base 266e4aa (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@github-actions
Copy link
Contributor

github-actions bot commented Feb 11, 2026

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 172. Improved: $\large\color{#35bf28}13$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_tensor_to_bytestream_speed[pickle] 80.0465μs 79.0509μs 12.6501 KOps/s 12.5437 KOps/s $\color{#35bf28}+0.85\%$
test_tensor_to_bytestream_speed[torch.save] 0.1384ms 0.1377ms 7.2627 KOps/s 7.3317 KOps/s $\color{#d91a1a}-0.94\%$
test_tensor_to_bytestream_speed[untyped_storage] 0.1086s 0.1080s 9.2633 Ops/s 9.3772 Ops/s $\color{#d91a1a}-1.21\%$
test_tensor_to_bytestream_speed[numpy] 2.4784μs 2.4739μs 404.2226 KOps/s 400.4643 KOps/s $\color{#35bf28}+0.94\%$
test_tensor_to_bytestream_speed[safetensors] 38.4859μs 38.2371μs 26.1526 KOps/s 27.2764 KOps/s $\color{#d91a1a}-4.12\%$
test_simple 0.7849s 0.7818s 1.2790 Ops/s 1.2390 Ops/s $\color{#35bf28}+3.23\%$
test_transformed 1.4775s 1.3845s 0.7223 Ops/s 0.7241 Ops/s $\color{#d91a1a}-0.25\%$
test_serial 2.2708s 2.2664s 0.4412 Ops/s 0.4376 Ops/s $\color{#35bf28}+0.83\%$
test_parallel 1.8957s 1.8029s 0.5547 Ops/s 0.5662 Ops/s $\color{#d91a1a}-2.03\%$
test_step_mdp_speed[True-True-True-True-True] 0.3924ms 40.6429μs 24.6045 KOps/s 24.5754 KOps/s $\color{#35bf28}+0.12\%$
test_step_mdp_speed[True-True-True-True-False] 47.0510μs 22.9825μs 43.5114 KOps/s 43.1648 KOps/s $\color{#35bf28}+0.80\%$
test_step_mdp_speed[True-True-True-False-True] 49.3210μs 22.8997μs 43.6687 KOps/s 45.2443 KOps/s $\color{#d91a1a}-3.48\%$
test_step_mdp_speed[True-True-True-False-False] 44.3810μs 12.7912μs 78.1787 KOps/s 81.5440 KOps/s $\color{#d91a1a}-4.13\%$
test_step_mdp_speed[True-True-False-True-True] 66.3120μs 43.6992μs 22.8837 KOps/s 23.0898 KOps/s $\color{#d91a1a}-0.89\%$
test_step_mdp_speed[True-True-False-True-False] 58.0720μs 25.0890μs 39.8581 KOps/s 40.4153 KOps/s $\color{#d91a1a}-1.38\%$
test_step_mdp_speed[True-True-False-False-True] 52.5510μs 25.5701μs 39.1082 KOps/s 40.4489 KOps/s $\color{#d91a1a}-3.31\%$
test_step_mdp_speed[True-True-False-False-False] 46.4110μs 15.2366μs 65.6314 KOps/s 66.8419 KOps/s $\color{#d91a1a}-1.81\%$
test_step_mdp_speed[True-False-True-True-True] 0.1012ms 46.0114μs 21.7337 KOps/s 22.2376 KOps/s $\color{#d91a1a}-2.27\%$
test_step_mdp_speed[True-False-True-True-False] 58.8420μs 28.3261μs 35.3032 KOps/s 36.0095 KOps/s $\color{#d91a1a}-1.96\%$
test_step_mdp_speed[True-False-True-False-True] 62.0820μs 25.3958μs 39.3765 KOps/s 40.1929 KOps/s $\color{#d91a1a}-2.03\%$
test_step_mdp_speed[True-False-True-False-False] 51.7210μs 15.1958μs 65.8078 KOps/s 67.3679 KOps/s $\color{#d91a1a}-2.32\%$
test_step_mdp_speed[True-False-False-True-True] 92.0330μs 47.6206μs 20.9993 KOps/s 21.0029 KOps/s $\color{#d91a1a}-0.02\%$
test_step_mdp_speed[True-False-False-True-False] 61.1820μs 30.1845μs 33.1296 KOps/s 33.3825 KOps/s $\color{#d91a1a}-0.76\%$
test_step_mdp_speed[True-False-False-False-True] 75.3820μs 27.5933μs 36.2406 KOps/s 36.8074 KOps/s $\color{#d91a1a}-1.54\%$
test_step_mdp_speed[True-False-False-False-False] 46.5610μs 17.5220μs 57.0710 KOps/s 56.5925 KOps/s $\color{#35bf28}+0.85\%$
test_step_mdp_speed[False-True-True-True-True] 99.5220μs 45.9875μs 21.7450 KOps/s 21.8866 KOps/s $\color{#d91a1a}-0.65\%$
test_step_mdp_speed[False-True-True-True-False] 67.1620μs 27.5535μs 36.2930 KOps/s 36.3618 KOps/s $\color{#d91a1a}-0.19\%$
test_step_mdp_speed[False-True-True-False-True] 2.5886ms 29.4050μs 34.0078 KOps/s 34.2862 KOps/s $\color{#d91a1a}-0.81\%$
test_step_mdp_speed[False-True-True-False-False] 61.4910μs 16.6978μs 59.8882 KOps/s 61.3296 KOps/s $\color{#d91a1a}-2.35\%$
test_step_mdp_speed[False-True-False-True-True] 74.7820μs 48.0452μs 20.8137 KOps/s 20.9414 KOps/s $\color{#d91a1a}-0.61\%$
test_step_mdp_speed[False-True-False-True-False] 63.9720μs 29.8331μs 33.5198 KOps/s 32.9609 KOps/s $\color{#35bf28}+1.70\%$
test_step_mdp_speed[False-True-False-False-True] 60.3910μs 30.9036μs 32.3587 KOps/s 32.4372 KOps/s $\color{#d91a1a}-0.24\%$
test_step_mdp_speed[False-True-False-False-False] 48.1510μs 18.9802μs 52.6864 KOps/s 52.5968 KOps/s $\color{#35bf28}+0.17\%$
test_step_mdp_speed[False-False-True-True-True] 82.4120μs 50.5925μs 19.7658 KOps/s 20.4973 KOps/s $\color{#d91a1a}-3.57\%$
test_step_mdp_speed[False-False-True-True-False] 67.8510μs 32.4669μs 30.8006 KOps/s 30.8182 KOps/s $\color{#d91a1a}-0.06\%$
test_step_mdp_speed[False-False-True-False-True] 65.2520μs 31.1535μs 32.0991 KOps/s 31.8850 KOps/s $\color{#35bf28}+0.67\%$
test_step_mdp_speed[False-False-True-False-False] 51.0820μs 18.9444μs 52.7860 KOps/s 52.2724 KOps/s $\color{#35bf28}+0.98\%$
test_step_mdp_speed[False-False-False-True-True] 0.1063ms 52.0985μs 19.1944 KOps/s 18.8220 KOps/s $\color{#35bf28}+1.98\%$
test_step_mdp_speed[False-False-False-True-False] 81.0720μs 34.7487μs 28.7780 KOps/s 28.4263 KOps/s $\color{#35bf28}+1.24\%$
test_step_mdp_speed[False-False-False-False-True] 75.5720μs 33.0842μs 30.2259 KOps/s 30.2997 KOps/s $\color{#d91a1a}-0.24\%$
test_step_mdp_speed[False-False-False-False-False] 54.2120μs 20.9120μs 47.8195 KOps/s 46.7264 KOps/s $\color{#35bf28}+2.34\%$
test_non_tensor_env_rollout_speed[1000-single-True] 0.8149s 0.7202s 1.3885 Ops/s 1.3823 Ops/s $\color{#35bf28}+0.45\%$
test_non_tensor_env_rollout_speed[1000-single-False] 0.6813s 0.5898s 1.6956 Ops/s 1.6991 Ops/s $\color{#d91a1a}-0.21\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] 1.6591s 1.5802s 0.6328 Ops/s 0.6105 Ops/s $\color{#35bf28}+3.66\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 1.4546s 1.3745s 0.7275 Ops/s 0.7105 Ops/s $\color{#35bf28}+2.40\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 1.9024s 1.8217s 0.5489 Ops/s 0.5315 Ops/s $\color{#35bf28}+3.28\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 1.6940s 1.6100s 0.6211 Ops/s 0.6137 Ops/s $\color{#35bf28}+1.20\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 4.7155s 4.6327s 0.2159 Ops/s 0.2168 Ops/s $\color{#d91a1a}-0.43\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] 4.4586s 4.3764s 0.2285 Ops/s 0.2284 Ops/s $\color{#35bf28}+0.05\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 1.8706s 1.8049s 0.5541 Ops/s 0.5515 Ops/s $\color{#35bf28}+0.46\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 1.6261s 1.5383s 0.6501 Ops/s 0.6510 Ops/s $\color{#d91a1a}-0.15\%$
test_values[generalized_advantage_estimate-True-True] 21.4706ms 21.0990ms 47.3956 Ops/s 47.5885 Ops/s $\color{#d91a1a}-0.41\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1495s 3.9295ms 254.4878 Ops/s 282.0292 Ops/s $\textbf{\color{#d91a1a}-9.77\%}$
test_values[td0_return_estimate-False-False] 0.1091ms 85.1079μs 11.7498 KOps/s 12.0000 KOps/s $\color{#d91a1a}-2.08\%$
test_values[td1_return_estimate-False-False] 51.9588ms 50.5420ms 19.7855 Ops/s 19.8614 Ops/s $\color{#d91a1a}-0.38\%$
test_values[vec_td1_return_estimate-False-False] 1.3257ms 1.1095ms 901.2930 Ops/s 905.5990 Ops/s $\color{#d91a1a}-0.48\%$
test_values[td_lambda_return_estimate-True-False] 84.1606ms 81.8221ms 12.2216 Ops/s 11.8658 Ops/s $\color{#35bf28}+3.00\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.3069ms 1.1042ms 905.5945 Ops/s 905.3728 Ops/s $\color{#35bf28}+0.02\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 21.5749ms 21.3911ms 46.7483 Ops/s 46.9184 Ops/s $\color{#d91a1a}-0.36\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0512ms 0.7790ms 1.2837 KOps/s 1.3068 KOps/s $\color{#d91a1a}-1.77\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7326ms 0.6961ms 1.4366 KOps/s 1.4578 KOps/s $\color{#d91a1a}-1.46\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5888ms 1.5132ms 660.8370 Ops/s 667.4010 Ops/s $\color{#d91a1a}-0.98\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7844ms 0.7249ms 1.3796 KOps/s 1.3871 KOps/s $\color{#d91a1a}-0.55\%$
test_dqn_speed[False-None] 1.6485ms 1.5450ms 647.2374 Ops/s 655.5322 Ops/s $\color{#d91a1a}-1.27\%$
test_dqn_speed[False-backward] 2.4940ms 2.1996ms 454.6326 Ops/s 459.9261 Ops/s $\color{#d91a1a}-1.15\%$
test_dqn_speed[True-None] 1.1503ms 0.5606ms 1.7838 KOps/s 1.6971 KOps/s $\textbf{\color{#35bf28}+5.11\%}$
test_dqn_speed[True-backward] 1.1681ms 1.0962ms 912.2587 Ops/s 828.3440 Ops/s $\textbf{\color{#35bf28}+10.13\%}$
test_dqn_speed[reduce-overhead-None] 0.6546ms 0.5859ms 1.7068 KOps/s 1.6038 KOps/s $\textbf{\color{#35bf28}+6.42\%}$
test_ddpg_speed[False-None] 3.2774ms 2.8858ms 346.5248 Ops/s 350.8803 Ops/s $\color{#d91a1a}-1.24\%$
test_ddpg_speed[False-backward] 4.4018ms 4.1933ms 238.4770 Ops/s 232.4205 Ops/s $\color{#35bf28}+2.61\%$
test_ddpg_speed[True-None] 1.4514ms 1.3154ms 760.2340 Ops/s 739.4022 Ops/s $\color{#35bf28}+2.82\%$
test_ddpg_speed[True-backward] 2.4629ms 2.3889ms 418.5979 Ops/s 392.9109 Ops/s $\textbf{\color{#35bf28}+6.54\%}$
test_ddpg_speed[reduce-overhead-None] 1.5385ms 1.3466ms 742.6158 Ops/s 727.8334 Ops/s $\color{#35bf28}+2.03\%$
test_sac_speed[False-None] 8.7417ms 8.2745ms 120.8532 Ops/s 92.9425 Ops/s $\textbf{\color{#35bf28}+30.03\%}$
test_sac_speed[False-backward] 11.8366ms 11.3480ms 88.1210 Ops/s 87.0755 Ops/s $\color{#35bf28}+1.20\%$
test_sac_speed[True-None] 1.8891ms 1.8123ms 551.7983 Ops/s 529.4109 Ops/s $\color{#35bf28}+4.23\%$
test_sac_speed[True-backward] 3.5801ms 3.4399ms 290.7085 Ops/s 278.1655 Ops/s $\color{#35bf28}+4.51\%$
test_sac_speed[reduce-overhead-None] 18.4648ms 10.5279ms 94.9853 Ops/s 82.5732 Ops/s $\textbf{\color{#35bf28}+15.03\%}$
test_redq_deprec_speed[False-None] 9.9602ms 9.3923ms 106.4706 Ops/s 107.0933 Ops/s $\color{#d91a1a}-0.58\%$
test_redq_deprec_speed[False-backward] 13.1962ms 12.5509ms 79.6753 Ops/s 80.2984 Ops/s $\color{#d91a1a}-0.78\%$
test_redq_deprec_speed[True-None] 2.7732ms 2.5530ms 391.6916 Ops/s 396.5702 Ops/s $\color{#d91a1a}-1.23\%$
test_redq_deprec_speed[True-backward] 4.6317ms 4.1642ms 240.1449 Ops/s 229.4618 Ops/s $\color{#35bf28}+4.66\%$
test_redq_deprec_speed[reduce-overhead-None] 15.5357ms 9.6285ms 103.8584 Ops/s 88.2462 Ops/s $\textbf{\color{#35bf28}+17.69\%}$
test_td3_speed[False-None] 8.5121ms 8.2583ms 121.0908 Ops/s 121.3355 Ops/s $\color{#d91a1a}-0.20\%$
test_td3_speed[False-backward] 11.1620ms 10.7941ms 92.6430 Ops/s 91.1501 Ops/s $\color{#35bf28}+1.64\%$
test_td3_speed[True-None] 1.7455ms 1.7086ms 585.2642 Ops/s 612.6355 Ops/s $\color{#d91a1a}-4.47\%$
test_td3_speed[True-backward] 3.2022ms 3.1043ms 322.1359 Ops/s 320.5120 Ops/s $\color{#35bf28}+0.51\%$
test_td3_speed[reduce-overhead-None] 86.5635ms 24.1390ms 41.4268 Ops/s 41.6584 Ops/s $\color{#d91a1a}-0.56\%$
test_cql_speed[False-None] 17.6317ms 17.3097ms 57.7712 Ops/s 58.5164 Ops/s $\color{#d91a1a}-1.27\%$
test_cql_speed[False-backward] 23.2463ms 22.7947ms 43.8698 Ops/s 44.5875 Ops/s $\color{#d91a1a}-1.61\%$
test_cql_speed[True-None] 3.3603ms 3.2559ms 307.1304 Ops/s 307.1494 Ops/s $-0.01\%$
test_cql_speed[True-backward] 5.8046ms 5.3963ms 185.3119 Ops/s 181.3374 Ops/s $\color{#35bf28}+2.19\%$
test_cql_speed[reduce-overhead-None] 18.5368ms 11.7005ms 85.4661 Ops/s 84.4400 Ops/s $\color{#35bf28}+1.22\%$
test_a2c_speed[False-None] 4.0715ms 3.2507ms 307.6249 Ops/s 311.1934 Ops/s $\color{#d91a1a}-1.15\%$
test_a2c_speed[False-backward] 6.6958ms 6.2509ms 159.9772 Ops/s 155.4772 Ops/s $\color{#35bf28}+2.89\%$
test_a2c_speed[True-None] 1.4594ms 1.3267ms 753.7540 Ops/s 756.6057 Ops/s $\color{#d91a1a}-0.38\%$
test_a2c_speed[True-backward] 3.1989ms 2.9628ms 337.5237 Ops/s 325.1484 Ops/s $\color{#35bf28}+3.81\%$
test_a2c_speed[reduce-overhead-None] 1.0402ms 0.9577ms 1.0442 KOps/s 1.0462 KOps/s $\color{#d91a1a}-0.19\%$
test_ppo_speed[False-None] 4.0043ms 3.8701ms 258.3879 Ops/s 261.4797 Ops/s $\color{#d91a1a}-1.18\%$
test_ppo_speed[False-backward] 7.5003ms 7.0719ms 141.4038 Ops/s 137.0640 Ops/s $\color{#35bf28}+3.17\%$
test_ppo_speed[True-None] 1.5164ms 1.4048ms 711.8407 Ops/s 716.7506 Ops/s $\color{#d91a1a}-0.69\%$
test_ppo_speed[True-backward] 3.5443ms 3.1027ms 322.2992 Ops/s 321.3502 Ops/s $\color{#35bf28}+0.30\%$
test_ppo_speed[reduce-overhead-None] 1.0849ms 1.0277ms 973.0810 Ops/s 961.9119 Ops/s $\color{#35bf28}+1.16\%$
test_reinforce_speed[False-None] 2.4048ms 2.2849ms 437.6482 Ops/s 443.9464 Ops/s $\color{#d91a1a}-1.42\%$
test_reinforce_speed[False-backward] 3.8448ms 3.3737ms 296.4120 Ops/s 292.9496 Ops/s $\color{#35bf28}+1.18\%$
test_reinforce_speed[True-None] 1.3673ms 1.2733ms 785.3377 Ops/s 778.5498 Ops/s $\color{#35bf28}+0.87\%$
test_reinforce_speed[True-backward] 2.9962ms 2.8995ms 344.8834 Ops/s 327.8315 Ops/s $\textbf{\color{#35bf28}+5.20\%}$
test_reinforce_speed[reduce-overhead-None] 0.4413s 10.2930ms 97.1538 Ops/s 92.7350 Ops/s $\color{#35bf28}+4.76\%$
test_iql_speed[False-None] 10.1481ms 9.4754ms 105.5365 Ops/s 106.9399 Ops/s $\color{#d91a1a}-1.31\%$
test_iql_speed[False-backward] 13.7009ms 13.2893ms 75.2488 Ops/s 75.2716 Ops/s $\color{#d91a1a}-0.03\%$
test_iql_speed[True-None] 2.2501ms 2.1633ms 462.2626 Ops/s 458.2017 Ops/s $\color{#35bf28}+0.89\%$
test_iql_speed[True-backward] 4.8897ms 4.7301ms 211.4137 Ops/s 205.9722 Ops/s $\color{#35bf28}+2.64\%$
test_iql_speed[reduce-overhead-None] 17.4912ms 10.3787ms 96.3509 Ops/s 94.7384 Ops/s $\color{#35bf28}+1.70\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.2242ms 5.8203ms 171.8116 Ops/s 173.3090 Ops/s $\color{#d91a1a}-0.86\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.0583ms 0.3281ms 3.0479 KOps/s 2.9388 KOps/s $\color{#35bf28}+3.71\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7228ms 0.3183ms 3.1419 KOps/s 2.8841 KOps/s $\textbf{\color{#35bf28}+8.94\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.9213ms 5.5681ms 179.5933 Ops/s 181.6757 Ops/s $\color{#d91a1a}-1.15\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.7100ms 0.3229ms 3.0969 KOps/s 3.1020 KOps/s $\color{#d91a1a}-0.17\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5846ms 0.2895ms 3.4546 KOps/s 3.1984 KOps/s $\textbf{\color{#35bf28}+8.01\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.5893ms 1.2853ms 778.0454 Ops/s 688.5569 Ops/s $\textbf{\color{#35bf28}+13.00\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.4828ms 1.2006ms 832.9201 Ops/s 816.4226 Ops/s $\color{#35bf28}+2.02\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.9713ms 5.7291ms 174.5468 Ops/s 176.9446 Ops/s $\color{#d91a1a}-1.36\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.7827ms 0.4408ms 2.2685 KOps/s 2.3065 KOps/s $\color{#d91a1a}-1.65\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6738ms 0.4530ms 2.2074 KOps/s 2.4289 KOps/s $\textbf{\color{#d91a1a}-9.12\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.8931ms 5.5878ms 178.9621 Ops/s 177.5362 Ops/s $\color{#35bf28}+0.80\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.5738ms 0.3585ms 2.7894 KOps/s 3.5302 KOps/s $\textbf{\color{#d91a1a}-20.98\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.4820ms 0.2766ms 3.6147 KOps/s 3.7859 KOps/s $\color{#d91a1a}-4.52\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.9384ms 5.6074ms 178.3365 Ops/s 175.9945 Ops/s $\color{#35bf28}+1.33\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.7655ms 0.2896ms 3.4530 KOps/s 3.5507 KOps/s $\color{#d91a1a}-2.75\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4705ms 0.2659ms 3.7606 KOps/s 3.4715 KOps/s $\textbf{\color{#35bf28}+8.33\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.0853ms 5.8716ms 170.3103 Ops/s 171.1141 Ops/s $\color{#d91a1a}-0.47\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0701ms 0.4437ms 2.2536 KOps/s 2.1678 KOps/s $\color{#35bf28}+3.96\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8639ms 0.4387ms 2.2792 KOps/s 2.2727 KOps/s $\color{#35bf28}+0.29\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.3806ms 4.9293ms 202.8690 Ops/s 201.2745 Ops/s $\color{#35bf28}+0.79\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 10.2284ms 2.2517ms 444.1150 Ops/s 466.9942 Ops/s $\color{#d91a1a}-4.90\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 3.3117ms 1.0073ms 992.7840 Ops/s 1.0338 KOps/s $\color{#d91a1a}-3.97\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.5856s 16.6580ms 60.0312 Ops/s 198.1282 Ops/s $\textbf{\color{#d91a1a}-69.70\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 11.5311ms 2.1032ms 475.4621 Ops/s 484.4350 Ops/s $\color{#d91a1a}-1.85\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 7.1455ms 1.2572ms 795.4454 Ops/s 1.0559 KOps/s $\textbf{\color{#d91a1a}-24.67\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 6.7622ms 5.2576ms 190.2023 Ops/s 184.9945 Ops/s $\color{#35bf28}+2.82\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 14.3128ms 2.1708ms 460.6642 Ops/s 513.4033 Ops/s $\textbf{\color{#d91a1a}-10.27\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 1.4859ms 1.1137ms 897.9232 Ops/s 902.1676 Ops/s $\color{#d91a1a}-0.47\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 37.6989ms 35.3156ms 28.3161 Ops/s 27.8866 Ops/s $\color{#35bf28}+1.54\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 20.4993ms 18.3751ms 54.4214 Ops/s 54.9313 Ops/s $\color{#d91a1a}-0.93\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 40.7278ms 37.2623ms 26.8368 Ops/s 26.7597 Ops/s $\color{#35bf28}+0.29\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 21.0452ms 19.2215ms 52.0252 Ops/s 52.6997 Ops/s $\color{#d91a1a}-1.28\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 41.9782ms 39.2919ms 25.4505 Ops/s 25.6124 Ops/s $\color{#d91a1a}-0.63\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 22.0129ms 20.5386ms 48.6889 Ops/s 49.6844 Ops/s $\color{#d91a1a}-2.00\%$
test_storage_write_lazystack[50-img_shape0-small] 0.8769ms 0.2191ms 4.5648 KOps/s 4.6330 KOps/s $\color{#d91a1a}-1.47\%$
test_storage_write_lazystack[100-img_shape1-atari] 1.6641ms 1.3376ms 747.6225 Ops/s 730.0499 Ops/s $\color{#35bf28}+2.41\%$
test_storage_write_lazystack[100-img_shape2-large_img] 2.7083ms 2.2925ms 436.2019 Ops/s 449.4189 Ops/s $\color{#d91a1a}-2.94\%$
test_storage_write_lazystack[200-img_shape3-large_batch] 3.0280ms 2.8584ms 349.8495 Ops/s 342.6350 Ops/s $\color{#35bf28}+2.11\%$
test_storage_write_contiguous[50-img_shape0-small] 0.2381ms 0.1630ms 6.1353 KOps/s 6.1925 KOps/s $\color{#d91a1a}-0.92\%$
test_storage_write_contiguous[100-img_shape1-atari] 0.3834ms 0.2268ms 4.4101 KOps/s 4.5840 KOps/s $\color{#d91a1a}-3.79\%$
test_storage_write_contiguous[100-img_shape2-large_img] 1.9354ms 1.8246ms 548.0591 Ops/s 553.7513 Ops/s $\color{#d91a1a}-1.03\%$
test_storage_write_contiguous[200-img_shape3-large_batch] 1.5804ms 1.3363ms 748.3191 Ops/s 735.0166 Ops/s $\color{#35bf28}+1.81\%$
test_collector_stack_then_write[50-img_shape0-small] 1.2392ms 1.1149ms 896.9246 Ops/s 897.9166 Ops/s $\color{#d91a1a}-0.11\%$
test_collector_stack_then_write[100-img_shape1-atari] 3.7166ms 3.4713ms 288.0751 Ops/s 278.6665 Ops/s $\color{#35bf28}+3.38\%$
test_collector_stack_then_write[100-img_shape2-large_img] 5.8399ms 5.7277ms 174.5905 Ops/s 174.5488 Ops/s $\color{#35bf28}+0.02\%$
test_collector_stack_then_write[200-img_shape3-large_batch] 7.4787ms 7.1833ms 139.2109 Ops/s 137.6841 Ops/s $\color{#35bf28}+1.11\%$
test_collector_lazystack_then_write[50-img_shape0-small] 0.4320ms 0.2649ms 3.7756 KOps/s 3.7175 KOps/s $\color{#35bf28}+1.56\%$
test_collector_lazystack_then_write[100-img_shape1-atari] 1.6952ms 1.4868ms 672.5798 Ops/s 675.4051 Ops/s $\color{#d91a1a}-0.42\%$
test_collector_lazystack_then_write[100-img_shape2-large_img] 2.5368ms 2.3797ms 420.2251 Ops/s 421.6989 Ops/s $\color{#d91a1a}-0.35\%$
test_collector_lazystack_then_write[200-img_shape3-large_batch] 3.1842ms 3.0602ms 326.7717 Ops/s 320.8028 Ops/s $\color{#35bf28}+1.86\%$
test_collector_without_rb[100-img_shape0-atari] 33.4851ms 32.3946ms 30.8693 Ops/s 30.6789 Ops/s $\color{#35bf28}+0.62\%$
test_collector_without_rb[200-img_shape1-large_batch] 0.6084s 98.0618ms 10.1977 Ops/s 15.6238 Ops/s $\textbf{\color{#d91a1a}-34.73\%}$
test_collector_with_rb[100-img_shape0-atari] 39.8343ms 37.7548ms 26.4867 Ops/s 27.1320 Ops/s $\color{#d91a1a}-2.38\%$
test_collector_with_rb[200-img_shape1-large_batch] 73.4022ms 72.7559ms 13.7446 Ops/s 7.7687 Ops/s $\textbf{\color{#35bf28}+76.92\%}$
test_collector_without_rb_cuda[100-img_shape0-atari] 55.7703ms 55.3239ms 18.0754 Ops/s 18.0977 Ops/s $\color{#d91a1a}-0.12\%$
test_collector_without_rb_cuda[200-img_shape1-large_batch] 0.1136s 0.1109s 9.0152 Ops/s 9.0118 Ops/s $\color{#35bf28}+0.04\%$
test_collector_with_rb_cuda[100-img_shape0-atari] 57.7464ms 57.3367ms 17.4408 Ops/s 17.4376 Ops/s $\color{#35bf28}+0.02\%$
test_collector_with_rb_cuda[200-img_shape1-large_batch] 0.1190s 0.1149s 8.7006 Ops/s 8.7669 Ops/s $\color{#d91a1a}-0.76\%$

@github-actions
Copy link
Contributor

github-actions bot commented Feb 11, 2026

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 173. Improved: $\large\color{#35bf28}14$. Worsened: $\large\color{#d91a1a}15$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_tensor_to_bytestream_speed[pickle] 80.7648μs 79.7672μs 12.5365 KOps/s 12.2455 KOps/s $\color{#35bf28}+2.38\%$
test_tensor_to_bytestream_speed[torch.save] 0.1389ms 0.1385ms 7.2221 KOps/s 6.9149 KOps/s $\color{#35bf28}+4.44\%$
test_tensor_to_bytestream_speed[untyped_storage] 0.1148s 0.1140s 8.7686 Ops/s 8.4911 Ops/s $\color{#35bf28}+3.27\%$
test_tensor_to_bytestream_speed[numpy] 2.7016μs 2.6950μs 371.0515 KOps/s 394.0890 KOps/s $\textbf{\color{#d91a1a}-5.85\%}$
test_tensor_to_bytestream_speed[safetensors] 40.0582μs 39.7887μs 25.1328 KOps/s 26.6898 KOps/s $\textbf{\color{#d91a1a}-5.83\%}$
test_simple 0.5624s 0.5568s 1.7960 Ops/s 1.7499 Ops/s $\color{#35bf28}+2.64\%$
test_transformed 1.0984s 1.0939s 0.9142 Ops/s 0.8918 Ops/s $\color{#35bf28}+2.51\%$
test_serial 1.7038s 1.6910s 0.5914 Ops/s 0.5904 Ops/s $\color{#35bf28}+0.17\%$
test_parallel 1.1343s 1.0397s 0.9619 Ops/s 0.9613 Ops/s $\color{#35bf28}+0.06\%$
test_step_mdp_speed[True-True-True-True-True] 0.2531ms 42.1801μs 23.7078 KOps/s 23.6618 KOps/s $\color{#35bf28}+0.19\%$
test_step_mdp_speed[True-True-True-True-False] 58.1600μs 23.8695μs 41.8945 KOps/s 42.1174 KOps/s $\color{#d91a1a}-0.53\%$
test_step_mdp_speed[True-True-True-False-True] 71.6310μs 23.6684μs 42.2505 KOps/s 41.0404 KOps/s $\color{#35bf28}+2.95\%$
test_step_mdp_speed[True-True-True-False-False] 48.8210μs 13.0085μs 76.8731 KOps/s 75.5706 KOps/s $\color{#35bf28}+1.72\%$
test_step_mdp_speed[True-True-False-True-True] 82.3410μs 45.4773μs 21.9890 KOps/s 21.6918 KOps/s $\color{#35bf28}+1.37\%$
test_step_mdp_speed[True-True-False-True-False] 54.8410μs 26.4159μs 37.8560 KOps/s 37.8717 KOps/s $\color{#d91a1a}-0.04\%$
test_step_mdp_speed[True-True-False-False-True] 72.4310μs 26.9690μs 37.0796 KOps/s 37.4886 KOps/s $\color{#d91a1a}-1.09\%$
test_step_mdp_speed[True-True-False-False-False] 45.2200μs 16.0879μs 62.1585 KOps/s 61.9527 KOps/s $\color{#35bf28}+0.33\%$
test_step_mdp_speed[True-False-True-True-True] 91.8710μs 49.4680μs 20.2151 KOps/s 20.1119 KOps/s $\color{#35bf28}+0.51\%$
test_step_mdp_speed[True-False-True-True-False] 86.4710μs 29.3687μs 34.0498 KOps/s 33.8895 KOps/s $\color{#35bf28}+0.47\%$
test_step_mdp_speed[True-False-True-False-True] 60.2410μs 27.3924μs 36.5065 KOps/s 37.1416 KOps/s $\color{#d91a1a}-1.71\%$
test_step_mdp_speed[True-False-True-False-False] 53.2310μs 16.3782μs 61.0568 KOps/s 62.7557 KOps/s $\color{#d91a1a}-2.71\%$
test_step_mdp_speed[True-False-False-True-True] 81.5110μs 51.7588μs 19.3204 KOps/s 19.6426 KOps/s $\color{#d91a1a}-1.64\%$
test_step_mdp_speed[True-False-False-True-False] 64.6610μs 31.9420μs 31.3068 KOps/s 31.6877 KOps/s $\color{#d91a1a}-1.20\%$
test_step_mdp_speed[True-False-False-False-True] 62.1410μs 29.4686μs 33.9344 KOps/s 34.4378 KOps/s $\color{#d91a1a}-1.46\%$
test_step_mdp_speed[True-False-False-False-False] 46.9500μs 18.4065μs 54.3287 KOps/s 54.3881 KOps/s $\color{#d91a1a}-0.11\%$
test_step_mdp_speed[False-True-True-True-True] 87.4110μs 49.1719μs 20.3368 KOps/s 20.8894 KOps/s $\color{#d91a1a}-2.65\%$
test_step_mdp_speed[False-True-True-True-False] 61.0410μs 29.3689μs 34.0496 KOps/s 34.2240 KOps/s $\color{#d91a1a}-0.51\%$
test_step_mdp_speed[False-True-True-False-True] 2.4143ms 31.9551μs 31.2939 KOps/s 32.0154 KOps/s $\color{#d91a1a}-2.25\%$
test_step_mdp_speed[False-True-True-False-False] 67.6710μs 17.9994μs 55.5573 KOps/s 56.6513 KOps/s $\color{#d91a1a}-1.93\%$
test_step_mdp_speed[False-True-False-True-True] 92.5220μs 51.9752μs 19.2400 KOps/s 19.8220 KOps/s $\color{#d91a1a}-2.94\%$
test_step_mdp_speed[False-True-False-True-False] 62.5710μs 32.2182μs 31.0384 KOps/s 31.6962 KOps/s $\color{#d91a1a}-2.08\%$
test_step_mdp_speed[False-True-False-False-True] 69.0810μs 33.0884μs 30.2221 KOps/s 30.7968 KOps/s $\color{#d91a1a}-1.87\%$
test_step_mdp_speed[False-True-False-False-False] 49.0110μs 20.5577μs 48.6436 KOps/s 49.8116 KOps/s $\color{#d91a1a}-2.34\%$
test_step_mdp_speed[False-False-True-True-True] 0.1154ms 55.0558μs 18.1634 KOps/s 18.6505 KOps/s $\color{#d91a1a}-2.61\%$
test_step_mdp_speed[False-False-True-True-False] 73.8410μs 34.4653μs 29.0147 KOps/s 29.0872 KOps/s $\color{#d91a1a}-0.25\%$
test_step_mdp_speed[False-False-True-False-True] 69.7110μs 33.8672μs 29.5271 KOps/s 30.0033 KOps/s $\color{#d91a1a}-1.59\%$
test_step_mdp_speed[False-False-True-False-False] 54.7810μs 20.6271μs 48.4798 KOps/s 49.6762 KOps/s $\color{#d91a1a}-2.41\%$
test_step_mdp_speed[False-False-False-True-True] 0.1035ms 56.4679μs 17.7092 KOps/s 18.1223 KOps/s $\color{#d91a1a}-2.28\%$
test_step_mdp_speed[False-False-False-True-False] 72.4010μs 36.8585μs 27.1308 KOps/s 27.0801 KOps/s $\color{#35bf28}+0.19\%$
test_step_mdp_speed[False-False-False-False-True] 83.8510μs 35.5088μs 28.1621 KOps/s 28.2825 KOps/s $\color{#d91a1a}-0.43\%$
test_step_mdp_speed[False-False-False-False-False] 67.5310μs 22.6665μs 44.1180 KOps/s 44.7237 KOps/s $\color{#d91a1a}-1.35\%$
test_non_tensor_env_rollout_speed[1000-single-True] 0.8435s 0.7466s 1.3394 Ops/s 1.3424 Ops/s $\color{#d91a1a}-0.22\%$
test_non_tensor_env_rollout_speed[1000-single-False] 0.7034s 0.6101s 1.6391 Ops/s 1.6431 Ops/s $\color{#d91a1a}-0.24\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] 1.7291s 1.6510s 0.6057 Ops/s 0.6089 Ops/s $\color{#d91a1a}-0.53\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 1.5047s 1.4263s 0.7011 Ops/s 0.7094 Ops/s $\color{#d91a1a}-1.17\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 1.9813s 1.8983s 0.5268 Ops/s 0.5199 Ops/s $\color{#35bf28}+1.32\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 1.7555s 1.6768s 0.5964 Ops/s 0.6012 Ops/s $\color{#d91a1a}-0.80\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 4.8463s 4.6994s 0.2128 Ops/s 0.2150 Ops/s $\color{#d91a1a}-1.04\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] 4.5258s 4.4630s 0.2241 Ops/s 0.2254 Ops/s $\color{#d91a1a}-0.58\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 1.9610s 1.8729s 0.5339 Ops/s 0.5288 Ops/s $\color{#35bf28}+0.97\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 1.7147s 1.6071s 0.6222 Ops/s 0.6304 Ops/s $\color{#d91a1a}-1.30\%$
test_values[generalized_advantage_estimate-True-True] 11.4316ms 10.7283ms 93.2114 Ops/s 98.3198 Ops/s $\textbf{\color{#d91a1a}-5.20\%}$
test_values[vec_generalized_advantage_estimate-True-True] 21.7761ms 17.7060ms 56.4782 Ops/s 56.3456 Ops/s $\color{#35bf28}+0.24\%$
test_values[td0_return_estimate-False-False] 0.2214ms 0.1338ms 7.4712 KOps/s 7.6531 KOps/s $\color{#d91a1a}-2.38\%$
test_values[td1_return_estimate-False-False] 30.8871ms 29.4160ms 33.9951 Ops/s 35.9886 Ops/s $\textbf{\color{#d91a1a}-5.54\%}$
test_values[vec_td1_return_estimate-False-False] 18.2993ms 17.7441ms 56.3568 Ops/s 56.2526 Ops/s $\color{#35bf28}+0.19\%$
test_values[td_lambda_return_estimate-True-False] 46.0175ms 43.5317ms 22.9717 Ops/s 24.0956 Ops/s $\color{#d91a1a}-4.66\%$
test_values[vec_td_lambda_return_estimate-True-False] 18.8552ms 17.6982ms 56.5030 Ops/s 56.3596 Ops/s $\color{#35bf28}+0.25\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 9.7568ms 9.4981ms 105.2846 Ops/s 110.3327 Ops/s $\color{#d91a1a}-4.58\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.7578ms 1.5433ms 647.9614 Ops/s 665.4066 Ops/s $\color{#d91a1a}-2.62\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.5750ms 0.4480ms 2.2322 KOps/s 2.3444 KOps/s $\color{#d91a1a}-4.79\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 38.4257ms 34.9309ms 28.6279 Ops/s 33.9575 Ops/s $\textbf{\color{#d91a1a}-15.69\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 2.1490ms 1.7823ms 561.0841 Ops/s 572.6199 Ops/s $\color{#d91a1a}-2.01\%$
test_dqn_speed[False-None] 1.5760ms 1.4166ms 705.9064 Ops/s 707.6627 Ops/s $\color{#d91a1a}-0.25\%$
test_dqn_speed[False-backward] 2.0508ms 1.9337ms 517.1446 Ops/s 520.5166 Ops/s $\color{#d91a1a}-0.65\%$
test_dqn_speed[True-None] 0.7256ms 0.5497ms 1.8193 KOps/s 1.7381 KOps/s $\color{#35bf28}+4.67\%$
test_dqn_speed[True-backward] 1.0547ms 1.0157ms 984.5182 Ops/s 837.6888 Ops/s $\textbf{\color{#35bf28}+17.53\%}$
test_dqn_speed[reduce-overhead-None] 0.6081ms 0.5393ms 1.8543 KOps/s 1.8000 KOps/s $\color{#35bf28}+3.01\%$
test_ddpg_speed[False-None] 3.2715ms 2.8841ms 346.7279 Ops/s 353.1849 Ops/s $\color{#d91a1a}-1.83\%$
test_ddpg_speed[False-backward] 4.2628ms 4.1104ms 243.2825 Ops/s 245.0491 Ops/s $\color{#d91a1a}-0.72\%$
test_ddpg_speed[True-None] 1.5549ms 1.4280ms 700.2811 Ops/s 694.9222 Ops/s $\color{#35bf28}+0.77\%$
test_ddpg_speed[True-backward] 2.5722ms 2.4337ms 410.8992 Ops/s 413.3534 Ops/s $\color{#d91a1a}-0.59\%$
test_ddpg_speed[reduce-overhead-None] 1.5829ms 1.4233ms 702.5714 Ops/s 700.9549 Ops/s $\color{#35bf28}+0.23\%$
test_sac_speed[False-None] 8.6950ms 8.0711ms 123.8988 Ops/s 125.6149 Ops/s $\color{#d91a1a}-1.37\%$
test_sac_speed[False-backward] 11.7879ms 11.3196ms 88.3422 Ops/s 88.7830 Ops/s $\color{#d91a1a}-0.50\%$
test_sac_speed[True-None] 2.2928ms 2.1788ms 458.9769 Ops/s 459.1446 Ops/s $\color{#d91a1a}-0.04\%$
test_sac_speed[True-backward] 4.2342ms 4.0720ms 245.5780 Ops/s 207.4691 Ops/s $\textbf{\color{#35bf28}+18.37\%}$
test_sac_speed[reduce-overhead-None] 3.2161ms 2.2153ms 451.4043 Ops/s 460.5556 Ops/s $\color{#d91a1a}-1.99\%$
test_redq_speed[False-None] 15.6709ms 10.5491ms 94.7948 Ops/s 88.2508 Ops/s $\textbf{\color{#35bf28}+7.42\%}$
test_redq_speed[False-backward] 18.6331ms 17.8163ms 56.1284 Ops/s 56.2872 Ops/s $\color{#d91a1a}-0.28\%$
test_redq_speed[True-None] 4.6354ms 4.4001ms 227.2688 Ops/s 231.9145 Ops/s $\color{#d91a1a}-2.00\%$
test_redq_speed[True-backward] 10.1460ms 9.8509ms 101.5132 Ops/s 102.3105 Ops/s $\color{#d91a1a}-0.78\%$
test_redq_speed[reduce-overhead-None] 4.6074ms 4.3864ms 227.9796 Ops/s 225.4717 Ops/s $\color{#35bf28}+1.11\%$
test_redq_deprec_speed[False-None] 11.5056ms 11.0361ms 90.6119 Ops/s 91.0329 Ops/s $\color{#d91a1a}-0.46\%$
test_redq_deprec_speed[False-backward] 16.0772ms 15.7274ms 63.5834 Ops/s 63.2402 Ops/s $\color{#35bf28}+0.54\%$
test_redq_deprec_speed[True-None] 5.4887ms 3.7608ms 265.8991 Ops/s 261.3976 Ops/s $\color{#35bf28}+1.72\%$
test_redq_deprec_speed[True-backward] 7.6719ms 7.4837ms 133.6235 Ops/s 128.1887 Ops/s $\color{#35bf28}+4.24\%$
test_redq_deprec_speed[reduce-overhead-None] 3.7844ms 3.5911ms 278.4656 Ops/s 271.0969 Ops/s $\color{#35bf28}+2.72\%$
test_td3_speed[False-None] 8.3323ms 8.0662ms 123.9747 Ops/s 123.6668 Ops/s $\color{#35bf28}+0.25\%$
test_td3_speed[False-backward] 11.2814ms 10.8596ms 92.0845 Ops/s 90.6132 Ops/s $\color{#35bf28}+1.62\%$
test_td3_speed[True-None] 1.8783ms 1.8577ms 538.2996 Ops/s 534.5276 Ops/s $\color{#35bf28}+0.71\%$
test_td3_speed[True-backward] 3.8283ms 3.6578ms 273.3907 Ops/s 238.3249 Ops/s $\textbf{\color{#35bf28}+14.71\%}$
test_td3_speed[reduce-overhead-None] 1.8273ms 1.8003ms 555.4687 Ops/s 544.6195 Ops/s $\color{#35bf28}+1.99\%$
test_cql_speed[False-None] 26.8205ms 25.9793ms 38.4922 Ops/s 38.7782 Ops/s $\color{#d91a1a}-0.74\%$
test_cql_speed[False-backward] 35.5837ms 35.0284ms 28.5482 Ops/s 28.3862 Ops/s $\color{#35bf28}+0.57\%$
test_cql_speed[True-None] 12.6889ms 12.4249ms 80.4833 Ops/s 81.6910 Ops/s $\color{#d91a1a}-1.48\%$
test_cql_speed[True-backward] 18.8451ms 18.3792ms 54.4094 Ops/s 55.7057 Ops/s $\color{#d91a1a}-2.33\%$
test_cql_speed[reduce-overhead-None] 12.6660ms 12.3859ms 80.7369 Ops/s 76.9807 Ops/s $\color{#35bf28}+4.88\%$
test_a2c_speed[False-None] 5.6352ms 5.4372ms 183.9197 Ops/s 187.2650 Ops/s $\color{#d91a1a}-1.79\%$
test_a2c_speed[False-backward] 12.1863ms 11.8131ms 84.6517 Ops/s 85.6364 Ops/s $\color{#d91a1a}-1.15\%$
test_a2c_speed[True-None] 3.9127ms 3.7285ms 268.2014 Ops/s 262.5990 Ops/s $\color{#35bf28}+2.13\%$
test_a2c_speed[True-backward] 9.0289ms 8.6347ms 115.8118 Ops/s 117.2265 Ops/s $\color{#d91a1a}-1.21\%$
test_a2c_speed[reduce-overhead-None] 4.0930ms 3.7458ms 266.9652 Ops/s 271.5522 Ops/s $\color{#d91a1a}-1.69\%$
test_ppo_speed[False-None] 6.2866ms 5.9884ms 166.9891 Ops/s 167.6308 Ops/s $\color{#d91a1a}-0.38\%$
test_ppo_speed[False-backward] 12.7740ms 12.4867ms 80.0850 Ops/s 80.1526 Ops/s $\color{#d91a1a}-0.08\%$
test_ppo_speed[True-None] 4.0449ms 3.6605ms 273.1859 Ops/s 275.7352 Ops/s $\color{#d91a1a}-0.92\%$
test_ppo_speed[True-backward] 8.6284ms 8.4118ms 118.8803 Ops/s 118.0431 Ops/s $\color{#35bf28}+0.71\%$
test_ppo_speed[reduce-overhead-None] 4.4177ms 3.6454ms 274.3209 Ops/s 277.0846 Ops/s $\color{#d91a1a}-1.00\%$
test_reinforce_speed[False-None] 4.9096ms 4.5146ms 221.5031 Ops/s 219.0418 Ops/s $\color{#35bf28}+1.12\%$
test_reinforce_speed[False-backward] 7.5203ms 7.3531ms 135.9970 Ops/s 136.1116 Ops/s $\color{#d91a1a}-0.08\%$
test_reinforce_speed[True-None] 3.6746ms 2.9452ms 339.5400 Ops/s 352.1255 Ops/s $\color{#d91a1a}-3.57\%$
test_reinforce_speed[True-backward] 7.9800ms 7.7895ms 128.3777 Ops/s 129.2464 Ops/s $\color{#d91a1a}-0.67\%$
test_reinforce_speed[reduce-overhead-None] 3.2233ms 2.8634ms 349.2405 Ops/s 353.0516 Ops/s $\color{#d91a1a}-1.08\%$
test_iql_speed[False-None] 20.6146ms 19.4800ms 51.3346 Ops/s 50.0947 Ops/s $\color{#35bf28}+2.48\%$
test_iql_speed[False-backward] 30.6635ms 30.0098ms 33.3224 Ops/s 33.3766 Ops/s $\color{#d91a1a}-0.16\%$
test_iql_speed[True-None] 9.0611ms 8.5481ms 116.9853 Ops/s 115.1083 Ops/s $\color{#35bf28}+1.63\%$
test_iql_speed[True-backward] 17.8161ms 16.7762ms 59.6083 Ops/s 60.1993 Ops/s $\color{#d91a1a}-0.98\%$
test_iql_speed[reduce-overhead-None] 9.0664ms 8.5570ms 116.8631 Ops/s 116.2157 Ops/s $\color{#35bf28}+0.56\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.3518ms 6.0663ms 164.8456 Ops/s 164.1557 Ops/s $\color{#35bf28}+0.42\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.1551ms 0.3712ms 2.6937 KOps/s 3.0746 KOps/s $\textbf{\color{#d91a1a}-12.39\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6125ms 0.3369ms 2.9684 KOps/s 3.3428 KOps/s $\textbf{\color{#d91a1a}-11.20\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.1314ms 5.8801ms 170.0656 Ops/s 171.6909 Ops/s $\color{#d91a1a}-0.95\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.0815ms 0.3356ms 2.9799 KOps/s 3.6182 KOps/s $\textbf{\color{#d91a1a}-17.64\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5744ms 0.2984ms 3.3508 KOps/s 3.8684 KOps/s $\textbf{\color{#d91a1a}-13.38\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.5423ms 1.3120ms 762.1832 Ops/s 729.4511 Ops/s $\color{#35bf28}+4.49\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5194ms 1.2317ms 811.8753 Ops/s 805.9708 Ops/s $\color{#35bf28}+0.73\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 10.0323ms 6.1943ms 161.4400 Ops/s 167.5990 Ops/s $\color{#d91a1a}-3.67\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.0254ms 0.4768ms 2.0975 KOps/s 2.2132 KOps/s $\textbf{\color{#d91a1a}-5.23\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6751ms 0.4589ms 2.1790 KOps/s 2.4252 KOps/s $\textbf{\color{#d91a1a}-10.15\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.0216ms 5.8838ms 169.9585 Ops/s 171.4107 Ops/s $\color{#d91a1a}-0.85\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.1435ms 0.3251ms 3.0759 KOps/s 3.5836 KOps/s $\textbf{\color{#d91a1a}-14.17\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.4801ms 0.2667ms 3.7490 KOps/s 2.8120 KOps/s $\textbf{\color{#35bf28}+33.32\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.9898ms 5.8060ms 172.2370 Ops/s 172.1459 Ops/s $\color{#35bf28}+0.05\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.9101ms 0.2772ms 3.6071 KOps/s 2.6779 KOps/s $\textbf{\color{#35bf28}+34.70\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4591ms 0.2596ms 3.8519 KOps/s 2.8505 KOps/s $\textbf{\color{#35bf28}+35.13\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.1548ms 6.0337ms 165.7348 Ops/s 166.4477 Ops/s $\color{#d91a1a}-0.43\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.7423ms 0.4510ms 2.2172 KOps/s 1.8871 KOps/s $\textbf{\color{#35bf28}+17.49\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6308ms 0.4154ms 2.4076 KOps/s 1.9471 KOps/s $\textbf{\color{#35bf28}+23.65\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.6355ms 5.1212ms 195.2658 Ops/s 200.1343 Ops/s $\color{#d91a1a}-2.43\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 5.1881ms 1.8991ms 526.5517 Ops/s 492.9328 Ops/s $\textbf{\color{#35bf28}+6.82\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 2.1483ms 1.1114ms 899.7673 Ops/s 1.1247 KOps/s $\textbf{\color{#d91a1a}-20.00\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 8.6161ms 5.1866ms 192.8041 Ops/s 57.6574 Ops/s $\textbf{\color{#35bf28}+234.40\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 3.9562ms 1.7932ms 557.6511 Ops/s 506.0194 Ops/s $\textbf{\color{#35bf28}+10.20\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 2.0008ms 1.0702ms 934.4374 Ops/s 800.6701 Ops/s $\textbf{\color{#35bf28}+16.71\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 8.4612ms 5.3969ms 185.2922 Ops/s 189.4068 Ops/s $\color{#d91a1a}-2.17\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 8.0890ms 2.0688ms 483.3712 Ops/s 531.9539 Ops/s $\textbf{\color{#d91a1a}-9.13\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 1.2523ms 1.0471ms 955.0100 Ops/s 984.1769 Ops/s $\color{#d91a1a}-2.96\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 38.4316ms 36.2929ms 27.5536 Ops/s 27.5525 Ops/s $+0.00\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 20.0931ms 18.5488ms 53.9120 Ops/s 54.2316 Ops/s $\color{#d91a1a}-0.59\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 42.6144ms 37.4939ms 26.6710 Ops/s 25.8083 Ops/s $\color{#35bf28}+3.34\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 20.0740ms 18.6772ms 53.5412 Ops/s 51.7031 Ops/s $\color{#35bf28}+3.56\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 40.8174ms 39.0670ms 25.5970 Ops/s 25.1917 Ops/s $\color{#35bf28}+1.61\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 21.9088ms 20.4629ms 48.8689 Ops/s 49.1413 Ops/s $\color{#d91a1a}-0.55\%$
test_storage_write_lazystack[50-img_shape0-small] 0.8615ms 0.2156ms 4.6378 KOps/s 4.3740 KOps/s $\textbf{\color{#35bf28}+6.03\%}$
test_storage_write_lazystack[100-img_shape1-atari] 1.6867ms 1.3773ms 726.0738 Ops/s 710.0735 Ops/s $\color{#35bf28}+2.25\%$
test_storage_write_lazystack[100-img_shape2-large_img] 2.7401ms 2.3237ms 430.3543 Ops/s 423.6480 Ops/s $\color{#35bf28}+1.58\%$
test_storage_write_lazystack[200-img_shape3-large_batch] 3.1073ms 2.8639ms 349.1724 Ops/s 339.4056 Ops/s $\color{#35bf28}+2.88\%$
test_storage_write_contiguous[50-img_shape0-small] 0.4847ms 0.1355ms 7.3825 KOps/s 7.0400 KOps/s $\color{#35bf28}+4.86\%$
test_storage_write_contiguous[100-img_shape1-atari] 0.3998ms 0.2209ms 4.5269 KOps/s 5.4248 KOps/s $\textbf{\color{#d91a1a}-16.55\%}$
test_storage_write_contiguous[100-img_shape2-large_img] 1.9239ms 1.7564ms 569.3420 Ops/s 568.8127 Ops/s $\color{#35bf28}+0.09\%$
test_storage_write_contiguous[200-img_shape3-large_batch] 1.4802ms 1.2661ms 789.8225 Ops/s 774.0650 Ops/s $\color{#35bf28}+2.04\%$
test_collector_stack_then_write[50-img_shape0-small] 1.2494ms 1.1166ms 895.5968 Ops/s 887.0384 Ops/s $\color{#35bf28}+0.96\%$
test_collector_stack_then_write[100-img_shape1-atari] 7.5117ms 3.6151ms 276.6204 Ops/s 275.8438 Ops/s $\color{#35bf28}+0.28\%$
test_collector_stack_then_write[100-img_shape2-large_img] 11.5686ms 5.7786ms 173.0513 Ops/s 175.2208 Ops/s $\color{#d91a1a}-1.24\%$
test_collector_stack_then_write[200-img_shape3-large_batch] 8.3421ms 7.2051ms 138.7913 Ops/s 142.4327 Ops/s $\color{#d91a1a}-2.56\%$
test_collector_lazystack_then_write[50-img_shape0-small] 0.4279ms 0.2699ms 3.7049 KOps/s 3.6725 KOps/s $\color{#35bf28}+0.88\%$
test_collector_lazystack_then_write[100-img_shape1-atari] 1.7001ms 1.5003ms 666.5390 Ops/s 656.6097 Ops/s $\color{#35bf28}+1.51\%$
test_collector_lazystack_then_write[100-img_shape2-large_img] 2.8537ms 2.4252ms 412.3367 Ops/s 410.2767 Ops/s $\color{#35bf28}+0.50\%$
test_collector_lazystack_then_write[200-img_shape3-large_batch] 3.4102ms 3.0891ms 323.7203 Ops/s 315.0799 Ops/s $\color{#35bf28}+2.74\%$
test_collector_without_rb[100-img_shape0-atari] 33.7958ms 33.3038ms 30.0266 Ops/s 29.8991 Ops/s $\color{#35bf28}+0.43\%$
test_collector_without_rb[200-img_shape1-large_batch] 65.3885ms 65.1740ms 15.3435 Ops/s 15.3008 Ops/s $\color{#35bf28}+0.28\%$
test_collector_with_rb[100-img_shape0-atari] 38.9209ms 38.1240ms 26.2302 Ops/s 26.4603 Ops/s $\color{#d91a1a}-0.87\%$
test_collector_with_rb[200-img_shape1-large_batch] 75.4241ms 74.6793ms 13.3906 Ops/s 13.5012 Ops/s $\color{#d91a1a}-0.82\%$

[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Feature New feature Modules

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant