Skip to content

[Feature] Auto-batching inference server: weight sync integration#3497

Open
vmoens wants to merge 4 commits intogh/vmoens/239/basefrom
gh/vmoens/239/head
Open

[Feature] Auto-batching inference server: weight sync integration#3497
vmoens wants to merge 4 commits intogh/vmoens/239/basefrom
gh/vmoens/239/head

Conversation

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Feb 11, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3497

Note: Links to docs will display an error until the docs builds have been completed.

❌ 4 New Failures, 1 Pending

As of commit 7924241 with merge base 266e4aa (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

vmoens added a commit that referenced this pull request Feb 11, 2026
Wires WeightSyncScheme into the server loop:
- init_on_receiver + connect at startup
- Non-blocking receive() poll between inference batches
- threading.Lock protects model during weight updates
- End-to-end tests and updated Sphinx docs with usage tutorial

Co-authored-by: Cursor <cursoragent@cursor.com>
ghstack-source-id: 50480f5
Pull-Request: #3497
@github-actions github-actions bot added Documentation Improvements or additions to documentation Modules labels Feb 11, 2026
@github-actions github-actions bot added the Feature New feature label Feb 11, 2026
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 11, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Feb 11, 2026

$\color{#D29922}\textsf{\Large&amp;#x26A0;\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 173. Improved: $\large\color{#35bf28}12$. Worsened: $\large\color{#d91a1a}9$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_tensor_to_bytestream_speed[pickle] 84.1930μs 81.4168μs 12.2825 KOps/s 12.3507 KOps/s $\color{#d91a1a}-0.55\%$
test_tensor_to_bytestream_speed[torch.save] 0.1424ms 0.1384ms 7.2249 KOps/s 7.2941 KOps/s $\color{#d91a1a}-0.95\%$
test_tensor_to_bytestream_speed[untyped_storage] 0.1055s 0.1052s 9.5031 Ops/s 9.4152 Ops/s $\color{#35bf28}+0.93\%$
test_tensor_to_bytestream_speed[numpy] 2.5381μs 2.5122μs 398.0560 KOps/s 398.9778 KOps/s $\color{#d91a1a}-0.23\%$
test_tensor_to_bytestream_speed[safetensors] 36.0887μs 35.8825μs 27.8687 KOps/s 27.8848 KOps/s $\color{#d91a1a}-0.06\%$
test_simple 0.5348s 0.5328s 1.8768 Ops/s 1.7963 Ops/s $\color{#35bf28}+4.48\%$
test_transformed 1.0648s 1.0632s 0.9405 Ops/s 0.9166 Ops/s $\color{#35bf28}+2.61\%$
test_serial 1.6238s 1.6223s 0.6164 Ops/s 0.6068 Ops/s $\color{#35bf28}+1.58\%$
test_parallel 1.1146s 1.0176s 0.9827 Ops/s 0.9858 Ops/s $\color{#d91a1a}-0.32\%$
test_step_mdp_speed[True-True-True-True-True] 0.3194ms 40.9129μs 24.4422 KOps/s 24.9439 KOps/s $\color{#d91a1a}-2.01\%$
test_step_mdp_speed[True-True-True-True-False] 72.1310μs 22.7853μs 43.8880 KOps/s 43.8423 KOps/s $\color{#35bf28}+0.10\%$
test_step_mdp_speed[True-True-True-False-True] 68.2510μs 22.8787μs 43.7088 KOps/s 43.6008 KOps/s $\color{#35bf28}+0.25\%$
test_step_mdp_speed[True-True-True-False-False] 37.6200μs 12.6536μs 79.0286 KOps/s 79.9097 KOps/s $\color{#d91a1a}-1.10\%$
test_step_mdp_speed[True-True-False-True-True] 83.5720μs 43.4219μs 23.0299 KOps/s 23.1041 KOps/s $\color{#d91a1a}-0.32\%$
test_step_mdp_speed[True-True-False-True-False] 58.1710μs 25.3190μs 39.4960 KOps/s 39.8457 KOps/s $\color{#d91a1a}-0.88\%$
test_step_mdp_speed[True-True-False-False-True] 58.1210μs 25.3723μs 39.4130 KOps/s 39.2710 KOps/s $\color{#35bf28}+0.36\%$
test_step_mdp_speed[True-True-False-False-False] 43.6010μs 15.2128μs 65.7339 KOps/s 66.7151 KOps/s $\color{#d91a1a}-1.47\%$
test_step_mdp_speed[True-False-True-True-True] 92.9010μs 47.1337μs 21.2162 KOps/s 21.7339 KOps/s $\color{#d91a1a}-2.38\%$
test_step_mdp_speed[True-False-True-True-False] 57.2610μs 27.9147μs 35.8235 KOps/s 36.1508 KOps/s $\color{#d91a1a}-0.91\%$
test_step_mdp_speed[True-False-True-False-True] 59.5610μs 24.9675μs 40.0521 KOps/s 39.8568 KOps/s $\color{#35bf28}+0.49\%$
test_step_mdp_speed[True-False-True-False-False] 46.9910μs 15.1509μs 66.0026 KOps/s 67.0084 KOps/s $\color{#d91a1a}-1.50\%$
test_step_mdp_speed[True-False-False-True-True] 85.3220μs 47.9217μs 20.8674 KOps/s 20.6009 KOps/s $\color{#35bf28}+1.29\%$
test_step_mdp_speed[True-False-False-True-False] 67.7410μs 30.2974μs 33.0061 KOps/s 32.8132 KOps/s $\color{#35bf28}+0.59\%$
test_step_mdp_speed[True-False-False-False-True] 64.5710μs 27.3596μs 36.5503 KOps/s 35.7132 KOps/s $\color{#35bf28}+2.34\%$
test_step_mdp_speed[True-False-False-False-False] 45.1910μs 17.4850μs 57.1920 KOps/s 57.2529 KOps/s $\color{#d91a1a}-0.11\%$
test_step_mdp_speed[False-True-True-True-True] 88.7810μs 45.9252μs 21.7745 KOps/s 21.5845 KOps/s $\color{#35bf28}+0.88\%$
test_step_mdp_speed[False-True-True-True-False] 61.9210μs 27.8489μs 35.9080 KOps/s 35.6482 KOps/s $\color{#35bf28}+0.73\%$
test_step_mdp_speed[False-True-True-False-True] 2.4898ms 29.5172μs 33.8786 KOps/s 33.5922 KOps/s $\color{#35bf28}+0.85\%$
test_step_mdp_speed[False-True-True-False-False] 51.0200μs 16.9722μs 58.9198 KOps/s 58.8602 KOps/s $\color{#35bf28}+0.10\%$
test_step_mdp_speed[False-True-False-True-True] 91.5210μs 48.8778μs 20.4592 KOps/s 20.5203 KOps/s $\color{#d91a1a}-0.30\%$
test_step_mdp_speed[False-True-False-True-False] 63.5620μs 30.4862μs 32.8018 KOps/s 32.8320 KOps/s $\color{#d91a1a}-0.09\%$
test_step_mdp_speed[False-True-False-False-True] 62.1310μs 30.7313μs 32.5401 KOps/s 31.6427 KOps/s $\color{#35bf28}+2.84\%$
test_step_mdp_speed[False-True-False-False-False] 61.5810μs 19.5054μs 51.2679 KOps/s 51.0375 KOps/s $\color{#35bf28}+0.45\%$
test_step_mdp_speed[False-False-True-True-True] 80.8310μs 51.0617μs 19.5842 KOps/s 19.5124 KOps/s $\color{#35bf28}+0.37\%$
test_step_mdp_speed[False-False-True-True-False] 64.8010μs 33.0949μs 30.2161 KOps/s 30.2712 KOps/s $\color{#d91a1a}-0.18\%$
test_step_mdp_speed[False-False-True-False-True] 61.0110μs 31.1545μs 32.0981 KOps/s 31.9008 KOps/s $\color{#35bf28}+0.62\%$
test_step_mdp_speed[False-False-True-False-False] 64.2910μs 19.3211μs 51.7570 KOps/s 52.5619 KOps/s $\color{#d91a1a}-1.53\%$
test_step_mdp_speed[False-False-False-True-True] 83.8410μs 53.3908μs 18.7298 KOps/s 19.2558 KOps/s $\color{#d91a1a}-2.73\%$
test_step_mdp_speed[False-False-False-True-False] 77.7720μs 35.2856μs 28.3401 KOps/s 29.1148 KOps/s $\color{#d91a1a}-2.66\%$
test_step_mdp_speed[False-False-False-False-True] 0.1055ms 32.3884μs 30.8753 KOps/s 30.0611 KOps/s $\color{#35bf28}+2.71\%$
test_step_mdp_speed[False-False-False-False-False] 61.9410μs 21.4945μs 46.5234 KOps/s 47.4753 KOps/s $\color{#d91a1a}-2.01\%$
test_non_tensor_env_rollout_speed[1000-single-True] 0.8197s 0.7231s 1.3829 Ops/s 1.3631 Ops/s $\color{#35bf28}+1.46\%$
test_non_tensor_env_rollout_speed[1000-single-False] 0.6854s 0.5890s 1.6978 Ops/s 1.6723 Ops/s $\color{#35bf28}+1.52\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] 1.6799s 1.6053s 0.6229 Ops/s 0.6192 Ops/s $\color{#35bf28}+0.60\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 1.4626s 1.3835s 0.7228 Ops/s 0.7181 Ops/s $\color{#35bf28}+0.66\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 1.9201s 1.8441s 0.5423 Ops/s 0.5400 Ops/s $\color{#35bf28}+0.42\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 1.7043s 1.6228s 0.6162 Ops/s 0.6114 Ops/s $\color{#35bf28}+0.80\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 4.6558s 4.5459s 0.2200 Ops/s 0.2175 Ops/s $\color{#35bf28}+1.12\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] 4.6125s 4.4443s 0.2250 Ops/s 0.2273 Ops/s $\color{#d91a1a}-1.03\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 1.9336s 1.8220s 0.5488 Ops/s 0.5429 Ops/s $\color{#35bf28}+1.10\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 1.6753s 1.5537s 0.6436 Ops/s 0.6442 Ops/s $\color{#d91a1a}-0.09\%$
test_values[generalized_advantage_estimate-True-True] 11.4627ms 10.6186ms 94.1743 Ops/s 96.9688 Ops/s $\color{#d91a1a}-2.88\%$
test_values[vec_generalized_advantage_estimate-True-True] 19.6166ms 17.7183ms 56.4387 Ops/s 56.9422 Ops/s $\color{#d91a1a}-0.88\%$
test_values[td0_return_estimate-False-False] 0.2426ms 0.1358ms 7.3647 KOps/s 8.1220 KOps/s $\textbf{\color{#d91a1a}-9.32\%}$
test_values[td1_return_estimate-False-False] 28.5856ms 28.0501ms 35.6505 Ops/s 35.6186 Ops/s $\color{#35bf28}+0.09\%$
test_values[vec_td1_return_estimate-False-False] 18.5495ms 17.8361ms 56.0660 Ops/s 56.8153 Ops/s $\color{#d91a1a}-1.32\%$
test_values[td_lambda_return_estimate-True-False] 42.2704ms 41.4795ms 24.1083 Ops/s 24.1203 Ops/s $\color{#d91a1a}-0.05\%$
test_values[vec_td_lambda_return_estimate-True-False] 18.4813ms 17.7928ms 56.2024 Ops/s 57.1267 Ops/s $\color{#d91a1a}-1.62\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 9.2956ms 9.0781ms 110.1553 Ops/s 110.6471 Ops/s $\color{#d91a1a}-0.44\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.7476ms 1.4960ms 668.4402 Ops/s 661.0089 Ops/s $\color{#35bf28}+1.12\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4885ms 0.4235ms 2.3612 KOps/s 2.3548 KOps/s $\color{#35bf28}+0.27\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 35.4988ms 35.1571ms 28.4438 Ops/s 29.0452 Ops/s $\color{#d91a1a}-2.07\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 2.0533ms 1.7381ms 575.3348 Ops/s 572.5541 Ops/s $\color{#35bf28}+0.49\%$
test_dqn_speed[False-None] 1.8029ms 1.3911ms 718.8648 Ops/s 722.8617 Ops/s $\color{#d91a1a}-0.55\%$
test_dqn_speed[False-backward] 1.9605ms 1.8992ms 526.5505 Ops/s 527.1985 Ops/s $\color{#d91a1a}-0.12\%$
test_dqn_speed[True-None] 0.9340ms 0.5537ms 1.8059 KOps/s 1.8164 KOps/s $\color{#d91a1a}-0.58\%$
test_dqn_speed[True-backward] 1.0403ms 1.0065ms 993.5248 Ops/s 986.7688 Ops/s $\color{#35bf28}+0.68\%$
test_dqn_speed[reduce-overhead-None] 0.8957ms 0.5490ms 1.8216 KOps/s 1.8415 KOps/s $\color{#d91a1a}-1.08\%$
test_ddpg_speed[False-None] 3.1381ms 2.8286ms 353.5316 Ops/s 358.4682 Ops/s $\color{#d91a1a}-1.38\%$
test_ddpg_speed[False-backward] 4.1880ms 4.0401ms 247.5208 Ops/s 250.7766 Ops/s $\color{#d91a1a}-1.30\%$
test_ddpg_speed[True-None] 1.6409ms 1.4059ms 711.2964 Ops/s 716.0189 Ops/s $\color{#d91a1a}-0.66\%$
test_ddpg_speed[True-backward] 2.4395ms 2.3961ms 417.3382 Ops/s 414.5910 Ops/s $\color{#35bf28}+0.66\%$
test_ddpg_speed[reduce-overhead-None] 1.8072ms 1.4155ms 706.4730 Ops/s 715.1478 Ops/s $\color{#d91a1a}-1.21\%$
test_sac_speed[False-None] 8.5198ms 7.8937ms 126.6826 Ops/s 125.4195 Ops/s $\color{#35bf28}+1.01\%$
test_sac_speed[False-backward] 11.6180ms 11.1747ms 89.4878 Ops/s 89.1447 Ops/s $\color{#35bf28}+0.38\%$
test_sac_speed[True-None] 2.5030ms 2.1911ms 456.3864 Ops/s 460.0033 Ops/s $\color{#d91a1a}-0.79\%$
test_sac_speed[True-backward] 4.5303ms 4.1365ms 241.7480 Ops/s 217.6358 Ops/s $\textbf{\color{#35bf28}+11.08\%}$
test_sac_speed[reduce-overhead-None] 2.3804ms 2.1778ms 459.1766 Ops/s 460.5377 Ops/s $\color{#d91a1a}-0.30\%$
test_redq_speed[False-None] 10.8846ms 10.1374ms 98.6447 Ops/s 90.2028 Ops/s $\textbf{\color{#35bf28}+9.36\%}$
test_redq_speed[False-backward] 18.5950ms 17.6235ms 56.7423 Ops/s 55.9243 Ops/s $\color{#35bf28}+1.46\%$
test_redq_speed[True-None] 4.9501ms 4.4351ms 225.4755 Ops/s 226.2799 Ops/s $\color{#d91a1a}-0.36\%$
test_redq_speed[True-backward] 9.9585ms 9.6695ms 103.4185 Ops/s 97.5172 Ops/s $\textbf{\color{#35bf28}+6.05\%}$
test_redq_speed[reduce-overhead-None] 4.7059ms 4.4625ms 224.0883 Ops/s 223.4742 Ops/s $\color{#35bf28}+0.27\%$
test_redq_deprec_speed[False-None] 11.5585ms 10.9050ms 91.7007 Ops/s 91.6821 Ops/s $\color{#35bf28}+0.02\%$
test_redq_deprec_speed[False-backward] 18.8593ms 15.9106ms 62.8511 Ops/s 63.4241 Ops/s $\color{#d91a1a}-0.90\%$
test_redq_deprec_speed[True-None] 4.1233ms 3.6467ms 274.2234 Ops/s 269.2828 Ops/s $\color{#35bf28}+1.83\%$
test_redq_deprec_speed[True-backward] 7.9145ms 7.4992ms 133.3473 Ops/s 129.4521 Ops/s $\color{#35bf28}+3.01\%$
test_redq_deprec_speed[reduce-overhead-None] 4.0269ms 3.6025ms 277.5876 Ops/s 262.0361 Ops/s $\textbf{\color{#35bf28}+5.93\%}$
test_td3_speed[False-None] 8.1674ms 7.9019ms 126.5523 Ops/s 124.8913 Ops/s $\color{#35bf28}+1.33\%$
test_td3_speed[False-backward] 11.0762ms 10.7266ms 93.2261 Ops/s 92.1635 Ops/s $\color{#35bf28}+1.15\%$
test_td3_speed[True-None] 2.1551ms 1.8892ms 529.3254 Ops/s 535.2250 Ops/s $\color{#d91a1a}-1.10\%$
test_td3_speed[True-backward] 3.8267ms 3.7023ms 270.1058 Ops/s 269.6535 Ops/s $\color{#35bf28}+0.17\%$
test_td3_speed[reduce-overhead-None] 1.8765ms 1.8320ms 545.8494 Ops/s 538.2279 Ops/s $\color{#35bf28}+1.42\%$
test_cql_speed[False-None] 26.6324ms 25.8506ms 38.6838 Ops/s 38.4913 Ops/s $\color{#35bf28}+0.50\%$
test_cql_speed[False-backward] 36.1000ms 35.2482ms 28.3702 Ops/s 28.1810 Ops/s $\color{#35bf28}+0.67\%$
test_cql_speed[True-None] 12.7569ms 12.3903ms 80.7082 Ops/s 77.6605 Ops/s $\color{#35bf28}+3.92\%$
test_cql_speed[True-backward] 18.9361ms 18.1939ms 54.9636 Ops/s 47.3120 Ops/s $\textbf{\color{#35bf28}+16.17\%}$
test_cql_speed[reduce-overhead-None] 12.7271ms 12.4692ms 80.1975 Ops/s 76.5429 Ops/s $\color{#35bf28}+4.77\%$
test_a2c_speed[False-None] 5.8032ms 5.3798ms 185.8821 Ops/s 183.1244 Ops/s $\color{#35bf28}+1.51\%$
test_a2c_speed[False-backward] 12.2337ms 11.8478ms 84.4039 Ops/s 84.9520 Ops/s $\color{#d91a1a}-0.65\%$
test_a2c_speed[True-None] 4.1712ms 3.7174ms 269.0077 Ops/s 262.2881 Ops/s $\color{#35bf28}+2.56\%$
test_a2c_speed[True-backward] 8.9641ms 8.6165ms 116.0565 Ops/s 117.4247 Ops/s $\color{#d91a1a}-1.17\%$
test_a2c_speed[reduce-overhead-None] 4.7271ms 3.7881ms 263.9869 Ops/s 268.4570 Ops/s $\color{#d91a1a}-1.67\%$
test_ppo_speed[False-None] 6.1655ms 5.9328ms 168.5538 Ops/s 170.5563 Ops/s $\color{#d91a1a}-1.17\%$
test_ppo_speed[False-backward] 12.8232ms 12.5019ms 79.9879 Ops/s 82.4975 Ops/s $\color{#d91a1a}-3.04\%$
test_ppo_speed[True-None] 4.0175ms 3.6726ms 272.2900 Ops/s 265.2299 Ops/s $\color{#35bf28}+2.66\%$
test_ppo_speed[True-backward] 8.7139ms 8.4024ms 119.0141 Ops/s 113.3839 Ops/s $\color{#35bf28}+4.97\%$
test_ppo_speed[reduce-overhead-None] 4.9701ms 3.6282ms 275.6184 Ops/s 270.8672 Ops/s $\color{#35bf28}+1.75\%$
test_reinforce_speed[False-None] 5.0773ms 4.5394ms 220.2939 Ops/s 220.7390 Ops/s $\color{#d91a1a}-0.20\%$
test_reinforce_speed[False-backward] 7.6522ms 7.4283ms 134.6199 Ops/s 136.5742 Ops/s $\color{#d91a1a}-1.43\%$
test_reinforce_speed[True-None] 3.8996ms 2.8887ms 346.1788 Ops/s 341.9281 Ops/s $\color{#35bf28}+1.24\%$
test_reinforce_speed[True-backward] 8.2901ms 7.7357ms 129.2705 Ops/s 132.3836 Ops/s $\color{#d91a1a}-2.35\%$
test_reinforce_speed[reduce-overhead-None] 3.2838ms 2.8250ms 353.9880 Ops/s 353.3658 Ops/s $\color{#35bf28}+0.18\%$
test_iql_speed[False-None] 20.4113ms 19.7176ms 50.7160 Ops/s 49.9203 Ops/s $\color{#35bf28}+1.59\%$
test_iql_speed[False-backward] 30.7750ms 29.9818ms 33.3536 Ops/s 32.6277 Ops/s $\color{#35bf28}+2.22\%$
test_iql_speed[True-None] 8.7933ms 8.5429ms 117.0561 Ops/s 112.7560 Ops/s $\color{#35bf28}+3.81\%$
test_iql_speed[True-backward] 17.1640ms 16.6038ms 60.2273 Ops/s 58.8235 Ops/s $\color{#35bf28}+2.39\%$
test_iql_speed[reduce-overhead-None] 8.9587ms 8.5775ms 116.5838 Ops/s 116.0406 Ops/s $\color{#35bf28}+0.47\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.0954ms 5.9385ms 168.3931 Ops/s 168.0453 Ops/s $\color{#35bf28}+0.21\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.1430ms 0.3197ms 3.1278 KOps/s 2.6038 KOps/s $\textbf{\color{#35bf28}+20.12\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7052ms 0.2987ms 3.3481 KOps/s 2.7375 KOps/s $\textbf{\color{#35bf28}+22.31\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.0832ms 5.7478ms 173.9794 Ops/s 174.8511 Ops/s $\color{#d91a1a}-0.50\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.9229ms 0.3362ms 2.9745 KOps/s 3.0044 KOps/s $\color{#d91a1a}-1.00\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7041ms 0.3569ms 2.8023 KOps/s 3.0651 KOps/s $\textbf{\color{#d91a1a}-8.57\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.7947ms 1.4621ms 683.9338 Ops/s 767.9746 Ops/s $\textbf{\color{#d91a1a}-10.94\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.6355ms 1.3750ms 727.2586 Ops/s 825.3859 Ops/s $\textbf{\color{#d91a1a}-11.89\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 9.4143ms 6.0354ms 165.6904 Ops/s 173.1830 Ops/s $\color{#d91a1a}-4.33\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.3232ms 0.4705ms 2.1255 KOps/s 2.1853 KOps/s $\color{#d91a1a}-2.74\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7804ms 0.4514ms 2.2155 KOps/s 2.4414 KOps/s $\textbf{\color{#d91a1a}-9.25\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.9092ms 5.7180ms 174.8876 Ops/s 176.2883 Ops/s $\color{#d91a1a}-0.79\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.5661ms 0.2811ms 3.5576 KOps/s 3.1028 KOps/s $\textbf{\color{#35bf28}+14.66\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5232ms 0.2649ms 3.7744 KOps/s 3.4964 KOps/s $\textbf{\color{#35bf28}+7.95\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.9958ms 5.7281ms 174.5792 Ops/s 175.8524 Ops/s $\color{#d91a1a}-0.72\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.6358ms 0.3219ms 3.1064 KOps/s 3.1143 KOps/s $\color{#d91a1a}-0.25\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6786ms 0.3653ms 2.7371 KOps/s 3.4371 KOps/s $\textbf{\color{#d91a1a}-20.36\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.0311ms 5.8858ms 169.8992 Ops/s 169.5475 Ops/s $\color{#35bf28}+0.21\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.1626ms 0.4660ms 2.1458 KOps/s 2.1355 KOps/s $\color{#35bf28}+0.48\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7128ms 0.4544ms 2.2006 KOps/s 2.1962 KOps/s $\color{#35bf28}+0.20\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.5474ms 4.9925ms 200.3005 Ops/s 201.0903 Ops/s $\color{#d91a1a}-0.39\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 8.0746ms 1.8832ms 531.0063 Ops/s 521.1498 Ops/s $\color{#35bf28}+1.89\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 10.6340ms 1.2298ms 813.1532 Ops/s 902.2416 Ops/s $\textbf{\color{#d91a1a}-9.87\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.5349s 15.6680ms 63.8244 Ops/s 58.2633 Ops/s $\textbf{\color{#35bf28}+9.54\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 9.7215ms 1.9948ms 501.2956 Ops/s 534.1279 Ops/s $\textbf{\color{#d91a1a}-6.15\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 8.6887ms 1.2170ms 821.7207 Ops/s 803.1565 Ops/s $\color{#35bf28}+2.31\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 7.3082ms 5.3230ms 187.8626 Ops/s 190.2538 Ops/s $\color{#d91a1a}-1.26\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 11.7077ms 2.0804ms 480.6869 Ops/s 488.1682 Ops/s $\color{#d91a1a}-1.53\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 1.2101ms 1.0472ms 954.8903 Ops/s 930.3854 Ops/s $\color{#35bf28}+2.63\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 40.6882ms 36.1494ms 27.6629 Ops/s 27.9975 Ops/s $\color{#d91a1a}-1.19\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 19.3928ms 17.8680ms 55.9659 Ops/s 55.1904 Ops/s $\color{#35bf28}+1.41\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 40.6443ms 37.0550ms 26.9869 Ops/s 27.1148 Ops/s $\color{#d91a1a}-0.47\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 20.2503ms 18.5524ms 53.9014 Ops/s 54.5082 Ops/s $\color{#d91a1a}-1.11\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 41.3142ms 38.7906ms 25.7794 Ops/s 25.9264 Ops/s $\color{#d91a1a}-0.57\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 21.3952ms 19.8123ms 50.4738 Ops/s 33.0003 Ops/s $\textbf{\color{#35bf28}+52.95\%}$
test_storage_write_lazystack[50-img_shape0-small] 0.9156ms 0.2131ms 4.6934 KOps/s 4.6814 KOps/s $\color{#35bf28}+0.26\%$
test_storage_write_lazystack[100-img_shape1-atari] 1.7261ms 1.3923ms 718.2298 Ops/s 720.2592 Ops/s $\color{#d91a1a}-0.28\%$
test_storage_write_lazystack[100-img_shape2-large_img] 2.6722ms 2.3826ms 419.7162 Ops/s 431.9957 Ops/s $\color{#d91a1a}-2.84\%$
test_storage_write_lazystack[200-img_shape3-large_batch] 3.0533ms 2.8665ms 348.8539 Ops/s 338.8483 Ops/s $\color{#35bf28}+2.95\%$
test_storage_write_contiguous[50-img_shape0-small] 0.2188ms 0.1315ms 7.6036 KOps/s 7.5941 KOps/s $\color{#35bf28}+0.13\%$
test_storage_write_contiguous[100-img_shape1-atari] 0.5233ms 0.1777ms 5.6286 KOps/s 5.3145 KOps/s $\textbf{\color{#35bf28}+5.91\%}$
test_storage_write_contiguous[100-img_shape2-large_img] 1.9051ms 1.7641ms 566.8507 Ops/s 585.0383 Ops/s $\color{#d91a1a}-3.11\%$
test_storage_write_contiguous[200-img_shape3-large_batch] 1.5448ms 1.2904ms 774.9755 Ops/s 791.5807 Ops/s $\color{#d91a1a}-2.10\%$
test_collector_stack_then_write[50-img_shape0-small] 1.2103ms 1.0857ms 921.1068 Ops/s 911.9148 Ops/s $\color{#35bf28}+1.01\%$
test_collector_stack_then_write[100-img_shape1-atari] 3.6875ms 3.5302ms 283.2679 Ops/s 285.7139 Ops/s $\color{#d91a1a}-0.86\%$
test_collector_stack_then_write[100-img_shape2-large_img] 5.7627ms 5.5100ms 181.4872 Ops/s 180.1507 Ops/s $\color{#35bf28}+0.74\%$
test_collector_stack_then_write[200-img_shape3-large_batch] 7.1902ms 7.0080ms 142.6944 Ops/s 146.0443 Ops/s $\color{#d91a1a}-2.29\%$
test_collector_lazystack_then_write[50-img_shape0-small] 0.4353ms 0.2692ms 3.7146 KOps/s 3.6990 KOps/s $\color{#35bf28}+0.42\%$
test_collector_lazystack_then_write[100-img_shape1-atari] 1.6473ms 1.5107ms 661.9589 Ops/s 655.9318 Ops/s $\color{#35bf28}+0.92\%$
test_collector_lazystack_then_write[100-img_shape2-large_img] 2.9164ms 2.4989ms 400.1757 Ops/s 412.8716 Ops/s $\color{#d91a1a}-3.08\%$
test_collector_lazystack_then_write[200-img_shape3-large_batch] 3.2114ms 3.0629ms 326.4863 Ops/s 315.5443 Ops/s $\color{#35bf28}+3.47\%$
test_collector_without_rb[100-img_shape0-atari] 33.3072ms 32.3489ms 30.9129 Ops/s 30.3602 Ops/s $\color{#35bf28}+1.82\%$
test_collector_without_rb[200-img_shape1-large_batch] 65.0291ms 64.0949ms 15.6019 Ops/s 15.3892 Ops/s $\color{#35bf28}+1.38\%$
test_collector_with_rb[100-img_shape0-atari] 0.6098s 57.9120ms 17.2676 Ops/s 26.9360 Ops/s $\textbf{\color{#d91a1a}-35.89\%}$
test_collector_with_rb[200-img_shape1-large_batch] 75.3917ms 73.3798ms 13.6277 Ops/s 13.8093 Ops/s $\color{#d91a1a}-1.31\%$

@github-actions
Copy link
Contributor

github-actions bot commented Feb 11, 2026

Result of GPU Benchmark Tests

Expand to view detailed results
Name Max Mean Ops
test_tensor_to_bytestream_speed[pickle] 81.7025μs 80.9050μs 12.3602 KOps/s
test_tensor_to_bytestream_speed[torch.save] 0.1463ms 0.1454ms 6.8775 KOps/s
test_tensor_to_bytestream_speed[untyped_storage] 0.1178s 0.1175s 8.5119 Ops/s
test_tensor_to_bytestream_speed[numpy] 2.5998μs 2.5929μs 385.6674 KOps/s
test_tensor_to_bytestream_speed[safetensors] 39.4718μs 37.5517μs 26.6300 KOps/s
test_simple 0.7903s 0.7898s 1.2662 Ops/s
test_transformed 1.3802s 1.3787s 0.7253 Ops/s
test_serial 2.3308s 2.3026s 0.4343 Ops/s
test_parallel 1.9113s 1.8224s 0.5487 Ops/s
test_step_mdp_speed[True-True-True-True-True] 0.2122ms 41.0699μs 24.3487 KOps/s
test_step_mdp_speed[True-True-True-True-False] 59.9810μs 23.1327μs 43.2289 KOps/s
test_step_mdp_speed[True-True-True-False-True] 60.1610μs 23.2404μs 43.0286 KOps/s
test_step_mdp_speed[True-True-True-False-False] 40.9200μs 12.9081μs 77.4706 KOps/s
test_step_mdp_speed[True-True-False-True-True] 86.6310μs 44.1005μs 22.6755 KOps/s
test_step_mdp_speed[True-True-False-True-False] 66.0810μs 25.7468μs 38.8397 KOps/s
test_step_mdp_speed[True-True-False-False-True] 72.7610μs 26.1241μs 38.2788 KOps/s
test_step_mdp_speed[True-True-False-False-False] 42.9610μs 15.6513μs 63.8925 KOps/s
test_step_mdp_speed[True-False-True-True-True] 84.1710μs 47.4125μs 21.0915 KOps/s
test_step_mdp_speed[True-False-True-True-False] 66.1410μs 28.6938μs 34.8507 KOps/s
test_step_mdp_speed[True-False-True-False-True] 64.9510μs 26.6322μs 37.5485 KOps/s
test_step_mdp_speed[True-False-True-False-False] 47.4710μs 15.6519μs 63.8900 KOps/s
test_step_mdp_speed[True-False-False-True-True] 88.1510μs 49.6596μs 20.1371 KOps/s
test_step_mdp_speed[True-False-False-True-False] 80.3710μs 31.1646μs 32.0877 KOps/s
test_step_mdp_speed[True-False-False-False-True] 60.1910μs 28.9770μs 34.5101 KOps/s
test_step_mdp_speed[True-False-False-False-False] 49.9210μs 18.2390μs 54.8275 KOps/s
test_step_mdp_speed[False-True-True-True-True] 87.0210μs 47.4989μs 21.0531 KOps/s
test_step_mdp_speed[False-True-True-True-False] 63.9010μs 28.7128μs 34.8276 KOps/s
test_step_mdp_speed[False-True-True-False-True] 2.4820ms 30.3314μs 32.9691 KOps/s
test_step_mdp_speed[False-True-True-False-False] 48.4910μs 17.3106μs 57.7682 KOps/s
test_step_mdp_speed[False-True-False-True-True] 86.7010μs 50.1311μs 19.9477 KOps/s
test_step_mdp_speed[False-True-False-True-False] 64.3510μs 30.7482μs 32.5223 KOps/s
test_step_mdp_speed[False-True-False-False-True] 68.4410μs 31.8995μs 31.3485 KOps/s
test_step_mdp_speed[False-True-False-False-False] 54.4510μs 19.8066μs 50.4883 KOps/s
test_step_mdp_speed[False-False-True-True-True] 0.1057ms 51.2799μs 19.5008 KOps/s
test_step_mdp_speed[False-False-True-True-False] 60.0910μs 33.6972μs 29.6760 KOps/s
test_step_mdp_speed[False-False-True-False-True] 70.8710μs 32.8542μs 30.4375 KOps/s
test_step_mdp_speed[False-False-True-False-False] 56.9610μs 20.1212μs 49.6988 KOps/s
test_step_mdp_speed[False-False-False-True-True] 97.9810μs 54.3598μs 18.3959 KOps/s
test_step_mdp_speed[False-False-False-True-False] 70.8510μs 36.2645μs 27.5752 KOps/s
test_step_mdp_speed[False-False-False-False-True] 82.0510μs 35.0020μs 28.5698 KOps/s
test_step_mdp_speed[False-False-False-False-False] 57.6810μs 22.0449μs 45.3619 KOps/s
test_non_tensor_env_rollout_speed[1000-single-True] 0.7193s 0.7164s 1.3959 Ops/s
test_non_tensor_env_rollout_speed[1000-single-False] 0.7032s 0.6066s 1.6486 Ops/s
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] 1.7156s 1.6366s 0.6110 Ops/s
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 1.4842s 1.4033s 0.7126 Ops/s
test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 1.9581s 1.8773s 0.5327 Ops/s
test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 1.7325s 1.6523s 0.6052 Ops/s
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 4.6765s 4.6208s 0.2164 Ops/s
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] 4.5091s 4.4299s 0.2257 Ops/s
test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 2.0236s 1.9085s 0.5240 Ops/s
test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 1.6892s 1.6040s 0.6234 Ops/s
test_values[generalized_advantage_estimate-True-True] 22.8930ms 21.5254ms 46.4567 Ops/s
test_values[vec_generalized_advantage_estimate-True-True] 0.1398s 3.7499ms 266.6746 Ops/s
test_values[td0_return_estimate-False-False] 0.1174ms 87.5971μs 11.4159 KOps/s
test_values[td1_return_estimate-False-False] 54.4242ms 52.4299ms 19.0731 Ops/s
test_values[vec_td1_return_estimate-False-False] 1.3895ms 1.1167ms 895.5057 Ops/s
test_values[td_lambda_return_estimate-True-False] 88.7627ms 86.8798ms 11.5102 Ops/s
test_values[vec_td_lambda_return_estimate-True-False] 1.3205ms 1.1098ms 901.0299 Ops/s
test_gae_speed[generalized_advantage_estimate-False-1-512] 23.5729ms 23.1668ms 43.1652 Ops/s
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0619ms 0.7877ms 1.2696 KOps/s
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7922ms 0.7321ms 1.3658 KOps/s
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.6079ms 1.5402ms 649.2477 Ops/s
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.8034ms 0.7534ms 1.3273 KOps/s
test_dqn_speed[False-None] 1.6948ms 1.5981ms 625.7284 Ops/s
test_dqn_speed[False-backward] 2.2914ms 2.2142ms 451.6359 Ops/s
test_dqn_speed[True-None] 0.7222ms 0.5697ms 1.7553 KOps/s
test_dqn_speed[True-backward] 1.3122ms 1.2271ms 814.9177 Ops/s
test_dqn_speed[reduce-overhead-None] 0.6548ms 0.6016ms 1.6621 KOps/s
test_ddpg_speed[False-None] 3.2772ms 3.0007ms 333.2537 Ops/s
test_ddpg_speed[False-backward] 4.7278ms 4.3253ms 231.1973 Ops/s
test_ddpg_speed[True-None] 1.5047ms 1.3498ms 740.8463 Ops/s
test_ddpg_speed[True-backward] 2.5829ms 2.5398ms 393.7256 Ops/s
test_ddpg_speed[reduce-overhead-None] 2.0264ms 1.3705ms 729.6573 Ops/s
test_sac_speed[False-None] 8.7833ms 8.3548ms 119.6914 Ops/s
test_sac_speed[False-backward] 12.0925ms 11.6926ms 85.5241 Ops/s
test_sac_speed[True-None] 2.0042ms 1.8355ms 544.8060 Ops/s
test_sac_speed[True-backward] 3.7916ms 3.6473ms 274.1729 Ops/s
test_sac_speed[reduce-overhead-None] 19.8947ms 11.0374ms 90.6012 Ops/s
test_redq_deprec_speed[False-None] 10.3326ms 9.4649ms 105.6537 Ops/s
test_redq_deprec_speed[False-backward] 13.2348ms 12.8614ms 77.7523 Ops/s
test_redq_deprec_speed[True-None] 2.8575ms 2.5931ms 385.6439 Ops/s
test_redq_deprec_speed[True-backward] 4.7983ms 4.4023ms 227.1559 Ops/s
test_redq_deprec_speed[reduce-overhead-None] 16.0121ms 9.8786ms 101.2284 Ops/s
test_td3_speed[False-None] 8.6642ms 8.4683ms 118.0879 Ops/s
test_td3_speed[False-backward] 11.4476ms 10.9723ms 91.1388 Ops/s
test_td3_speed[True-None] 1.7827ms 1.7648ms 566.6420 Ops/s
test_td3_speed[True-backward] 3.3741ms 3.3009ms 302.9495 Ops/s
test_td3_speed[reduce-overhead-None] 87.7050ms 25.0227ms 39.9637 Ops/s
test_cql_speed[False-None] 17.8109ms 17.4740ms 57.2278 Ops/s
test_cql_speed[False-backward] 23.4610ms 22.8227ms 43.8160 Ops/s
test_cql_speed[True-None] 3.4030ms 3.2707ms 305.7453 Ops/s
test_cql_speed[True-backward] 5.8616ms 5.4382ms 183.8845 Ops/s
test_cql_speed[reduce-overhead-None] 18.9813ms 11.9154ms 83.9247 Ops/s
test_a2c_speed[False-None] 4.1161ms 3.2998ms 303.0470 Ops/s
test_a2c_speed[False-backward] 6.3283ms 6.2215ms 160.7332 Ops/s
test_a2c_speed[True-None] 1.4051ms 1.3424ms 744.9472 Ops/s
test_a2c_speed[True-backward] 3.3832ms 2.9914ms 334.2935 Ops/s
test_a2c_speed[reduce-overhead-None] 1.0615ms 0.9750ms 1.0257 KOps/s
test_ppo_speed[False-None] 4.0500ms 3.9281ms 254.5738 Ops/s
test_ppo_speed[False-backward] 7.3895ms 7.0537ms 141.7691 Ops/s
test_ppo_speed[True-None] 1.5135ms 1.4230ms 702.7336 Ops/s
test_ppo_speed[True-backward] 3.2485ms 3.1267ms 319.8271 Ops/s
test_ppo_speed[reduce-overhead-None] 1.2107ms 1.0508ms 951.6636 Ops/s
test_reinforce_speed[False-None] 2.4751ms 2.3117ms 432.5840 Ops/s
test_reinforce_speed[False-backward] 3.8463ms 3.4431ms 290.4319 Ops/s
test_reinforce_speed[True-None] 1.4884ms 1.3014ms 768.4233 Ops/s
test_reinforce_speed[True-backward] 3.1057ms 3.0417ms 328.7669 Ops/s
test_reinforce_speed[reduce-overhead-None] 0.4453s 10.4492ms 95.7008 Ops/s
test_iql_speed[False-None] 9.9913ms 9.5278ms 104.9566 Ops/s
test_iql_speed[False-backward] 13.6225ms 13.3410ms 74.9568 Ops/s
test_iql_speed[True-None] 2.6565ms 2.1986ms 454.8317 Ops/s
test_iql_speed[True-backward] 4.9818ms 4.8570ms 205.8871 Ops/s
test_iql_speed[reduce-overhead-None] 18.3815ms 10.6192ms 94.1694 Ops/s
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.4158ms 5.9560ms 167.8982 Ops/s
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.9065ms 0.3338ms 2.9954 KOps/s
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8300ms 0.3163ms 3.1617 KOps/s
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.3363ms 5.7081ms 175.1898 Ops/s
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.6243ms 0.3277ms 3.0511 KOps/s
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8305ms 0.3274ms 3.0547 KOps/s
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.5510ms 1.3043ms 766.7153 Ops/s
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.6888ms 1.2360ms 809.0420 Ops/s
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.2870ms 5.8805ms 170.0537 Ops/s
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.4602ms 0.4461ms 2.2415 KOps/s
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6495ms 0.4212ms 2.3744 KOps/s
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.8143ms 5.7116ms 175.0811 Ops/s
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.1481ms 0.2881ms 3.4712 KOps/s
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5800ms 0.3114ms 3.2113 KOps/s
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.3115ms 5.7742ms 173.1855 Ops/s
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.3831ms 0.2906ms 3.4407 KOps/s
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4832ms 0.2664ms 3.7543 KOps/s
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.4279ms 6.0037ms 166.5643 Ops/s
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.3328ms 0.4460ms 2.2419 KOps/s
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8586ms 0.4233ms 2.3624 KOps/s
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.4769ms 5.0033ms 199.8675 Ops/s
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 11.3156ms 2.2098ms 452.5373 Ops/s
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.3531ms 0.9775ms 1.0230 KOps/s
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.5926s 16.8220ms 59.4461 Ops/s
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 5.9133ms 1.9798ms 505.0890 Ops/s
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 2.3827ms 1.1763ms 850.1541 Ops/s
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 9.8117ms 5.3969ms 185.2929 Ops/s
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 4.3432ms 2.0325ms 492.0124 Ops/s
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 12.8131ms 1.5725ms 635.9390 Ops/s
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 38.9148ms 36.0277ms 27.7564 Ops/s
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 20.3396ms 18.5427ms 53.9296 Ops/s
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 41.6391ms 38.5021ms 25.9726 Ops/s
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 21.2109ms 19.4657ms 51.3725 Ops/s
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 42.5084ms 40.6276ms 24.6138 Ops/s
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 22.3019ms 21.1307ms 47.3245 Ops/s
test_storage_write_lazystack[50-img_shape0-small] 0.7736ms 0.2301ms 4.3469 KOps/s
test_storage_write_lazystack[100-img_shape1-atari] 2.0091ms 1.4807ms 675.3750 Ops/s
test_storage_write_lazystack[100-img_shape2-large_img] 2.8363ms 2.4352ms 410.6390 Ops/s
test_storage_write_lazystack[200-img_shape3-large_batch] 3.4157ms 2.9883ms 334.6357 Ops/s
test_storage_write_contiguous[50-img_shape0-small] 0.2426ms 0.1714ms 5.8348 KOps/s
test_storage_write_contiguous[100-img_shape1-atari] 0.3919ms 0.2335ms 4.2826 KOps/s
test_storage_write_contiguous[100-img_shape2-large_img] 2.2434ms 1.8944ms 527.8726 Ops/s
test_storage_write_contiguous[200-img_shape3-large_batch] 1.6194ms 1.4342ms 697.2458 Ops/s
test_collector_stack_then_write[50-img_shape0-small] 1.3664ms 1.1543ms 866.3241 Ops/s
test_collector_stack_then_write[100-img_shape1-atari] 3.9832ms 3.6944ms 270.6827 Ops/s
test_collector_stack_then_write[100-img_shape2-large_img] 11.4338ms 6.1223ms 163.3379 Ops/s
test_collector_stack_then_write[200-img_shape3-large_batch] 7.8712ms 7.2389ms 138.1428 Ops/s
test_collector_lazystack_then_write[50-img_shape0-small] 0.7464ms 0.2838ms 3.5239 KOps/s
test_collector_lazystack_then_write[100-img_shape1-atari] 2.1895ms 1.5291ms 653.9894 Ops/s
test_collector_lazystack_then_write[100-img_shape2-large_img] 3.0201ms 2.5611ms 390.4524 Ops/s
test_collector_lazystack_then_write[200-img_shape3-large_batch] 3.6581ms 3.2124ms 311.2932 Ops/s
test_collector_without_rb[100-img_shape0-atari] 35.6326ms 33.9749ms 29.4335 Ops/s
test_collector_without_rb[200-img_shape1-large_batch] 0.6143s 0.1028s 9.7316 Ops/s
test_collector_with_rb[100-img_shape0-atari] 38.4892ms 37.7734ms 26.4737 Ops/s
test_collector_with_rb[200-img_shape1-large_batch] 82.4132ms 75.1804ms 13.3013 Ops/s
test_collector_without_rb_cuda[100-img_shape0-atari] 59.4518ms 58.6831ms 17.0407 Ops/s
test_collector_without_rb_cuda[200-img_shape1-large_batch] 0.1192s 0.1170s 8.5493 Ops/s
test_collector_with_rb_cuda[100-img_shape0-atari] 61.8073ms 60.4086ms 16.5539 Ops/s
test_collector_with_rb_cuda[200-img_shape1-large_batch] 0.1244s 0.1199s 8.3373 Ops/s

[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Documentation Improvements or additions to documentation Feature New feature Modules

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant