[Feature] Auto-batching inference server: weight sync integration#3497
Open
vmoens wants to merge 4 commits intogh/vmoens/239/basefrom
Open
[Feature] Auto-batching inference server: weight sync integration#3497vmoens wants to merge 4 commits intogh/vmoens/239/basefrom
vmoens wants to merge 4 commits intogh/vmoens/239/basefrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3497
Note: Links to docs will display an error until the docs builds have been completed. ❌ 4 New Failures, 1 PendingAs of commit 7924241 with merge base 266e4aa ( NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
vmoens
added a commit
that referenced
this pull request
Feb 11, 2026
Wires WeightSyncScheme into the server loop: - init_on_receiver + connect at startup - Non-blocking receive() poll between inference batches - threading.Lock protects model during weight updates - End-to-end tests and updated Sphinx docs with usage tutorial Co-authored-by: Cursor <cursoragent@cursor.com> ghstack-source-id: 50480f5 Pull-Request: #3497
Contributor
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_tensor_to_bytestream_speed[pickle] | 84.1930μs | 81.4168μs | 12.2825 KOps/s | 12.3507 KOps/s | |
| test_tensor_to_bytestream_speed[torch.save] | 0.1424ms | 0.1384ms | 7.2249 KOps/s | 7.2941 KOps/s | |
| test_tensor_to_bytestream_speed[untyped_storage] | 0.1055s | 0.1052s | 9.5031 Ops/s | 9.4152 Ops/s | |
| test_tensor_to_bytestream_speed[numpy] | 2.5381μs | 2.5122μs | 398.0560 KOps/s | 398.9778 KOps/s | |
| test_tensor_to_bytestream_speed[safetensors] | 36.0887μs | 35.8825μs | 27.8687 KOps/s | 27.8848 KOps/s | |
| test_simple | 0.5348s | 0.5328s | 1.8768 Ops/s | 1.7963 Ops/s | |
| test_transformed | 1.0648s | 1.0632s | 0.9405 Ops/s | 0.9166 Ops/s | |
| test_serial | 1.6238s | 1.6223s | 0.6164 Ops/s | 0.6068 Ops/s | |
| test_parallel | 1.1146s | 1.0176s | 0.9827 Ops/s | 0.9858 Ops/s | |
| test_step_mdp_speed[True-True-True-True-True] | 0.3194ms | 40.9129μs | 24.4422 KOps/s | 24.9439 KOps/s | |
| test_step_mdp_speed[True-True-True-True-False] | 72.1310μs | 22.7853μs | 43.8880 KOps/s | 43.8423 KOps/s | |
| test_step_mdp_speed[True-True-True-False-True] | 68.2510μs | 22.8787μs | 43.7088 KOps/s | 43.6008 KOps/s | |
| test_step_mdp_speed[True-True-True-False-False] | 37.6200μs | 12.6536μs | 79.0286 KOps/s | 79.9097 KOps/s | |
| test_step_mdp_speed[True-True-False-True-True] | 83.5720μs | 43.4219μs | 23.0299 KOps/s | 23.1041 KOps/s | |
| test_step_mdp_speed[True-True-False-True-False] | 58.1710μs | 25.3190μs | 39.4960 KOps/s | 39.8457 KOps/s | |
| test_step_mdp_speed[True-True-False-False-True] | 58.1210μs | 25.3723μs | 39.4130 KOps/s | 39.2710 KOps/s | |
| test_step_mdp_speed[True-True-False-False-False] | 43.6010μs | 15.2128μs | 65.7339 KOps/s | 66.7151 KOps/s | |
| test_step_mdp_speed[True-False-True-True-True] | 92.9010μs | 47.1337μs | 21.2162 KOps/s | 21.7339 KOps/s | |
| test_step_mdp_speed[True-False-True-True-False] | 57.2610μs | 27.9147μs | 35.8235 KOps/s | 36.1508 KOps/s | |
| test_step_mdp_speed[True-False-True-False-True] | 59.5610μs | 24.9675μs | 40.0521 KOps/s | 39.8568 KOps/s | |
| test_step_mdp_speed[True-False-True-False-False] | 46.9910μs | 15.1509μs | 66.0026 KOps/s | 67.0084 KOps/s | |
| test_step_mdp_speed[True-False-False-True-True] | 85.3220μs | 47.9217μs | 20.8674 KOps/s | 20.6009 KOps/s | |
| test_step_mdp_speed[True-False-False-True-False] | 67.7410μs | 30.2974μs | 33.0061 KOps/s | 32.8132 KOps/s | |
| test_step_mdp_speed[True-False-False-False-True] | 64.5710μs | 27.3596μs | 36.5503 KOps/s | 35.7132 KOps/s | |
| test_step_mdp_speed[True-False-False-False-False] | 45.1910μs | 17.4850μs | 57.1920 KOps/s | 57.2529 KOps/s | |
| test_step_mdp_speed[False-True-True-True-True] | 88.7810μs | 45.9252μs | 21.7745 KOps/s | 21.5845 KOps/s | |
| test_step_mdp_speed[False-True-True-True-False] | 61.9210μs | 27.8489μs | 35.9080 KOps/s | 35.6482 KOps/s | |
| test_step_mdp_speed[False-True-True-False-True] | 2.4898ms | 29.5172μs | 33.8786 KOps/s | 33.5922 KOps/s | |
| test_step_mdp_speed[False-True-True-False-False] | 51.0200μs | 16.9722μs | 58.9198 KOps/s | 58.8602 KOps/s | |
| test_step_mdp_speed[False-True-False-True-True] | 91.5210μs | 48.8778μs | 20.4592 KOps/s | 20.5203 KOps/s | |
| test_step_mdp_speed[False-True-False-True-False] | 63.5620μs | 30.4862μs | 32.8018 KOps/s | 32.8320 KOps/s | |
| test_step_mdp_speed[False-True-False-False-True] | 62.1310μs | 30.7313μs | 32.5401 KOps/s | 31.6427 KOps/s | |
| test_step_mdp_speed[False-True-False-False-False] | 61.5810μs | 19.5054μs | 51.2679 KOps/s | 51.0375 KOps/s | |
| test_step_mdp_speed[False-False-True-True-True] | 80.8310μs | 51.0617μs | 19.5842 KOps/s | 19.5124 KOps/s | |
| test_step_mdp_speed[False-False-True-True-False] | 64.8010μs | 33.0949μs | 30.2161 KOps/s | 30.2712 KOps/s | |
| test_step_mdp_speed[False-False-True-False-True] | 61.0110μs | 31.1545μs | 32.0981 KOps/s | 31.9008 KOps/s | |
| test_step_mdp_speed[False-False-True-False-False] | 64.2910μs | 19.3211μs | 51.7570 KOps/s | 52.5619 KOps/s | |
| test_step_mdp_speed[False-False-False-True-True] | 83.8410μs | 53.3908μs | 18.7298 KOps/s | 19.2558 KOps/s | |
| test_step_mdp_speed[False-False-False-True-False] | 77.7720μs | 35.2856μs | 28.3401 KOps/s | 29.1148 KOps/s | |
| test_step_mdp_speed[False-False-False-False-True] | 0.1055ms | 32.3884μs | 30.8753 KOps/s | 30.0611 KOps/s | |
| test_step_mdp_speed[False-False-False-False-False] | 61.9410μs | 21.4945μs | 46.5234 KOps/s | 47.4753 KOps/s | |
| test_non_tensor_env_rollout_speed[1000-single-True] | 0.8197s | 0.7231s | 1.3829 Ops/s | 1.3631 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-single-False] | 0.6854s | 0.5890s | 1.6978 Ops/s | 1.6723 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] | 1.6799s | 1.6053s | 0.6229 Ops/s | 0.6192 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] | 1.4626s | 1.3835s | 0.7228 Ops/s | 0.7181 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-True] | 1.9201s | 1.8441s | 0.5423 Ops/s | 0.5400 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-False] | 1.7043s | 1.6228s | 0.6162 Ops/s | 0.6114 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] | 4.6558s | 4.5459s | 0.2200 Ops/s | 0.2175 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] | 4.6125s | 4.4443s | 0.2250 Ops/s | 0.2273 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] | 1.9336s | 1.8220s | 0.5488 Ops/s | 0.5429 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] | 1.6753s | 1.5537s | 0.6436 Ops/s | 0.6442 Ops/s | |
| test_values[generalized_advantage_estimate-True-True] | 11.4627ms | 10.6186ms | 94.1743 Ops/s | 96.9688 Ops/s | |
| test_values[vec_generalized_advantage_estimate-True-True] | 19.6166ms | 17.7183ms | 56.4387 Ops/s | 56.9422 Ops/s | |
| test_values[td0_return_estimate-False-False] | 0.2426ms | 0.1358ms | 7.3647 KOps/s | 8.1220 KOps/s | |
| test_values[td1_return_estimate-False-False] | 28.5856ms | 28.0501ms | 35.6505 Ops/s | 35.6186 Ops/s | |
| test_values[vec_td1_return_estimate-False-False] | 18.5495ms | 17.8361ms | 56.0660 Ops/s | 56.8153 Ops/s | |
| test_values[td_lambda_return_estimate-True-False] | 42.2704ms | 41.4795ms | 24.1083 Ops/s | 24.1203 Ops/s | |
| test_values[vec_td_lambda_return_estimate-True-False] | 18.4813ms | 17.7928ms | 56.2024 Ops/s | 57.1267 Ops/s | |
| test_gae_speed[generalized_advantage_estimate-False-1-512] | 9.2956ms | 9.0781ms | 110.1553 Ops/s | 110.6471 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 1.7476ms | 1.4960ms | 668.4402 Ops/s | 661.0089 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.4885ms | 0.4235ms | 2.3612 KOps/s | 2.3548 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 35.4988ms | 35.1571ms | 28.4438 Ops/s | 29.0452 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 2.0533ms | 1.7381ms | 575.3348 Ops/s | 572.5541 Ops/s | |
| test_dqn_speed[False-None] | 1.8029ms | 1.3911ms | 718.8648 Ops/s | 722.8617 Ops/s | |
| test_dqn_speed[False-backward] | 1.9605ms | 1.8992ms | 526.5505 Ops/s | 527.1985 Ops/s | |
| test_dqn_speed[True-None] | 0.9340ms | 0.5537ms | 1.8059 KOps/s | 1.8164 KOps/s | |
| test_dqn_speed[True-backward] | 1.0403ms | 1.0065ms | 993.5248 Ops/s | 986.7688 Ops/s | |
| test_dqn_speed[reduce-overhead-None] | 0.8957ms | 0.5490ms | 1.8216 KOps/s | 1.8415 KOps/s | |
| test_ddpg_speed[False-None] | 3.1381ms | 2.8286ms | 353.5316 Ops/s | 358.4682 Ops/s | |
| test_ddpg_speed[False-backward] | 4.1880ms | 4.0401ms | 247.5208 Ops/s | 250.7766 Ops/s | |
| test_ddpg_speed[True-None] | 1.6409ms | 1.4059ms | 711.2964 Ops/s | 716.0189 Ops/s | |
| test_ddpg_speed[True-backward] | 2.4395ms | 2.3961ms | 417.3382 Ops/s | 414.5910 Ops/s | |
| test_ddpg_speed[reduce-overhead-None] | 1.8072ms | 1.4155ms | 706.4730 Ops/s | 715.1478 Ops/s | |
| test_sac_speed[False-None] | 8.5198ms | 7.8937ms | 126.6826 Ops/s | 125.4195 Ops/s | |
| test_sac_speed[False-backward] | 11.6180ms | 11.1747ms | 89.4878 Ops/s | 89.1447 Ops/s | |
| test_sac_speed[True-None] | 2.5030ms | 2.1911ms | 456.3864 Ops/s | 460.0033 Ops/s | |
| test_sac_speed[True-backward] | 4.5303ms | 4.1365ms | 241.7480 Ops/s | 217.6358 Ops/s | |
| test_sac_speed[reduce-overhead-None] | 2.3804ms | 2.1778ms | 459.1766 Ops/s | 460.5377 Ops/s | |
| test_redq_speed[False-None] | 10.8846ms | 10.1374ms | 98.6447 Ops/s | 90.2028 Ops/s | |
| test_redq_speed[False-backward] | 18.5950ms | 17.6235ms | 56.7423 Ops/s | 55.9243 Ops/s | |
| test_redq_speed[True-None] | 4.9501ms | 4.4351ms | 225.4755 Ops/s | 226.2799 Ops/s | |
| test_redq_speed[True-backward] | 9.9585ms | 9.6695ms | 103.4185 Ops/s | 97.5172 Ops/s | |
| test_redq_speed[reduce-overhead-None] | 4.7059ms | 4.4625ms | 224.0883 Ops/s | 223.4742 Ops/s | |
| test_redq_deprec_speed[False-None] | 11.5585ms | 10.9050ms | 91.7007 Ops/s | 91.6821 Ops/s | |
| test_redq_deprec_speed[False-backward] | 18.8593ms | 15.9106ms | 62.8511 Ops/s | 63.4241 Ops/s | |
| test_redq_deprec_speed[True-None] | 4.1233ms | 3.6467ms | 274.2234 Ops/s | 269.2828 Ops/s | |
| test_redq_deprec_speed[True-backward] | 7.9145ms | 7.4992ms | 133.3473 Ops/s | 129.4521 Ops/s | |
| test_redq_deprec_speed[reduce-overhead-None] | 4.0269ms | 3.6025ms | 277.5876 Ops/s | 262.0361 Ops/s | |
| test_td3_speed[False-None] | 8.1674ms | 7.9019ms | 126.5523 Ops/s | 124.8913 Ops/s | |
| test_td3_speed[False-backward] | 11.0762ms | 10.7266ms | 93.2261 Ops/s | 92.1635 Ops/s | |
| test_td3_speed[True-None] | 2.1551ms | 1.8892ms | 529.3254 Ops/s | 535.2250 Ops/s | |
| test_td3_speed[True-backward] | 3.8267ms | 3.7023ms | 270.1058 Ops/s | 269.6535 Ops/s | |
| test_td3_speed[reduce-overhead-None] | 1.8765ms | 1.8320ms | 545.8494 Ops/s | 538.2279 Ops/s | |
| test_cql_speed[False-None] | 26.6324ms | 25.8506ms | 38.6838 Ops/s | 38.4913 Ops/s | |
| test_cql_speed[False-backward] | 36.1000ms | 35.2482ms | 28.3702 Ops/s | 28.1810 Ops/s | |
| test_cql_speed[True-None] | 12.7569ms | 12.3903ms | 80.7082 Ops/s | 77.6605 Ops/s | |
| test_cql_speed[True-backward] | 18.9361ms | 18.1939ms | 54.9636 Ops/s | 47.3120 Ops/s | |
| test_cql_speed[reduce-overhead-None] | 12.7271ms | 12.4692ms | 80.1975 Ops/s | 76.5429 Ops/s | |
| test_a2c_speed[False-None] | 5.8032ms | 5.3798ms | 185.8821 Ops/s | 183.1244 Ops/s | |
| test_a2c_speed[False-backward] | 12.2337ms | 11.8478ms | 84.4039 Ops/s | 84.9520 Ops/s | |
| test_a2c_speed[True-None] | 4.1712ms | 3.7174ms | 269.0077 Ops/s | 262.2881 Ops/s | |
| test_a2c_speed[True-backward] | 8.9641ms | 8.6165ms | 116.0565 Ops/s | 117.4247 Ops/s | |
| test_a2c_speed[reduce-overhead-None] | 4.7271ms | 3.7881ms | 263.9869 Ops/s | 268.4570 Ops/s | |
| test_ppo_speed[False-None] | 6.1655ms | 5.9328ms | 168.5538 Ops/s | 170.5563 Ops/s | |
| test_ppo_speed[False-backward] | 12.8232ms | 12.5019ms | 79.9879 Ops/s | 82.4975 Ops/s | |
| test_ppo_speed[True-None] | 4.0175ms | 3.6726ms | 272.2900 Ops/s | 265.2299 Ops/s | |
| test_ppo_speed[True-backward] | 8.7139ms | 8.4024ms | 119.0141 Ops/s | 113.3839 Ops/s | |
| test_ppo_speed[reduce-overhead-None] | 4.9701ms | 3.6282ms | 275.6184 Ops/s | 270.8672 Ops/s | |
| test_reinforce_speed[False-None] | 5.0773ms | 4.5394ms | 220.2939 Ops/s | 220.7390 Ops/s | |
| test_reinforce_speed[False-backward] | 7.6522ms | 7.4283ms | 134.6199 Ops/s | 136.5742 Ops/s | |
| test_reinforce_speed[True-None] | 3.8996ms | 2.8887ms | 346.1788 Ops/s | 341.9281 Ops/s | |
| test_reinforce_speed[True-backward] | 8.2901ms | 7.7357ms | 129.2705 Ops/s | 132.3836 Ops/s | |
| test_reinforce_speed[reduce-overhead-None] | 3.2838ms | 2.8250ms | 353.9880 Ops/s | 353.3658 Ops/s | |
| test_iql_speed[False-None] | 20.4113ms | 19.7176ms | 50.7160 Ops/s | 49.9203 Ops/s | |
| test_iql_speed[False-backward] | 30.7750ms | 29.9818ms | 33.3536 Ops/s | 32.6277 Ops/s | |
| test_iql_speed[True-None] | 8.7933ms | 8.5429ms | 117.0561 Ops/s | 112.7560 Ops/s | |
| test_iql_speed[True-backward] | 17.1640ms | 16.6038ms | 60.2273 Ops/s | 58.8235 Ops/s | |
| test_iql_speed[reduce-overhead-None] | 8.9587ms | 8.5775ms | 116.5838 Ops/s | 116.0406 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.0954ms | 5.9385ms | 168.3931 Ops/s | 168.0453 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 2.1430ms | 0.3197ms | 3.1278 KOps/s | 2.6038 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.7052ms | 0.2987ms | 3.3481 KOps/s | 2.7375 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.0832ms | 5.7478ms | 173.9794 Ops/s | 174.8511 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.9229ms | 0.3362ms | 2.9745 KOps/s | 3.0044 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.7041ms | 0.3569ms | 2.8023 KOps/s | 3.0651 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 1.7947ms | 1.4621ms | 683.9338 Ops/s | 767.9746 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 1.6355ms | 1.3750ms | 727.2586 Ops/s | 825.3859 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 9.4143ms | 6.0354ms | 165.6904 Ops/s | 173.1830 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.3232ms | 0.4705ms | 2.1255 KOps/s | 2.1853 KOps/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.7804ms | 0.4514ms | 2.2155 KOps/s | 2.4414 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 5.9092ms | 5.7180ms | 174.8876 Ops/s | 176.2883 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.5661ms | 0.2811ms | 3.5576 KOps/s | 3.1028 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.5232ms | 0.2649ms | 3.7744 KOps/s | 3.4964 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 5.9958ms | 5.7281ms | 174.5792 Ops/s | 175.8524 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.6358ms | 0.3219ms | 3.1064 KOps/s | 3.1143 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.6786ms | 0.3653ms | 2.7371 KOps/s | 3.4371 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 6.0311ms | 5.8858ms | 169.8992 Ops/s | 169.5475 Ops/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 2.1626ms | 0.4660ms | 2.1458 KOps/s | 2.1355 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.7128ms | 0.4544ms | 2.2006 KOps/s | 2.1962 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 6.5474ms | 4.9925ms | 200.3005 Ops/s | 201.0903 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 8.0746ms | 1.8832ms | 531.0063 Ops/s | 521.1498 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 10.6340ms | 1.2298ms | 813.1532 Ops/s | 902.2416 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 0.5349s | 15.6680ms | 63.8244 Ops/s | 58.2633 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 9.7215ms | 1.9948ms | 501.2956 Ops/s | 534.1279 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 8.6887ms | 1.2170ms | 821.7207 Ops/s | 803.1565 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 7.3082ms | 5.3230ms | 187.8626 Ops/s | 190.2538 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 11.7077ms | 2.0804ms | 480.6869 Ops/s | 488.1682 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 1.2101ms | 1.0472ms | 954.8903 Ops/s | 930.3854 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] | 40.6882ms | 36.1494ms | 27.6629 Ops/s | 27.9975 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] | 19.3928ms | 17.8680ms | 55.9659 Ops/s | 55.1904 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] | 40.6443ms | 37.0550ms | 26.9869 Ops/s | 27.1148 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] | 20.2503ms | 18.5524ms | 53.9014 Ops/s | 54.5082 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] | 41.3142ms | 38.7906ms | 25.7794 Ops/s | 25.9264 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] | 21.3952ms | 19.8123ms | 50.4738 Ops/s | 33.0003 Ops/s | |
| test_storage_write_lazystack[50-img_shape0-small] | 0.9156ms | 0.2131ms | 4.6934 KOps/s | 4.6814 KOps/s | |
| test_storage_write_lazystack[100-img_shape1-atari] | 1.7261ms | 1.3923ms | 718.2298 Ops/s | 720.2592 Ops/s | |
| test_storage_write_lazystack[100-img_shape2-large_img] | 2.6722ms | 2.3826ms | 419.7162 Ops/s | 431.9957 Ops/s | |
| test_storage_write_lazystack[200-img_shape3-large_batch] | 3.0533ms | 2.8665ms | 348.8539 Ops/s | 338.8483 Ops/s | |
| test_storage_write_contiguous[50-img_shape0-small] | 0.2188ms | 0.1315ms | 7.6036 KOps/s | 7.5941 KOps/s | |
| test_storage_write_contiguous[100-img_shape1-atari] | 0.5233ms | 0.1777ms | 5.6286 KOps/s | 5.3145 KOps/s | |
| test_storage_write_contiguous[100-img_shape2-large_img] | 1.9051ms | 1.7641ms | 566.8507 Ops/s | 585.0383 Ops/s | |
| test_storage_write_contiguous[200-img_shape3-large_batch] | 1.5448ms | 1.2904ms | 774.9755 Ops/s | 791.5807 Ops/s | |
| test_collector_stack_then_write[50-img_shape0-small] | 1.2103ms | 1.0857ms | 921.1068 Ops/s | 911.9148 Ops/s | |
| test_collector_stack_then_write[100-img_shape1-atari] | 3.6875ms | 3.5302ms | 283.2679 Ops/s | 285.7139 Ops/s | |
| test_collector_stack_then_write[100-img_shape2-large_img] | 5.7627ms | 5.5100ms | 181.4872 Ops/s | 180.1507 Ops/s | |
| test_collector_stack_then_write[200-img_shape3-large_batch] | 7.1902ms | 7.0080ms | 142.6944 Ops/s | 146.0443 Ops/s | |
| test_collector_lazystack_then_write[50-img_shape0-small] | 0.4353ms | 0.2692ms | 3.7146 KOps/s | 3.6990 KOps/s | |
| test_collector_lazystack_then_write[100-img_shape1-atari] | 1.6473ms | 1.5107ms | 661.9589 Ops/s | 655.9318 Ops/s | |
| test_collector_lazystack_then_write[100-img_shape2-large_img] | 2.9164ms | 2.4989ms | 400.1757 Ops/s | 412.8716 Ops/s | |
| test_collector_lazystack_then_write[200-img_shape3-large_batch] | 3.2114ms | 3.0629ms | 326.4863 Ops/s | 315.5443 Ops/s | |
| test_collector_without_rb[100-img_shape0-atari] | 33.3072ms | 32.3489ms | 30.9129 Ops/s | 30.3602 Ops/s | |
| test_collector_without_rb[200-img_shape1-large_batch] | 65.0291ms | 64.0949ms | 15.6019 Ops/s | 15.3892 Ops/s | |
| test_collector_with_rb[100-img_shape0-atari] | 0.6098s | 57.9120ms | 17.2676 Ops/s | 26.9360 Ops/s | |
| test_collector_with_rb[200-img_shape1-large_batch] | 75.3917ms | 73.3798ms | 13.6277 Ops/s | 13.8093 Ops/s |
Contributor
Result of GPU Benchmark TestsExpand to view detailed results
|
This was referenced Feb 12, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):
Wires WeightSyncScheme into the server loop:
Co-authored-by: Cursor cursoragent@cursor.com