[Feature] AsyncBatchedCollector: backend params and performance optimizations#3511
Open
vmoens wants to merge 3 commits intogh/vmoens/242/basefrom
Open
[Feature] AsyncBatchedCollector: backend params and performance optimizations#3511vmoens wants to merge 3 commits intogh/vmoens/242/basefrom
vmoens wants to merge 3 commits intogh/vmoens/242/basefrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3511
Note: Links to docs will display an error until the docs builds have been completed. ❌ 4 New FailuresAs of commit 8c2309c with merge base 266e4aa ( NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
vmoens
added a commit
that referenced
this pull request
Feb 16, 2026
…izations - Three-tier backend system: `backend` (global default), `env_backend` (env pool override), `policy_backend` (transport override), mirroring the device parameter pattern. - Lock-free SlotTransport: per-env slots with no shared lock, replacing ThreadingTransport as the default for in-process threading. - min_batch_size parameter for InferenceServer to accumulate requests. - Batch drain from result queue (get_nowait after first blocking get). - Remove redundant .copy() in ProcessorAsyncEnvPool._env_exec. Co-authored-by: Cursor <cursoragent@cursor.com> ghstack-source-id: 58cc17b Pull-Request: #3511
This was referenced Feb 12, 2026
This was referenced Feb 12, 2026
Contributor
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_tensor_to_bytestream_speed[pickle] | 79.2115μs | 77.9842μs | 12.8231 KOps/s | 12.6970 KOps/s | |
| test_tensor_to_bytestream_speed[torch.save] | 0.1417ms | 0.1397ms | 7.1597 KOps/s | 7.3139 KOps/s | |
| test_tensor_to_bytestream_speed[untyped_storage] | 0.1058s | 0.1052s | 9.5066 Ops/s | 9.3113 Ops/s | |
| test_tensor_to_bytestream_speed[numpy] | 2.5454μs | 2.5362μs | 394.2973 KOps/s | 407.6992 KOps/s | |
| test_tensor_to_bytestream_speed[safetensors] | 38.9635μs | 36.5686μs | 27.3459 KOps/s | 28.0423 KOps/s | |
| test_simple | 0.5350s | 0.5332s | 1.8754 Ops/s | 1.7823 Ops/s | |
| test_transformed | 1.0622s | 1.0598s | 0.9436 Ops/s | 0.9137 Ops/s | |
| test_serial | 1.6421s | 1.6239s | 0.6158 Ops/s | 0.6057 Ops/s | |
| test_parallel | 1.0119s | 0.9965s | 1.0036 Ops/s | 0.9681 Ops/s | |
| test_step_mdp_speed[True-True-True-True-True] | 0.2386ms | 40.8660μs | 24.4702 KOps/s | 23.6145 KOps/s | |
| test_step_mdp_speed[True-True-True-True-False] | 52.2700μs | 22.8716μs | 43.7223 KOps/s | 43.4124 KOps/s | |
| test_step_mdp_speed[True-True-True-False-True] | 71.6910μs | 23.0151μs | 43.4497 KOps/s | 43.3024 KOps/s | |
| test_step_mdp_speed[True-True-True-False-False] | 44.0310μs | 12.7381μs | 78.5045 KOps/s | 79.2769 KOps/s | |
| test_step_mdp_speed[True-True-False-True-True] | 0.1316ms | 43.8525μs | 22.8037 KOps/s | 22.6741 KOps/s | |
| test_step_mdp_speed[True-True-False-True-False] | 56.2210μs | 25.7860μs | 38.7807 KOps/s | 39.8193 KOps/s | |
| test_step_mdp_speed[True-True-False-False-True] | 56.1910μs | 25.3932μs | 39.3806 KOps/s | 38.9824 KOps/s | |
| test_step_mdp_speed[True-True-False-False-False] | 45.6210μs | 15.1624μs | 65.9526 KOps/s | 65.1300 KOps/s | |
| test_step_mdp_speed[True-False-True-True-True] | 81.6410μs | 46.0547μs | 21.7133 KOps/s | 21.5023 KOps/s | |
| test_step_mdp_speed[True-False-True-True-False] | 86.7920μs | 27.8061μs | 35.9633 KOps/s | 35.4423 KOps/s | |
| test_step_mdp_speed[True-False-True-False-True] | 52.6510μs | 25.5630μs | 39.1190 KOps/s | 39.0787 KOps/s | |
| test_step_mdp_speed[True-False-True-False-False] | 46.7900μs | 15.5207μs | 64.4299 KOps/s | 65.1665 KOps/s | |
| test_step_mdp_speed[True-False-False-True-True] | 87.0010μs | 49.2978μs | 20.2849 KOps/s | 20.4517 KOps/s | |
| test_step_mdp_speed[True-False-False-True-False] | 55.9810μs | 30.5996μs | 32.6801 KOps/s | 32.1767 KOps/s | |
| test_step_mdp_speed[True-False-False-False-True] | 59.9310μs | 27.8750μs | 35.8744 KOps/s | 35.8139 KOps/s | |
| test_step_mdp_speed[True-False-False-False-False] | 50.8510μs | 17.2330μs | 58.0283 KOps/s | 56.5989 KOps/s | |
| test_step_mdp_speed[False-True-True-True-True] | 93.9320μs | 47.1912μs | 21.1904 KOps/s | 21.1856 KOps/s | |
| test_step_mdp_speed[False-True-True-True-False] | 66.9710μs | 28.0472μs | 35.6542 KOps/s | 35.3780 KOps/s | |
| test_step_mdp_speed[False-True-True-False-True] | 2.6123ms | 30.0248μs | 33.3058 KOps/s | 33.7948 KOps/s | |
| test_step_mdp_speed[False-True-True-False-False] | 48.5010μs | 17.2549μs | 57.9546 KOps/s | 59.0429 KOps/s | |
| test_step_mdp_speed[False-True-False-True-True] | 86.6920μs | 48.9473μs | 20.4301 KOps/s | 20.1016 KOps/s | |
| test_step_mdp_speed[False-True-False-True-False] | 65.0410μs | 30.2472μs | 33.0610 KOps/s | 32.8396 KOps/s | |
| test_step_mdp_speed[False-True-False-False-True] | 59.7410μs | 31.1280μs | 32.1254 KOps/s | 31.6438 KOps/s | |
| test_step_mdp_speed[False-True-False-False-False] | 56.6410μs | 19.3735μs | 51.6169 KOps/s | 51.9633 KOps/s | |
| test_step_mdp_speed[False-False-True-True-True] | 88.1310μs | 52.3796μs | 19.0914 KOps/s | 19.1589 KOps/s | |
| test_step_mdp_speed[False-False-True-True-False] | 63.0410μs | 33.6985μs | 29.6749 KOps/s | 30.4206 KOps/s | |
| test_step_mdp_speed[False-False-True-False-True] | 73.1120μs | 30.6136μs | 32.6652 KOps/s | 31.3877 KOps/s | |
| test_step_mdp_speed[False-False-True-False-False] | 46.7510μs | 18.7581μs | 53.3104 KOps/s | 52.0909 KOps/s | |
| test_step_mdp_speed[False-False-False-True-True] | 92.6520μs | 52.6655μs | 18.9878 KOps/s | 18.7532 KOps/s | |
| test_step_mdp_speed[False-False-False-True-False] | 66.5010μs | 35.4081μs | 28.2421 KOps/s | 28.3480 KOps/s | |
| test_step_mdp_speed[False-False-False-False-True] | 64.4220μs | 33.1148μs | 30.1980 KOps/s | 29.6989 KOps/s | |
| test_step_mdp_speed[False-False-False-False-False] | 56.7210μs | 21.3462μs | 46.8467 KOps/s | 45.8121 KOps/s | |
| test_non_tensor_env_rollout_speed[1000-single-True] | 0.8150s | 0.7150s | 1.3985 Ops/s | 1.3765 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-single-False] | 0.6816s | 0.5872s | 1.7029 Ops/s | 1.6759 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] | 1.6684s | 1.5912s | 0.6285 Ops/s | 0.6251 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] | 1.4537s | 1.3731s | 0.7283 Ops/s | 0.7243 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-True] | 1.9133s | 1.8269s | 0.5474 Ops/s | 0.5393 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-False] | 1.6917s | 1.6106s | 0.6209 Ops/s | 0.6117 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] | 4.6539s | 4.5567s | 0.2195 Ops/s | 0.2183 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] | 4.5173s | 4.3675s | 0.2290 Ops/s | 0.2269 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] | 1.9734s | 1.8490s | 0.5408 Ops/s | 0.5517 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] | 1.6682s | 1.5619s | 0.6402 Ops/s | 0.6376 Ops/s | |
| test_values[generalized_advantage_estimate-True-True] | 10.5528ms | 10.3080ms | 97.0124 Ops/s | 95.0322 Ops/s | |
| test_values[vec_generalized_advantage_estimate-True-True] | 20.0319ms | 17.6528ms | 56.6482 Ops/s | 56.0074 Ops/s | |
| test_values[td0_return_estimate-False-False] | 0.2113ms | 0.1269ms | 7.8810 KOps/s | 7.5935 KOps/s | |
| test_values[td1_return_estimate-False-False] | 29.3529ms | 28.2736ms | 35.3687 Ops/s | 35.4054 Ops/s | |
| test_values[vec_td1_return_estimate-False-False] | 18.5514ms | 17.5995ms | 56.8197 Ops/s | 56.4711 Ops/s | |
| test_values[td_lambda_return_estimate-True-False] | 43.9023ms | 41.6660ms | 24.0004 Ops/s | 23.6950 Ops/s | |
| test_values[vec_td_lambda_return_estimate-True-False] | 18.0395ms | 17.6234ms | 56.7426 Ops/s | 56.1170 Ops/s | |
| test_gae_speed[generalized_advantage_estimate-False-1-512] | 9.2100ms | 9.0736ms | 110.2093 Ops/s | 109.3946 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 1.7756ms | 1.5356ms | 651.2064 Ops/s | 642.4394 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.5951ms | 0.4380ms | 2.2830 KOps/s | 2.3727 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 34.9711ms | 34.5446ms | 28.9481 Ops/s | 28.9266 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 2.1365ms | 1.7329ms | 577.0600 Ops/s | 561.4934 Ops/s | |
| test_dqn_speed[False-None] | 1.4976ms | 1.3878ms | 720.5796 Ops/s | 709.1614 Ops/s | |
| test_dqn_speed[False-backward] | 2.2536ms | 1.9320ms | 517.5889 Ops/s | 525.6402 Ops/s | |
| test_dqn_speed[True-None] | 0.8083ms | 0.5421ms | 1.8448 KOps/s | 1.7947 KOps/s | |
| test_dqn_speed[True-backward] | 1.1132ms | 1.0193ms | 981.0974 Ops/s | 861.4567 Ops/s | |
| test_dqn_speed[reduce-overhead-None] | 0.9566ms | 0.5375ms | 1.8605 KOps/s | 1.8266 KOps/s | |
| test_ddpg_speed[False-None] | 3.2024ms | 2.8472ms | 351.2276 Ops/s | 347.1918 Ops/s | |
| test_ddpg_speed[False-backward] | 4.1309ms | 4.0313ms | 248.0600 Ops/s | 243.0394 Ops/s | |
| test_ddpg_speed[True-None] | 1.8496ms | 1.4045ms | 712.0172 Ops/s | 711.3732 Ops/s | |
| test_ddpg_speed[True-backward] | 2.4387ms | 2.3912ms | 418.2045 Ops/s | 377.8507 Ops/s | |
| test_ddpg_speed[reduce-overhead-None] | 1.8233ms | 1.4041ms | 712.1968 Ops/s | 718.6938 Ops/s | |
| test_sac_speed[False-None] | 8.4713ms | 7.9113ms | 126.4020 Ops/s | 127.8629 Ops/s | |
| test_sac_speed[False-backward] | 11.6910ms | 11.2047ms | 89.2481 Ops/s | 90.3094 Ops/s | |
| test_sac_speed[True-None] | 2.5556ms | 2.1620ms | 462.5416 Ops/s | 455.9840 Ops/s | |
| test_sac_speed[True-backward] | 4.1895ms | 4.0512ms | 246.8405 Ops/s | 221.9256 Ops/s | |
| test_sac_speed[reduce-overhead-None] | 2.3358ms | 2.1446ms | 466.2837 Ops/s | 461.0807 Ops/s | |
| test_redq_speed[False-None] | 15.1476ms | 10.4778ms | 95.4397 Ops/s | 94.3723 Ops/s | |
| test_redq_speed[False-backward] | 21.0729ms | 17.9142ms | 55.8217 Ops/s | 56.4207 Ops/s | |
| test_redq_speed[True-None] | 4.9637ms | 4.4154ms | 226.4807 Ops/s | 221.3040 Ops/s | |
| test_redq_speed[True-backward] | 10.3449ms | 9.8422ms | 101.6029 Ops/s | 100.4487 Ops/s | |
| test_redq_speed[reduce-overhead-None] | 5.0072ms | 4.4036ms | 227.0884 Ops/s | 222.0870 Ops/s | |
| test_redq_deprec_speed[False-None] | 11.3217ms | 10.9335ms | 91.4619 Ops/s | 91.3156 Ops/s | |
| test_redq_deprec_speed[False-backward] | 16.1288ms | 15.8352ms | 63.1504 Ops/s | 63.2218 Ops/s | |
| test_redq_deprec_speed[True-None] | 3.8720ms | 3.6684ms | 272.5957 Ops/s | 265.1238 Ops/s | |
| test_redq_deprec_speed[True-backward] | 7.7662ms | 7.5250ms | 132.8911 Ops/s | 124.9672 Ops/s | |
| test_redq_deprec_speed[reduce-overhead-None] | 3.8687ms | 3.6183ms | 276.3732 Ops/s | 279.4834 Ops/s | |
| test_td3_speed[False-None] | 8.2300ms | 7.8917ms | 126.7158 Ops/s | 126.1692 Ops/s | |
| test_td3_speed[False-backward] | 11.1024ms | 10.7600ms | 92.9366 Ops/s | 92.7507 Ops/s | |
| test_td3_speed[True-None] | 1.9113ms | 1.8660ms | 535.9102 Ops/s | 539.8611 Ops/s | |
| test_td3_speed[True-backward] | 4.1885ms | 3.7073ms | 269.7396 Ops/s | 272.0306 Ops/s | |
| test_td3_speed[reduce-overhead-None] | 1.8716ms | 1.8095ms | 552.6360 Ops/s | 555.7492 Ops/s | |
| test_cql_speed[False-None] | 31.0065ms | 26.3166ms | 37.9988 Ops/s | 39.7532 Ops/s | |
| test_cql_speed[False-backward] | 39.0827ms | 35.5040ms | 28.1658 Ops/s | 28.8957 Ops/s | |
| test_cql_speed[True-None] | 15.3783ms | 12.4995ms | 80.0030 Ops/s | 85.5663 Ops/s | |
| test_cql_speed[True-backward] | 19.1658ms | 18.4686ms | 54.1460 Ops/s | 55.8333 Ops/s | |
| test_cql_speed[reduce-overhead-None] | 15.4089ms | 12.6739ms | 78.9026 Ops/s | 81.7164 Ops/s | |
| test_a2c_speed[False-None] | 5.6665ms | 5.4073ms | 184.9355 Ops/s | 194.9802 Ops/s | |
| test_a2c_speed[False-backward] | 12.6022ms | 11.8898ms | 84.1054 Ops/s | 85.0148 Ops/s | |
| test_a2c_speed[True-None] | 3.9387ms | 3.7063ms | 269.8082 Ops/s | 274.4100 Ops/s | |
| test_a2c_speed[True-backward] | 8.8021ms | 8.5701ms | 116.6846 Ops/s | 120.4066 Ops/s | |
| test_a2c_speed[reduce-overhead-None] | 3.8838ms | 3.7023ms | 270.1051 Ops/s | 288.9657 Ops/s | |
| test_ppo_speed[False-None] | 6.1152ms | 5.9470ms | 168.1532 Ops/s | 177.0032 Ops/s | |
| test_ppo_speed[False-backward] | 12.9161ms | 12.5942ms | 79.4014 Ops/s | 80.6236 Ops/s | |
| test_ppo_speed[True-None] | 3.7681ms | 3.6244ms | 275.9091 Ops/s | 296.4822 Ops/s | |
| test_ppo_speed[True-backward] | 8.7813ms | 8.4597ms | 118.2074 Ops/s | 122.2512 Ops/s | |
| test_ppo_speed[reduce-overhead-None] | 3.8114ms | 3.6005ms | 277.7405 Ops/s | 296.1050 Ops/s | |
| test_reinforce_speed[False-None] | 4.8005ms | 4.5501ms | 219.7737 Ops/s | 234.8828 Ops/s | |
| test_reinforce_speed[False-backward] | 7.5830ms | 7.3459ms | 136.1308 Ops/s | 139.7494 Ops/s | |
| test_reinforce_speed[True-None] | 3.1354ms | 2.8782ms | 347.4382 Ops/s | 342.9707 Ops/s | |
| test_reinforce_speed[True-backward] | 8.1254ms | 7.7760ms | 128.6009 Ops/s | 122.4248 Ops/s | |
| test_reinforce_speed[reduce-overhead-None] | 3.1484ms | 2.8325ms | 353.0394 Ops/s | 349.9170 Ops/s | |
| test_iql_speed[False-None] | 25.0005ms | 19.9892ms | 50.0270 Ops/s | 49.4987 Ops/s | |
| test_iql_speed[False-backward] | 34.0581ms | 30.1861ms | 33.1279 Ops/s | 33.2723 Ops/s | |
| test_iql_speed[True-None] | 8.7556ms | 8.4950ms | 117.7157 Ops/s | 117.1216 Ops/s | |
| test_iql_speed[True-backward] | 17.1770ms | 16.7382ms | 59.7435 Ops/s | 60.0208 Ops/s | |
| test_iql_speed[reduce-overhead-None] | 8.8852ms | 8.5616ms | 116.8001 Ops/s | 116.0667 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.0859ms | 5.9348ms | 168.4982 Ops/s | 167.6082 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 2.8784ms | 0.2804ms | 3.5664 KOps/s | 3.1916 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.5435ms | 0.2617ms | 3.8210 KOps/s | 3.4048 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 5.9824ms | 5.6520ms | 176.9297 Ops/s | 176.1030 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.6171ms | 0.3420ms | 2.9242 KOps/s | 2.7795 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.5989ms | 0.3054ms | 3.2746 KOps/s | 2.9292 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 1.6427ms | 1.3810ms | 724.1134 Ops/s | 700.6756 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 1.4832ms | 1.2707ms | 786.9951 Ops/s | 737.0010 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 12.3340ms | 5.9457ms | 168.1885 Ops/s | 171.2893 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 0.8759ms | 0.4836ms | 2.0677 KOps/s | 2.0699 KOps/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.7864ms | 0.4673ms | 2.1401 KOps/s | 2.1303 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 5.8902ms | 5.6357ms | 177.4405 Ops/s | 176.1011 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.7006ms | 0.3770ms | 2.6529 KOps/s | 2.7532 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.6304ms | 0.3690ms | 2.7102 KOps/s | 2.8651 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 5.9219ms | 5.6817ms | 176.0034 Ops/s | 175.3644 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 2.2956ms | 0.3772ms | 2.6513 KOps/s | 2.7451 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.5507ms | 0.3640ms | 2.7474 KOps/s | 2.8922 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 6.0110ms | 5.8575ms | 170.7222 Ops/s | 168.5743 Ops/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 2.0415ms | 0.5244ms | 1.9069 KOps/s | 1.9641 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.7411ms | 0.5098ms | 1.9617 KOps/s | 2.0251 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 6.3268ms | 4.9186ms | 203.3114 Ops/s | 199.7045 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 9.5305ms | 2.1537ms | 464.3206 Ops/s | 499.1528 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 3.2687ms | 0.9123ms | 1.0961 KOps/s | 1.1313 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 0.5538s | 16.0104ms | 62.4595 Ops/s | 57.5349 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 3.8743ms | 1.7683ms | 565.4996 Ops/s | 528.9605 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 7.5418ms | 1.1998ms | 833.4818 Ops/s | 1.0994 KOps/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 6.5948ms | 5.1791ms | 193.0853 Ops/s | 190.0320 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 12.9018ms | 2.0386ms | 490.5397 Ops/s | 521.8464 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 1.4184ms | 1.0555ms | 947.3914 Ops/s | 935.7781 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] | 37.7627ms | 35.4955ms | 28.1726 Ops/s | 27.1692 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] | 19.7715ms | 18.1132ms | 55.2084 Ops/s | 54.7486 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] | 39.5102ms | 36.6806ms | 27.2624 Ops/s | 26.3786 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] | 19.8729ms | 18.3694ms | 54.4382 Ops/s | 52.4847 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] | 40.2743ms | 38.1797ms | 26.1920 Ops/s | 25.6063 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] | 21.0135ms | 19.9076ms | 50.2320 Ops/s | 49.4764 Ops/s | |
| test_storage_write_lazystack[50-img_shape0-small] | 0.8642ms | 0.2141ms | 4.6711 KOps/s | 4.5396 KOps/s | |
| test_storage_write_lazystack[100-img_shape1-atari] | 1.7362ms | 1.3758ms | 726.8560 Ops/s | 716.6511 Ops/s | |
| test_storage_write_lazystack[100-img_shape2-large_img] | 2.7244ms | 2.3070ms | 433.4598 Ops/s | 430.2206 Ops/s | |
| test_storage_write_lazystack[200-img_shape3-large_batch] | 3.2964ms | 2.9335ms | 340.8896 Ops/s | 341.4891 Ops/s | |
| test_storage_write_contiguous[50-img_shape0-small] | 0.4269ms | 0.1309ms | 7.6366 KOps/s | 7.4973 KOps/s | |
| test_storage_write_contiguous[100-img_shape1-atari] | 0.3454ms | 0.1886ms | 5.3027 KOps/s | 5.1524 KOps/s | |
| test_storage_write_contiguous[100-img_shape2-large_img] | 1.9850ms | 1.7382ms | 575.2968 Ops/s | 579.8405 Ops/s | |
| test_storage_write_contiguous[200-img_shape3-large_batch] | 1.5352ms | 1.3046ms | 766.5144 Ops/s | 779.6783 Ops/s | |
| test_collector_stack_then_write[50-img_shape0-small] | 1.2389ms | 1.0937ms | 914.3609 Ops/s | 917.7867 Ops/s | |
| test_collector_stack_then_write[100-img_shape1-atari] | 3.7143ms | 3.4673ms | 288.4123 Ops/s | 286.8234 Ops/s | |
| test_collector_stack_then_write[100-img_shape2-large_img] | 10.0676ms | 5.5834ms | 179.1018 Ops/s | 175.9225 Ops/s | |
| test_collector_stack_then_write[200-img_shape3-large_batch] | 14.8340ms | 6.9004ms | 144.9199 Ops/s | 139.9969 Ops/s | |
| test_collector_lazystack_then_write[50-img_shape0-small] | 0.4536ms | 0.2738ms | 3.6524 KOps/s | 3.6406 KOps/s | |
| test_collector_lazystack_then_write[100-img_shape1-atari] | 1.6523ms | 1.4839ms | 673.9178 Ops/s | 665.2571 Ops/s | |
| test_collector_lazystack_then_write[100-img_shape2-large_img] | 2.8315ms | 2.4118ms | 414.6199 Ops/s | 409.8002 Ops/s | |
| test_collector_lazystack_then_write[200-img_shape3-large_batch] | 3.3199ms | 3.1283ms | 319.6604 Ops/s | 321.7141 Ops/s | |
| test_collector_without_rb[100-img_shape0-atari] | 32.8951ms | 32.4435ms | 30.8228 Ops/s | 20.2428 Ops/s | |
| test_collector_without_rb[200-img_shape1-large_batch] | 64.0047ms | 63.7554ms | 15.6850 Ops/s | 15.4532 Ops/s | |
| test_collector_with_rb[100-img_shape0-atari] | 37.6856ms | 36.9439ms | 27.0681 Ops/s | 26.9864 Ops/s | |
| test_collector_with_rb[200-img_shape1-large_batch] | 72.3349ms | 71.8573ms | 13.9165 Ops/s | 13.8487 Ops/s |
Contributor
Result of GPU Benchmark TestsExpand to view detailed results
|
vmoens
added a commit
that referenced
this pull request
Feb 16, 2026
…izations - Three-tier backend system: `backend` (global default), `env_backend` (env pool override), `policy_backend` (transport override), mirroring the device parameter pattern. - Lock-free SlotTransport: per-env slots with no shared lock, replacing ThreadingTransport as the default for in-process threading. - min_batch_size parameter for InferenceServer to accumulate requests. - Batch drain from result queue (get_nowait after first blocking get). - Remove redundant .copy() in ProcessorAsyncEnvPool._env_exec. Co-authored-by: Cursor <cursoragent@cursor.com> ghstack-source-id: 5b0282d Pull-Request: #3511 Co-authored-by: Cursor <cursoragent@cursor.com>
vmoens
added a commit
that referenced
this pull request
Feb 17, 2026
…izations - Three-tier backend system: `backend` (global default), `env_backend` (env pool override), `policy_backend` (transport override), mirroring the device parameter pattern. - Lock-free SlotTransport: per-env slots with no shared lock, replacing ThreadingTransport as the default for in-process threading. - min_batch_size parameter for InferenceServer to accumulate requests. - Batch drain from result queue (get_nowait after first blocking get). - Remove redundant .copy() in ProcessorAsyncEnvPool._env_exec. Co-authored-by: Cursor <cursoragent@cursor.com> ghstack-source-id: d7fc567 Pull-Request: #3511 Co-authored-by: Cursor <cursoragent@cursor.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):
backend(global default),env_backend(env pool override),
policy_backend(transport override), mirroringthe device parameter pattern.
ThreadingTransport as the default for in-process threading.
Co-authored-by: Cursor cursoragent@cursor.com