Skip to content

Comments

[Feature] Add StoreStorage for Redis/Dragonfly-backed replay buffers#3516

Merged
vmoens merged 3 commits intomainfrom
feature/store-storage
Feb 19, 2026
Merged

[Feature] Add StoreStorage for Redis/Dragonfly-backed replay buffers#3516
vmoens merged 3 commits intomainfrom
feature/store-storage

Conversation

@vmoens
Copy link
Collaborator

@vmoens vmoens commented Feb 18, 2026

Summary

  • Adds StoreStorage, a new Storage subclass backed by tensordict.store.TensorDictStore for out-of-core replay buffer storage via Redis/Dragonfly/KeyDB
  • Supports tensors, non-tensor data (strings, Python objects), nested TensorDicts, and TensorClass types with transparent re-wrapping on retrieval
  • Lazily initialized on first write; multiprocess-safe length tracking via mp.Value
  • Includes pytest-benchmark tests comparing StoreStorage vs LazyTensorStorage for fill and sample throughput

Benchmark results (buffer_size=10K, obs_dim=64, batch_size=256)

Benchmark LazyTensorStorage StoreStorage Ratio
Fill (5K elements) 3.4 ms 38.9 ms ~11x
Sample (batch of 256) 79 µs 1,729 µs ~22x
Sample w/ prefetch=4 101 µs 1,758 µs ~17x

Dependencies

Requires tensordict with the tensordict.store module, specifically the following fixes (not yet merged):

  • Per-element non-tensor indexing in TensorDictStore
  • TensorClass classification fix (is_tensor_collection instead of isinstance(val, TensorDictBase))
  • NonTensorStack handling in __setitem__
  • torch.Tensor index support in _aset_non_tensor_at
  • SETRANGE-based pre-allocation instead of full zero tensor on first write
  • Batched _abatch_set_non_tensor_at for pipeline efficiency

CI will fail until the tensordict changes are merged.

Test plan

  • Tensor tests: flat, nested, round-robin overwrite, extend overflow
  • Non-tensor tests: strings, TensorClass, History, ChatHistory, mixed extend
  • pytest-benchmark: fill and sample throughput for both storage types
  • CI (blocked on tensordict changes)

Made with Cursor

Add a new Storage subclass that delegates to tensordict's TensorDictStore
for out-of-core replay buffer storage via Redis-compatible key-value stores.

Supports tensors, non-tensor data (strings, Python objects), nested TensorDicts,
and TensorClass types. The storage is lazily initialized on first write and
tracks TensorClass types for transparent re-wrapping on retrieval.

Requires tensordict with the tensordict.store module (per-element non-tensor
indexing, TensorClass classification fix, NonTensorStack handling, and
tensor index support in _aset_non_tensor_at).

Co-authored-by: Cursor <cursoragent@cursor.com>
@pytorch-bot
Copy link

pytorch-bot bot commented Feb 18, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3516

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 Cancelled Job

As of commit 55a3aab with merge base 83c2101 (image):

CANCELLED JOB - The following job was cancelled. Please retry:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 18, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Feb 18, 2026

$\color{#D29922}\textsf{\Large&amp;#x26A0;\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 172. Improved: $\large\color{#35bf28}15$. Worsened: $\large\color{#d91a1a}12$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_tensor_to_bytestream_speed[pickle] 86.3225μs 85.2634μs 11.7284 KOps/s 12.3535 KOps/s $\textbf{\color{#d91a1a}-5.06\%}$
test_tensor_to_bytestream_speed[torch.save] 0.1443ms 0.1416ms 7.0611 KOps/s 7.0357 KOps/s $\color{#35bf28}+0.36\%$
test_tensor_to_bytestream_speed[untyped_storage] 0.1020s 0.1009s 9.9123 Ops/s 9.3455 Ops/s $\textbf{\color{#35bf28}+6.06\%}$
test_tensor_to_bytestream_speed[numpy] 2.5283μs 2.5227μs 396.4021 KOps/s 410.9704 KOps/s $\color{#d91a1a}-3.54\%$
test_tensor_to_bytestream_speed[safetensors] 36.9398μs 36.7234μs 27.2306 KOps/s 26.1106 KOps/s $\color{#35bf28}+4.29\%$
test_simple 0.7766s 0.7710s 1.2969 Ops/s 1.2371 Ops/s $\color{#35bf28}+4.84\%$
test_transformed 1.3536s 1.3528s 0.7392 Ops/s 0.7304 Ops/s $\color{#35bf28}+1.20\%$
test_serial 2.2978s 2.2602s 0.4424 Ops/s 0.4422 Ops/s $\color{#35bf28}+0.04\%$
test_parallel 1.9004s 1.8044s 0.5542 Ops/s 0.5619 Ops/s $\color{#d91a1a}-1.37\%$
test_step_mdp_speed[True-True-True-True-True] 0.3351ms 43.1662μs 23.1663 KOps/s 22.5737 KOps/s $\color{#35bf28}+2.62\%$
test_step_mdp_speed[True-True-True-True-False] 0.1433ms 23.4769μs 42.5950 KOps/s 41.7493 KOps/s $\color{#35bf28}+2.03\%$
test_step_mdp_speed[True-True-True-False-True] 0.4613ms 24.5375μs 40.7540 KOps/s 40.7038 KOps/s $\color{#35bf28}+0.12\%$
test_step_mdp_speed[True-True-True-False-False] 44.2310μs 13.4860μs 74.1509 KOps/s 73.0664 KOps/s $\color{#35bf28}+1.48\%$
test_step_mdp_speed[True-True-False-True-True] 0.4631ms 46.6829μs 21.4211 KOps/s 21.5591 KOps/s $\color{#d91a1a}-0.64\%$
test_step_mdp_speed[True-True-False-True-False] 0.4416ms 26.3485μs 37.9529 KOps/s 38.3268 KOps/s $\color{#d91a1a}-0.98\%$
test_step_mdp_speed[True-True-False-False-True] 0.4481ms 26.7646μs 37.3628 KOps/s 36.3818 KOps/s $\color{#35bf28}+2.70\%$
test_step_mdp_speed[True-True-False-False-False] 47.2100μs 16.0251μs 62.4021 KOps/s 64.0149 KOps/s $\color{#d91a1a}-2.52\%$
test_step_mdp_speed[True-False-True-True-True] 0.4707ms 49.3377μs 20.2685 KOps/s 20.4393 KOps/s $\color{#d91a1a}-0.84\%$
test_step_mdp_speed[True-False-True-True-False] 0.4469ms 29.6305μs 33.7490 KOps/s 34.6635 KOps/s $\color{#d91a1a}-2.64\%$
test_step_mdp_speed[True-False-True-False-True] 58.5610μs 27.5149μs 36.3439 KOps/s 36.3649 KOps/s $\color{#d91a1a}-0.06\%$
test_step_mdp_speed[True-False-True-False-False] 0.4324ms 16.1025μs 62.1021 KOps/s 62.4998 KOps/s $\color{#d91a1a}-0.64\%$
test_step_mdp_speed[True-False-False-True-True] 0.4711ms 51.9760μs 19.2396 KOps/s 19.8755 KOps/s $\color{#d91a1a}-3.20\%$
test_step_mdp_speed[True-False-False-True-False] 0.4489ms 32.0558μs 31.1956 KOps/s 32.1354 KOps/s $\color{#d91a1a}-2.92\%$
test_step_mdp_speed[True-False-False-False-True] 94.9010μs 29.4065μs 34.0061 KOps/s 34.0696 KOps/s $\color{#d91a1a}-0.19\%$
test_step_mdp_speed[True-False-False-False-False] 0.4370ms 18.8122μs 53.1571 KOps/s 55.0313 KOps/s $\color{#d91a1a}-3.41\%$
test_step_mdp_speed[False-True-True-True-True] 0.4699ms 49.0443μs 20.3897 KOps/s 20.7264 KOps/s $\color{#d91a1a}-1.62\%$
test_step_mdp_speed[False-True-True-True-False] 0.4396ms 29.0591μs 34.4126 KOps/s 34.6809 KOps/s $\color{#d91a1a}-0.77\%$
test_step_mdp_speed[False-True-True-False-True] 2.5020ms 31.0626μs 32.1930 KOps/s 32.0669 KOps/s $\color{#35bf28}+0.39\%$
test_step_mdp_speed[False-True-True-False-False] 0.4400ms 17.7102μs 56.4646 KOps/s 57.2473 KOps/s $\color{#d91a1a}-1.37\%$
test_step_mdp_speed[False-True-False-True-True] 0.4951ms 50.8123μs 19.6803 KOps/s 19.5420 KOps/s $\color{#35bf28}+0.71\%$
test_step_mdp_speed[False-True-False-True-False] 0.4527ms 31.6778μs 31.5678 KOps/s 31.6556 KOps/s $\color{#d91a1a}-0.28\%$
test_step_mdp_speed[False-True-False-False-True] 68.3510μs 33.1382μs 30.1766 KOps/s 30.4518 KOps/s $\color{#d91a1a}-0.90\%$
test_step_mdp_speed[False-True-False-False-False] 0.4356ms 20.3451μs 49.1519 KOps/s 50.5315 KOps/s $\color{#d91a1a}-2.73\%$
test_step_mdp_speed[False-False-True-True-True] 0.4680ms 54.4203μs 18.3755 KOps/s 18.5108 KOps/s $\color{#d91a1a}-0.73\%$
test_step_mdp_speed[False-False-True-True-False] 0.4585ms 34.8717μs 28.6766 KOps/s 28.6109 KOps/s $\color{#35bf28}+0.23\%$
test_step_mdp_speed[False-False-True-False-True] 69.7810μs 33.5813μs 29.7785 KOps/s 29.4209 KOps/s $\color{#35bf28}+1.22\%$
test_step_mdp_speed[False-False-True-False-False] 0.4368ms 20.0972μs 49.7582 KOps/s 49.7346 KOps/s $\color{#35bf28}+0.05\%$
test_step_mdp_speed[False-False-False-True-True] 0.4832ms 56.6102μs 17.6647 KOps/s 18.0660 KOps/s $\color{#d91a1a}-2.22\%$
test_step_mdp_speed[False-False-False-True-False] 0.4600ms 36.8232μs 27.1568 KOps/s 27.3752 KOps/s $\color{#d91a1a}-0.80\%$
test_step_mdp_speed[False-False-False-False-True] 0.4501ms 35.7294μs 27.9882 KOps/s 28.3573 KOps/s $\color{#d91a1a}-1.30\%$
test_step_mdp_speed[False-False-False-False-False] 76.7610μs 22.5884μs 44.2705 KOps/s 44.3369 KOps/s $\color{#d91a1a}-0.15\%$
test_non_tensor_env_rollout_speed[1000-single-True] 0.8356s 0.7357s 1.3592 Ops/s 1.3365 Ops/s $\color{#35bf28}+1.69\%$
test_non_tensor_env_rollout_speed[1000-single-False] 0.7072s 0.6108s 1.6373 Ops/s 1.5798 Ops/s $\color{#35bf28}+3.64\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] 1.7294s 1.6505s 0.6059 Ops/s 0.5869 Ops/s $\color{#35bf28}+3.23\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 1.5020s 1.4270s 0.7008 Ops/s 0.6777 Ops/s $\color{#35bf28}+3.41\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 1.9364s 1.8549s 0.5391 Ops/s 0.5150 Ops/s $\color{#35bf28}+4.68\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 1.7121s 1.6331s 0.6123 Ops/s 0.6004 Ops/s $\color{#35bf28}+1.99\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 4.6957s 4.5663s 0.2190 Ops/s 0.2189 Ops/s $\color{#35bf28}+0.04\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] 4.5375s 4.3957s 0.2275 Ops/s 0.2249 Ops/s $\color{#35bf28}+1.14\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 1.9529s 1.8889s 0.5294 Ops/s 0.5321 Ops/s $\color{#d91a1a}-0.51\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 1.6877s 1.5811s 0.6325 Ops/s 0.6338 Ops/s $\color{#d91a1a}-0.21\%$
test_values[generalized_advantage_estimate-True-True] 22.3490ms 21.4628ms 46.5923 Ops/s 49.4749 Ops/s $\textbf{\color{#d91a1a}-5.83\%}$
test_values[vec_generalized_advantage_estimate-True-True] 0.1324s 3.5674ms 280.3141 Ops/s 286.0080 Ops/s $\color{#d91a1a}-1.99\%$
test_values[td0_return_estimate-False-False] 0.1074ms 81.5336μs 12.2649 KOps/s 12.1591 KOps/s $\color{#35bf28}+0.87\%$
test_values[td1_return_estimate-False-False] 53.2741ms 51.6502ms 19.3610 Ops/s 20.6634 Ops/s $\textbf{\color{#d91a1a}-6.30\%}$
test_values[vec_td1_return_estimate-False-False] 1.3324ms 1.0790ms 926.7482 Ops/s 930.2239 Ops/s $\color{#d91a1a}-0.37\%$
test_values[td_lambda_return_estimate-True-False] 86.7358ms 77.8371ms 12.8474 Ops/s 12.4958 Ops/s $\color{#35bf28}+2.81\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.3451ms 1.0787ms 927.0168 Ops/s 931.9045 Ops/s $\color{#d91a1a}-0.52\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 20.3682ms 20.0165ms 49.9588 Ops/s 46.5555 Ops/s $\textbf{\color{#35bf28}+7.31\%}$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0193ms 0.7452ms 1.3419 KOps/s 1.3381 KOps/s $\color{#35bf28}+0.28\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7211ms 0.6657ms 1.5022 KOps/s 1.4571 KOps/s $\color{#35bf28}+3.10\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5356ms 1.4801ms 675.6517 Ops/s 670.4766 Ops/s $\color{#35bf28}+0.77\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7520ms 0.6827ms 1.4649 KOps/s 1.4182 KOps/s $\color{#35bf28}+3.29\%$
test_dqn_speed[False-None] 1.6006ms 1.5087ms 662.8357 Ops/s 652.6933 Ops/s $\color{#35bf28}+1.55\%$
test_dqn_speed[False-backward] 2.2649ms 2.1446ms 466.2817 Ops/s 464.5936 Ops/s $\color{#35bf28}+0.36\%$
test_dqn_speed[True-None] 1.2178ms 0.5497ms 1.8191 KOps/s 1.7813 KOps/s $\color{#35bf28}+2.12\%$
test_dqn_speed[True-backward] 1.1174ms 1.0768ms 928.6573 Ops/s 919.2257 Ops/s $\color{#35bf28}+1.03\%$
test_dqn_speed[reduce-overhead-None] 0.6293ms 0.5751ms 1.7388 KOps/s 1.6897 KOps/s $\color{#35bf28}+2.90\%$
test_ddpg_speed[False-None] 3.1855ms 2.8267ms 353.7725 Ops/s 351.3948 Ops/s $\color{#35bf28}+0.68\%$
test_ddpg_speed[False-backward] 4.4101ms 4.0864ms 244.7154 Ops/s 242.2564 Ops/s $\color{#35bf28}+1.02\%$
test_ddpg_speed[True-None] 1.3803ms 1.2942ms 772.6811 Ops/s 767.0925 Ops/s $\color{#35bf28}+0.73\%$
test_ddpg_speed[True-backward] 2.3864ms 2.3360ms 428.0811 Ops/s 401.1329 Ops/s $\textbf{\color{#35bf28}+6.72\%}$
test_ddpg_speed[reduce-overhead-None] 1.4360ms 1.3168ms 759.4334 Ops/s 750.6934 Ops/s $\color{#35bf28}+1.16\%$
test_sac_speed[False-None] 12.3916ms 8.3049ms 120.4102 Ops/s 121.2914 Ops/s $\color{#d91a1a}-0.73\%$
test_sac_speed[False-backward] 11.6136ms 11.1362ms 89.7968 Ops/s 87.0563 Ops/s $\color{#35bf28}+3.15\%$
test_sac_speed[True-None] 1.9445ms 1.7910ms 558.3483 Ops/s 553.5592 Ops/s $\color{#35bf28}+0.87\%$
test_sac_speed[True-backward] 3.4610ms 3.3780ms 296.0311 Ops/s 293.7187 Ops/s $\color{#35bf28}+0.79\%$
test_sac_speed[reduce-overhead-None] 19.5349ms 11.0752ms 90.2917 Ops/s 90.3633 Ops/s $\color{#d91a1a}-0.08\%$
test_redq_deprec_speed[False-None] 9.8532ms 9.2292ms 108.3519 Ops/s 77.0714 Ops/s $\textbf{\color{#35bf28}+40.59\%}$
test_redq_deprec_speed[False-backward] 12.8930ms 12.2871ms 81.3862 Ops/s 80.3949 Ops/s $\color{#35bf28}+1.23\%$
test_redq_deprec_speed[True-None] 2.6935ms 2.4751ms 404.0166 Ops/s 400.7956 Ops/s $\color{#35bf28}+0.80\%$
test_redq_deprec_speed[True-backward] 4.5432ms 4.2238ms 236.7544 Ops/s 233.9607 Ops/s $\color{#35bf28}+1.19\%$
test_redq_deprec_speed[reduce-overhead-None] 16.3089ms 9.9270ms 100.7358 Ops/s 100.3240 Ops/s $\color{#35bf28}+0.41\%$
test_td3_speed[False-None] 8.1714ms 8.0645ms 124.0000 Ops/s 123.0720 Ops/s $\color{#35bf28}+0.75\%$
test_td3_speed[False-backward] 11.1307ms 10.6818ms 93.6170 Ops/s 92.4097 Ops/s $\color{#35bf28}+1.31\%$
test_td3_speed[True-None] 1.6310ms 1.6093ms 621.3995 Ops/s 625.1742 Ops/s $\color{#d91a1a}-0.60\%$
test_td3_speed[True-backward] 3.2579ms 3.1886ms 313.6166 Ops/s 307.0352 Ops/s $\color{#35bf28}+2.14\%$
test_td3_speed[reduce-overhead-None] 83.2163ms 24.5238ms 40.7768 Ops/s 40.0680 Ops/s $\color{#35bf28}+1.77\%$
test_cql_speed[False-None] 17.2221ms 16.9705ms 58.9257 Ops/s 58.3616 Ops/s $\color{#35bf28}+0.97\%$
test_cql_speed[False-backward] 23.0350ms 22.5769ms 44.2931 Ops/s 44.0481 Ops/s $\color{#35bf28}+0.56\%$
test_cql_speed[True-None] 3.3051ms 3.1954ms 312.9493 Ops/s 309.9052 Ops/s $\color{#35bf28}+0.98\%$
test_cql_speed[True-backward] 5.7199ms 5.4149ms 184.6759 Ops/s 182.2359 Ops/s $\color{#35bf28}+1.34\%$
test_cql_speed[reduce-overhead-None] 19.2668ms 11.9802ms 83.4711 Ops/s 82.9005 Ops/s $\color{#35bf28}+0.69\%$
test_a2c_speed[False-None] 4.0251ms 3.2199ms 310.5722 Ops/s 310.2726 Ops/s $\color{#35bf28}+0.10\%$
test_a2c_speed[False-backward] 6.7083ms 6.3055ms 158.5917 Ops/s 156.8652 Ops/s $\color{#35bf28}+1.10\%$
test_a2c_speed[True-None] 1.4237ms 1.3314ms 751.0778 Ops/s 760.5399 Ops/s $\color{#d91a1a}-1.24\%$
test_a2c_speed[True-backward] 3.0829ms 3.0462ms 328.2811 Ops/s 321.3408 Ops/s $\color{#35bf28}+2.16\%$
test_a2c_speed[reduce-overhead-None] 1.0597ms 0.9719ms 1.0289 KOps/s 1.0124 KOps/s $\color{#35bf28}+1.63\%$
test_ppo_speed[False-None] 3.9091ms 3.7989ms 263.2317 Ops/s 253.8532 Ops/s $\color{#35bf28}+3.69\%$
test_ppo_speed[False-backward] 7.5736ms 7.1340ms 140.1729 Ops/s 137.6003 Ops/s $\color{#35bf28}+1.87\%$
test_ppo_speed[True-None] 1.4989ms 1.4126ms 707.8979 Ops/s 692.3756 Ops/s $\color{#35bf28}+2.24\%$
test_ppo_speed[True-backward] 3.2574ms 3.1999ms 312.5072 Ops/s 321.2587 Ops/s $\color{#d91a1a}-2.72\%$
test_ppo_speed[reduce-overhead-None] 1.1084ms 1.0469ms 955.1717 Ops/s 917.0066 Ops/s $\color{#35bf28}+4.16\%$
test_reinforce_speed[False-None] 2.3611ms 2.2629ms 441.9128 Ops/s 441.7925 Ops/s $\color{#35bf28}+0.03\%$
test_reinforce_speed[False-backward] 3.7259ms 3.2871ms 304.2236 Ops/s 294.4070 Ops/s $\color{#35bf28}+3.33\%$
test_reinforce_speed[True-None] 1.3878ms 1.2695ms 787.6843 Ops/s 787.0332 Ops/s $\color{#35bf28}+0.08\%$
test_reinforce_speed[True-backward] 2.9144ms 2.8626ms 349.3331 Ops/s 329.5477 Ops/s $\textbf{\color{#35bf28}+6.00\%}$
test_reinforce_speed[reduce-overhead-None] 17.4165ms 9.4665ms 105.6360 Ops/s 105.2378 Ops/s $\color{#35bf28}+0.38\%$
test_iql_speed[False-None] 9.8633ms 9.2870ms 107.6772 Ops/s 106.1544 Ops/s $\color{#35bf28}+1.43\%$
test_iql_speed[False-backward] 13.5709ms 12.9958ms 76.9480 Ops/s 74.5375 Ops/s $\color{#35bf28}+3.23\%$
test_iql_speed[True-None] 2.2299ms 2.1392ms 467.4727 Ops/s 462.6428 Ops/s $\color{#35bf28}+1.04\%$
test_iql_speed[True-backward] 4.7030ms 4.6144ms 216.7150 Ops/s 207.0177 Ops/s $\color{#35bf28}+4.68\%$
test_iql_speed[reduce-overhead-None] 18.0740ms 10.5785ms 94.5313 Ops/s 94.5017 Ops/s $\color{#35bf28}+0.03\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.0746ms 5.9020ms 169.4327 Ops/s 166.4604 Ops/s $\color{#35bf28}+1.79\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.0581ms 0.3372ms 2.9653 KOps/s 3.1640 KOps/s $\textbf{\color{#d91a1a}-6.28\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5485ms 0.3215ms 3.1107 KOps/s 3.3797 KOps/s $\textbf{\color{#d91a1a}-7.96\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.1469ms 5.7085ms 175.1771 Ops/s 168.7552 Ops/s $\color{#35bf28}+3.81\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.5648ms 0.2930ms 3.4127 KOps/s 3.5726 KOps/s $\color{#d91a1a}-4.48\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5541ms 0.3105ms 3.2205 KOps/s 3.8289 KOps/s $\textbf{\color{#d91a1a}-15.89\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.4101ms 1.2282ms 814.2190 Ops/s 754.8068 Ops/s $\textbf{\color{#35bf28}+7.87\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5663ms 1.1604ms 861.7755 Ops/s 816.6006 Ops/s $\textbf{\color{#35bf28}+5.53\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.2158ms 5.8480ms 170.9995 Ops/s 163.9260 Ops/s $\color{#35bf28}+4.32\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.8862ms 0.5021ms 1.9914 KOps/s 2.0285 KOps/s $\color{#d91a1a}-1.83\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7795s 0.9689ms 1.0321 KOps/s 1.9573 KOps/s $\textbf{\color{#d91a1a}-47.27\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.0413ms 5.8647ms 170.5116 Ops/s 166.7181 Ops/s $\color{#35bf28}+2.28\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.0973ms 0.3407ms 2.9347 KOps/s 2.9447 KOps/s $\color{#d91a1a}-0.34\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7190ms 0.3086ms 3.2404 KOps/s 3.2689 KOps/s $\color{#d91a1a}-0.87\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.1131ms 5.8213ms 171.7833 Ops/s 169.0638 Ops/s $\color{#35bf28}+1.61\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.3547ms 0.3111ms 3.2148 KOps/s 3.2173 KOps/s $\color{#d91a1a}-0.08\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5059ms 0.2922ms 3.4228 KOps/s 2.9210 KOps/s $\textbf{\color{#35bf28}+17.18\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.1457ms 5.9990ms 166.6940 Ops/s 164.6261 Ops/s $\color{#35bf28}+1.26\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.3346ms 0.4689ms 2.1325 KOps/s 1.9110 KOps/s $\textbf{\color{#35bf28}+11.59\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7034ms 0.4315ms 2.3175 KOps/s 2.1134 KOps/s $\textbf{\color{#35bf28}+9.66\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.4398ms 5.0165ms 199.3427 Ops/s 48.4103 Ops/s $\textbf{\color{#35bf28}+311.78\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 4.2908ms 2.1825ms 458.1826 Ops/s 509.4192 Ops/s $\textbf{\color{#d91a1a}-10.06\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.2309ms 0.9294ms 1.0760 KOps/s 836.6008 Ops/s $\textbf{\color{#35bf28}+28.61\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.5909s 16.7584ms 59.6714 Ops/s 192.6870 Ops/s $\textbf{\color{#d91a1a}-69.03\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 5.1308ms 1.8930ms 528.2752 Ops/s 531.9087 Ops/s $\color{#d91a1a}-0.68\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 2.1492ms 1.1691ms 855.3456 Ops/s 706.6405 Ops/s $\textbf{\color{#35bf28}+21.04\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 9.6479ms 5.2914ms 188.9863 Ops/s 187.3876 Ops/s $\color{#35bf28}+0.85\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 4.2412ms 1.9703ms 507.5341 Ops/s 464.3003 Ops/s $\textbf{\color{#35bf28}+9.31\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 1.1980ms 1.0553ms 947.6388 Ops/s 797.7349 Ops/s $\textbf{\color{#35bf28}+18.79\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 39.6169ms 35.6915ms 28.0179 Ops/s 27.2762 Ops/s $\color{#35bf28}+2.72\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 19.8785ms 17.9685ms 55.6528 Ops/s 54.4580 Ops/s $\color{#35bf28}+2.19\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 0.5782s 47.7658ms 20.9355 Ops/s 26.5200 Ops/s $\textbf{\color{#d91a1a}-21.06\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 19.6376ms 18.2034ms 54.9348 Ops/s 53.8230 Ops/s $\color{#35bf28}+2.07\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 40.1358ms 38.3402ms 26.0823 Ops/s 25.2978 Ops/s $\color{#35bf28}+3.10\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 21.4668ms 19.9854ms 50.0366 Ops/s 48.7729 Ops/s $\color{#35bf28}+2.59\%$
test_storage_write_lazystack[50-img_shape0-small] 0.8283ms 0.2162ms 4.6249 KOps/s 4.4562 KOps/s $\color{#35bf28}+3.79\%$
test_storage_write_lazystack[100-img_shape1-atari] 1.7480ms 1.4005ms 714.0140 Ops/s 711.6426 Ops/s $\color{#35bf28}+0.33\%$
test_storage_write_lazystack[100-img_shape2-large_img] 2.6828ms 2.2950ms 435.7344 Ops/s 425.1025 Ops/s $\color{#35bf28}+2.50\%$
test_storage_write_lazystack[200-img_shape3-large_batch] 3.1166ms 2.9138ms 343.1971 Ops/s 332.9278 Ops/s $\color{#35bf28}+3.08\%$
test_storage_write_contiguous[50-img_shape0-small] 0.2562ms 0.1624ms 6.1559 KOps/s 5.9345 KOps/s $\color{#35bf28}+3.73\%$
test_storage_write_contiguous[100-img_shape1-atari] 0.6240ms 0.2115ms 4.7287 KOps/s 4.7042 KOps/s $\color{#35bf28}+0.52\%$
test_storage_write_contiguous[100-img_shape2-large_img] 1.9110ms 1.7657ms 566.3572 Ops/s 544.3368 Ops/s $\color{#35bf28}+4.05\%$
test_storage_write_contiguous[200-img_shape3-large_batch] 1.5656ms 1.3913ms 718.7472 Ops/s 787.8196 Ops/s $\textbf{\color{#d91a1a}-8.77\%}$
test_collector_stack_then_write[50-img_shape0-small] 1.3076ms 1.1533ms 867.0694 Ops/s 869.6070 Ops/s $\color{#d91a1a}-0.29\%$
test_collector_stack_then_write[100-img_shape1-atari] 7.6041ms 3.6880ms 271.1501 Ops/s 272.7089 Ops/s $\color{#d91a1a}-0.57\%$
test_collector_stack_then_write[100-img_shape2-large_img] 11.2973ms 5.7506ms 173.8963 Ops/s 170.4980 Ops/s $\color{#35bf28}+1.99\%$
test_collector_stack_then_write[200-img_shape3-large_batch] 15.1275ms 7.1647ms 139.5741 Ops/s 134.7593 Ops/s $\color{#35bf28}+3.57\%$
test_collector_lazystack_then_write[50-img_shape0-small] 0.4423ms 0.2710ms 3.6901 KOps/s 3.6265 KOps/s $\color{#35bf28}+1.76\%$
test_collector_lazystack_then_write[100-img_shape1-atari] 1.5997ms 1.4640ms 683.0829 Ops/s 677.5828 Ops/s $\color{#35bf28}+0.81\%$
test_collector_lazystack_then_write[100-img_shape2-large_img] 2.8551ms 2.4490ms 408.3270 Ops/s 403.7778 Ops/s $\color{#35bf28}+1.13\%$
test_collector_lazystack_then_write[200-img_shape3-large_batch] 3.4372ms 3.1286ms 319.6360 Ops/s 314.0951 Ops/s $\color{#35bf28}+1.76\%$
test_collector_without_rb[100-img_shape0-atari] 33.1255ms 32.5193ms 30.7510 Ops/s 30.1335 Ops/s $\color{#35bf28}+2.05\%$
test_collector_without_rb[200-img_shape1-large_batch] 63.7388ms 63.5259ms 15.7416 Ops/s 15.3551 Ops/s $\color{#35bf28}+2.52\%$
test_collector_with_rb[100-img_shape0-atari] 37.4403ms 36.8977ms 27.1020 Ops/s 26.6297 Ops/s $\color{#35bf28}+1.77\%$
test_collector_with_rb[200-img_shape1-large_batch] 95.8153ms 79.3001ms 12.6103 Ops/s 13.5335 Ops/s $\textbf{\color{#d91a1a}-6.82\%}$
test_collector_without_rb_cuda[100-img_shape0-atari] 54.7344ms 54.5629ms 18.3275 Ops/s 18.1326 Ops/s $\color{#35bf28}+1.07\%$
test_collector_without_rb_cuda[200-img_shape1-large_batch] 0.1091s 0.1088s 9.1945 Ops/s 9.0857 Ops/s $\color{#35bf28}+1.20\%$
test_collector_with_rb_cuda[100-img_shape0-atari] 56.8767ms 56.6378ms 17.6561 Ops/s 17.4718 Ops/s $\color{#35bf28}+1.05\%$
test_collector_with_rb_cuda[200-img_shape1-large_batch] 0.1132s 0.1126s 8.8796 Ops/s 8.7751 Ops/s $\color{#35bf28}+1.19\%$

vmoens and others added 2 commits February 19, 2026 10:18
…lure

The test_benchmark_store_storage.py module imported redis at the top level,
causing ModuleNotFoundError during pytest collection in CI environments
where redis is not installed. This blocked all ~26k tests from running.

Co-authored-by: Cursor <cursoragent@cursor.com>
@github-actions
Copy link
Contributor

$\color{#D29922}\textsf{\Large&amp;#x26A0;\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 173. Improved: $\large\color{#35bf28}17$. Worsened: $\large\color{#d91a1a}6$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_tensor_to_bytestream_speed[pickle] 81.7556μs 80.8702μs 12.3655 KOps/s 11.8203 KOps/s $\color{#35bf28}+4.61\%$
test_tensor_to_bytestream_speed[torch.save] 0.1399ms 0.1394ms 7.1757 KOps/s 7.0626 KOps/s $\color{#35bf28}+1.60\%$
test_tensor_to_bytestream_speed[untyped_storage] 0.1101s 0.1099s 9.0962 Ops/s 9.3856 Ops/s $\color{#d91a1a}-3.08\%$
test_tensor_to_bytestream_speed[numpy] 2.6494μs 2.6412μs 378.6187 KOps/s 377.7656 KOps/s $\color{#35bf28}+0.23\%$
test_tensor_to_bytestream_speed[safetensors] 37.3803μs 37.2654μs 26.8345 KOps/s 25.0530 KOps/s $\textbf{\color{#35bf28}+7.11\%}$
test_simple 0.5573s 0.5512s 1.8141 Ops/s 1.7633 Ops/s $\color{#35bf28}+2.88\%$
test_transformed 1.0912s 1.0900s 0.9174 Ops/s 0.9010 Ops/s $\color{#35bf28}+1.82\%$
test_serial 1.6845s 1.6734s 0.5976 Ops/s 0.5960 Ops/s $\color{#35bf28}+0.27\%$
test_parallel 1.0183s 1.0160s 0.9842 Ops/s 0.9879 Ops/s $\color{#d91a1a}-0.38\%$
test_step_mdp_speed[True-True-True-True-True] 0.1319ms 41.5393μs 24.0736 KOps/s 23.9472 KOps/s $\color{#35bf28}+0.53\%$
test_step_mdp_speed[True-True-True-True-False] 47.6910μs 23.7065μs 42.1826 KOps/s 42.0781 KOps/s $\color{#35bf28}+0.25\%$
test_step_mdp_speed[True-True-True-False-True] 51.1510μs 24.2344μs 41.2637 KOps/s 41.9393 KOps/s $\color{#d91a1a}-1.61\%$
test_step_mdp_speed[True-True-True-False-False] 71.2720μs 13.0074μs 76.8793 KOps/s 76.5496 KOps/s $\color{#35bf28}+0.43\%$
test_step_mdp_speed[True-True-False-True-True] 73.3110μs 44.8447μs 22.2992 KOps/s 21.9665 KOps/s $\color{#35bf28}+1.51\%$
test_step_mdp_speed[True-True-False-True-False] 53.9710μs 26.2139μs 38.1477 KOps/s 37.3680 KOps/s $\color{#35bf28}+2.09\%$
test_step_mdp_speed[True-True-False-False-True] 57.5520μs 26.3605μs 37.9356 KOps/s 37.2396 KOps/s $\color{#35bf28}+1.87\%$
test_step_mdp_speed[True-True-False-False-False] 43.6500μs 15.9621μs 62.6484 KOps/s 62.5411 KOps/s $\color{#35bf28}+0.17\%$
test_step_mdp_speed[True-False-True-True-True] 78.1120μs 48.6312μs 20.5629 KOps/s 21.0324 KOps/s $\color{#d91a1a}-2.23\%$
test_step_mdp_speed[True-False-True-True-False] 55.3410μs 29.0848μs 34.3823 KOps/s 34.3100 KOps/s $\color{#35bf28}+0.21\%$
test_step_mdp_speed[True-False-True-False-True] 54.9520μs 26.6467μs 37.5281 KOps/s 37.5407 KOps/s $\color{#d91a1a}-0.03\%$
test_step_mdp_speed[True-False-True-False-False] 41.7110μs 15.9029μs 62.8814 KOps/s 63.2548 KOps/s $\color{#d91a1a}-0.59\%$
test_step_mdp_speed[True-False-False-True-True] 87.7120μs 50.3019μs 19.8800 KOps/s 19.5593 KOps/s $\color{#35bf28}+1.64\%$
test_step_mdp_speed[True-False-False-True-False] 58.4310μs 31.3611μs 31.8867 KOps/s 31.3917 KOps/s $\color{#35bf28}+1.58\%$
test_step_mdp_speed[True-False-False-False-True] 57.6110μs 29.2142μs 34.2300 KOps/s 34.3812 KOps/s $\color{#d91a1a}-0.44\%$
test_step_mdp_speed[True-False-False-False-False] 55.8010μs 18.4689μs 54.1450 KOps/s 54.1005 KOps/s $\color{#35bf28}+0.08\%$
test_step_mdp_speed[False-True-True-True-True] 80.8820μs 48.1612μs 20.7636 KOps/s 20.3849 KOps/s $\color{#35bf28}+1.86\%$
test_step_mdp_speed[False-True-True-True-False] 58.6620μs 28.9666μs 34.5225 KOps/s 34.2276 KOps/s $\color{#35bf28}+0.86\%$
test_step_mdp_speed[False-True-True-False-True] 2.4960ms 30.5678μs 32.7141 KOps/s 32.8392 KOps/s $\color{#d91a1a}-0.38\%$
test_step_mdp_speed[False-True-True-False-False] 47.1810μs 17.6263μs 56.7333 KOps/s 57.3238 KOps/s $\color{#d91a1a}-1.03\%$
test_step_mdp_speed[False-True-False-True-True] 78.2720μs 50.7770μs 19.6939 KOps/s 19.8695 KOps/s $\color{#d91a1a}-0.88\%$
test_step_mdp_speed[False-True-False-True-False] 60.8520μs 31.2338μs 32.0166 KOps/s 31.4545 KOps/s $\color{#35bf28}+1.79\%$
test_step_mdp_speed[False-True-False-False-True] 77.0720μs 33.0799μs 30.2299 KOps/s 30.8034 KOps/s $\color{#d91a1a}-1.86\%$
test_step_mdp_speed[False-True-False-False-False] 52.8110μs 20.9658μs 47.6968 KOps/s 50.0876 KOps/s $\color{#d91a1a}-4.77\%$
test_step_mdp_speed[False-False-True-True-True] 89.8520μs 52.4924μs 19.0504 KOps/s 18.5876 KOps/s $\color{#35bf28}+2.49\%$
test_step_mdp_speed[False-False-True-True-False] 71.5610μs 34.1069μs 29.3195 KOps/s 28.7035 KOps/s $\color{#35bf28}+2.15\%$
test_step_mdp_speed[False-False-True-False-True] 64.0010μs 33.1158μs 30.1971 KOps/s 30.0351 KOps/s $\color{#35bf28}+0.54\%$
test_step_mdp_speed[False-False-True-False-False] 48.0510μs 20.0800μs 49.8008 KOps/s 49.4008 KOps/s $\color{#35bf28}+0.81\%$
test_step_mdp_speed[False-False-False-True-True] 91.3520μs 53.8591μs 18.5670 KOps/s 17.9151 KOps/s $\color{#35bf28}+3.64\%$
test_step_mdp_speed[False-False-False-True-False] 73.8510μs 36.6076μs 27.3167 KOps/s 26.9604 KOps/s $\color{#35bf28}+1.32\%$
test_step_mdp_speed[False-False-False-False-True] 66.2010μs 35.2099μs 28.4011 KOps/s 28.3167 KOps/s $\color{#35bf28}+0.30\%$
test_step_mdp_speed[False-False-False-False-False] 60.0720μs 22.5115μs 44.4218 KOps/s 44.8790 KOps/s $\color{#d91a1a}-1.02\%$
test_non_tensor_env_rollout_speed[1000-single-True] 0.8407s 0.7423s 1.3472 Ops/s 1.3477 Ops/s $\color{#d91a1a}-0.04\%$
test_non_tensor_env_rollout_speed[1000-single-False] 0.7075s 0.6076s 1.6458 Ops/s 1.6518 Ops/s $\color{#d91a1a}-0.36\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] 1.7128s 1.6380s 0.6105 Ops/s 0.6148 Ops/s $\color{#d91a1a}-0.71\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 1.4832s 1.4042s 0.7121 Ops/s 0.7105 Ops/s $\color{#35bf28}+0.23\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 1.9539s 1.8710s 0.5345 Ops/s 0.5361 Ops/s $\color{#d91a1a}-0.30\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 1.7301s 1.6484s 0.6067 Ops/s 0.6048 Ops/s $\color{#35bf28}+0.31\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 4.7021s 4.6601s 0.2146 Ops/s 0.2140 Ops/s $\color{#35bf28}+0.28\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] 4.5759s 4.4919s 0.2226 Ops/s 0.2258 Ops/s $\color{#d91a1a}-1.39\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 1.9413s 1.8602s 0.5376 Ops/s 0.5326 Ops/s $\color{#35bf28}+0.93\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 1.6864s 1.5670s 0.6382 Ops/s 0.6339 Ops/s $\color{#35bf28}+0.68\%$
test_values[generalized_advantage_estimate-True-True] 10.7181ms 10.5597ms 94.6996 Ops/s 93.4870 Ops/s $\color{#35bf28}+1.30\%$
test_values[vec_generalized_advantage_estimate-True-True] 21.1510ms 17.7220ms 56.4271 Ops/s 55.6314 Ops/s $\color{#35bf28}+1.43\%$
test_values[td0_return_estimate-False-False] 0.2221ms 0.1261ms 7.9327 KOps/s 7.4882 KOps/s $\textbf{\color{#35bf28}+5.94\%}$
test_values[td1_return_estimate-False-False] 30.3406ms 29.2170ms 34.2267 Ops/s 34.3776 Ops/s $\color{#d91a1a}-0.44\%$
test_values[vec_td1_return_estimate-False-False] 18.6000ms 17.6112ms 56.7822 Ops/s 57.0242 Ops/s $\color{#d91a1a}-0.42\%$
test_values[td_lambda_return_estimate-True-False] 45.3877ms 43.2465ms 23.1232 Ops/s 23.1211 Ops/s $+0.01\%$
test_values[vec_td_lambda_return_estimate-True-False] 20.5827ms 17.6187ms 56.7578 Ops/s 57.0766 Ops/s $\color{#d91a1a}-0.56\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 9.5070ms 9.3451ms 107.0083 Ops/s 106.4969 Ops/s $\color{#35bf28}+0.48\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.7134ms 1.5378ms 650.2927 Ops/s 664.2054 Ops/s $\color{#d91a1a}-2.09\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4744ms 0.4224ms 2.3673 KOps/s 2.3367 KOps/s $\color{#35bf28}+1.31\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 35.6988ms 35.3144ms 28.3171 Ops/s 28.8211 Ops/s $\color{#d91a1a}-1.75\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 1.8326ms 1.7007ms 587.9766 Ops/s 582.1578 Ops/s $\color{#35bf28}+1.00\%$
test_dqn_speed[False-None] 2.0755ms 1.4229ms 702.8027 Ops/s 720.9054 Ops/s $\color{#d91a1a}-2.51\%$
test_dqn_speed[False-backward] 1.9595ms 1.8986ms 526.6995 Ops/s 511.5227 Ops/s $\color{#35bf28}+2.97\%$
test_dqn_speed[True-None] 0.7575ms 0.5487ms 1.8224 KOps/s 1.8002 KOps/s $\color{#35bf28}+1.23\%$
test_dqn_speed[True-backward] 1.0733ms 1.0103ms 989.7719 Ops/s 827.7222 Ops/s $\textbf{\color{#35bf28}+19.58\%}$
test_dqn_speed[reduce-overhead-None] 0.6796ms 0.5363ms 1.8646 KOps/s 1.8081 KOps/s $\color{#35bf28}+3.12\%$
test_ddpg_speed[False-None] 3.2010ms 2.8265ms 353.7932 Ops/s 354.7235 Ops/s $\color{#d91a1a}-0.26\%$
test_ddpg_speed[False-backward] 4.2099ms 4.0348ms 247.8448 Ops/s 250.8012 Ops/s $\color{#d91a1a}-1.18\%$
test_ddpg_speed[True-None] 1.5035ms 1.3972ms 715.6981 Ops/s 710.1263 Ops/s $\color{#35bf28}+0.78\%$
test_ddpg_speed[True-backward] 2.4390ms 2.3828ms 419.6805 Ops/s 348.9104 Ops/s $\textbf{\color{#35bf28}+20.28\%}$
test_ddpg_speed[reduce-overhead-None] 1.5258ms 1.3940ms 717.3356 Ops/s 718.1841 Ops/s $\color{#d91a1a}-0.12\%$
test_sac_speed[False-None] 8.4924ms 7.9582ms 125.6562 Ops/s 126.8294 Ops/s $\color{#d91a1a}-0.92\%$
test_sac_speed[False-backward] 11.6273ms 11.1682ms 89.5397 Ops/s 90.0472 Ops/s $\color{#d91a1a}-0.56\%$
test_sac_speed[True-None] 2.3125ms 2.1602ms 462.9112 Ops/s 460.1367 Ops/s $\color{#35bf28}+0.60\%$
test_sac_speed[True-backward] 4.2021ms 4.0583ms 246.4097 Ops/s 246.4019 Ops/s $+0.00\%$
test_sac_speed[reduce-overhead-None] 2.4865ms 2.1426ms 466.7289 Ops/s 464.8997 Ops/s $\color{#35bf28}+0.39\%$
test_redq_speed[False-None] 15.6261ms 10.7252ms 93.2383 Ops/s 93.1672 Ops/s $\color{#35bf28}+0.08\%$
test_redq_speed[False-backward] 21.1190ms 17.7772ms 56.2519 Ops/s 57.6692 Ops/s $\color{#d91a1a}-2.46\%$
test_redq_speed[True-None] 4.9845ms 4.3378ms 230.5314 Ops/s 231.4843 Ops/s $\color{#d91a1a}-0.41\%$
test_redq_speed[True-backward] 10.2441ms 9.8143ms 101.8927 Ops/s 99.1286 Ops/s $\color{#35bf28}+2.79\%$
test_redq_speed[reduce-overhead-None] 4.6007ms 4.3477ms 230.0046 Ops/s 234.0355 Ops/s $\color{#d91a1a}-1.72\%$
test_redq_deprec_speed[False-None] 11.5456ms 11.0398ms 90.5814 Ops/s 91.1053 Ops/s $\color{#d91a1a}-0.57\%$
test_redq_deprec_speed[False-backward] 16.1072ms 15.7926ms 63.3207 Ops/s 63.1019 Ops/s $\color{#35bf28}+0.35\%$
test_redq_deprec_speed[True-None] 4.1315ms 3.7144ms 269.2241 Ops/s 261.3095 Ops/s $\color{#35bf28}+3.03\%$
test_redq_deprec_speed[True-backward] 7.8891ms 7.6960ms 129.9368 Ops/s 123.8785 Ops/s $\color{#35bf28}+4.89\%$
test_redq_deprec_speed[reduce-overhead-None] 3.9464ms 3.6227ms 276.0353 Ops/s 239.9520 Ops/s $\textbf{\color{#35bf28}+15.04\%}$
test_td3_speed[False-None] 8.2591ms 8.0042ms 124.9338 Ops/s 126.3469 Ops/s $\color{#d91a1a}-1.12\%$
test_td3_speed[False-backward] 11.2570ms 10.8192ms 92.4286 Ops/s 92.7149 Ops/s $\color{#d91a1a}-0.31\%$
test_td3_speed[True-None] 1.8879ms 1.8384ms 543.9519 Ops/s 537.8350 Ops/s $\color{#35bf28}+1.14\%$
test_td3_speed[True-backward] 3.7131ms 3.6185ms 276.3581 Ops/s 252.7986 Ops/s $\textbf{\color{#35bf28}+9.32\%}$
test_td3_speed[reduce-overhead-None] 1.8423ms 1.7955ms 556.9397 Ops/s 548.4525 Ops/s $\color{#35bf28}+1.55\%$
test_cql_speed[False-None] 28.7003ms 25.7824ms 38.7861 Ops/s 38.8473 Ops/s $\color{#d91a1a}-0.16\%$
test_cql_speed[False-backward] 38.8415ms 35.3382ms 28.2980 Ops/s 28.4466 Ops/s $\color{#d91a1a}-0.52\%$
test_cql_speed[True-None] 12.8896ms 12.2580ms 81.5795 Ops/s 80.9699 Ops/s $\color{#35bf28}+0.75\%$
test_cql_speed[True-backward] 18.2251ms 17.5948ms 56.8348 Ops/s 54.7398 Ops/s $\color{#35bf28}+3.83\%$
test_cql_speed[reduce-overhead-None] 12.4710ms 12.1429ms 82.3528 Ops/s 81.5078 Ops/s $\color{#35bf28}+1.04\%$
test_a2c_speed[False-None] 5.6520ms 5.3195ms 187.9873 Ops/s 186.6844 Ops/s $\color{#35bf28}+0.70\%$
test_a2c_speed[False-backward] 11.9848ms 11.6167ms 86.0831 Ops/s 84.8815 Ops/s $\color{#35bf28}+1.42\%$
test_a2c_speed[True-None] 4.0020ms 3.6992ms 270.3301 Ops/s 271.6813 Ops/s $\color{#d91a1a}-0.50\%$
test_a2c_speed[True-backward] 8.7526ms 8.2994ms 120.4904 Ops/s 115.5889 Ops/s $\color{#35bf28}+4.24\%$
test_a2c_speed[reduce-overhead-None] 4.0647ms 3.7502ms 266.6557 Ops/s 270.2823 Ops/s $\color{#d91a1a}-1.34\%$
test_ppo_speed[False-None] 6.1015ms 5.9148ms 169.0677 Ops/s 169.9035 Ops/s $\color{#d91a1a}-0.49\%$
test_ppo_speed[False-backward] 12.9099ms 12.3948ms 80.6787 Ops/s 80.4799 Ops/s $\color{#35bf28}+0.25\%$
test_ppo_speed[True-None] 4.0561ms 3.6617ms 273.0967 Ops/s 267.9639 Ops/s $\color{#35bf28}+1.92\%$
test_ppo_speed[True-backward] 8.6132ms 8.3853ms 119.2569 Ops/s 113.4099 Ops/s $\textbf{\color{#35bf28}+5.16\%}$
test_ppo_speed[reduce-overhead-None] 3.8903ms 3.6405ms 274.6862 Ops/s 275.2426 Ops/s $\color{#d91a1a}-0.20\%$
test_reinforce_speed[False-None] 5.1011ms 4.4915ms 222.6422 Ops/s 222.3385 Ops/s $\color{#35bf28}+0.14\%$
test_reinforce_speed[False-backward] 7.5374ms 7.3049ms 136.8941 Ops/s 137.4468 Ops/s $\color{#d91a1a}-0.40\%$
test_reinforce_speed[True-None] 3.2403ms 2.8916ms 345.8271 Ops/s 348.9339 Ops/s $\color{#d91a1a}-0.89\%$
test_reinforce_speed[True-backward] 8.0124ms 7.6573ms 130.5943 Ops/s 115.7534 Ops/s $\textbf{\color{#35bf28}+12.82\%}$
test_reinforce_speed[reduce-overhead-None] 3.1065ms 2.8645ms 349.1028 Ops/s 351.7712 Ops/s $\color{#d91a1a}-0.76\%$
test_iql_speed[False-None] 25.4709ms 20.3582ms 49.1202 Ops/s 49.5448 Ops/s $\color{#d91a1a}-0.86\%$
test_iql_speed[False-backward] 30.9975ms 30.1988ms 33.1139 Ops/s 33.1537 Ops/s $\color{#d91a1a}-0.12\%$
test_iql_speed[True-None] 8.9739ms 8.5012ms 117.6307 Ops/s 118.6304 Ops/s $\color{#d91a1a}-0.84\%$
test_iql_speed[True-backward] 16.9671ms 16.5687ms 60.3547 Ops/s 57.2204 Ops/s $\textbf{\color{#35bf28}+5.48\%}$
test_iql_speed[reduce-overhead-None] 9.2706ms 8.5813ms 116.5325 Ops/s 115.7335 Ops/s $\color{#35bf28}+0.69\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.1439ms 6.0160ms 166.2220 Ops/s 164.4920 Ops/s $\color{#35bf28}+1.05\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.1831ms 0.3684ms 2.7148 KOps/s 3.2880 KOps/s $\textbf{\color{#d91a1a}-17.44\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6898ms 0.2902ms 3.4459 KOps/s 3.5055 KOps/s $\color{#d91a1a}-1.70\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.0239ms 5.7969ms 172.5059 Ops/s 171.4221 Ops/s $\color{#35bf28}+0.63\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.8624ms 0.3592ms 2.7839 KOps/s 2.8085 KOps/s $\color{#d91a1a}-0.88\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5923ms 0.3426ms 2.9186 KOps/s 2.9024 KOps/s $\color{#35bf28}+0.56\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6911ms 1.4292ms 699.6891 Ops/s 697.9438 Ops/s $\color{#35bf28}+0.25\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.6927ms 1.3465ms 742.6755 Ops/s 738.2268 Ops/s $\color{#35bf28}+0.60\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 9.2976ms 6.0481ms 165.3399 Ops/s 167.1487 Ops/s $\color{#d91a1a}-1.08\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.7654ms 0.4445ms 2.2497 KOps/s 2.1105 KOps/s $\textbf{\color{#35bf28}+6.60\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7167ms 0.4443ms 2.2507 KOps/s 2.2630 KOps/s $\color{#d91a1a}-0.55\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.8923ms 5.7941ms 172.5881 Ops/s 168.6423 Ops/s $\color{#35bf28}+2.34\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.9052ms 0.2816ms 3.5511 KOps/s 3.5129 KOps/s $\color{#35bf28}+1.09\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.4124ms 0.2612ms 3.8284 KOps/s 3.8198 KOps/s $\color{#35bf28}+0.23\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.9477ms 5.7337ms 174.4079 Ops/s 170.6506 Ops/s $\color{#35bf28}+2.20\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.9856ms 0.2785ms 3.5907 KOps/s 3.1080 KOps/s $\textbf{\color{#35bf28}+15.53\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6385ms 0.3722ms 2.6866 KOps/s 3.8379 KOps/s $\textbf{\color{#d91a1a}-30.00\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 9.2138ms 5.9109ms 169.1804 Ops/s 164.0253 Ops/s $\color{#35bf28}+3.14\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.5546s 1.5633ms 639.6763 Ops/s 2.1019 KOps/s $\textbf{\color{#d91a1a}-69.57\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6503ms 0.4412ms 2.2666 KOps/s 2.1156 KOps/s $\textbf{\color{#35bf28}+7.14\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.5754ms 5.0359ms 198.5727 Ops/s 195.7988 Ops/s $\color{#35bf28}+1.42\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 3.9240ms 1.8076ms 553.2080 Ops/s 457.6039 Ops/s $\textbf{\color{#35bf28}+20.89\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.1554ms 0.8713ms 1.1478 KOps/s 1.0923 KOps/s $\textbf{\color{#35bf28}+5.08\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 6.4923ms 4.9899ms 200.4051 Ops/s 56.6008 Ops/s $\textbf{\color{#35bf28}+254.07\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 13.5996ms 1.9763ms 505.9943 Ops/s 496.8948 Ops/s $\color{#35bf28}+1.83\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 2.8439ms 1.1209ms 892.1778 Ops/s 845.0759 Ops/s $\textbf{\color{#35bf28}+5.57\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.5158s 15.5221ms 64.4244 Ops/s 190.7157 Ops/s $\textbf{\color{#d91a1a}-66.22\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 3.9952ms 1.8991ms 526.5551 Ops/s 485.6570 Ops/s $\textbf{\color{#35bf28}+8.42\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 12.3768ms 1.4500ms 689.6544 Ops/s 978.9569 Ops/s $\textbf{\color{#d91a1a}-29.55\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 39.3252ms 36.0149ms 27.7663 Ops/s 27.6111 Ops/s $\color{#35bf28}+0.56\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 19.6062ms 18.0768ms 55.3194 Ops/s 54.9745 Ops/s $\color{#35bf28}+0.63\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 40.6209ms 37.0740ms 26.9731 Ops/s 26.5679 Ops/s $\color{#35bf28}+1.53\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 20.1705ms 18.4497ms 54.2016 Ops/s 53.6321 Ops/s $\color{#35bf28}+1.06\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 40.4466ms 38.8768ms 25.7223 Ops/s 25.7374 Ops/s $\color{#d91a1a}-0.06\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 21.2317ms 19.8847ms 50.2900 Ops/s 49.8541 Ops/s $\color{#35bf28}+0.87\%$
test_storage_write_lazystack[50-img_shape0-small] 0.8595ms 0.2186ms 4.5736 KOps/s 4.5625 KOps/s $\color{#35bf28}+0.24\%$
test_storage_write_lazystack[100-img_shape1-atari] 1.9189ms 1.3976ms 715.5106 Ops/s 707.0462 Ops/s $\color{#35bf28}+1.20\%$
test_storage_write_lazystack[100-img_shape2-large_img] 2.6191ms 2.4090ms 415.1152 Ops/s 432.3712 Ops/s $\color{#d91a1a}-3.99\%$
test_storage_write_lazystack[200-img_shape3-large_batch] 3.1531ms 2.8876ms 346.3043 Ops/s 337.6948 Ops/s $\color{#35bf28}+2.55\%$
test_storage_write_contiguous[50-img_shape0-small] 0.2623ms 0.1340ms 7.4638 KOps/s 7.5190 KOps/s $\color{#d91a1a}-0.73\%$
test_storage_write_contiguous[100-img_shape1-atari] 0.3436ms 0.1793ms 5.5778 KOps/s 5.3488 KOps/s $\color{#35bf28}+4.28\%$
test_storage_write_contiguous[100-img_shape2-large_img] 1.9524ms 1.7229ms 580.4137 Ops/s 572.4960 Ops/s $\color{#35bf28}+1.38\%$
test_storage_write_contiguous[200-img_shape3-large_batch] 1.4334ms 1.2751ms 784.2236 Ops/s 776.6984 Ops/s $\color{#35bf28}+0.97\%$
test_collector_stack_then_write[50-img_shape0-small] 1.1678ms 1.1181ms 894.3680 Ops/s 892.8781 Ops/s $\color{#35bf28}+0.17\%$
test_collector_stack_then_write[100-img_shape1-atari] 3.8869ms 3.6762ms 272.0187 Ops/s 278.7760 Ops/s $\color{#d91a1a}-2.42\%$
test_collector_stack_then_write[100-img_shape2-large_img] 6.0269ms 5.6286ms 177.6636 Ops/s 176.9204 Ops/s $\color{#35bf28}+0.42\%$
test_collector_stack_then_write[200-img_shape3-large_batch] 7.4550ms 7.1407ms 140.0425 Ops/s 141.3941 Ops/s $\color{#d91a1a}-0.96\%$
test_collector_lazystack_then_write[50-img_shape0-small] 0.4151ms 0.2762ms 3.6208 KOps/s 3.6423 KOps/s $\color{#d91a1a}-0.59\%$
test_collector_lazystack_then_write[100-img_shape1-atari] 1.6743ms 1.5158ms 659.7120 Ops/s 651.3499 Ops/s $\color{#35bf28}+1.28\%$
test_collector_lazystack_then_write[100-img_shape2-large_img] 2.8833ms 2.5163ms 397.4022 Ops/s 409.5255 Ops/s $\color{#d91a1a}-2.96\%$
test_collector_lazystack_then_write[200-img_shape3-large_batch] 3.3811ms 3.0806ms 324.6146 Ops/s 314.9789 Ops/s $\color{#35bf28}+3.06\%$
test_collector_without_rb[100-img_shape0-atari] 32.7456ms 32.3137ms 30.9466 Ops/s 30.2147 Ops/s $\color{#35bf28}+2.42\%$
test_collector_without_rb[200-img_shape1-large_batch] 64.0914ms 63.6770ms 15.7043 Ops/s 15.4156 Ops/s $\color{#35bf28}+1.87\%$
test_collector_with_rb[100-img_shape0-atari] 37.8538ms 36.9203ms 27.0854 Ops/s 26.7053 Ops/s $\color{#35bf28}+1.42\%$
test_collector_with_rb[200-img_shape1-large_batch] 0.6545s 0.1145s 8.7331 Ops/s 13.6389 Ops/s $\textbf{\color{#d91a1a}-35.97\%}$

@vmoens vmoens merged commit 74fcb21 into main Feb 19, 2026
117 of 120 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Feature New feature ReplayBuffers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant