cache image_ids in NaViT forward pass for group_images mode #355

AmitMY · 2025-12-07T14:05:19Z

Summary

When using group_images=True with small group_max_seq_len, the image_ids tensor creation becomes a significant bottleneck. This PR adds LRU caching for image_ids tensors, similar to the existing posemb_grid cache.

The Problem

With group_images=True and group_max_seq_len=5, processing 512 images creates ~440 groups. Each group calls:

image_ids = torch.repeat_interleave(
    arange(len(images)),
    torch.tensor(patch_counts, device=device)
)

The torch.tensor(patch_counts, device=device) call has significant overhead when called 440 times per forward pass.

The Solution

Cache the image_ids tensors by patch count pattern using lru_cache. Since there are only a few unique patch count patterns (e.g., (3,), (2, 3), (5,)), the cache is highly effective.

Benchmark Results

512 variable-width images (16px tall, 32-80px wide), group_max_seq_len=5:

Config	Before	After	Speedup
NaViT (group_images=True)	97.9ms	24.1ms	4x faster

🤖 Generated with Claude Code

🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

AmitMY · 2025-12-07T14:16:16Z

This would not really work in real scenarios.
But it does show that the repeat_interleave is very costly for short sequences

AmitMY force-pushed the optimize-group-images branch from 88714a6 to 43726b7 Compare December 7, 2025 14:07

AmitMY marked this pull request as draft December 7, 2025 14:08

cache image_ids in NaViT forward pass for group_images mode

417d571

🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

AmitMY force-pushed the optimize-group-images branch from 43726b7 to 417d571 Compare December 7, 2025 14:13

AmitMY closed this Dec 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

cache image_ids in NaViT forward pass for group_images mode #355

cache image_ids in NaViT forward pass for group_images mode #355

Uh oh!

AmitMY commented Dec 7, 2025

Uh oh!

AmitMY commented Dec 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

cache image_ids in NaViT forward pass for group_images mode #355

cache image_ids in NaViT forward pass for group_images mode #355

Uh oh!

Conversation

AmitMY commented Dec 7, 2025

Summary

The Problem

The Solution

Benchmark Results

Uh oh!

AmitMY commented Dec 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant