[rollou] fix: forward max_tokens from rollout config to vLLM backends #5027

psyloy · 2026-01-23T03:57:33Z

What does this PR do?

forward max_tokens/max_new_tokens from rollout config to vLLM/sglang backends

gemini-code-assist

Code Review

This pull request aims to forward max_tokens and max_new_tokens from the rollout configuration to the vLLM/sglang backends. I've identified a few critical issues that prevent this from working as intended. Specifically, there's a bug in how sampling parameters are updated in verl/experimental/agent_loop/agent_loop.py, an incorrect type hint in verl/workers/config/rollout.py that would cause a runtime error, and the new logic is missing entirely from verl/experimental/fully_async_policy/agent_loop/agent_loop.py. I have provided detailed comments and suggestions to address these problems.

gemini-code-assist · 2026-01-23T03:58:56Z

verl/experimental/agent_loop/agent_loop.py

+        for param_name in ["max_tokens", "max_new_tokens"]:
+            param_value = getattr(config, param_name, None)
+            if param_value is not None:
+                sampling_params[param_value] = param_value


There's a bug in how the sampling parameters are being updated. You're using the parameter's value (param_value) as the dictionary key, but it should be the parameter's name (param_name). This will cause a TypeError if the value is not hashable, or will add an incorrect key to the sampling_params dictionary, preventing the setting from being applied.

Suggested change

sampling_params[param_value] = param_value

sampling_params[param_name] = param_value

gemini-code-assist · 2026-01-23T03:58:56Z

verl/workers/config/rollout.py

+    max_tokens: Optional[list] = None
+    max_new_tokens: Optional[list] = None


The type hints for max_tokens and max_new_tokens are incorrectly defined as Optional[list]. These parameters should be integers representing the maximum number of tokens. Using a list will cause a TypeError at runtime when trying to add them to the sampling_params dictionary, as lists are not hashable.

Suggested change

max_tokens: Optional[list] = None

max_new_tokens: Optional[list] = None

max_tokens: Optional[int] = None

max_new_tokens: Optional[int] = None

gemini-code-assist · 2026-01-23T03:58:56Z

verl/experimental/fully_async_policy/agent_loop/agent_loop.py

            repetition_penalty=1.0,
            logprobs=config.calculate_log_probs,
        )



This file is missing the logic to forward max_tokens and max_new_tokens from the rollout configuration to sampling_params. This was the main goal of the PR and its omission here will lead to inconsistent behavior between the two agent loop implementations. Please add the forwarding logic here as you did in verl/experimental/agent_loop/agent_loop.py.

Suggested change

# configure max generation tokens for vllm/sglang

for param_name in ["max_tokens", "max_new_tokens"]:

param_value = getattr(config, param_name, None)

if param_value is not None:

sampling_params[param_name] = param_value

psyloy requested review from PeterSH6, chenhaiq, eric-haibin-lin, tongyx361, vermouth1992 and zhaochenyang20 as code owners January 23, 2026 03:57

gemini-code-assist bot reviewed Jan 23, 2026

View reviewed changes

configure max generation tokens for vllm/sglang

57a8384

psyloy force-pushed the main_1 branch from 9268115 to 57a8384 Compare January 23, 2026 04:13

psyloy closed this Jan 23, 2026

psyloy changed the title ~~[rollout, vllm, sglang] fix: forward max_tokens/max_new_tokens from rollout config to vLLM/sglang backends~~ [rollou] fix: forward max_tokens from rollout config to vLLM backends Jan 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[rollou] fix: forward max_tokens from rollout config to vLLM backends #5027

[rollou] fix: forward max_tokens from rollout config to vLLM backends #5027

psyloy commented Jan 23, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jan 23, 2026

Uh oh!

gemini-code-assist bot Jan 23, 2026

Uh oh!

gemini-code-assist bot Jan 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	sampling_params[param_value] = param_value
	sampling_params[param_name] = param_value

		max_tokens: Optional[list] = None
		max_new_tokens: Optional[list] = None

[rollou] fix: forward max_tokens from rollout config to vLLM backends #5027

[rollou] fix: forward max_tokens from rollout config to vLLM backends #5027

Conversation

psyloy commented Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

psyloy commented Jan 23, 2026 •

edited

Loading