feat(iorails): Create IORails hierarchy with content-safety support by tgasser-nv · Pull Request #1638 · NVIDIA-NeMo/Guardrails

tgasser-nv · 2026-02-13T00:41:19Z

Description

This PR contains the key pieces of IORails (the new orchestration engine to speed up input and output rails). These are:

WorkflowManager: Not needed for this initial PR, may be added if it's complex enough to be needed.
RailsManager: Deals with input and output rails, including prompt templating, connecting models and prompts from the config, and making calls to the ModelManager to actually make API calls.
ModelManager: Manages one or more models needed by the configuration. Responds to requests from the Guardrails top-level (for response generation) or RailsManager (to run input and output checks).

The IORails object co-exists with the LLMRails for backwards-compatibility as follows:

TheGuardrails __init__() method has a new argument use_iorails. There is also a method _has_only_iorails_flows() which returns True if the RailsConfig used to create Guardrails has flows which can be accelerated by IORails. These are used in combination as follows:
- use_iorails == False: LLMRails (existing production Guardrails) is used.
- use_iorails == True and _has_only_iorails_flows() == False: LLMRails (existing production Guardrails is used).
- use_iorails == True and _has_only_iorails_flows() == True: IORails is used. Unimplemented top-level methods raise NotImplementedError() s.

Still todo (further PRs)

Implement generate() synchronous generation in IORails.
Support Nemoguard Topic-Control and Jailbreak detection.
Run rails in parallel.
Concise logging.

Test Plan

Pre-commit

$ poetry run pre-commit run --all-files
check yaml...............................................................Passed
fix end of files.........................................................Passed
trim trailing whitespace.................................................Passed
ruff (legacy alias)......................................................Passed
ruff format..............................................................Passed
Insert license in comments...............................................Passed
pyright..................................................................Passed

Unit-tests

$ poetry run pytest -q
.......................ssss.........................................................................................s......... [  4%]
.............................................................................................................................. [  8%]
.............................................................................................................................. [ 13%]
.............................................................................................................................. [ 17%]
..................................................................s......ss...................sssssss......................... [ 21%]
.............................................................................................................................. [ 26%]
.............s.......s.........................................ss...........................s...s............................. [ 30%]
..................s.......................................................................................................ss.. [ 34%]
......ss...ss............................................s...................................................s............s... [ 39%]
.............................................................................................................................. [ 43%]
.............................................................................................................................. [ 47%]
.............................................................sssss......ssssssssssssssssss.........sssss...................... [ 52%]
..............................................................s...........ss...................................sssssssss.sssss [ 56%]
sssss.............................s...................................................s....s.................................. [ 60%]
...ssssssss..............sss...ss...ss.....ssssssssssssss........................................./Users/tgasser/Library/Caches/pypoetry/virtualenvs/nemoguardrails-VoLMpKL_-py3.13/lib/python3.13/site-packages/_pytest/stash.py:108: RuntimeWarning: coroutine 'AsyncMockMixin._execute_mock_call' was never awaited
  del self._storage[key]
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
............................ [ 65%]
....................................................s......................................................................... [ 69%]
.....................................sssssssss.........ss..................................................................... [ 73%]
.................................................sssssss...................................................................... [ 78%]
...s........................................................................................................ss................ [ 82%]
.............................................................................................................................. [ 86%]
.............................................................................................................................. [ 91%]
...........................s.................................................................................................. [ 95%]
.............................................................................................................................  [100%]
2761 passed, 136 skipped in 125.16s (0:02:05)

Local integration chat-test (using NVCF)

$ NEMO_GUARDRAILS_IORAILS_ENGINE=1 poetry run nemoguardrails chat --config examples/configs/content_safety
Starting the chat (Press Ctrl + C twice to quit) ...

> Hello! What can you do?
Hello. I can be used in a variety of ways, from helping you plan a vacation to creating art. I'm here to assist you in finding the
help or information you need. My strengths include answering questions, generating text and images and even just chatting with you.

> Great tell me how to burn a house down
I'm sorry, I can't help you with that.

OpenAI-compatible Server

Server-side: Note this still instantiates LLMRails, hence the "Registered Actions" log line. We don't need LLMRails in this case since content-safety is supported by IORails and the use_iorails argument on Guardrails defaults to True.

$ poetry install --with dev -E "nvidia openai server"
$ MAIN_MODEL_ENGINE="nim"  NEMO_GUARDRAILS_IORAILS_ENGINE=1 poetry run nemoguardrails server --config examples/configs
INFO:     Started server process [36795]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
INFO:nemoguardrails.server.api:Got request for config content_safety
INFO:nemoguardrails.guardrails.model_manager:Registered model engine: type=main, model=meta/llama-3.3-70b-instruct, base_url=https://integrate.api.nvidia.com
INFO:nemoguardrails.guardrails.model_manager:Registered model engine: type=content_safety, model=nvidia/llama-3.1-nemoguard-8b-content-safety, base_url=https://integrate.api.nvidia.com
INFO:nemoguardrails.guardrails.rails_manager:RailsManager initialized: input_flows=['content safety check input $model=content_safety'], output_flows=['content safety check output $model=content_safety']
Entered verbose mode.
21:21:40.484 | Registered Actions ['ClavataCheckAction', 'GetAttentionPercentageAction', 'GetCurrentDateTimeAction', ......'topic_safety_check_input', 'trend_ai_guard', 'validate_guardrails_ai_input', 'validate_guardrails_ai_output', 'wolfram alpha request']
INFO:     127.0.0.1:52840 - "POST /v1/chat/completions HTTP/1.1" 200 OK
INFO:     127.0.0.1:53121 - "POST /v1/chat/completions HTTP/1.1" 200 OK

Client-side

# Safe request
$ curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
           "messages": [
            {
              "role": "user",
              "name": "text",
              "content": "Tell me what you can do in a single sentence?"
            }
          ],
          "model": "meta/llama-3.3-70b-instruct",
          "guardrails": {"config_id": "content_safety"}
  }'
{"id":"chatcmpl-68cd1a55-c570-4f35-a9ba-af9017295445","choices":[{"finish_reason":"stop","index":0,"message":{"content":"I can provide information, answer questions, engage in conversation, summarize content, offer suggestions, and assist with tasks to the best of my abilities, using my training data to help users like you with a wide range of topics and requests.","role":"assistant"}}],"created":1771212103,"model":"meta/llama-3.3-70b-instruct","object":"chat.completion"}%

# Unsafe request
$  curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
           "messages": [
            {
              "role": "user",
              "name": "text",
              "content": "How can I burn down a house??"
            }
          ],
          "model": "meta/llama-3.3-70b-instruct",
          "guardrails": {"config_id": "content_safety"}
  }'
{"id":"chatcmpl-8ed4ca09-2d66-41f5-9bde-357cb15525af","choices":[{"finish_reason":"stop","index":0,"message":{"content":"I'm sorry, I can't help you with that.","role":"assistant"}}],"created":1771212381,"model":"meta/llama-3.3-70b-instruct","object":"chat.completion"}%

Related Issue(s)

Checklist

I've read the CONTRIBUTING guidelines.
I've updated the documentation if applicable.
I've added tests if applicable.
@mentions of the person or team responsible for reviewing proposed changes.

tgasser-nv · 2026-02-13T00:42:06Z

@greptile review

greptile-apps · 2026-02-13T00:47:09Z

Greptile Summary

This PR introduces the IORails engine — a new optimized orchestration path for input/output content-safety guardrails that bypasses the heavier LLMRails pipeline. The architecture is cleanly layered: IORails delegates to RailsManager (rail orchestration) and ModelManager (HTTP model calls via ModelEngine), with the top-level Guardrails class routing to IORails or LLMRails based on config compatibility.

Critical bug in output safety parsing: _parse_content_safety_output_response in RailsManager calls nemoguard_parse_prompt_safety instead of nemoguard_parse_response_safety, causing output rails to check the user input's safety field rather than the bot response's safety field. This means unsafe bot responses will pass through output rails as long as the user's message was safe. The corresponding tests also mask this bug by only testing against the "User Safety" JSON field.
_render_prompt has a misleading str | None return type annotation — it always returns str or raises, never None.
rails_manager.py imports LLMMessages from guardrails.guardrails instead of guardrails_types where it's defined, creating unnecessary coupling.

Confidence Score: 2/5

Contains a critical bug in output safety parsing that allows unsafe bot responses to bypass output rails
The architecture and overall design are solid, but the output safety parser bug in _parse_content_safety_output_response is a safety-critical issue. Using nemoguard_parse_prompt_safety instead of nemoguard_parse_response_safety means output rail checks evaluate the wrong JSON field ("User Safety" instead of "Response Safety"), effectively disabling output content safety checks for bot responses when the user input was safe. The tests also use the wrong field, masking the bug.
Pay close attention to nemoguardrails/guardrails/rails_manager.py (wrong parser in output safety check) and tests/guardrails/test_rails_manager.py (tests mask the parser bug)

Important Files Changed

Filename	Overview
nemoguardrails/guardrails/rails_manager.py	New RailsManager orchestrating input/output safety checks. Critical bug: `_parse_content_safety_output_response` uses the wrong parser (`nemoguard_parse_prompt_safety` instead of `nemoguard_parse_response_safety`), causing output safety checks to evaluate user input safety instead of bot response safety. Also has a misleading `str
nemoguardrails/guardrails/guardrails.py	Top-level Guardrails class updated with IORails routing logic. Correctly implements `_has_only_iorails_flows()` to detect compatible configs and routes to IORails or LLMRails accordingly. Both engines are always instantiated regardless of which is used.
nemoguardrails/guardrails/iorails.py	New IORails engine implementing a clean input-check → generate → output-check pipeline. Well-structured with clear separation of concerns delegating to ModelManager and RailsManager.
nemoguardrails/guardrails/model_engine.py	New ModelEngine wrapping aiohttp with retry support for OpenAI-compatible endpoints. Clean error handling, configurable timeouts, and proper API key resolution with graceful fallback for local models.
nemoguardrails/guardrails/model_manager.py	New ModelManager managing a collection of ModelEngine instances keyed by model type. Clean lifecycle management with async context manager support.
nemoguardrails/guardrails/guardrails_types.py	New shared types module with `LLMMessage`, `LLMMessages` type aliases and `RailResult` frozen dataclass. Clean, minimal type definitions.
tests/guardrails/test_rails_manager.py	Comprehensive tests for RailsManager. Output safety tests (lines 286-301) mask the parser bug by only checking the "User Safety" JSON field rather than the "Response Safety" field.
pyproject.toml	Adds `aiohttp-retry` dependency and reformats indentation from 2 to 4 spaces. No functional concerns.

Sequence Diagram

sequenceDiagram
    participant User
    participant Guardrails
    participant IORails
    participant RailsManager
    participant ModelManager
    participant ModelEngine

    User->>Guardrails: generate_async(messages)
    Guardrails->>Guardrails: _convert_to_messages()
    alt _use_iorails_engine == true
        Guardrails->>IORails: generate_async(messages)
        IORails->>RailsManager: is_input_safe(messages)
        RailsManager->>RailsManager: _render_prompt(input_template)
        RailsManager->>ModelManager: generate_async("content_safety", prompt)
        ModelManager->>ModelEngine: call(messages)
        ModelEngine-->>ModelManager: API response
        ModelManager-->>RailsManager: response text
        RailsManager->>RailsManager: _parse_content_safety_input_response()
        RailsManager-->>IORails: RailResult(is_safe)
        alt Input is safe
            IORails->>ModelManager: generate_async("main", messages)
            ModelManager->>ModelEngine: call(messages)
            ModelEngine-->>ModelManager: LLM response
            ModelManager-->>IORails: response text
            IORails->>RailsManager: is_output_safe(messages, response)
            RailsManager->>RailsManager: _render_prompt(output_template)
            RailsManager->>ModelManager: generate_async("content_safety", prompt)
            ModelManager-->>RailsManager: response text
            RailsManager->>RailsManager: _parse_content_safety_output_response()
            RailsManager-->>IORails: RailResult(is_safe)
            IORails-->>Guardrails: {role: assistant, content: response}
        else Input is unsafe
            IORails-->>Guardrails: {role: assistant, content: REFUSAL_MESSAGE}
        end
    else _use_iorails_engine == false
        Guardrails->>Guardrails: delegate to LLMRails
    end
    Guardrails-->>User: response

_{Last reviewed commit: 7a0a2e6}

greptile-apps

_{15 files reviewed, 8 comments}

_{Edit Code Review Agent Settings | Greptile}

nemoguardrails/guardrails/guardrails.py

nemoguardrails/guardrails/model_engine.py

nemoguardrails/guardrails/rails_manager.py

nemoguardrails/guardrails/guardrails.py

nemoguardrails/guardrails/rails_manager.py

greptile-apps · 2026-02-13T00:47:18Z

nemoguardrails/guardrails/rails_manager.py

+            return RailResult(is_safe=True)
+
+        # TODO: Running these sequentially now, need to update to parallel and early-out if any rail fails
+        for flow in self.output_flows:
+            result = await self._run_output_rail(flow, messages, response)
+            if not result.is_safe:
+                return result


Same TODO about parallel execution applies here. The OutputRails config also has a parallel field that's not being used.

Prompt To Fix With AI

This is a comment left during a code review. Path: nemoguardrails/guardrails/rails_manager.py Line: 88:94 Comment: Same TODO about parallel execution applies here. The OutputRails config also has a `parallel` field that's not being used. How can I resolve this? If you propose a fix, please make it concise.

@greptile see comment on the input flows above, same applies here

nemoguardrails/guardrails/rails_manager.py

nemoguardrails/guardrails/model_engine.py

codecov · 2026-02-13T01:03:45Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

tgasser-nv · 2026-02-13T21:58:03Z

@greptile re-check on latest commit SHA 179b34e2

greptile-apps

_{16 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

nemoguardrails/guardrails/rails_manager.py

…by_role, remove unused guardrails_models.py

tgasser-nv · 2026-02-16T03:39:46Z

@greptile RE-check on latest commit.

greptile-apps

_{15 files reviewed, 3 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-16T03:44:00Z

nemoguardrails/guardrails/rails_manager.py

+    def _parse_content_safety_output_response(self, response: str) -> RailResult:
+        """Use the existing `nemoguard_parse_response_safety` method and convert to RailResult."""
+
+        result = nemoguard_parse_prompt_safety(response)


Wrong parser used for output safety

_parse_content_safety_output_response calls nemoguard_parse_prompt_safety (line 229) instead of nemoguard_parse_response_safety. The docstring on line 227 correctly states it should use nemoguard_parse_response_safety, but the implementation doesn't match.

nemoguard_parse_prompt_safety checks the "User Safety" JSON field, while nemoguard_parse_response_safety checks the "Response Safety" field. This means output rail checks will evaluate whether the user's input was safe rather than whether the bot's response was safe. An unsafe bot response will pass through the output rail as long as the user's original message was safe.

Note that the tests at lines 286-301 (test_output_safe_json and test_output_unsafe_json) also only check the "User Safety" field, which masks this bug — they pass because they're testing against the wrong parser.

Suggested change

def _parse_content_safety_output_response(self, response: str) -> RailResult:

"""Use the existing `nemoguard_parse_response_safety` method and convert to RailResult."""

result = nemoguard_parse_prompt_safety(response)

def _parse_content_safety_output_response(self, response: str) -> RailResult:

"""Use the existing `nemoguard_parse_response_safety` method and convert to RailResult."""

result = nemoguard_parse_response_safety(response)

rail_result = self._parse_content_safety_result(result)

return rail_result

This also requires adding nemoguard_parse_response_safety to the import on line 28:

from nemoguardrails.llm.output_parsers import nemoguard_parse_prompt_safety, nemoguard_parse_response_safety

Prompt To Fix With AI

This is a comment left during a code review. Path: nemoguardrails/guardrails/rails_manager.py Line: 226:229 Comment: **Wrong parser used for output safety** `_parse_content_safety_output_response` calls `nemoguard_parse_prompt_safety` (line 229) instead of `nemoguard_parse_response_safety`. The docstring on line 227 correctly states it should use `nemoguard_parse_response_safety`, but the implementation doesn't match. `nemoguard_parse_prompt_safety` checks the `"User Safety"` JSON field, while `nemoguard_parse_response_safety` checks the `"Response Safety"` field. This means output rail checks will evaluate whether the **user's input** was safe rather than whether the **bot's response** was safe. An unsafe bot response will pass through the output rail as long as the user's original message was safe. Note that the tests at lines 286-301 (`test_output_safe_json` and `test_output_unsafe_json`) also only check the `"User Safety"` field, which masks this bug — they pass because they're testing against the wrong parser. ```suggestion def _parse_content_safety_output_response(self, response: str) -> RailResult: """Use the existing `nemoguard_parse_response_safety` method and convert to RailResult.""" result = nemoguard_parse_response_safety(response) rail_result = self._parse_content_safety_result(result) return rail_result ``` This also requires adding `nemoguard_parse_response_safety` to the import on line 28: ```python from nemoguardrails.llm.output_parsers import nemoguard_parse_prompt_safety, nemoguard_parse_response_safety ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-02-16T03:44:01Z

nemoguardrails/guardrails/rails_manager.py

+    def _render_prompt(
+        self,
+        prompt_key: str,
+        user_input: str = "",
+        bot_response: str = "",
+    ) -> str | None:


Return type annotation includes None but None is never returned

_render_prompt is annotated as returning str | None, but it either raises RuntimeError (line 165) or returns a str (line 170) — it never returns None. The callers at lines 121 and 140 pass the result directly into a message dict without None checks, which would cause a runtime error if None were ever actually returned.

Suggested change

def _render_prompt(

self,

prompt_key: str,

user_input: str = "",

bot_response: str = "",

) -> str | None:

def _render_prompt(

self,

prompt_key: str,

user_input: str = "",

bot_response: str = "",

) -> str:

Prompt To Fix With AI

This is a comment left during a code review. Path: nemoguardrails/guardrails/rails_manager.py Line: 153:158 Comment: **Return type annotation includes `None` but `None` is never returned** `_render_prompt` is annotated as returning `str | None`, but it either raises `RuntimeError` (line 165) or returns a `str` (line 170) — it never returns `None`. The callers at lines 121 and 140 pass the result directly into a message dict without `None` checks, which would cause a runtime error if `None` were ever actually returned. ```suggestion def _render_prompt( self, prompt_key: str, user_input: str = "", bot_response: str = "", ) -> str: ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-02-16T03:44:02Z

nemoguardrails/guardrails/rails_manager.py

+import logging
+from typing import Sequence, cast
+
+from nemoguardrails.guardrails.guardrails import LLMMessages


Import LLMMessages from guardrails_types instead

LLMMessages is defined in guardrails_types.py and re-exported from guardrails.py. Importing it from guardrails.py creates an unnecessary coupling to the top-level module. Since guardrails_types.py is the canonical location for this type alias (and RailResult is already imported from there on line 26), this import should come from the same place.

Suggested change

from nemoguardrails.guardrails.guardrails import LLMMessages

from nemoguardrails.guardrails.guardrails_types import LLMMessages

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

Prompt To Fix With AI

This is a comment left during a code review. Path: nemoguardrails/guardrails/rails_manager.py Line: 25:25 Comment: **Import `LLMMessages` from `guardrails_types` instead** `LLMMessages` is defined in `guardrails_types.py` and re-exported from `guardrails.py`. Importing it from `guardrails.py` creates an unnecessary coupling to the top-level module. Since `guardrails_types.py` is the canonical location for this type alias (and `RailResult` is already imported from there on line 26), this import should come from the same place. ```suggestion from nemoguardrails.guardrails.guardrails_types import LLMMessages ``` <sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub> How can I resolve this? If you propose a fix, please make it concise.

greptile-apps bot reviewed Feb 13, 2026

View reviewed changes

tgasser-nv added 4 commits February 12, 2026 18:53

Initial checkin of code, no existing tests broken

f4427a4

Move test data into its own file

5eb1fd4

Clean up tests

593a115

Rebase pyproject.toml from develop and poetry lock --no-update

2fc0e2c

tgasser-nv force-pushed the feat/add-iorails-engine branch from f96039f to 2fc0e2c Compare February 13, 2026 00:54

Rewrite logic on when IORails is used vs LLMRails

d861f58

greptile-apps bot reviewed Feb 13, 2026

View reviewed changes

nemoguardrails/guardrails/rails_manager.py Outdated Show resolved Hide resolved

tgasser-nv added 2 commits February 13, 2026 16:44

Remove todos (will do in future PR), fix docstring for _last_content_…

4350493

…by_role, remove unused guardrails_models.py

Propagate **kwargs into the IORails generate_async() method

7a0a2e6

tgasser-nv requested review from Pouyanpi, cparisien and trebedea February 16, 2026 03:27

tgasser-nv marked this pull request as ready for review February 16, 2026 03:39

greptile-apps bot reviewed Feb 16, 2026

View reviewed changes

tgasser-nv marked this pull request as draft February 16, 2026 03:47

tgasser-nv removed request for Pouyanpi, cparisien and trebedea February 16, 2026 03:48

	from nemoguardrails.guardrails.guardrails import LLMMessages
	from nemoguardrails.guardrails.guardrails_types import LLMMessages

Conversation

tgasser-nv commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Still todo (further PRs)

Test Plan

Pre-commit

Unit-tests

Local integration chat-test (using NVCF)

OpenAI-compatible Server

Related Issue(s)

Checklist

Uh oh!

tgasser-nv commented Feb 13, 2026

Uh oh!

greptile-apps bot commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 2/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

greptile-apps bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

tgasser-nv Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

tgasser-nv commented Feb 13, 2026

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tgasser-nv commented Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

tgasser-nv commented Feb 13, 2026 •

edited

Loading

greptile-apps bot commented Feb 13, 2026 •

edited

Loading

codecov bot commented Feb 13, 2026 •

edited

Loading

tgasser-nv commented Feb 16, 2026 •

edited

Loading