[FIX] Add LLMCompat bridge class to fix retriever LLM compatibility with llama-index by pk-zipstack · Pull Request #1788 · Zipstack/unstract

pk-zipstack · 2026-02-12T19:55:05Z

What

Added LLMCompat class in SDK1 that emulates the llama-index LLM interface without requiring llama-index as a dependency
Added RetrieverLLM class in prompt-service that inherits from llama-index's LLM base class and delegates calls to LLMCompat
Updated all affected prompt-service retrievers to use RetrieverLLM when passing LLM to llama-index components
Fixed SubQuestionQueryEngine failing due to missing llama-index-question-gen-openai package

Why

After the SDK1 migration, unstract.sdk1.llm.LLM uses litellm.completion() directly and does not inherit from llama_index.core.llms.llm.LLM
Prompt-service retrievers (KeywordTable, Subquestion, Fusion, Router) pass this LLM to llama-index components that expect the llama-index base class
This causes 'LLM' object has no attribute 'predict' errors at runtime for all retrieval strategies except Simple
Additionally, llama-index's resolve_llm() asserts isinstance(llm, LLM) which fails for plain classes — breaking Fusion, Router, and Subquestion retrievers

How

Two-layer architecture:

LLMCompat (unstract/sdk1/src/unstract/sdk1/llm.py): A plain Python class (no llama-index dependency) that emulates the llama-index LLM interface. Uses local dataclass emulations of llama-index types (ChatMessage, ChatResponse, CompletionResponse, LLMMetadata, MessageRole). Follows the existing EmbeddingCompat initialization pattern — takes raw adapter params and creates an SDK1 LLM internally. Implements chat(), complete(), predict(), achat(), acomplete(), apredict().
RetrieverLLM (prompt-service/.../core/retrievers/retriever_llm.py): Inherits from llama_index.core.llms.llm.LLM (passing resolve_llm() isinstance checks) and delegates all LLM calls to an internal LLMCompat instance. Converts emulated return types back to real llama-index types.

Supporting changes:

llm_helper.py: Helper to convert SDK1 LLM → RetrieverLLM instance
base_retriever.py: Added llama_index_llm property that lazily creates and caches the RetrieverLLM wrapper
Retriever updates: Replaced self.llm with self.llama_index_llm where passed to llama-index components (keyword_table, subquestion, fusion, router)
SubQuestion fix: Explicitly pass LLMQuestionGenerator to SubQuestionQueryEngine.from_defaults() to avoid import of llama-index-question-gen-openai

Can this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)

No. LLMCompat and RetrieverLLM are only used when retrievers pass LLM to llama-index components. The Simple, Automerging, and Recursive retrievers don't pass LLM to llama-index and remain unchanged. The bridge delegates all calls to the existing SDK1 LLM, so no behavior changes.

Database Migrations

None

Env Config

None

Relevant Docs

N/A

Related Issues or PRs

[FEAT] Rewrite LLMCompat to emulate llama-index interface without dependency #1793

Dependencies Versions

No new dependencies added. LLMCompat uses emulated types (no llama-index imports in SDK1). RetrieverLLM uses llama-index core classes already present in the prompt-service.

Notes on Testing

Set retrieval strategy to Keyword Table, Subquestion, Fusion, or Router in Prompt Studio with chunk_size > 0 and run extraction — should no longer error with 'LLM' object has no attribute 'predict'
Verify Simple, Automerging, Recursive strategies still work (regression check)

Screenshots

N/A

Checklist

I have read and understood the Contribution Guidelines.

coderabbitai · 2026-02-12T19:55:17Z

Summary by CodeRabbit

Bug Fixes

Fixed runtime failures affecting four critical retriever components: Keyword Table Retriever, Sub-question Retriever, Fusion Retriever, and Router Retriever. These retrievers were experiencing errors due to LLM compatibility issues when using the new SDK1. They are now fully operational and seamlessly compatible with the updated system, requiring no code changes from users.

Walkthrough

The changes introduce a compatibility bridge to resolve the "'LLM' object has no attribute 'predict'" error by creating a wrapper layer that adapts SDK1 LLM instances to conform to llama-index's expected LLM interface. The solution adds llama-index type emulations and compatibility wrappers in the SDK1 module and uses them throughout the retriever system.

Changes

Cohort / File(s)	Summary
Investigation Documentation `llm-predict-error-investigation.md`	New document detailing root cause analysis of the LLM AttributeError, affected retrievers, and three proposed solutions with implementation recommendations.
SDK1 LLM Core Types & Compatibility `unstract/sdk1/src/unstract/sdk1/llm.py`	Added emulated llama-index types (MessageRole, ChatMessage, ChatResponse, CompletionResponse, LLMMetadata) and new LLMCompat wrapper class that delegates SDK1 LLM calls to expose chat/complete/predict methods compatible with llama-index interfaces.
Prompt Service Bridge & Helper `prompt-service/src/unstract/prompt_service/core/retrievers/retriever_llm.py`, `prompt-service/src/unstract/prompt_service/utils/llm_helper.py`	New RetrieverLLM bridge class adapting LLMCompat to llama-index's LLM subclass with synchronous/asynchronous methods; helper function get_llama_index_llm to instantiate the bridge.
Prompt Service Retriever Infrastructure `prompt-service/src/unstract/prompt_service/core/retrievers/base_retriever.py`	Added lazy-initialized llama_index_llm property to BaseRetriever that bridges SDK1 LLM via get_llama_index_llm on first access.
Prompt Service Retrievers Using Bridge `prompt-service/src/unstract/prompt_service/core/retrievers/fusion.py`, `keyword_table.py`, `router.py`, `subquestion.py`	Updated four affected retrievers to use self.llama_index_llm instead of self.llm when passing to llama-index components (QueryFusionRetriever, KeywordTableIndex, RouterQueryEngine, SubQuestionQueryEngine).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~35 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 72.50% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically summarizes the main change: adding an LLMCompat bridge class to resolve retriever LLM compatibility issues with llama-index.
Description check	✅ Passed	The description is comprehensive and follows the template structure with all critical sections filled: What, Why, How, breaking changes analysis, database migrations, env config, related issues, dependencies, and testing notes.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/retriever-llm-bridge-class

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

for more information, see https://pre-commit.ci

…endency (#1793) * [FEAT] Rewrite LLMCompat to emulate llama-index interface without dependency - Add emulated llama-index types (MessageRole, ChatMessage, ChatResponse, CompletionResponse, LLMMetadata) as dataclasses - Rewrite LLMCompat to match llama-index LLM interface without inheritance - Implement chat(), complete(), achat(), acomplete() for retriever compatibility - Follow EmbeddingCompat initialization pattern (takes adapter params directly) - Remove llama-index imports from llm.py - Raise NotImplementedError for streaming methods not needed by retrievers This allows SubQuestionQueryEngine, QueryFusionRetriever, and other llama-index components to use LLMCompat without requiring llama-index as a dependency. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Added retrieverLLM class to handle issues with retreivers * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: pk-zipstack <praveen@zipstack.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

sonarqubecloud · 2026-02-19T04:23:26Z

Quality Gate passed

Issues
2 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

github-actions · 2026-02-19T04:28:57Z

Test Results

Summary

✅ Runner Tests: 11 passed, 0 failed (11 total)
✅ SDK1 Tests: 66 passed, 0 failed (66 total)

Runner Tests - Full Report

filepath	function	$$\textcolor{#23d18b}{\tt{passed}}$$	SUBTOTAL
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_logs}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_cleanup}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_cleanup\_skip}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_client\_init}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_image\_exists}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_image}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_container\_run\_config}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_container\_run\_config\_without\_mount}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_run\_container}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_image\_for\_sidecar}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_sidecar\_container}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{TOTAL}}$$		$$\textcolor{#23d18b}{\tt{11}}$$	$$\textcolor{#23d18b}{\tt{11}}$$

SDK1 Tests - Full Report

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (6)

unstract/sdk1/src/unstract/sdk1/llm.py (4)
36-43: Prefer enum.StrEnum over (str, Enum) dual inheritance.

Python 3.11+ provides StrEnum which is the idiomatic replacement. Since the codebase already uses | union types (Python 3.10+), StrEnum should be available.
♻️ Suggested change
-from enum import Enum
+from enum import StrEnum

-class MessageRole(str, Enum):
+class MessageRole(StrEnum):
     """Emulates llama_index.core.base.llms.types.MessageRole."""
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@unstract/sdk1/src/unstract/sdk1/llm.py` around lines 36 - 43, Replace the
MessageRole class to inherit from enum.StrEnum instead of (str, Enum): import
StrEnum from enum and change the class definition for MessageRole to subclass
StrEnum; keep the same member names and values (SYSTEM, USER, ASSISTANT,
FUNCTION, TOOL) so behavior and comparisons remain identical.
659-676: predict and apredict assume a PromptTemplate-like interface on prompt.

These methods call prompt.format_messages(llm=self, ...) and prompt.format(llm=self, ...), expecting a llama-index PromptTemplate object. The type is annotated as Any, which hides this contract. Since llama-index's LLM.predict receives a BasePromptTemplate, this should work — but if ever called with a plain string, it'll raise AttributeError.

This is acceptable for the bridge use case, but documenting the expected type (even as a comment) would help maintainability.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@unstract/sdk1/src/unstract/sdk1/llm.py` around lines 659 - 676, predict and
apredict assume a PromptTemplate-like object (they call
prompt.format_messages(...) and prompt.format(...)) but are annotated as Any;
update the contract by changing the type annotation from Any to the appropriate
llama-index type (e.g. BasePromptTemplate or PromptTemplate) or at minimum add a
clear inline comment/docstring stating the expected type, and add a simple
runtime check in predict and apredict that raises a descriptive TypeError if the
prompt lacks format/format_messages (e.g. if not hasattr(prompt, "format") and
not hasattr(prompt, "format_messages")), referencing the predict/apredict
methods and self.metadata.is_chat_model, so callers get a clear error instead of
an AttributeError.
574-628: LLMCompat creates a second LLM instance — consider accepting an existing one.

RetrieverLLM.__init__ (in retriever_llm.py, Line 33-43) reads private attributes (llm._adapter_id, llm._adapter_metadata, etc.) from the SDK1 LLM to construct LLMCompat, which then creates another LLM internally (Line 613). This means every retriever bridge creates a duplicate LLM with duplicate adapter validation, platform config fetching, and callback setup.

Consider allowing LLMCompat to accept an existing LLM instance directly rather than always constructing a new one, to avoid this duplication and the reliance on private attributes.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@unstract/sdk1/src/unstract/sdk1/llm.py` around lines 574 - 628, LLMCompat
currently always constructs a new LLM (in LLMCompat.__init__) causing duplicate
LLMs; change the constructor to accept an optional existing LLM instance
parameter (e.g., llm_instance: LLM | None = None) and if provided set
self._llm_instance = llm_instance and skip creating a new LLM, while still
setting self._tool, self._adapter_instance_id, self.model_name =
self._llm_instance.get_model_name(), and self.callback_manager =
self._llm_instance.callback_manager (or None) to preserve behavior; retain the
original parameter list and only construct a new LLM when llm_instance is None
so RetrieverLLM can pass the SDK1 LLM directly and avoid duplicated adapter
validation and config fetching.
804-815: Add comment documenting the duck-typing contract for ChatMessage compatibility.

_to_litellm_messages relies on duck typing to accept both the emulated ChatMessage (defined in this file) and real llama_index.core.base.llms.types.ChatMessage objects, since achat() and acomplete() may receive either. Both provide .role and .content attributes, with the getattr(m.role, "value", ...) pattern handling both enum and string variants.

While the .content attribute is maintained in llama-index as a backward-compatibility property (returning str | None from the underlying block-based structure), a brief comment here clarifying the intentional duck-typing contract would improve maintainability and signal to future maintainers why both types are accepted.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@unstract/sdk1/src/unstract/sdk1/llm.py` around lines 804 - 815, Add a brief
doc-comment above the _to_litellm_messages function explaining the duck-typing
contract: that the function accepts both the emulated ChatMessage defined in
this module and llama_index.core.base.llms.types.ChatMessage objects because
both expose .role and .content, and that getattr(m.role, "value", str(m.role))
handles enum or string roles while m.content may be str | None; mention this is
intentional to support inputs from achat() and acomplete().
prompt-service/src/unstract/prompt_service/core/retrievers/retriever_llm.py (1)
15-15: Remove unused noqa directives flagged by Ruff.

Static analysis (Ruff RUF100) reports 10 unused noqa directives across this file (N811, ANN401). These should be removed to keep the codebase clean.

Also applies to: 33-33, 57-57, 72-72, 80-80, 88-88, 97-97, 112-112, 120-120, 128-128
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@prompt-service/src/unstract/prompt_service/core/retrievers/retriever_llm.py`
at line 15, Remove the redundant noqa directives reported by Ruff (RUF100) in
retriever_llm.py: delete the unused "# noqa: N811" on the import "from
llama_index.core.llms.llm import LLM as LlamaIndexBaseLLM" and likewise remove
the other unused "# noqa" tags on the affected lines (the import/annotation
lines referencing LlamaIndexBaseLLM and any annotated functions/classes flagged
with ANN401). Ensure you only remove the "# noqa" comments (not the imports or
annotations themselves), run Ruff/pytest to verify no new linter errors, and
keep the symbols like LlamaIndexBaseLLM unchanged so references in the file
remain valid.
llm-predict-error-investigation.md (1)
1-221: Debug artifact should not be committed to main.

This file reads as a transient investigation note, not stable documentation. Committing it to main introduces noise with no durable value: it already contains stale content (see below) and will continue diverging from the codebase over time.

If a record of the design decision is needed, consider:

Converting it to an Architecture Decision Record (ADR) in a docs/adr/ folder, keeping only the decision, rationale, and final chosen approach.

Moving the content into the PR description (where it already lives as context) and dropping the file entirely.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@llm-predict-error-investigation.md` around lines 1 - 221, This investigation
file is a debug artifact and should not be committed to main; remove
llm-predict-error-investigation.md from the branch (or revert the commit) and
either convert its essential outcome into a concise ADR under docs/adr/
(summarizing the chosen approach such as the SDK1LLMBridge, use of LiteLLM, or
SimpleKeywordTableIndex and the BaseRetriever/KeywordTableIndex impact) or move
the full investigative content into the PR description/temporary notes, keeping
only the final decision and rationale in the repo.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@llm-predict-error-investigation.md`:
- Around line 150-184: Update the document's Option 1 example to match the
actual two-layer implementation: show LLMCompat (from
unstract/sdk1/src/unstract/sdk1/llm.py) that emulates llama-index types and
RetrieverLLM (from prompt-service/.../core/retrievers/retriever_llm.py) which
inherits from llama_index.core.llms.llm.LLM and delegates to LLMCompat; replace
the single-class SDK1LLMBridge/CustomLLM example with a short description and
class names matching LLMCompat -> RetrieverLLM delegation and note that
RetrieverLLM inherits from llama_index.core.llms.llm.LLM (not CustomLLM).
- Line 5: Add explicit language specifiers to the two fenced code blocks to
satisfy MD040: change the traceback block that starts with "ERROR:
unstract.prompt_service.core.retrievers.keyword_table:" to use ```text and
change the block containing "BaseLLM -> LLM (has predict()) ->
CustomLLM/FunctionCallingLLM -> Provider" to use ```python so markdownlint
recognizes their languages.

In `@prompt-service/src/unstract/prompt_service/core/retrievers/retriever_llm.py`:
- Around line 33-43: Declare _compat as a Pydantic PrivateAttr at the class
level for v2 compatibility: import PrivateAttr from pydantic and add a class
attribute like "_compat: Any = PrivateAttr()" (or similar typed PrivateAttr) on
the retriever class, then keep the existing assignment to self._compat in
__init__ where LLMCompat(...) is created (referencing __init__, LLMCompat, and
the _compat attribute).

In `@unstract/sdk1/src/unstract/sdk1/llm.py`:
- Around line 698-728: LLMCompat.chat() and LLMCompat.complete() call
litellm.completion() directly, skipping the SDK’s usage recording and structured
error wrapping used elsewhere; update these methods to (1) call into the
internal LLM instance’s usage recorder (e.g., invoke
self._llm_instance._record_usage(...) or the same _record_usage flow used by
LLM.complete) after getting the response so usage/audit data is emitted, (2)
wrap litellm.completion() in a try/except and re-raise errors as LLMError to
match callers' expectations, and (3) if metrics capture is required, apply the
same `@capture_metrics` behavior or add a TODO comment documenting intentional
omission; locate changes in the LLMCompat.chat and LLMCompat.complete methods
around the litellm.completion(...) calls and the response handling to implement
these fixes.

---

Nitpick comments:
In `@llm-predict-error-investigation.md`:
- Around line 1-221: This investigation file is a debug artifact and should not
be committed to main; remove llm-predict-error-investigation.md from the branch
(or revert the commit) and either convert its essential outcome into a concise
ADR under docs/adr/ (summarizing the chosen approach such as the SDK1LLMBridge,
use of LiteLLM, or SimpleKeywordTableIndex and the
BaseRetriever/KeywordTableIndex impact) or move the full investigative content
into the PR description/temporary notes, keeping only the final decision and
rationale in the repo.

In `@prompt-service/src/unstract/prompt_service/core/retrievers/retriever_llm.py`:
- Line 15: Remove the redundant noqa directives reported by Ruff (RUF100) in
retriever_llm.py: delete the unused "# noqa: N811" on the import "from
llama_index.core.llms.llm import LLM as LlamaIndexBaseLLM" and likewise remove
the other unused "# noqa" tags on the affected lines (the import/annotation
lines referencing LlamaIndexBaseLLM and any annotated functions/classes flagged
with ANN401). Ensure you only remove the "# noqa" comments (not the imports or
annotations themselves), run Ruff/pytest to verify no new linter errors, and
keep the symbols like LlamaIndexBaseLLM unchanged so references in the file
remain valid.

In `@unstract/sdk1/src/unstract/sdk1/llm.py`:
- Around line 36-43: Replace the MessageRole class to inherit from enum.StrEnum
instead of (str, Enum): import StrEnum from enum and change the class definition
for MessageRole to subclass StrEnum; keep the same member names and values
(SYSTEM, USER, ASSISTANT, FUNCTION, TOOL) so behavior and comparisons remain
identical.
- Around line 659-676: predict and apredict assume a PromptTemplate-like object
(they call prompt.format_messages(...) and prompt.format(...)) but are annotated
as Any; update the contract by changing the type annotation from Any to the
appropriate llama-index type (e.g. BasePromptTemplate or PromptTemplate) or at
minimum add a clear inline comment/docstring stating the expected type, and add
a simple runtime check in predict and apredict that raises a descriptive
TypeError if the prompt lacks format/format_messages (e.g. if not
hasattr(prompt, "format") and not hasattr(prompt, "format_messages")),
referencing the predict/apredict methods and self.metadata.is_chat_model, so
callers get a clear error instead of an AttributeError.
- Around line 574-628: LLMCompat currently always constructs a new LLM (in
LLMCompat.__init__) causing duplicate LLMs; change the constructor to accept an
optional existing LLM instance parameter (e.g., llm_instance: LLM | None = None)
and if provided set self._llm_instance = llm_instance and skip creating a new
LLM, while still setting self._tool, self._adapter_instance_id, self.model_name
= self._llm_instance.get_model_name(), and self.callback_manager =
self._llm_instance.callback_manager (or None) to preserve behavior; retain the
original parameter list and only construct a new LLM when llm_instance is None
so RetrieverLLM can pass the SDK1 LLM directly and avoid duplicated adapter
validation and config fetching.
- Around line 804-815: Add a brief doc-comment above the _to_litellm_messages
function explaining the duck-typing contract: that the function accepts both the
emulated ChatMessage defined in this module and
llama_index.core.base.llms.types.ChatMessage objects because both expose .role
and .content, and that getattr(m.role, "value", str(m.role)) handles enum or
string roles while m.content may be str | None; mention this is intentional to
support inputs from achat() and acomplete().

coderabbitai · 2026-02-19T04:34:27Z

llm-predict-error-investigation.md

+
+## Error Context
+
+```


⚠️ Potential issue | 🟡 Minor

Missing language specifiers on fenced code blocks (MD040).

The traceback block on Line 5 and the Python block on Line 52 both lack a language identifier, triggering markdownlint MD040 warnings.

📝 Proposed fix

-``` +```text ERROR: unstract.prompt_service.core.retrievers.keyword_table:

-``` +```python BaseLLM -> LLM (has predict()) -> CustomLLM/FunctionCallingLLM -> Provider

Also applies to: 52-52

🧰 Tools

🪛 markdownlint-cli2 (0.21.0)

[warning] 5-5: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@llm-predict-error-investigation.md` at line 5, Add explicit language specifiers to the two fenced code blocks to satisfy MD040: change the traceback block that starts with "ERROR: unstract.prompt_service.core.retrievers.keyword_table:" to use ```text and change the block containing "BaseLLM -> LLM (has predict()) -> CustomLLM/FunctionCallingLLM -> Provider" to use ```python so markdownlint recognizes their languages.

coderabbitai · 2026-02-19T04:34:27Z

llm-predict-error-investigation.md

+### Option 1: Bridge Adapter (Recommended)
+
+Create a class extending `llama_index.core.llms.CustomLLM` that delegates to SDK1's `LLM.complete()` internally. This makes it compatible with all llama-index components.
+
+```python
+from llama_index.core.llms import CustomLLM, CompletionResponse, LLMMetadata
+
+class SDK1LLMBridge(CustomLLM):
+    """Bridge between unstract.sdk1.llm.LLM and llama-index's LLM interface."""
+
+    def __init__(self, sdk1_llm):
+        super().__init__()
+        self._sdk1_llm = sdk1_llm
+
+    @property
+    def metadata(self) -> LLMMetadata:
+        return LLMMetadata(model_name=self._sdk1_llm.get_model_name())
+
+    def complete(self, prompt, **kwargs) -> CompletionResponse:
+        result = self._sdk1_llm.complete(prompt, **kwargs)
+        return CompletionResponse(text=result["response"].text)
+
+    def stream_complete(self, prompt, **kwargs):
+        for chunk in self._sdk1_llm.stream_complete(prompt, **kwargs):
+            yield CompletionResponse(text=chunk.text, delta=chunk.delta)
+```
+
+Then in retrievers, wrap before passing to llama-index:
+
+```python
+from .llm_bridge import SDK1LLMBridge
+
+llama_llm = SDK1LLMBridge(self.llm)
+keyword_index = KeywordTableIndex(nodes=..., llm=llama_llm)
+```


⚠️ Potential issue | 🟡 Minor

Option 1 code example does not match the actual implementation.

The example shows a single-layer bridge extending llama_index.core.llms.CustomLLM and names it SDK1LLMBridge. The actual implementation is a two-layer approach:

LLMCompat in unstract/sdk1/src/unstract/sdk1/llm.py — emulates llama-index types without importing llama-index.

RetrieverLLM in prompt-service/.../core/retrievers/retriever_llm.py — inherits from llama_index.core.llms.llm.LLM (not CustomLLM) and delegates to LLMCompat.

If this document is retained, the code example should be updated to reflect the actual class hierarchy and naming so it doesn't mislead future readers.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@llm-predict-error-investigation.md` around lines 150 - 184, Update the document's Option 1 example to match the actual two-layer implementation: show LLMCompat (from unstract/sdk1/src/unstract/sdk1/llm.py) that emulates llama-index types and RetrieverLLM (from prompt-service/.../core/retrievers/retriever_llm.py) which inherits from llama_index.core.llms.llm.LLM and delegates to LLMCompat; replace the single-class SDK1LLMBridge/CustomLLM example with a short description and class names matching LLMCompat -> RetrieverLLM delegation and note that RetrieverLLM inherits from llama_index.core.llms.llm.LLM (not CustomLLM).

coderabbitai · 2026-02-19T04:34:27Z

prompt-service/src/unstract/prompt_service/core/retrievers/retriever_llm.py

+    def __init__(self, llm: LLM, **kwargs: Any) -> None:  # noqa: ANN401
+        """Initialize with an SDK1 LLM instance."""
+        super().__init__(**kwargs)
+        self._compat = LLMCompat(
+            adapter_id=llm._adapter_id,
+            adapter_metadata=llm._adapter_metadata,
+            adapter_instance_id=llm._adapter_instance_id,
+            tool=llm._tool,
+            usage_kwargs=llm._usage_kwargs,
+            capture_metrics=llm._capture_metrics,
+        )


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

cat -n prompt-service/src/unstract/prompt_service/core/retrievers/retriever_llm.py

Repository: Zipstack/unstract

Length of output: 5083

🏁 Script executed:

# Check what LlamaIndexBaseLLM is and where it comes from rg -n "class RetrieverLLM" -A 20 prompt-service/src/unstract/prompt_service/core/retrievers/retriever_llm.py

Repository: Zipstack/unstract

Length of output: 1024

🏁 Script executed:

# Check imports to see what LlamaIndexBaseLLM is rg -n "LlamaIndexBaseLLM\|from.*LLM\|import.*LLM" prompt-service/src/unstract/prompt_service/core/retrievers/retriever_llm.py

Repository: Zipstack/unstract

Length of output: 43

🏁 Script executed:

# Check if _compat is declared elsewhere in the class rg -n "_compat" prompt-service/src/unstract/prompt_service/core/retrievers/retriever_llm.py

Repository: Zipstack/unstract

Length of output: 447

🏁 Script executed:

# Find llama-index LLM base class to see how it's configured find . -path "*/site-packages/llama_index*" -name "*.py" -exec grep -l "class LLM" {} \; 2>/dev/null | head -5

Repository: Zipstack/unstract

Length of output: 43

🏁 Script executed:

# Check if there's a pyproject.toml or requirements to see llama-index version fd -n "pyproject.toml|requirements.txt" prompt-service/ | head -5

Repository: Zipstack/unstract

Length of output: 291

🏁 Script executed:

# Check if _compat attribute assignment works by looking for similar patterns in the codebase rg -n "self\._[a-z_]+ = " --type=py | grep -v "test" | head -20

Repository: Zipstack/unstract

Length of output: 1970

🏁 Script executed:

# Check if there are other Pydantic models in the codebase that set private attributes rg -n "PrivateAttr\|model_config\|ConfigDict" --type=py | head -30

Repository: Zipstack/unstract

Length of output: 43

🏁 Script executed:

# Look for how llama-index is used in tests to understand if this pattern works fd -n "test.*retriever" prompt-service/ | xargs cat 2>/dev/null | head -50

Repository: Zipstack/unstract

Length of output: 291

🏁 Script executed:

# Check the requirements/pyproject to understand llama-index version find prompt-service -name "pyproject.toml" -o -name "requirements*.txt" | xargs grep -i "llama" 2>/dev/null

Repository: Zipstack/unstract

Length of output: 87

🏁 Script executed:

# Look for how llama-index BaseModel is used elsewhere rg "LlamaIndexBaseLLM|from llama_index" --type py -A 3 prompt-service/ | head -60

Repository: Zipstack/unstract

Length of output: 6123

🏁 Script executed:

# Check if there are test files for RetrieverLLM find prompt-service -type f -name "*test*retriever*" -o -name "*retriever*test*" | head -5

Repository: Zipstack/unstract

Length of output: 43

🏁 Script executed:

# Get the actual imports and check llama-index version in any lock files find . -name "poetry.lock" -o -name "Pipfile.lock" -o -name "uv.lock" | xargs grep -A 5 'llama-index' 2>/dev/null | head -80

Repository: Zipstack/unstract

Length of output: 5650

🏁 Script executed:

# Verify Pydantic handling by checking if llama-index LLM uses ConfigDict python3 << 'EOF' try: from llama_index.core.llms.llm import LLM import inspect # Check if model_config exists if hasattr(LLM, 'model_config'): print("LLM has model_config:", LLM.model_config) # Check base classes print("LLM bases:", LLM.__bases__) # Try to understand if it's a Pydantic model if hasattr(LLM, '__fields__'): print("Has __fields__ (Pydantic v1)") if hasattr(LLM, 'model_fields'): print("Has model_fields (Pydantic v2)") # Check if we can set private attributes test_instance = LLM() test_instance._test_attr = "test" print("Successfully set _test_attr on LLM instance") print("_test_attr value:", test_instance._test_attr) except Exception as e: print(f"Error: {type(e).__name__}: {e}") EOF

Repository: Zipstack/unstract

Length of output: 118

🏁 Script executed:

# Check if the code actually runs by looking for any test files that use RetrieverLLM find prompt-service -type f -name "*.py" -path "*/test*" -exec grep -l "RetrieverLLM" {} \;

Repository: Zipstack/unstract

Length of output: 43

🏁 Script executed:

# Check if there are integration tests or examples showing RetrieverLLM usage rg "RetrieverLLM" --type py -B 2 -A 10

Repository: Zipstack/unstract

Length of output: 7975

🏁 Script executed:

# Search for documentation about llama-index LLM Pydantic setup in comments/docstrings rg -i "pydantic|privateattr|model_config" prompt-service/ --type py -B 2 -A 2

Repository: Zipstack/unstract

Length of output: 43

🏁 Script executed:

# Check the SDK1 LLM class to understand the private attributes being accessed find . -path "*/unstract/sdk1/llm*" -type f | head -10

Repository: Zipstack/unstract

Length of output: 101

🏁 Script executed:

# Look at what LLMCompat expects rg "class LLMCompat" -A 30 --type py

Repository: Zipstack/unstract

Length of output: 2505

🏁 Script executed:

# Check if the noqa directives are actually suppressing violations # Let's look for what ANN401 is cat -n prompt-service/src/unstract/prompt_service/core/retrievers/retriever_llm.py | grep -E "(noqa|ANN401)" | head -20

Repository: Zipstack/unstract

Length of output: 592

🏁 Script executed:

# Search for Pydantic PrivateAttr usage in the entire codebase to understand the pattern rg "PrivateAttr" --type py -B 3 -A 3 | head -80

Repository: Zipstack/unstract

Length of output: 43

🏁 Script executed:

# Check the SDK1 LLM class definition to understand the private attributes cat unstract/sdk1/src/unstract/sdk1/llm.py | head -100

Repository: Zipstack/unstract

Length of output: 2967

🏁 Script executed:

# Look for any documentation or tests that verify RetrieverLLM works correctly find prompt-service -type f -name "*.py" | xargs grep -l "_compat" 2>/dev/null

Repository: Zipstack/unstract

Length of output: 136

🏁 Script executed:

# Check if there's any test infrastructure that would validate RetrieverLLM works find . -type f -name "*.py" -path "*/test*" -exec grep -l "llama_index\|RetrieverLLM" {} \; 2>/dev/null | head -10

Repository: Zipstack/unstract

Length of output: 43

🏁 Script executed:

# Check the actual SDK1 LLM class to see if those private attributes are really private cat unstract/sdk1/src/unstract/sdk1/llm.py | grep -E "self\._adapter_id|self\._adapter_metadata|self\._adapter_instance_id|self\._tool|self\._usage_kwargs|self\._capture_metrics" | head -20

Repository: Zipstack/unstract

Length of output: 1199

🏁 Script executed:

# Check if there are any __init__ definitions in SDK1 LLM that set these private attributes rg "def __init__" unstract/sdk1/src/unstract/sdk1/llm.py -A 30 | head -60

Repository: Zipstack/unstract

Length of output: 2445

🏁 Script executed:

# Verify: are the noqa directives in RetrieverLLM actually suppressing flake8-annotations violations? # Check if removing them would trigger ANN401 warnings python3 << 'EOF' # ANN401 is from flake8-annotations and means: "Dynamically typed expressions (Any) are disallowed" # When you use **kwargs: Any, you're using a dynamic type (Any) without specifying what the type should be # So the noqa: ANN401 is suppressing this warning # In the code: # Line 33: def __init__(self, llm: LLM, **kwargs: Any) -> None: # noqa: ANN401 # The **kwargs: Any is the dynamic type expression being suppressed # Lines 57, 72, 80, 88, 97, 112, 120, 128: All have **kwargs: Any # noqa: ANN401 # These are all legitimate uses of the noqa directive # So the claim about "unused noqa directives" appears to be FALSE # All 9 noqa: ANN401 directives are suppressing legitimate ANN401 violations print("Analysis: All noqa: ANN401 directives are legitimate and not unused") print("They are suppressing the ANN401 violation for '**kwargs: Any' type hints") EOF

Repository: Zipstack/unstract

Length of output: 201

🏁 Script executed:

# Let's check what the actual Pydantic configuration is in llama-index 0.13.2 # by looking for any Python documentation or checking if it's documented web_search "llama-index 0.13.2 LLM Pydantic configuration private attributes"

Repository: Zipstack/unstract

Length of output: 111

🏁 Script executed:

# Let me search online for llama-index 0.13.2 LLM implementation curl -s "https://raw.githubusercontent.com/run-llama/llama_index/main/llama-index-core/llama_index/core/llms/llm.py" | head -150

Repository: Zipstack/unstract

Length of output: 4006

🏁 Script executed:

# Alternative: check if there's any CI configuration or test runner that would validate this works find . -type f $ -name "pytest.ini" -o -name "tox.ini" -o -name ".github" -o -name "conftest.py" $ | head -5

Repository: Zipstack/unstract

Length of output: 208

🏁 Script executed:

# Let's look for any actual test files or CI config that tests RetrieverLLM find . -path "*/.github/workflows/*.yml" -o -path "*/tests/*retriever*.py" 2>/dev/null | head -10

Repository: Zipstack/unstract

Length of output: 43

🏁 Script executed:

# Check if there are any runtime validation tests find prompt-service -type f -name "*.py" | xargs grep -l "test.*retriever\|retriever.*test" 2>/dev/null | head -5

Repository: Zipstack/unstract

Length of output: 43

🏁 Script executed:

# Check if there's any evidence in tests or CI that validates this works cat prompt-service/src/unstract/prompt_service/tests/conftest.py | head -50

Repository: Zipstack/unstract

Length of output: 817

🏁 Script executed:

# Look for any actual instantiation or usage tests of RetrieverLLM rg "RetrieverLLM\|get_llama_index_llm" prompt-service --type py -B 2 -A 5 | head -80

Repository: Zipstack/unstract

Length of output: 43

🏁 Script executed:

# Verify the Pydantic configuration in llama-index by checking bridge curl -s "https://raw.githubusercontent.com/run-llama/llama_index/main/llama-index-core/llama_index/core/bridge/pydantic.py" 2>/dev/null | head -50

Repository: Zipstack/unstract

Length of output: 1032

Declare _compat as a Pydantic PrivateAttr for proper v2 compatibility.

While the current code functions in practice, _compat should be declared as a class-level PrivateAttr() to follow Pydantic v2 conventions. This ensures the attribute is properly managed and future-proofs the code.

🔧 Suggested improvement

+from pydantic import PrivateAttr + class RetrieverLLM(LlamaIndexBaseLLM): + _compat: LLMCompat = PrivateAttr() + def __init__(self, llm: LLM, **kwargs: Any) -> None: # noqa: ANN401 """Initialize with an SDK1 LLM instance.""" super().__init__(**kwargs) self._compat = LLMCompat(

🧰 Tools

🪛 Ruff (0.15.1)

[warning] 33-33: Unused noqa directive (non-enabled: ANN401)

Remove unused noqa directive

(RUF100)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@prompt-service/src/unstract/prompt_service/core/retrievers/retriever_llm.py` around lines 33 - 43, Declare _compat as a Pydantic PrivateAttr at the class level for v2 compatibility: import PrivateAttr from pydantic and add a class attribute like "_compat: Any = PrivateAttr()" (or similar typed PrivateAttr) on the retriever class, then keep the existing assignment to self._compat in __init__ where LLMCompat(...) is created (referencing __init__, LLMCompat, and the _compat attribute).

coderabbitai · 2026-02-19T04:34:27Z

unstract/sdk1/src/unstract/sdk1/llm.py

+    def chat(
+        self,
+        messages: Sequence[ChatMessage],
+        **kwargs: Any,  # noqa: ANN401
+    ) -> ChatResponse:
+        """Synchronous chat completion."""
+        litellm.drop_params = True
+        response = litellm.completion(
+            messages=self._to_litellm_messages(messages),
+            **self._get_completion_kwargs(),
+        )
+        content = response["choices"][0]["message"]["content"]
+        return ChatResponse(
+            message=ChatMessage(role=MessageRole.ASSISTANT, content=content),
+            raw=response,
+        )
+
+    def complete(
+        self,
+        prompt: str,
+        formatted: bool = False,
+        **kwargs: Any,  # noqa: ANN401
+    ) -> CompletionResponse:
+        """Synchronous completion."""
+        litellm.drop_params = True
+        response = litellm.completion(
+            messages=[{"role": "user", "content": prompt}],
+            **self._get_completion_kwargs(),
+        )
+        content = response["choices"][0]["message"]["content"]
+        return CompletionResponse(text=content, raw=response)


⚠️ Potential issue | 🟠 Major

LLMCompat.chat() and complete() bypass usage recording and error handling.

The SDK1 LLM.complete() method (Lines 240-329) includes _record_usage() calls for auditing, structured error wrapping into LLMError, and the @capture_metrics decorator. LLMCompat.chat() and complete() call litellm.completion() directly, skipping all of that. This means:

No usage audit trail for retriever LLM calls (affects billing/observability).

Bare litellm exceptions propagate without the LLMError wrapping callers may expect.

If usage tracking is intentionally deferred, consider adding a TODO comment. Otherwise, consider at minimum wrapping exceptions and calling _record_usage via the internal _llm_instance.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/src/unstract/sdk1/llm.py` around lines 698 - 728, LLMCompat.chat() and LLMCompat.complete() call litellm.completion() directly, skipping the SDK’s usage recording and structured error wrapping used elsewhere; update these methods to (1) call into the internal LLM instance’s usage recorder (e.g., invoke self._llm_instance._record_usage(...) or the same _record_usage flow used by LLM.complete) after getting the response so usage/audit data is emitted, (2) wrap litellm.completion() in a try/except and re-raise errors as LLMError to match callers' expectations, and (3) if metrics capture is required, apply the same `@capture_metrics` behavior or add a TODO comment documenting intentional omission; locate changes in the LLMCompat.chat and LLMCompat.complete methods around the litellm.completion(...) calls and the response handling to implement these fixes.

pk-zipstack added 2 commits February 13, 2026 01:01

Added bridge class for llms to fix retrieval issue

0acab2b

Fixed import error with sub-question retrieval

5fc3ef8

pk-zipstack requested review from gaya3-zipstack, hari-kuriakose and harini-venkataraman February 12, 2026 19:55

pk-zipstack self-assigned this Feb 12, 2026

pre-commit-ci bot and others added 3 commits February 12, 2026 19:55

[pre-commit.ci] auto fixes from pre-commit.com hooks

dc57c53

for more information, see https://pre-commit.ci

Merge branch 'main' into fix/retriever-llm-bridge-class

0be9fb1

pk-zipstack marked this pull request as ready for review February 19, 2026 04:28

coderabbitai bot reviewed Feb 19, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FIX] Add LLMCompat bridge class to fix retriever LLM compatibility with llama-index#1788

[FIX] Add LLMCompat bridge class to fix retriever LLM compatibility with llama-index#1788
pk-zipstack wants to merge 5 commits intomainfrom
fix/retriever-llm-bridge-class

pk-zipstack commented Feb 12, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Feb 12, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

sonarqubecloud bot commented Feb 19, 2026

Uh oh!

github-actions bot commented Feb 19, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 19, 2026

Uh oh!

coderabbitai bot Feb 19, 2026

Uh oh!

coderabbitai bot Feb 19, 2026

Uh oh!

coderabbitai bot Feb 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pk-zipstack commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

How

Can this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)

Database Migrations

Env Config

Relevant Docs

Related Issues or PRs

Dependencies Versions

Notes on Testing

Screenshots

Checklist

Uh oh!

coderabbitai bot commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Bug Fixes

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

sonarqubecloud bot commented Feb 19, 2026

Quality Gate passed

Uh oh!

github-actions bot commented Feb 19, 2026

Test Results

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pk-zipstack commented Feb 12, 2026 •

edited

Loading

coderabbitai bot commented Feb 12, 2026 •

edited

Loading