Skip to content

feat: Add LangExtract tool for structured information extractionΒ #91

@its-animay

Description

@its-animay

πŸ”΄ Required Information

Is your feature request related to a specific problem?

ADK community lacks a native integration for LangExtract β€” Google's own library for extracting structured information from unstructured text using LLMs with precise source grounding. Users who want to use LangExtract within ADK agents currently have to manually wrap `lx.extract()` in a custom tool class with significant boilerplate.

Describe the Solution You'd Like

Add a `LangExtractTool` to `google.adk_community.tools` that:

  • Extends `BaseTool` with a clean function declaration exposing `text` and `prompt_description` as LLM-visible parameters
  • Pre-configures extraction settings (examples, model_id, extraction_passes, etc.) at construction time
  • Runs `lx.extract()` via `asyncio.to_thread()` to avoid blocking the event loop
  • Includes a companion `LangExtractToolConfig` for easy programmatic configuration

Usage:

from google.adk_community.tools import LangExtractTool
import langextract as lx

tool = LangExtractTool(
    name='extract_entities',
    description='Extract named entities from text.',
    examples=[
        lx.data.ExampleData(
            text='John is a software engineer at Google.',
            extractions=[
                lx.data.Extraction(
                    extraction_class='person',
                    extraction_text='John',
                    attributes={'role': 'software engineer', 'company': 'Google'},
                )
            ],
        )
    ],
)

agent = Agent(model='gemini-2.5-flash', name='extraction_agent', tools=[tool])

Impact on your work

Enables ADK agents to perform structured extraction (entities, attributes, relationships) from documents out of the box β€” a common use case for enterprise AI workflows. Since LangExtract is a Google library, having native ADK community support is a natural fit.

Willingness to contribute

Yes β€” I have an implementation ready to submit as a PR.


🟑 Recommended Information

Additional Context

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions