Skip to content

Latest commit

 

History

History
188 lines (134 loc) · 6.73 KB

README.md

File metadata and controls

188 lines (134 loc) · 6.73 KB

SearchQnA Application

Search Question and Answering (SearchQnA) harnesses the synergy between search engines, like Google Search, and large language models (LLMs) to enhance QA quality. While LLMs excel at general knowledge, they face limitations in accessing real-time or specific details due to their reliance on prior training data. By integrating a search engine, SearchQnA bridges this gap.

Operating within the LangChain framework, the Google Search QnA chatbot mimics human behavior by iteratively searching, selecting, and synthesizing information. Here's how it works:

  • Diverse Search Queries: The system employs an LLM to generate multiple search queries from a single prompt, ensuring a wide range of query terms essential for comprehensive results.

  • Parallel Search Execution: Queries are executed simultaneously, accelerating data collection. This concurrent approach enables the bot to 'read' multiple pages concurrently, a unique advantage of AI.

  • Top Link Prioritization: Algorithms identify top K links for each query, and the bot scrapes full page content in parallel. This prioritization ensures the extraction of the most relevant information.

  • Efficient Data Indexing: Extracted data is meticulously indexed into a dedicated vector store (Chroma DB), optimizing retrieval and comparison in subsequent steps.

  • Contextual Result Matching: The bot matches original search queries with relevant documents stored in the vector store, presenting users with accurate and contextually appropriate results.

By integrating search capabilities with LLMs within the LangChain framework, this Google Search QnA chatbot delivers comprehensive and precise answers, akin to human search behavior.

The workflow falls into the following architecture:

architecture

The SearchQnA example is implemented using the component-level microservices defined in GenAIComps. The flow chart below shows the information flow between different microservices for this example.

---
config:
  flowchart:
    nodeSpacing: 400
    rankSpacing: 100
    curve: linear
  themeVariables:
    fontSize: 50px
---
flowchart LR
    %% Colors %%
    classDef blue fill:#ADD8E6,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
    classDef orange fill:#FBAA60,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
    classDef orchid fill:#C26DBC,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
    classDef invisible fill:transparent,stroke:transparent;
    style SearchQnA-MegaService stroke:#000000

    %% Subgraphs %%
    subgraph SearchQnA-MegaService["SearchQnA MegaService "]
        direction LR
        EM([Embedding MicroService]):::blue
        RET([Web Retrieval MicroService]):::blue
        RER([Rerank MicroService]):::blue
        LLM([LLM MicroService]):::blue
    end
    subgraph UserInterface[" User Interface "]
        direction LR
        a([User Input Query]):::orchid
        UI([UI server<br>]):::orchid
    end



    TEI_RER{{Reranking service<br>}}
    TEI_EM{{Embedding service <br>}}
    VDB{{Vector DB<br><br>}}
    R_RET{{Web Retriever service <br>}}
    LLM_gen{{LLM Service <br>}}
    GW([SearchQnA GateWay<br>]):::orange

    %% Questions interaction
    direction LR
    a[User Input Query] --> UI
    UI --> GW
    GW <==> SearchQnA-MegaService
    EM ==> RET
    RET ==> RER
    RER ==> LLM

    %% Embedding service flow
    direction LR
    EM <-.-> TEI_EM
    RET <-.-> R_RET
    RER <-.-> TEI_RER
    LLM <-.-> LLM_gen

    direction TB
    %% Vector DB interaction
    R_RET <-.-> VDB

Loading

Deploy SearchQnA Service

The SearchQnA service can be effortlessly deployed on either Intel Gaudi2 or Intel Xeon Scalable Processors.

Currently we support two ways of deploying SearchQnA services with docker compose:

  1. Start services using the docker image on docker hub:

    docker pull opea/searchqna:latest
  2. Start services using the docker images built from source: Guide

Setup Environment Variable

To set up environment variables for deploying SearchQnA services, follow these steps:

  1. Set the required environment variables:

    # Example: host_ip="192.168.1.1"
    export host_ip="External_Public_IP"
    # Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
    export no_proxy="Your_No_Proxy"
    export GOOGLE_CSE_ID="Your_CSE_ID"
    export GOOGLE_API_KEY="Your_Google_API_Key"
    export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
  2. If you are in a proxy environment, also set the proxy-related environment variables:

    export http_proxy="Your_HTTP_Proxy"
    export https_proxy="Your_HTTPs_Proxy"
  3. Set up other environment variables:

    source ./docker_compose/set_env.sh

Deploy SearchQnA on Gaudi

If your version of Habana Driver < 1.16.0 (check with hl-smi), run the following command directly to start SearchQnA services. Find the corresponding compose.yaml.

cd GenAIExamples/SearchQnA/docker_compose/intel/hpu/gaudi/
docker compose up -d

Refer to the Gaudi Guide to build docker images from source.

Deploy SearchQnA on Xeon

Find the corresponding compose.yaml.

cd GenAIExamples/SearchQnA/docker_compose/intel/cpu/xeon/
docker compose up -d

Refer to the Xeon Guide for more instructions on building docker images from source.

Consume SearchQnA Service

Two ways of consuming SearchQnA Service:

  1. Use cURL command on terminal

    curl http://${host_ip}:3008/v1/searchqna \
        -H "Content-Type: application/json" \
        -d '{
            "messages": "What is the latest news? Give me also the source link.",
            "stream": "True"
        }'
  2. Access via frontend

    To access the frontend, open the following URL in your browser: http://{host_ip}:5173.

    By default, the UI runs on port 5173 internally.

Troubleshooting

  1. If you get errors like "Access Denied", validate micro service first. A simple example:

    http_proxy=""
    curl http://${host_ip}:3001/embed \
        -X POST \
        -d '{"inputs":"What is Deep Learning?"}' \
        -H 'Content-Type: application/json'
  2. (Docker only) If all microservices work well, check the port ${host_ip}:3008, the port may be allocated by other users, you can modify the compose.yaml.

  3. (Docker only) If you get errors like "The container name is in use", change container name in compose.yaml.