Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chain-server container keeps crashing (rag-app-text-chatbot.yaml) #133

Open
jbond00747 opened this issue Jun 13, 2024 · 0 comments
Open

Comments

@jbond00747
Copy link

I'm trying to deploy a basic RAG chatbot using the rag-app-text-chatbot.yaml file, but I'm running into issues with the chain-server container crashing shortly after startup. I believe I've properly followed the directions on https://nvidia.github.io/GenerativeAIExamples/latest/local-gpu.html. I'm using the v0.6.0 tag on the github repository. If I run docker logs on the chain-server container, here's the output I see:

===
INFO:     Started server process [1]
INFO:     Waiting for application startup.
/usr/local/lib/python3.10/dist-packages/langchain/embeddings/__init__.py:29: LangChainDeprecationWarning: Importing embeddings from langchain is deprecated. Importing from langchain will no longer be supported as of langchain==0.2.0. Please import from langchain-community instead:

`from langchain_community.embeddings import HuggingFaceEmbeddings`.

To install langchain-community run `pip install -U langchain-community`.
  warnings.warn(
/usr/local/lib/python3.10/dist-packages/langchain/vectorstores/__init__.py:35: LangChainDeprecationWarning: Importing vector stores from langchain is deprecated. Importing from langchain will no longer be supported as of langchain==0.2.0. Please import from langchain-community instead:

`from langchain_community.vectorstores import FAISS`.

To install langchain-community run `pip install -U langchain-community`.
  warnings.warn(
INFO:faiss.loader:Loading faiss with AVX2 support.
INFO:faiss.loader:Successfully loaded faiss with AVX2 support.
/usr/local/lib/python3.10/dist-packages/tritonclient/grpc/service_pb2_grpc.py:21: RuntimeWarning: The grpc package installed is at version 1.60.0, but the generated code in grpc_service_pb2_grpc.py depends on grpcio>=1.64.0. Please upgrade your grpc module to grpcio>=1.64.0 or downgrade your generated code using grpcio-tools<=1.60.0. This warning will become an error in 1.65.0, scheduled for release on June 25, 2024.
  warnings.warn(
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /root/nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!
INFO:RetrievalAugmentedGeneration.common.utils:Using huggingface as model engine and WhereIsAI/UAE-Large-V1 and model for embeddings
INFO:sentence_transformers.SentenceTransformer:Load pretrained SentenceTransformer: WhereIsAI/UAE-Large-V1
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
INFO:RetrievalAugmentedGeneration.common.utils:Using triton-trt-llm as model engine for llm. Model name: ensemble
ERROR:    Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 734, in lifespan
    async with self.lifespan_context(app) as maybe_state:
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 610, in __aenter__
    await self._router.startup()
  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 713, in startup
    handler()
  File "/opt/RetrievalAugmentedGeneration/common/server.py", line 158, in import_example
    spec.loader.exec_module(module)
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/opt/RetrievalAugmentedGeneration/example/chains.py", line 56, in <module>
    set_service_context()
  File "/opt/RetrievalAugmentedGeneration/common/utils.py", line 131, in wrapper
    return func(*args_hashable, **kwargs_hashable)
  File "/opt/RetrievalAugmentedGeneration/common/utils.py", line 138, in set_service_context
    llm = LangChainLLM(get_llm(**kwargs))
  File "/opt/RetrievalAugmentedGeneration/common/utils.py", line 131, in wrapper
    return func(*args_hashable, **kwargs_hashable)
  File "/opt/RetrievalAugmentedGeneration/common/utils.py", line 270, in get_llm
    trtllm = TensorRTLLM(  # type: ignore
  File "/usr/local/lib/python3.10/dist-packages/langchain_core/load/serializable.py", line 120, in __init__
    super().__init__(**kwargs)
  File "/usr/local/lib/python3.10/dist-packages/pydantic/v1/main.py", line 341, in __init__
    raise validation_error
pydantic.v1.error_wrappers.ValidationError: 1 validation error for TensorRTLLM
__root__
  Channel.unary_unary() got an unexpected keyword argument '_registered_method' (type=type_error)

ERROR:    Application startup failed. Exiting.
Exception ignored in: <function InferenceServerClient.__del__ at 0x7561548a9750>
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/tritonclient/grpc/_client.py", line 257, in __del__
    self.close()
  File "/usr/local/lib/python3.10/dist-packages/tritonclient/grpc/_client.py", line 264, in close
    self.stop_stream()
  File "/usr/local/lib/python3.10/dist-packages/tritonclient/grpc/_client.py", line 1811, in stop_stream
    if self._stream is not None:
AttributeError: 'InferenceServerClient' object has no attribute '_stream'
===

Here's some docker ps output:

$ docker ps -a --format "table {{.ID}}\t{{.Names}}\t{{.Status}}"
CONTAINER ID   NAMES                  STATUS
f025cd96cc5c   milvus-standalone      Up 13 minutes
ca017bfe8648   milvus-etcd            Up 13 minutes (healthy)
b44caa6c6e9a   milvus-minio           Up 13 minutes (healthy)
4b812c48035b   rag-playground         Up 13 minutes
a686d2b3938f   chain-server           Exited (3) 13 minutes ago
7fe575e94855   llm-inference-server   Up 13 minutes
80f535f5a462   notebook-server        Up 13 minutes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant