Skip to content
This repository has been archived by the owner on Feb 10, 2025. It is now read-only.

1.0.0-beta: Intel® AI for Enterprise RAG

Pre-release
Pre-release
Compare
Choose a tag to compare
@mzyczyns mzyczyns released this 18 Nov 13:51
· 1 commit to pre-release/1.0.0-beta since this release
18134a6

Introducing Intel® AI for Enterprise RAG – this first pre-release brings a variety of powerful features allowing users to quickly establish a highly optimized Chat Q&A RAG application with enterprise capabilities such as authentication and authorization, custom UI, and detailed telemetry.

Validated SW stack

Component Version
OS Ubuntu 22.04.4 LTS
Habana driver 1.18.0-ee698fb
Kubernetes v1.29.5

Getting started

To deploy your ChatQ&A RAG application please follow the instruction.

Click here to watch the demo recording!

Known issues

  • The data ingestion pipeline permits users to upload files up to 64MB. However, users might experience timeouts depending on the content volume within the documents.
  • In the Grafana Dashboard for 'EnterpriseRAG/VectorDB/Redis,' the memory usage graph shows infinite values because Kubernetes pods lack memory limits. Please ignore the displayed value.
  • Creating a Relevance scanner in LLM Guard Output Guardrail does not work with the ONNX model. Workaround: when enabling Relevance scanner, uncheck the "use_onnx" check box.
  • The Guardrail process may hang not deterministically when the scanner is enabled via a .env file and the POST request changes the scanner configuration. Also, scanners created via a .env file are not represented in the UI yet. Workaround: prioritize using scanners enabled in UI.
  • pcm-sensor-server is an "experimental preview feature" based on the "work in progress" PCM helm chart, so the "Intel Xeon telemetry PCM dashbord" will be empty unless manual installation steps are done for installing PCM .