Skip to content

Commit

Permalink
docs: showcase RAG with LlamaIndex and LangChain (#71)
Browse files Browse the repository at this point in the history
Signed-off-by: Panos Vagenas <[email protected]>
  • Loading branch information
vagenas authored Sep 11, 2024
1 parent 79932b7 commit 53569a1
Show file tree
Hide file tree
Showing 6 changed files with 2,704 additions and 39 deletions.
30 changes: 17 additions & 13 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,39 +2,43 @@ fail_fast: true
repos:
- repo: local
hooks:
- id: system
- id: black
name: Black
entry: poetry run black docling examples tests
pass_filenames: false
language: system
files: '\.py$'
- repo: local
hooks:
- id: system
- id: isort
name: isort
entry: poetry run isort docling examples tests
pass_filenames: false
language: system
files: '\.py$'
# - repo: local
# hooks:
# - id: system
# - id: flake8
# name: flake8
# entry: poetry run flake8 docling
# pass_filenames: false
# language: system
# files: '\.py$'
# - repo: local
# hooks:
# - id: system
# - id: mypy
# name: MyPy
# entry: poetry run mypy docling
# pass_filenames: false
# language: system
# files: '\.py$'
- repo: local
hooks:
- id: system
- id: nbqa_black
name: nbQA Black
entry: poetry run nbqa black examples
pass_filenames: false
language: system
files: '\.ipynb$'
- id: nbqa_isort
name: nbQA isort
entry: poetry run nbqa isort examples
pass_filenames: false
language: system
files: '\.ipynb$'
- id: poetry
name: Poetry check
entry: poetry check --lock
pass_filenames: false
Expand Down
7 changes: 5 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,7 @@ Docling bundles PDF document conversion to JSON and Markdown in an easy, self-co
* πŸ“‘ Understands detailed page layout, reading order and recovers table structures
* πŸ“ Extracts metadata from the document, such as title, authors, references and language
* πŸ” Optionally applies OCR (use with scanned PDFs)

For RAG, check out [Quackling](https://github.com/DS4SD/quackling) to get the most out of your docs, be it using LlamaIndex, LangChain or your pipeline.
* πŸ€– Integrates easily with LLM app / RAG frameworks like πŸ¦™ LlamaIndex and πŸ¦œπŸ”— LangChain

## Installation

Expand Down Expand Up @@ -143,6 +142,10 @@ results = doc_converter.convert(conv_input)

You can limit the CPU threads used by Docling by setting the environment variable `OMP_NUM_THREADS` accordingly. The default setting is using 4 CPU threads.

### RAG
Check out the following examples showcasing RAG using Docling with standard LLM application frameworks:
- [Basic RAG pipeline with πŸ¦™ LlamaIndex](https://github.com/DS4SD/docling/tree/main/examples/rag_llamaindex.ipynb)
- [Basic RAG pipeline with πŸ¦œπŸ”— LangChain](https://github.com/DS4SD/docling/tree/main/examples/rag_langchain.ipynb)

## Technical report

Expand Down
Loading

0 comments on commit 53569a1

Please sign in to comment.