Skip to content

Commit

Permalink
Update 2025-01-21-open-french-law-rag.md
Browse files Browse the repository at this point in the history
  • Loading branch information
matteocargnelutti committed Jan 21, 2025
1 parent d61a277 commit 4c8bec8
Showing 1 changed file with 12 additions and 10 deletions.
22 changes: 12 additions & 10 deletions app/_posts/2025-01-21-open-french-law-rag.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,11 @@ tags:
---


<a href="/open-french-law-rag"><img src="https://lil-blog-media.s3.amazonaws.com/oflr-1400.jpg" alt="Open French Law RAG Case Study"/></a>
<a href="/open-french-law-rag/"><img src="https://lil-blog-media.s3.amazonaws.com/oflr-1400.jpg" alt="Open French Law RAG Case Study"/></a>

Imagine that you are an English speaker visiting France, engaged in discussion with a French local about a legal issue, but you are a novice French speaker and not familiar with the French legal system. Fortunately, you have a laptop containing over 800,000 French law articles, where the answer to your question may be found. You also have access to open-source software and a multilingual large language model, capable of reading these legal documents and answering questions about them in English. Could a tool like this help you overcome both language and knowledge barriers when exploring large collections of information? How might LLMs help people access and understand legal information that is either in a foreign language or requires specialized knowledge?

We built the Open French Law Retrieval Augmented Generation (RAG) pipeline as part of [a case study](/open-french-law-rag) in which we explored how French law could be more accessible to non-French speakers. By experimenting with an off-the-shelf pipeline that combines LLMs with multilingual [Retrieval Augmented Generation](https://scriv.ai/guides/retrieval-augmented-generation-overview/) techniques, we aimed to investigate how such a tool might help non-French speakers of varying expertise ask questions in English to explore French law.
We built the Open French Law Retrieval Augmented Generation (RAG) pipeline as part of **[a case study](/open-french-law-rag/)** in which we explored how French law could be more accessible to non-French speakers. By experimenting with an off-the-shelf pipeline that combines LLMs with multilingual [Retrieval Augmented Generation](https://scriv.ai/guides/retrieval-augmented-generation-overview/) techniques, we aimed to investigate how such a tool might help non-French speakers of varying expertise ask questions in English to explore French law.

In the French civil law system, the emphasis is primarily on statutes—many of which are codified—rather than on case law which does not constitute binding precedents. This framework provided a favorable environment for experimenting with the RAG approach for legal information retrieval as it allows for the integration of structured information.

Expand Down Expand Up @@ -56,13 +56,15 @@ We analyzed the output for source relevance and accuracy, logical coherence, fac

With our particular experimental setup and analysis criteria, we identified the following trends in our study:

- **Performance Comparison: English vs. French**
- English questions showed slightly better performance compared to French questions, although RAG helped mitigate this difference. Both models performed better in English than in French.
- **Impact of RAG**
- While the use of RAG enhanced the accuracy and relevancy of some responses, it also introduced additional complexity and potential for errors.
- Incorporating RAG improved the system’s performance in both English and French.
- **Accuracy and Relevancy**
- We observed the prevalence of partially inaccurate responses that mix true and false statements, along with different types of inaccuracies. We observed that errors in responses often arose from the model’s inability to properly determine material, geographical and temporal scope of rules. This is a significant limitation because it is a core skill of lawyers. In addition, the retrieval of irrelevant embeddings also introduced inaccuracies.
**Performance Comparison: English vs. French**
- English questions showed slightly better performance compared to French questions, although RAG helped mitigate this difference. Both models performed better in English than in French.

**Impact of RAG**
- While the use of RAG enhanced the accuracy and relevancy of some responses, it also introduced additional complexity and potential for errors.
- Incorporating RAG improved the system’s performance in both English and French.

**Accuracy and Relevancy**
- We observed the prevalence of partially inaccurate responses that mix true and false statements, along with different types of inaccuracies. We observed that errors in responses often arose from the model’s inability to properly determine material, geographical and temporal scope of rules. This is a significant limitation because it is a core skill of lawyers. In addition, the retrieval of irrelevant embeddings also introduced inaccuracies.

While our findings are interesting, we recognize the limitations in our experimental scope and evaluation. Interpreting these results requires caution in drawing broad conclusions about the generalizability and robustness of our data.

Expand All @@ -78,7 +80,7 @@ Our key takeaways focused on the questions: _“How can legal AI tools be used e

We welcome feedback and contributions to this experiment and aim to spark cross-cultural, interdisciplinary conversations among librarians, engineers, and legal scholars about the use of RAG-based legal tools.

If you’re interested in learning more, you can find detailed examples, analyses, and a thorough discussion of our experiment and findings in our case study. **[Explore the case study](/open-french-law-rag)**.
If you’re interested in learning more, you can find detailed examples, analyses, and a thorough discussion of our experiment and findings in our case study. **[Explore the case study](/open-french-law-rag/)**.

---

Expand Down

0 comments on commit 4c8bec8

Please sign in to comment.