From 4c8bec89e005e5fc87e1fe42fa2894f647802f93 Mon Sep 17 00:00:00 2001 From: Matteo Cargnelutti Date: Tue, 21 Jan 2025 17:32:39 -0500 Subject: [PATCH 1/2] Update 2025-01-21-open-french-law-rag.md --- app/_posts/2025-01-21-open-french-law-rag.md | 22 +++++++++++--------- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/app/_posts/2025-01-21-open-french-law-rag.md b/app/_posts/2025-01-21-open-french-law-rag.md index 819e4a54..3aaff6fc 100644 --- a/app/_posts/2025-01-21-open-french-law-rag.md +++ b/app/_posts/2025-01-21-open-french-law-rag.md @@ -7,11 +7,11 @@ tags: --- -Open French Law RAG Case Study +Open French Law RAG Case Study Imagine that you are an English speaker visiting France, engaged in discussion with a French local about a legal issue, but you are a novice French speaker and not familiar with the French legal system. Fortunately, you have a laptop containing over 800,000 French law articles, where the answer to your question may be found. You also have access to open-source software and a multilingual large language model, capable of reading these legal documents and answering questions about them in English. Could a tool like this help you overcome both language and knowledge barriers when exploring large collections of information? How might LLMs help people access and understand legal information that is either in a foreign language or requires specialized knowledge? -We built the Open French Law Retrieval Augmented Generation (RAG) pipeline as part of [a case study](/open-french-law-rag) in which we explored how French law could be more accessible to non-French speakers. By experimenting with an off-the-shelf pipeline that combines LLMs with multilingual [Retrieval Augmented Generation](https://scriv.ai/guides/retrieval-augmented-generation-overview/) techniques, we aimed to investigate how such a tool might help non-French speakers of varying expertise ask questions in English to explore French law. +We built the Open French Law Retrieval Augmented Generation (RAG) pipeline as part of **[a case study](/open-french-law-rag/)** in which we explored how French law could be more accessible to non-French speakers. By experimenting with an off-the-shelf pipeline that combines LLMs with multilingual [Retrieval Augmented Generation](https://scriv.ai/guides/retrieval-augmented-generation-overview/) techniques, we aimed to investigate how such a tool might help non-French speakers of varying expertise ask questions in English to explore French law. In the French civil law system, the emphasis is primarily on statutes—many of which are codified—rather than on case law which does not constitute binding precedents. This framework provided a favorable environment for experimenting with the RAG approach for legal information retrieval as it allows for the integration of structured information. @@ -56,13 +56,15 @@ We analyzed the output for source relevance and accuracy, logical coherence, fac With our particular experimental setup and analysis criteria, we identified the following trends in our study: -- **Performance Comparison: English vs. French** - - English questions showed slightly better performance compared to French questions, although RAG helped mitigate this difference. Both models performed better in English than in French. -- **Impact of RAG** - - While the use of RAG enhanced the accuracy and relevancy of some responses, it also introduced additional complexity and potential for errors. - - Incorporating RAG improved the system’s performance in both English and French. -- **Accuracy and Relevancy** - - We observed the prevalence of partially inaccurate responses that mix true and false statements, along with different types of inaccuracies. We observed that errors in responses often arose from the model’s inability to properly determine material, geographical and temporal scope of rules. This is a significant limitation because it is a core skill of lawyers. In addition, the retrieval of irrelevant embeddings also introduced inaccuracies. +**Performance Comparison: English vs. French** +- English questions showed slightly better performance compared to French questions, although RAG helped mitigate this difference. Both models performed better in English than in French. + +**Impact of RAG** +- While the use of RAG enhanced the accuracy and relevancy of some responses, it also introduced additional complexity and potential for errors. +- Incorporating RAG improved the system’s performance in both English and French. + +**Accuracy and Relevancy** +- We observed the prevalence of partially inaccurate responses that mix true and false statements, along with different types of inaccuracies. We observed that errors in responses often arose from the model’s inability to properly determine material, geographical and temporal scope of rules. This is a significant limitation because it is a core skill of lawyers. In addition, the retrieval of irrelevant embeddings also introduced inaccuracies. While our findings are interesting, we recognize the limitations in our experimental scope and evaluation. Interpreting these results requires caution in drawing broad conclusions about the generalizability and robustness of our data. @@ -78,7 +80,7 @@ Our key takeaways focused on the questions: _“How can legal AI tools be used e We welcome feedback and contributions to this experiment and aim to spark cross-cultural, interdisciplinary conversations among librarians, engineers, and legal scholars about the use of RAG-based legal tools. -If you’re interested in learning more, you can find detailed examples, analyses, and a thorough discussion of our experiment and findings in our case study. **[Explore the case study](/open-french-law-rag)**. +If you’re interested in learning more, you can find detailed examples, analyses, and a thorough discussion of our experiment and findings in our case study. **[Explore the case study](/open-french-law-rag/)**. --- From f58351dc3cbcac34c9a0e52f3c6940b55eeb7ca6 Mon Sep 17 00:00:00 2001 From: Matteo Cargnelutti Date: Tue, 21 Jan 2025 17:37:58 -0500 Subject: [PATCH 2/2] Update index.css --- app/open-french-law-rag/assets/css/index.css | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/app/open-french-law-rag/assets/css/index.css b/app/open-french-law-rag/assets/css/index.css index 6d7008af..60a67bce 100644 --- a/app/open-french-law-rag/assets/css/index.css +++ b/app/open-french-law-rag/assets/css/index.css @@ -297,7 +297,7 @@ section blockquote { Header ------------------------------------------------------------------------------*/ header { - background-image: url("../images/oflr-1400.jpg"); + background-image: url("https://lil-blog-media.s3.amazonaws.com/oflr-1400.jpg"); min-height: 100vh; background-size: cover; display: flex;