By experimenting with AI technologies with significant potential impact on the City of Amsterdam, we learn about their capabilities, potential value, performance, and side effects. Recently, we researched the use of Retrieval Augmented Generation within the municipal context, and more specifically, for answering citizens' questions on the open.amsterdam website. This repository contains the corresponding code.
The Open Government Act (Wet Open Overheid, WOO) requires governmental organizations to disclose information actively. Documents published by the City of Amsterdam can currently be found on https://open.amsterdam/, https://amsterdam.raadsinformatie.nl/ and https://openresearch.amsterdam/.
While information is now publicly available, searching and interpreting the corresponding documents can be a challenging task for citizens. We researched how Retrieval Augmented Generation (RAG) can help us increase transparency, inclusivity, user-friendliness and efficiency by allowing citizens to ask questions about publicly available documents in natural language.
The final report for the project can be found on openresearch in English and Dutch.
data
: Sample data for demo purposesnotebooks
: contains examples of the RAG pipeline usingllama-index
,langchain
andtransformers
, as well as a comparisong ofchunking strategies
src
: some LLM helpers
- Clone this repository:
git clone https://github.com/Amsterdam-AI-Team/Woo-document-analysis.git
The code has been tested with Python 3.9 on Linux.
Feel free to help out! Open an issue, submit a PR or contact us.
This repository was created by Amsterdam Intelligence for the open.amsterdam and the City of Amsterdam.
This project is licensed under the terms of the European Union Public License 1.2 (EUPL-1.2).