R2GQA: Retriever-Reader-Generator Question Answering System to Support Students Understanding Legal Regulations in Higher Education
R2GQA is an automated question answering system developed to help students better understand legal regulations in higher education. The system combines advanced methods in information retrieval and answer generation, including:
- Searching for relevant context from legal document databases
- Extracting precise answers from question and context
- Generating natural, user-friendly responses
This project is part of research published in the paper R2GQA: Retriever-Reader-Generator Question Answering System to Support Students Understanding Legal Regulations in Higher Education.
Note: The dataset in this repository is only permitted for research purposes, not for commercial use.
- Libraries listed in
requirements.txt
git clone https://github.com/dpptinh/R2GQA-system
cd R2GQA
pip install -r requirements.txt
- Create
.env
file with the following environment variables:
FULL_DATA_PATH=<path to data file>
DOCUMENTS_LINK_PATH=<path to links file>
EMBEDDING_MODEL_PATH=<path to embedding model>
EXTRACTIVE_MODEL_PATH=<path to extractive model>
ABSTRACTIVE_MODEL_PATH=<path to abstractive model>
Run the application:
python main.py
The system consists of 3 main components:
- Retriever: Searches and retrieves text passages relevant to the question
- Reader: Extracts precise answers from text passages
- Generator: Generates natural responses based on extracted answers
Contributions are welcome. Please:
- Fork the repository
- Create a new branch (
git checkout -b feature/AmazingFeature
) - Commit your changes (
git commit -m 'Add some AmazingFeature'
) - Push to the branch (
git push origin feature/AmazingFeature
) - Create a Pull Request
This project is distributed under the CC BY-NC-SA 4.0 License for research purposes only. See LICENSE for more information.