Collection of python code to re-use across python-based scrapers
- This library is meant to be installed via PyPI (
zimscraperlib
). - Make sure to reference it using a version code as the API is subject to frequent changes.
- API should remain the same only within the same minor version.
Example usage:
zimscraperlib>=1.1,<1.2
See functional architecture, software architecture and technical architecture for more details on scraperlib (not all aspects are covered yet, this is a WIP).
- libmagic
- wget
- libzim (auto-installed, not available on Windows)
- Pillow
- FFmpeg
- gifsicle (>=1.92)
brew install libmagic wget libtiff libjpeg webp little-cms2 ffmpeg gifsicle
sudo apt install libmagic1 wget ffmpeg \
libtiff5-dev libjpeg8-dev libopenjp2-7-dev zlib1g-dev \
libfreetype6-dev liblcms2-dev libwebp-dev tcl8.6-dev tk8.6-dev python3-tk \
libharfbuzz-dev libfribidi-dev libxcb1-dev gifsicle
apk add ffmpeg gifsicle libmagic wget libjpeg
This project adheres to openZIM's Contribution Guidelines.
This project has implemented openZIM's Python bootstrap, conventions and policies v1.0.2.
pip install hatch
pip install ".[dev]"
pre-commit install
# For tests
invoke coverage
Non-exhaustive list of scrapers using it (check status when updating API):