ArchiveBox
Pinned Loading
Repositories
- abx-spec-behaviors Public
Proposal for a shared user script specification between scraping, crawling, archiving, and AI tools. Allows user scripts like "expand comments", "hide popups", "fill out this form", etc. to be reusable across pure browser environments, puppeteer, playwright, extensions, and many other contexts with minimal adjustments.
ArchiveBox/abx-spec-behaviors’s past year of commit activity - ArchiveBox Public
🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...
ArchiveBox/ArchiveBox’s past year of commit activity - abx-dl Public
⬇️ A CLI tool to download all discovered content from a URL (like youtube-dl/yt-dlp, forum-dl, gallery-dl). 🎭 Uses headless Chrome to get HTML, JS, CSS, images/video/audio/subtitles, PDFs, screenshots, article text, git srcs, and more...
ArchiveBox/abx-dl’s past year of commit activity - pip-archivebox Public archive
Official Python package for ArchiveBox, the self-hosted internet archiving solution.
ArchiveBox/pip-archivebox’s past year of commit activity - homebrew-archivebox Public archive
Homebrew formula for the ArchiveBox self-hosted internet archiving solution.
ArchiveBox/homebrew-archivebox’s past year of commit activity - debian-archivebox Public archive
Home of the official apt/deb package for Ubuntu/Debian-based systems.
ArchiveBox/debian-archivebox’s past year of commit activity - readability-extractor Public
Javascript/Node wrapper around Mozilla's Readability library so that ArchiveBox can call it as a oneshot CLI command to extract each page's article text.
ArchiveBox/readability-extractor’s past year of commit activity
Top languages
Loading…