A high-performance code scanning tool written in Rust that detects licenses, copyrights, and other relevant metadata in source code.
scancode-rust is designed to be a faster alternative to the Python-based ScanCode Toolkit, aiming to produce compatible output formats while delivering significantly improved performance. This tool currently scans codebases to identify:
- License information
- File metadata
- System information
More ScanCode features coming soon!
- Efficient file scanning with multi-threading
- Compatible output format with ScanCode Toolkit
- Progress indication for large scans
- Configurable scan depth
- File/directory exclusion patterns
cargo install scancode-rustDownload the appropriate binary for your platform from the GitHub Releases page:
- Linux (x64):
scancode-rust-x86_64-unknown-linux-gnu.tar.gz - Linux (ARM64):
scancode-rust-aarch64-unknown-linux-gnu.tar.gz - macOS (Apple Silicon & Intel):
scancode-rust-aarch64-apple-darwin.tar.gz- Intel Macs: Use Rosetta 2 for native-like performance
- Windows:
scancode-rust-x86_64-pc-windows-msvc.zip
Extract and place the binary in your system's PATH:
# Example for Linux/macOS
tar xzf scancode-rust-*.tar.gz
sudo mv scancode-rust /usr/local/bin/git clone https://github.com/yourusername/scancode-rust.git
cd scancode-rust
./setup.sh # Initialize the submodule and configure sparse checkout
cargo build --releaseThe compiled binary will be available at target/release/scancode-rust.
scancode-rust [OPTIONS] <DIR_PATH> --output-file <OUTPUT_FILE>Options:
-o, --output-file <OUTPUT_FILE> Output JSON file path
-d, --max-depth <MAX_DEPTH> Maximum directory depth to scan [default: 50]
-e, --exclude <EXCLUDE>... Glob patterns to exclude from scanning
--no-assemble Disable package assembly (merging related manifest/lockfiles)
-h, --help Print help
-V, --version Print versionscancode-rust ~/projects/my-codebase -o scan-results.json --exclude "*.git*" "target/*" "node_modules/*"scancode-rust is designed to be significantly faster than the Python-based ScanCode Toolkit, especially for large codebases, thanks to native Rust performance and parallel processing. See Architecture: Performance Characteristics for details.
The tool produces JSON output compatible with ScanCode Toolkit, including:
- Scan headers with timestamp information
- File-level data with license and metadata information
- System environment details
- Architecture - System design, processing pipeline, and design decisions
- Supported Formats - Complete list of supported package ecosystems and file formats
- How to Add a Parser - Step-by-step guide for adding new parsers
- Testing Strategy - Testing approach and guidelines
- ADRs - Architectural decision records
- Beyond-Parity Improvements - Features where Rust exceeds the Python original
Contributions are welcome! Please feel free to submit a Pull Request.
To contribute to scancode-rust, follow these steps to set up the repository for local development:
-
Install Rust
Ensure you have Rust installed on your system. You can install it using rustup:curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
-
Clone the Repository
Clone thescancode-rustrepository to your local machine:git clone https://github.com/mstykow/scancode-rust.git cd scancode-rust -
Initialize the License Submodule
Use the following script to initialize the submodule and configure sparse checkout:./setup.sh
-
Install Dependencies
Install the required Rust dependencies usingcargo:cargo build
-
Run Tests
Run the test suite to ensure everything is working correctly:cargo test -
Set Up Pre-commit Hooks
This repository uses pre-commit to run checks before each commit:# Using pip pip install pre-commit # Or using brew on macOS brew install pre-commit # Install the hooks pre-commit install
-
Start Developing
You can now make changes and test them locally. Usecargo runto execute the tool:cargo run -- [OPTIONS] <DIR_PATH>
Releases are automated using cargo-release and GitHub Actions.
One-time setup:
-
Install
cargo-releaseCLI tool:cargo install cargo-release
-
Authenticate with crates.io (one-time only):
cargo login
Enter your crates.io API token when prompted. This is stored in
~/.cargo/credentials.tomland persists across sessions.
Use the release.sh script:
# Dry-run first (recommended)
./release.sh patch
# Then execute the actual release
./release.sh patch --executeAvailable release types:
patch: Increments the patch version (0.0.4 → 0.0.5)minor: Increments the minor version (0.0.4 → 0.1.0)major: Increments the major version (0.0.4 → 1.0.0)
What happens automatically:
- Updates SPDX license data to the latest version from upstream
- Commits the license data update (if changes detected)
cargo-releaseupdates the version inCargo.tomlandCargo.lock- Creates a git commit:
chore: release vX.Y.Z - Creates a GPG-signed git tag:
vX.Y.Z - Publishes to crates.io
- Pushes commits and tag to GitHub
- GitHub Actions workflow is triggered by the tag
- Builds binaries for all platforms:
- Linux: x64 and ARM64
- macOS: ARM64 (Apple Silicon, works on Intel via Rosetta 2)
- Windows: x64
- Creates archives (.tar.gz/.zip) and SHA256 checksums
- Creates a GitHub Release with all artifacts and auto-generated release notes
Note: The release script ensures every release ships with the latest SPDX license definitions. It also handles a sparse checkout workaround for
cargo-release.
Monitor the GitHub Actions workflow to verify completion.
This project is licensed under the Apache License 2.0.