Skip to content

Commit

Permalink
Add PyPI publishing action
Browse files Browse the repository at this point in the history
  • Loading branch information
gbenson committed May 15, 2024
1 parent 2719ff5 commit 61f46bb
Show file tree
Hide file tree
Showing 4 changed files with 129 additions and 3 deletions.
101 changes: 101 additions & 0 deletions .github/workflows/publish-to-pypi.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
name: Publish Python distribution to PyPI

on: push

jobs:
build:
name: Build and test the Python distribution
runs-on: ubuntu-latest

steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.x"

- name: Install build+test requirements
run: >-
python3 -m
pip install --user --editable .[dev]
- name: Lint
run: flake8

- name: Build a binary wheel and a source tarball
run: python3 -m build

- name: Store the distribution packages
uses: actions/upload-artifact@v3
with:
name: python-package-distributions
path: dist/

publish-to-pypi:
name: Publish the Python distribution to PyPI
if: startsWith(github.ref, 'refs/tags/') # only publish to PyPI on tag pushes
needs:
- build
runs-on: ubuntu-latest

environment:
name: pypi
url: https://pypi.org/p/dom-tokenizers

permissions:
id-token: write # IMPORTANT: mandatory for trusted publishing

steps:
- name: Restore required artifacts
uses: actions/download-artifact@v3
with:
name: python-package-distributions
path: dist/

- name: Publish Python distribution to PyPI
uses: pypa/gh-action-pypi-publish@release/v1

github-release:
name: >-
Sign the Python distribution with Sigstore
and upload them to GitHub Release
needs:
- publish-to-pypi
runs-on: ubuntu-latest

permissions:
contents: write # IMPORTANT: mandatory for making GitHub Releases
id-token: write # IMPORTANT: mandatory for sigstore

steps:
- name: Restore required artifacts
uses: actions/download-artifact@v3
with:
name: python-package-distributions
path: dist/

- name: Sign the dists with Sigstore
uses: sigstore/[email protected]
with:
inputs: >-
./dist/*.tar.gz
./dist/*.whl
- name: Create GitHub Release
env:
GITHUB_TOKEN: ${{ github.token }}
run: >-
gh release create
'${{ github.ref_name }}'
--repo '${{ github.repository }}'
--notes ""
- name: Upload packages and signatures to GitHub Release
env:
GITHUB_TOKEN: ${{ github.token }}
run: >-
gh release upload
'${{ github.ref_name }}' dist/**
--repo '${{ github.repository }}'
1 change: 1 addition & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
include .flake8
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,10 @@ HTML DOM-aware tokenizers for Hugging Face language models.
## Setup for development

```sh
git clone --recursive https://github.com/gbenson/dom-tokenizers.git
git clone https://github.com/gbenson/dom-tokenizers.git
cd dom-tokenizers
python3 -m venv .venv
. .venv/bin/activate
pip install --upgrade pip
pip install -e .[dev]
pip install -e .[dev,train]
```
26 changes: 25 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,16 +1,40 @@
[project]
name = "dom-tokenizers"
version = "0.0.1"
authors = [{ name = "Gary Benson" }]
description = "HTML DOM-aware tokenizers for Hugging Face language models"
readme = "README.md"
license = { file = "LICENSE" }
requires-python = ">=3.10" # match..case
classifiers = [
"Programming Language :: Python :: 3",
"License :: OSI Approved :: Apache Software License",
"Operating System :: OS Independent",
"Development Status :: 4 - Beta",
"Topic :: Internet :: WWW/HTTP",
"Topic :: Software Development :: Libraries",
"Topic :: Scientific/Engineering :: Artificial Intelligence",
"Topic :: Scientific/Engineering :: Information Analysis",
"Topic :: Text Processing :: Markup :: HTML",
]
dependencies = [
"python-magic",
"tokenizers",
"transformers",
]

[project.urls]
Homepage = "https://github.com/gbenson/dom-tokenizers"
Repository = "https://github.com/gbenson/dom-tokenizers"
"Bug Tracker" = "https://github.com/gbenson/dom-tokenizers/issues"

[project.optional-dependencies]
dev = [
"datasets",
"build",
"flake8",
]
train = [
"datasets",
"pillow",
]

Expand Down

0 comments on commit 61f46bb

Please sign in to comment.