Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scripts for producing PyPI-compatible manylinux wheel files #1028

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

thammegowda
Copy link
Collaborator

Description

While the cmake build produces *.whl files, they are not distributable via PyPI.
PyPI enforces certain rules to improve compatibility for different Linux distributions.
This PR adds scripts for producing pymarian wheel files that can be distributable on PyPI.

List of changes:

  • add src/python/build.sh and src/python/build-manylinux.sh scripts.
    The former invokes docker run whereas the latter runs within docker env to create wheels for python version 3.8 to 3.12
  • fixed an issue with Python compatibility (previously wrong headers were included for some Python versions). Solution: set(PYBIND11_NOPYTHON On) before adding pybind11
  • loosened the constraint for huggingface-hub version as the strict version causes conflicts with other libs (such as transformers) and also unavailable for some version of python

Added dependencies: require docker

How to test

Run src/python/build.sh to produce wheel files at build-python/manylinux/*.whl

Describe how you have tested your code, including OS and the cmake command.

Checklist

  • I have tested the code manually
  • I have run regression tests
  • I have read and followed CONTRIBUTING.md
  • I have updated CHANGELOG.md

@thammegowda
Copy link
Collaborator Author

thammegowda commented Aug 9, 2024

Turns out PyPI has file size limit of 100MB by default.
Our statically linked cuda supported native extensions are ~600MB. CMAKE_BUILD_TYPE=Slim did not help reduce the extension size when CUDA is enabled.
So I was unable to upload our packages today.

There is a process to request the increase of file size limit. For instance, pytorch packages are ~800MB and they have successfully uploaded to PyPI. I have followed PyPI's suggested process for requesting limit increase.
I am not sure how long they will take to review our request and approve it. Fingers crossed.

Reference to track progress: pypi/support#4520

@thammegowda
Copy link
Collaborator Author

update: the limit was increased this morning, and I have uploaded packages to PyPI, which are compatible with both CUDA and intel MKL backends.

# this wont work if pybind11 is git submodule
#find_package(pybind11 REQUIRED)

# NOTE: this property must be set before including pybind11
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment seems confusing given the commented line below. Please check if it's okay and explain in the comment or fix.

@@ -47,7 +47,7 @@ demos = [
"flask",
"sacremoses",
"pyqt5",
"sentence-splitter@git+https://github.com/mediacloud/sentence-splitter",
# "sentence-splitter@git+https://github.com/mediacloud/sentence-splitter",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please explain in the comment why it's commented (because not used yet but will be in pymarian-webapp?) or remove.

@thammegowda
Copy link
Collaborator Author

Moved this PR to internal fork. Leaving it open here in case somebody wants to build manylinux wheels and we should close this PR once the code is synced

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants