Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the communication on the positive impact of HPy on the Python ecosystem #419

Open
paugier opened this issue Mar 14, 2023 · 5 comments

Comments

@paugier
Copy link
Contributor

paugier commented Mar 14, 2023

I worked on a presentation about HPy for the french "Calcul group".

The pdf of the presentation is here.

I wrote some notes about HPy mostly for me to prepare the presentation. When HPy 0.9 will be released and if I feel that it can be useful, I could share another version of this text (a bit more positive), for example on Numpy mailing list.

This version contains some parts that could potentially be taken into account to improve the project, so I'm going to open few issues here to share my (user) point of view.

I start with the section on communication that I just copy here. It could be better to understand my point to read the whole text.

Communication about HPy

Currently, the external communication on HPy is mostly done through the project README, the project website/blog and the project documentation.

Most of these documents target people working with the CPython C API and this technical documentation is IMO remarkably clear and educational. Unfortunately, HPy communication is less good to motivate maintainers of projets and end-users to
support the project. There is a very interesting page HPy overview which nicely presents the motivation and goals. However, some high level points of view are IMHO missing:

  • the general picture (deep technical issues of our ecosystem, which block the use and the development of alternative Python implementations)
  • the targeted new state for the Python ecosystem (and the direct consequences for users)
  • the proposed big transition towards HPy of nearly the whole Python ecosystem.

There are also very few mentions of HPy in Python conferences and social media. Victor Stinner mentions the project in some of his talks, for example Python Performance: Past, Present and Future at EuroPython 2019 and Introducing incompatible changes in Python at PyCon US 2023. However, there won't be any talks centered on HPy at PyCon 2023 (except in the Language Summit).

It is interesting to compare the respective communications of HPy and of the Faster CPython project. The Faster CPython project presents very high ambitions (X5 speedup for CPython in few years) and explains its detailed plan through many channels, for example a talk by Mark Shannon at PyCon 2023. In contrast, HPy is very shy and conservative. It is repeated in the documentation that "HPy is still in the early stages of development" and that "there is still a long road before HPy is usable for the general public". This is partially true, but this is not very positive and motivating. The concrete consequences for the end-users are somehow hidden behind quite technical data.

However, with all the respect that I have for the people working on the Faster CPython project, a successful HPy project (i.e. most popular packages with extensions using HPy) would lead to a deeper and better improvement to the Python ecosystem. In particular, it would bring great speedup for several users, which will be able to use Python implementations really "5 times faster" than Python 3.10 for many real world applications. Having specialized interpreters for different tasks would be much better than one interpreter that still has to support its problematic legacy C API during years. Without improvements of the
CPython C API and with the constrain of not degrading performance of extensions using the legacy C API, we start to know that the target "x5 faster" of the Faster CPython project is very ambitious. Thus, a successful HPy would actually help a lot the Faster CPython project.

@hodgestar
Copy link
Contributor

@paugier Thank you for giving the talk and for writing all of this up. I agree that we could be a lot more upbeat. Much of the documentation was written a couple of years ago when HPy was just starting out. It definitely needs an update to match the current much more usable and mature status of the project.

Would incorporating the manifesto and the What needs to change and why help explain the purpose to HPy users / the broader community or is it still too technical?

@paugier
Copy link
Contributor Author

paugier commented Mar 14, 2023

These texts (manifesto and What needs to change and why) are nice and we can see if the current documentation can be improved with them.

But I also think that it would be good if a standard Python user starts to read the README, the doc and the website, s-he should think after the first 5 lines "It seems technical but I understand what it is about. This project is useful for me and my colleagues. I hope it is going to be successful and I understand how we can help."

So it seems to me that a paragraph or a note deliberately too simple and catchy would be useful. Something like:

  • The Python ecosystem is somehow broken because of technical issues in the CPython C API.
  • This is the main reason why Python is slow.
  • This project is a clean solution to fix this situation by providing a new C API for the Python language.
  • The C extensions of most popular packages (Numpy, Matplotlib, Pandas, Scipy, Tensorflow, etc.) will have to be rewritten using HPy.
  • It will then be very easy for the end-users to use much efficient Python interpreters. To get numerical computation much faster and to be much less limited in terms of what is possible to do efficiently in Python.

@mattip
Copy link
Contributor

mattip commented Mar 15, 2023

One way to think about documentation is via the Diátaxis framework which

identifies four modes of documentation - tutorials, how-to guides, technical reference and explanation.

NumPy and others have adopted these categorization, and structured the top-level documentation categories loosely around them.

Help is welcome to think about better structuring the documentation, using Diátaxis or any other framework.

@paugier
Copy link
Contributor Author

paugier commented Mar 16, 2023

I did my seminar on HPy yesterday. I am quite happy with what it gave and I got very positive feedback. It means that it is indeed possible to communicate about HPy to a general audience full of Python developers that know nothing about the CPython C API or about alternative Python implementations.

A small detail: for this presentation, I finally completely avoided C examples. If I had to do it again, I would add few very very simple C examples because during the questions I had to explain things that would have been easier to explain with these examples (in particular pointers to PyObjects and explicit reference counting versus handles).

Regarding improving the website and documentation for simple Python users (that again know nothing about the CPython C API and alternative Python implementations), I think it could be useful to have a specific page for them in the website or in the documentation with some links at the beginning of the README, the website and the documentation.

Currently the first texts that are available in these pages are way too technical for simple Python users and even many maintainers of Python packages. In the Summary section of the README, we feel a real effort to explain things, but even this part is too technical ("GC instead of refcounting", "GIL", "binary stability", "API/ABI", consequences in terms of C developers/extensions and not in terms of end-users).

To better explain what I mean, I'm going to try to write something explaining HPy to my colleagues/friends using Python. I'm bad in English and I don't have time to really work on this text but I guess the ideas are going to be understandable for you HPy developers.

Introduction for Python users that don't know what is the CPython C API

The Python ecosystem is based on the CPython C API (i.e. on C functions to interact with the Python interpreter). Some of your favorite packages (in particular Numpy, Matplotlib, Scipy, Pandas, ...) contain extensions produced from C code using this API. Unfortunately, the CPython C API has deep technical issues, which block improvements of the ecosystem. On the one hand, it is very difficult to improve the performance of CPython (the reference Python implementation) and on the other hand, supporting the CPython C API is a nightmare for alternative Python implementation (like PyPy, GraalPython, RustPython, MicroPython, etc.) and completely blocks their usage because it leads to very bad performance for code using the CPython C API.

Blabla/data on how much better/faster the alternative Python implementations are and what it would change for the user to be able to use them.

HPy is a better API for extending Python in C. When most popular packages using the CPython C API will be ported to this new API, their wheels (what you usually install with pip install) will be compatible and efficient with CPython and all other alternative implementation supporting HPy (currently PyPy and GraalPy). It will become very easy for Python users to choose the most efficient Python implementations, which will be able to specialize themself for particular use cases.
By relaxing the constrains of maintaining good performance for the legacy CPython API, it will become much easier to change some CPython internals to improve its overall performance.

We are now convince that there is a clean technical solution to reach this better state, however there is still a long road.

Blabla on what remains to be done, in particular evaluation of the amount of work to port the most popular packages. + it's good to motivate people to give some raw estimation of when python users could be able to start to feel the changes related to HPy. (For example the Faster CPython project mentions x5 faster in 2025!)

To foster this project and the associate big transition in the ecosystem, you can ...

End of the introduction

What I try to say is that it is possible to explain HPy from the point of view of Python users and with very few technical terms.

There is also the idea that the project behind HPy is to port most popular packages using the legacy API to HPy. I feel that it is not really explicit in the README/website/doc.

Note that such text could be filled with hyperlinks to more serious contents with deeper explanation.

@mattip
Copy link
Contributor

mattip commented Mar 16, 2023

Could you create a blog post at https://github.com/hpyproject/hpyproject.org with your presentations or with a link to your presentation?

That repo holds the code used to create hpyproject.org using nikola. I see it is missing a README, which might be nice to add...

I think a non-technical 10,000 meter view would be nice. PRs welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants