title | parent |
---|---|
2023-11-08 |
Meetings |
Meeting: 08 November 2023
https://zoom-lfx.platform.linuxfoundation.org/meeting/97730566101
- Jean-Francois Panisset, VES Technology Committee
- Kerby Geffrard, OpenRV
- Andrew Grimberg, LF Release Engineering
- Christina Tempelaar-Lietz, ILM
-
aswf-docker updates
- Very close to local Conan cache solution, will be in 2023.2
- WIP branch
- Docker BuildKit (buildx) builds for Conan, still serialized but will eventually be able to parallelize Conan builds (still using a non-buildx container run for login to Conan repo)
- Big win for local builds: local Conan builds now live in BuildKit caches, no longer pulled into Docker build context (got huge with Clang / Qt / PySide Conan builds)
- OpenEXR and OpenVDB getting pybind11 in their ci- build containers
- May not get to fixing issue with GLfw cmake setup for OpenVDB in 2023.1 (transition to Conan) (maybe a short script that can be used in their CI?)
- Immediate plans for 2024 (as soon as 2023.2 is released)
- Qt 6
- Clang 15/16: OK to drop 14? (question for Larry)
- Does OpenEXR have a Clang build? Christina: yes we do, but will only support prior year in the CI. Don't think our CI builds with Clang 14. JF: the VFX Platform is silent about Clang versions. ci-openexr build container doesn't include Clang. Christina: may just be building on the GHA runner? We have Clang 14 and 15 builds. JF: trying to avoid combinatorial explosion, so drop 14 to add 16. Christina: should be fine. JF: Clang is actually in the ci-common container. TODO: JF to review what OpenEXR / Imath is doing with its CI.
- Kerby: we use Clang through Xcode for macOS, otherwise Visual Studio for Windows and GCC for Linux. JF: comment in Slack from Neil Gompa about standard ffmpeg build he maintains for Fedora and EPEL, could be useful. Interesting perhaps for codec IP issues? Kerby: discussion with Apple about ProRes. In the past we tried to be as similar as possible on each OS, but getting more complicated.
- Python 3.11
- Mandated by platform
- CUDA 12.x
- Depending on GPU runners
- OptiX 8
- Non-platform package updates
- More packages Conan-only (Clang, Ninja, Pybind11 Qt, PySide)
- Longer term plans for 2024
- OIIO build container
- ORI build containers
- Container size reduction effort
- Get rid of
nsight-compute
from nvidia/cuda base container if possible - An issue for builds that also need Intel compilers
- aswf-docker GHA removes a number of packages from GHA runners to free up disk space: aswf-docker docker-builds.yml GHA
- Get rid of
- Windows / macOS...
- Your wishes go here
- Linux Containers based on other OSes than Rocky?
- CentOS Stream
- Alma Linux / UBI / ...
- Ubuntu
- Building our own base image instead of using
nvidia/cuda
could help
- Christina: request for mingw on Windows, so not until we have a Windows container. Windows is higher priority than macOS. We get more reports of issues on Windows builds that are not captured by our CI than we do for MacOS. We have had reports of issues building with mingw, looking at adding a CI job. JF: mingw can be an alternative to Visual Studio (even though it has a license for open source use)
- Kerby: nothing right now
- Linux Containers based on other OSes than Rocky?
- Very close to local Conan cache solution, will be in 2023.2
-
Managing for pay resources
- GPU / larger runners / ARM Linux runners / Apple Silicon runners
- More visibility to TAC
- Integration with LFX dashboard. Andrew: GitHub has no API for that, they say something is coming soonish. Want an API also to see how many free resources we're consuming, zero visibility into how many free resources are being used.
- Resource allocation to projects
- Dealing with bursts (OpenVDB October release)
- Mechanism for projects to request additional resources
- Carrying balance forward
- $1450 cap/month (with $1500 hard cap). JF is it "use it or lose it"? Andrew: not really, LF IT keeps track of exact spend per month, LF AP passes all invoices to us for validation, we keep running tally, since we know what the cap is supposed to be, we know how much free space we have left in the budget. So last month was able to raise limit on credit card for October since we had enough budget. So there's a manual carry forward process. JF: would be great if it could somehow be automated.
- Are we still one of the few LF sub foundations that use non free resources?
- Andrew: CNCF are a big user, but they self support / built a shadow IT organization, but CNCF is 100x the size of ASWF
- Andrew: asked John to double CI budget for next year since we know the GPU runners are now in open beta, ARM runners are expected some time in next 2 quarters (we will be taking part in the beta). I expect we'll see these things come in, ASWF will want to use them, need to have the capacity for that. Also additional projects will be joining ASWF. Right now we're running so close to the budget that we need to double it, but it needs to be approved. JF: several TSC chairs should be happy to provide supporting quotes for an increase.
- JF: better coordination via #wg-ci TODO: post to various project channels about this
-
OpenRV Updates
- Dockerfile(s) ready, but not à la asfw-docker... What's next?
- At Autodesk we build on CentOS 7 and Rocky 9, tried to use in aswf-docker but realized the Dockerfiles are a bit "special".
- Don't need GPU driver or anything like that.
- JF: can walk you through how to add a ci-openrv build container
- JF: you can start with the closest container and just run a script in your CI to add missing dependencies
- Pre-caching dependencies using artifactory
- OpenRV Dependencies
- Automate the build, if you click on link above, all the dependencies in OpenRV
- More than half our build time is taken up by dependency builds, and that's the most difficulty. Becomes really painful, and need a bunch of stuff to your own environment. Starting to talk about caching some dependencies that don't change.
- Anybody else have a project in mind? Do you know if OIIO or OCIO have talked about providing pre-compiled binaries? Want't to know what's the right way to do this, don't want to put yet another Python / Qt in Artifactory for now good reason. But need to know the info about similar projects. xStudio might benefit from this, first step of their repo is to gather all their dependencies, so a script that can go and get them would help.
- Andrew: we have to be careful about what we put in there, must be stuff we produced ourselves? Kerby: could we use GitHub releases? Andrew: you may be using open source projects, but don't necessarily have the license to redistribute. We have to be careful about.
- JF: aswf-docker uses both Conan/Artifactory and Docker Hub
- Andrew: our agreement with Artifatory is that anything we put in there is produced by our projects, so we aren't authorized to host other projects.
- Kerby: we want to pre-compile Qt, OpenSSL... Andrew: if the other project is vendored into our own...
- Andrew: the issue is with Artifactory specifically. Kerby: so GitHub releases would be fine? Andrew: that should be OK, we just can't do that with Artifactory because of our JFrog agreement. Kerby: so we could put resulting OpenRV build in there? What if it bundles a bunch of other stuff? Andrew: I am not a lawyer! Kerby: we find way too much time fighting dependencies. This is coming, we might ask more questions in the future.
- Cost of CI, what is required to be known?
- If we want to have a CI for OpenRV, what do we need to know to put something live? We don't need GPU, we don't need ARM / Apple Silicon.
- Andrew: if you can fit in 2 core / 2 GB... Otherwise you need the larger runners. We run about $600/month below budget.
- Kerby: at Autodesk we build with 8 cores? JF: look at definitions for for-pay Linux builders in #wg-ci. Kerby: what would be a correct target of spend per month? I know nothing about the cost of open source projects. At Autodesk we have a datacenter where we deploy VMs, now we need to think about the cost of what we're doing, don't know what's "reasonable". Andrew: that's hard to say, we get alerts from GitHub when we near our budget cap. If we run out of capacity near middle of the month we'll pull reports and check, which we did last month for OpenVDB. If we are starting to run close to the budget. "Reasonable" is a matter of opinion across projects.
- Andrew: we have unlimited free minutes across public repos, but are limited in concurrency. But we are an "enterprise" GitHub plan, so we have largest capacity for concurrency, it's larger than what we should actually have.
- Kerby: we don't use ccache any more to make the build simpler.
- Dockerfile(s) ready, but not à la asfw-docker... What's next?
-
Current available for-pay runners (Andrew)
- ubuntu-20.04-4c-16g-150h
- ubuntu-20.04-8c-32g-300h
- ubuntu-20.04-16c-64g-600h
- windows-2022-8c-32g-600h
- windows-2022-16c-64g-600h
- ubuntu-20.04-gpu-6c-112g-336h-16vr
- ubuntu-20.04-gpu-12c-224g-672h-32vr
- ubuntu-20.04-gpu-24c-448g-1344h-64vr
-
Maintaining custom ASWF GH actions
- Does it make sense / would it be useful for projects?
- Restoring access to GHA GPU runners
- GHA GPU runners are back alive after updating the images: sample GPU enabled GHA run of nvidia-smi
- Official announcement of private beta
- Worked with
ubuntu-20.04-gpu-12c-224g-672h-32vr
runner, likely very expensive (was previously usingubuntu-20.04-gpu-6c-112g-336h-16vr
) - Need a writeup of how projects can use this and what they need to set up in their CI (volunteers?)
-
$0.07/min for Linux, $ .014/min for Windows for Azure nct-v3-series machines- 4 to 64 AMD EPYC Rome CPUs
- 28-440GB
- 180-2880 GB storage
- NVIDIA T4 GPUs with 16-64GB of VRAM
- Pricing above for smallest instance presumably
- Linux image based on Ubuntu
- Windows image based on Server 2019
- See discussion above about for pay resources
-
Apple Silicon GHA runners in early access
- Expensive!
- Linux ARM GHA runners private beta
- CI WG Project Requirements Survey
- No progress yet
- Breakdown of outstanding CII badge issues for OpenEXR and MaterialX
- PyPI organization level accounts
- Raven: CI/CD Security Analyzer
- OSV vulnerability scanner C/C++ support
- Available as a OSV-Scanner GitHub Action
- Complement SonarCloud scanning / Snyk?
- Pinning dependencies by hash instead of version number
- Avoid issues with version labels changing (malicious or not)
- Fedora Datagrepper Feed to monitor updates
- FFmpeg from Fedora / EPEL as standard version?
- C++ 20 modules with Clang and CMake
- C++ Modules and Conan
- OIIO PR for Apple Silicon builds