Release 0.54.0 "Dove"
This new release of OSv focuses on improving Linux compatibility and tooling aimed to make it possible to run unmodified Linux apps on OSv "as-is".
Overview
From the beginning, OSv was designed to implement a subset of Linux POSIX API superset. But until this release most Linux applications had to be re-compiled from source as shared libraries or some, like Java, rely on OSv version of /usr/bin/java
wrapper to run. This meant that one could NOT run a Linux executable "as is". In other words, OSv has always been Linux-compatible at source level but not at binary level.
This release offers a breakthrough and allows running unmodified Linux position-independent executables (so-called "pies") and position-dependant executables "as-is" as long as they do not use "fork/execve" or other unsupported Linux API. It means that very often one can take a binary from Linux host and run it on OSv without having to locate the source code on the Internet and build it as a shared library.
In addition, this release makes OSv more Linux-compatible from another end - booting on a hypervisor. The previous release 0.53 made OSv kernel "look like" ELF64 uncompressed Linux kernel. The new release 0.54 has enhanced OSv loader to "look like" vmlinuz and thus allow booting on Docker's Hyperkit on OSX. The OSv loader has also been enhanced to boot as Linux ELF64 PVH/HVM loader on QEMU with --kernel
option.
Highlights
Linux compatibility
- Applications
- Enhanced getopt family of functions to work correctly with both position-independent executables and position-dependent executables in order to allow receiving program arguments
- Enhanced dynamic linker to be capable of executing position-dependent executables
- Mapped kernel higher in virtual memory - from 0x00200000 to 0x40200000 (2nd GiB) in order to make space for position-dependent executables
- Added new GNU libc extensions:
error()
,__prognames
and__progname_full
- Added missing pseudo-files to procfs and minimal implementation of sysfs in order to support libnuma to allow programs like ffmpeg using x265 codec run on OSv "as-is"
- Encanced
/proc/self/maps
to include i-node number and device ID to support GraalVM apps with isolates - Enhanced
epoll_pwait()
implementation - Improved dynamic linker by making it:
- Ignore old version symbols so that new version symbols are resolved correctly instead
- Delay resolving symbols found missing during
relocate_rela()
phase for certain relocation types to allow more unmodified Linux executables run on OSv - Handle DT_RUNPATH
- Booting
- Added vmlinuz-compatible version of the kernel to allow OSv boot on Docker's Hyperkit
- Enhanced loader to support PVH/HVM boot to allow OSV run on QEMU with
--kernel
option - Added support of QEMU 4.x
- Enhanced HPET driver to support 32-bit main counter
Filesystem improvements
- VFS
- Hardened implementation of
open()/sys_open()/task_conv()
to handle null path - Enhanced
__fxstata
to handleAT_SYMLINK_NOFOLLOW
- Hardened implementation of
- RAMFS
- Greatly improved speed of write/append operations
- Fixed bugs
- Delay freeing data until i-node closed
- Keep i-node number the same
Tools
- Added script
manifest_from_host.sh
to allow building images from artifacts on Linux host “as-is” without need to compile - Added script
build-capstan-mpm-packages
to create capstan MPM packages - Added Ubuntu- and Fedora-based Docker files to help create build and test environment
- Enhanced
test.py
to allow executing unit tests on Firecracker
Bugs and other enhancements
- Fixed
sem_trywait()
that for example allows Java 12 run properly on OSv - Improved memory utilization by using memory below the kernel
- Introduced new command line suffix
!
allowing to force termination of lingering threads - Revamped building of the
cli
andhttpserver
apps to use OpenSSL 1.1 and Lua 5.3 and minimize compilation - Tweaked OSv code to support compilation by GCC 9
Improved Documentation
- Refreshed main README
- OSv-apps
- Scripts
Apps
- Added number of
*-from-host
apps that demonstrate building images out of binaries from Linux host:- Java
- Python
- Node
- Lua
- Ffmpeg
- Added demo app -
openjdk12-jre-from-docker
that creates an image out of a Docker image - Added demo app that demonstrates running GraalVM isolates
- Added an example of a basic mono app
- Improved support of Golang PIEs
Closed issues
- #1050 - Can't run anything with 1.01G of memory
- #1049 - tst_huge hangs with memory over 1GB.
- #1048 - VM with memory larger than 4GB doesn't boot
- #1043 - Map kernel higher in virtual memory
- #1039 - Handle new DT_RUNPATH in object::load_needed()
- #1035 - iperf3 fails with exception nested to deeply on ROFS/RamFS image
- #1034 - Build failures when build directory's pathname has a space
- #1031 - The graalvm-example fails with graalvm 1.0.0-rc13
- #1026 - golang-pie-httpserver crashes on control-C
- #1023 - Ignore missing symbols when loading objects with BIND_NOW in relocate_rela()
- #1022 - lua package requires openssl 1.0
- #1012 - Improve physical memory utilization by using memory below 2MB
- #884 - slow write/append to files on ramfs
- #689 - PIE applications using "optarg" do not work on newer gcc
- #561 - OSv failed to run a pthread application.
- #534 - imgedit.py can't always connect to qemu-nbd
- #305 - Fail to run iperf3 on osv
- #190 - Allow running a single unmodified regular (non-PIE) Linux executable
- #34 - Mono support
Commits by author
KANATSU Minoru (1):
Nadav Har'El (9):
- scripts/build: gracefully handle spaces in image= parameter
- build: don't fail build if pathname has space
- trace.py: fix failure on newest Python
- tracepoints: fix for compiling on gcc 9
- sched: fix gcc 9 warning
- libc: avoid weak_alias() warnings from gcc 9.
- acpi: ignore new gcc 9 warning
- imgedit.py: do not open a port to the entire world
- sched.hh: add missing include
Waldemar Kozaczuk (86):
- Added initial version of README under scripts directory
- Lowered default ZFS qcow2 image size from 10GB to to 256MB
- Add script to setup external bridge
- Refactor and enhance firecracker script
- Changed loader to print total boot time by default
- Enhanced setup.py to support Ubuntu 18.10 and 19.04
- Add GNU libc extension function error()
- Add GNU libc extension variables __progname and __progname_full
- Update nbd_client.py to support both old- and new-style handshake
- Simplify building images out of artifacts found on host filesystem
- Move getopt* files to libc folder and convert to C++
- Enhance getopt family of functions to work with PIEs
- Tweaked nbd_client.py to properly handle handshake and transport flags in new handshake protocol
- Tweak open() and sys_open() to return EFAULT error when pathname null
- elf: handle new DT_RUNPATH
- Enhanced __fxstata to handle AT_SYMLINK_NOFOLLOW
- vfs: Harden task_conv() to return EFAULT when cpath argument is null
- Added option suffix "!" to force termination of remaining application threads
- Move _post_main invocation to run_main
- Provide full implementation of epoll_pwait
- Start using memory below kernel
- Change vmlinux_entry64 to switch to protected mode and jump to start32
- Fixed indentation in xen.cc
- Move kernel to 0x40200000 address (1 GiB higher) in virtual memory
- Allow running non-PIE executables that do not collide with kernel
- Make RAMFS not to free file data when file is still opened
- Fix slow write/append to files on ramfs
- hpet: Support 32-bit counter
- mprotect: page-align len parameter instead of returning error
- procfs: populate maps file with i-node numbers
- procfs: Add device ID information to the maps file
- syscall: add getpid
- signal: tag user handler thread as an application one
- Make OSv boot as vmlinuz
- Prepare for local-exec TLS patch
- Clean boot logic from redundant passing OSV_KERNEL_BASE
- Support PVH/HVM direct kernel boot
- Enhance scripts/test.py to allow running unit tests on firecracker
- Refine confstr() to conform more closely to Linux spec
- Enhanced manifest_from_host.sh to support building apps from host and docker images
- Fixed compilation errors in modules mostly related to strlcpy
- Enhance scripts/build to allow passing arguments to modules/apps
- Remove obsolete loader.bin build artifact and related source files
- Add --help|-h option to build script to explain usage
- Remove reference to external from httpserver-api makefile
- Removed remains of externals reference from httpserver-api Makefile
- scripts: hardened manifest_from_host.hs to verify lddtree is installed on the system
- scripts: remove old Ubuntu and Fedora from setup.py; added support of Fedora 29
- scripts: fixed typo in setup.py
- Reverted some changes related to upgrading openssl that got checked in prematurely
- scripts: Enhanced manifest_from_host.sh to better support regular expressions and filter x86_64 ELF files
- Fixed compilation errors in modules httpserver-jolokia-plugin, josvsym and monitoring-agent mostly related to strlcpy
- Fixed httpserver file system integration test
- ramfs: fix arithmetic bug leading to write overflows
- scheduler: Initialize _cpu field in detached_state struct
- semaphores: fix sem_trywait
- pthreads: implement pthread_attr_getdetachstate
- pthreads: make implementation of pthread_attr_getdetachstate more correct
- java: add basic java test that does NOT use OSv wrapper
- Made maven more quiet and only show errors
- java: tweaked openjdk7 to add /usr/bin/java symlink
- setup: add pax-utils package for Ubuntu and Fedora
- Add docker files to help setup OSv build environment for Ubuntu/Fedora
- Upgrade cli, lua and httpserver-api modules to use OpenSSL 1.1 and Lua 5.3
- dynamic linker: adjust message when symbol missing
- docs: Updated main README to make it better reflect current state of OSv
- hpet: handle wrap-around with 32-bit counter
- Fix bug in arch_setup_free_memory
- memory: enforce physical free memory ranges do not start at 0
- pthreads: provide minimal implementation to handle SCHED_OTHER policy
- memory setup: ignore 0-sized e820 region
- libc: added __exp2_finite wrapper needed by newer libx265
- elf: skip old version symbols during lookup
- syscall: add set_mempolicy and sched_setaffinity needed by libnuma
- Ignore missing symbols when processing certain relocation types on load
- procfs: add minimum subset of status file intended for linuma consumption
- fs: extracted common pseudofs logic
- fs: add subset of sysfs implementation needed by numa library
- ramfs: make sure to pass absolute paths for mkbootfs.py
- ramfs: make sure i-node number stay the same after node allocation
- scripts: fix export_manifest.py to handle all symlinks properly
- scripts: enhance manifest_from_host.sh to put cmdline example for executables
- httpserver: make test images use openjdk8 as new jetty app requires it
- httpserver: enhance test script to accept different location of test image
- loader: print boot command line and expanded runscript line if applicable
- capstan: add script to automate building capstan MPM packages
Acknowledgments
We want to thank all contributors to the project. But the special thanks go to:
- Nadav Har’El for contributing and reviewing many patches and providing guidance for many others
- Waldemar Kozaczuk for contributing most patches