FlexSC (Flexible System Call), a mechanism to process system call, which was introduced on OSDI'10 by Livio Soares.
The main concept of FlexSC is processing syscalls in batching way, which has better cache locality and almost no CPU mode switch involved. For more details, the link above provides link to the paper. Also, you can refer to my porting note at HackMD.
Syscalls are processed through the following steps:
- The moment syscall being requested by user thread, it simply grab a free syscall entry, and submit (change the state of the entry) it after done population of syscall-related arguments to the entry
- Once there are no free entries, the kernel visible thread start submitting (by marking syscall entry to different state) the entries to kthread
- Kthread detects that it got stuff to do (by scanning syscall entries), then it start queuing work to the CMWQ workqueue
- After the work (syscall) is done, kthread change the state of the syscall entry
- Library of FlexSC (user space) detects that the syscall is done, it simply return the retval of the syscall to application thread
Illustration of FlexSC:
+---------------------------+
| |
| user thread requesting | .....
| syscalls |
| |
+---------------------------+
+---------------------------+
| |
| kernel-visible thread |
| | +-----------+
+---------------------------+ | |
USER SPACE | shared |
--------------------------------------------------------| syscall | .....
KERNEL SPACE | entry |
+---------------------------+ | |
| | +-----------+
| kthreads dispatching |
| work to CMWQ workqueue |
| |
+---------------------------+
The repo was originally downloaded from splasky/flexsc (c69213), it was lacking many of implementation of FlexSC at that commit, and what I've implemented are the following:
- per-kthread syscall entries
- kernel-visible thread (pthread)
- performance measurement program (write() and getpid() syscall)
- func of kthread
- mechanism to get free syscall entry
- allocation of CMWQ (Concurrency Managed Workqueue) and its work
Currently, flexsc.c
at libflexsc/
and linux-5.0.10/flexsc/
having some hard-coded section which used to test write()
syscall, a clearer version can be found at here (library) and here (kernel code).
The following analysis are done with 7 kthreads (kernel cpu, each kthread handling its own works of CMWQ) and 1 kernel-visible thread (user cpu) on 8th-gen Intel CPU (i5-8350U) with HyperThreading enabled (4C8T).
-
write() syscall:
-
Time elapsed for finding marked syscall entry (starts from the time the entry being marked as
FLEXSC_STATUS_MARKED
):
(you may figure that
thread no.
of FlexSC has ~2xx offset compare to the normal one, they have same meaning actually. It's because of I usegettid()
for thread no. instead of order of thread creation which normal one uses.)Summing analysis above, we might only optimize FlexSC to having similar result as typical syscall mechanism in the end, because processing syscall (write) in CMWQ costs ~500ns, summing it with other costs might lead to same consequence as I just mentioned.
-
getpid() syscall:
It's been 10 years since FlexSC released, computer organization may changed a lot (e.g. CPU mode switch in modern processor takes only <50ns within a round trip). Therefore, even FlexSC doesn't has better performance than typical syscall, this is still a record which shows that imporvements of cache locality and mode switch can't still beats the time cost of typical syscall. Or, there exists some overheads within my implementation of FlexSC, feel free to open a issue if you find out anything. Thank you!
For library, you can $ make
directly after you've done modification of the code, I have pre-defined macros in the Makefile, hence single $ make
command will produce two executables for testing typical syscall and FlexSC syscall respectively.
For FlexSC flavored kernel, I've added .config
file for kernel build, you can use $ make bzImage
to compile the kernel directly if your machine have already meet the requirement of compiling Linux kernel.
- (proposed by @afcidk) Find a policy to make user process to sleep during processing of requested syscalls, and wake up the process (maybe by signal) once the process is done. This is in order to reduce time elapsed by processing syscalls. Since user process is busy-waiting (by using
pthread_yield()
, we've triedpthread_cond_wait()
, but it's worser thanpthread_yield()
, might caused by my implementation) for the result of syscall, we want to test if putting them to sleep shows better performance. Moreover, I've noticed that syscall page (we have 64 * 7-cpu => 448) are not all exhausted by user process, this should also take into the consideration.
- On exit of test program (calling
flexsc_exit()
), oops and panic will occur since the cleanup is not done correctly. It's harmless to the test since the result file is forced to flush before calling of exit of FlexSC. The workround is restart the QEMU if you are running on QEMU.