-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Load Access Fault in nxsem_trywait Due to Invalid Semaphore Pointer on RISC-V NuttX #15178
Comments
@GuoyuYin Can you give the source code, build commands and NSH Commands you used? So I can reproduce it on QEMU RISC-V. (32-bit or 64-bit?) Thanks! |
Hi @lupyuen the source code is a bit complicated, Basically, we are fuzz testing the NuttX. Our core idea is to predefine some functions (contain certain Nuttx's API) and expose specific arguments; then, our fuzzing tool can generate arguments for this function and test the NuttX for unknown bugs. For this bug, the predefined function is: static long syz_sem_timedwait(volatile long sem_ptr, volatile long abstime_ptr)
{
sem_t *sem = (sem_t *)sem_ptr;
const struct timespec *abstime = (const struct timespec *)abstime_ptr;
return (long)sem_timedwait(sem, abstime);
} and its triggering argument is: syz_sem_timedwait(0x2, 0x0) And during execution, we triggered this error: riscv_exception: BUG: EXCEPTION: Load access fault. MCAUSE: 0000000000000005, EPC: 00000000800078da, MTVAL: 0000000000000002 According to the EPC value, it is something wrong within the function To reproduce it, I believe you can simply add the predefined function with its argument, this should trigger the error. |
@Rrooach are you using your Rtkaller fuzzer to do that? Could you please send a guide how to use it with NuttX? |
No, Rtkaller is not capable of fuzzing OSs like Nuttx. |
Expected Behavior
The program should execute without any memory access violations or crashes. Specifically, the
nxsem_trywait
function should correctly attempt to wait on a semaphore without causing system instability.Actual Behavior
Instead of executing normally, the program crashes due to a load access fault when trying to read a half-word from what seems to be an invalid or inaccessible memory location during the execution of the
nxsem_trywait
function. The crash happens at the address0x800078da
.Description
Instead of executing normally, the program crashes due to a load access fault when trying to read a half-word from what seems to be an invalid or inaccessible memory location during the execution of the
nxsem_trywait
function. The crash happens at the address0x800078da
.nxsem_trywait
)0x800078da (Program counter on exceptions in
nxsem_trywait
)MTVAL: 0x2 (The value associated with the exception, possibly the address offset)
Querying the nut-img file reveals that the exception happened during the execution of the
nxsem_trywait
function, specifically at offset +0x54 from its start. The error log indicates that the system was attempting to perform a load half-word unsigned (lhu) operation from an address held in register s0, which appears to be causing the issue.The assembly instruction at this location is
lhu a1,0(s0)
, which tries to load an unsigned half-word into register a1 from the address pointed to by s0. Given that MTVAL contains0x2
, it suggests that there might be an alignment issue or that the address being accessed does not exist or there is no valid mapping for it in the page tables, leading to the access fault.Debug Logs
The debug logs show that the system was executing various system calls before encountering the exception. These calls are executed sequentially by number (call_num) until finally a very large number #call_num = 18446744073709551615, which is the maximum value of an unsigned 64-bit integer that could mean an overflow or invalid argument.Upon reaching the
nxsem_trywait
function, it attempted to load a half-word from an address stored in s0. However, this address appears to be invalid or out of bounds, resulting in the load access fault.Steps to Reproduce
To reproduce this issue, one can use Syzkaller to execute system calls against the NuttX kernel. The specific sequence leading up to the crash includes calls such as
syz_sem_timedwait
,syz_putenv
,syz_setenv
, andsyz_sem_timedwait
, culminating in the problematic call tonxsem_trywait
.The corresponding syscall specific implementation code is as follows:
Suggested Fix
The nxsem_trywait function is in sched/semaphore/sem_trywait.c with the following code:
After analysis, the following recommendations were given
Pointer Validation: Ensure that the pointer passed to
nxsem_trywait
is properly initialized and aligned according to RISC-V's requirements. Check if the semaphore structure (sem
) is corrupted or if there are issues with memory allocation or deallocation patterns that could lead to accessing invalid addresses.Semaphore Initialization: Verify that all semaphores are correctly initialized before being used. Uninitialized or improperly initialized semaphores can cause undefined behavior, including invalid memory accesses.
Memory Alignment: Review the memory alignment of data structures used in conjunction with
nxsem_trywait
. Misaligned data can result in access faults on architectures like RISC-V, which have strict alignment requirements.Runtime Checks: We can see that the problem is a load access fault when trying to access sem->flags or NXSEM_COUNT(sem). This indicates that the incoming sem pointer may be invalid (e.g., NULL, uninitialized, or pointing to an illegal memory address). To fix this, we need to make sure that the sem pointer is properly initialized and points to a valid memory location before actually using it.
We introduce an auxiliary function is_valid_semaphore(), which is assumed to exist, to check if a semaphore structure is legal. This function can be defined on an implementation-specific basis, e.g., to check if the semaphore has been properly initialized, etc.
On which OS does this issue occur?
[OS: Linux]
What is the version of your OS?
Ubuntu 20.04
NuttX Version
2ff2b82
Issue Architecture
[Arch: risc-v]
Issue Area
[Area: Kernel]
Verification
The text was updated successfully, but these errors were encountered: