Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nxsem_destroyholder: Use critical section when destroying holder(s) #15243

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

pussuw
Copy link
Contributor

@pussuw pussuw commented Dec 17, 2024

Summary

Must use critical section here, otherwise the free holder list will leak, causing either a crash due to holder->htcb = NULL, or the free holder list becomes (erroneously) empty even though most of the holder entries are free.

Impact

Fix g_freeholders list integrity.

Testing

rv-virt:knsh64
rv-virt:ksmp64
MPFS target with > 100 threads and processeds, PI=yes, SMP=yes

Otherwise the free holder list will leak, causing either a crash due to
holder->htcb = NULL, or the free holder list becomes (erroneously) empty
even though most of the holder entries are free.
@github-actions github-actions bot added Area: OS Components OS Components issues Size: XS The size of the change in this PR is very small labels Dec 17, 2024
@@ -566,6 +566,8 @@ void nxsem_initialize_holders(void)

void nxsem_destroyholder(FAR sem_t *sem)
{
irqstate_t flags = enter_critical_section();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but if other place doesn't hold the critical section, how to avoid them modifying the list?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every other call that modifies the list comes from within critical section, this is the only hole currently.

Copy link
Contributor

@xiaoxiang781216 xiaoxiang781216 Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If so, should we add the lock in caller site, like the approach taken by other function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do it like that as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since other funtion in this file assume the caller hold the critical section, nxsem_destroyholder is better to follow this direction too, otherwise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xiaoxiang781216 I was looking into doing the change and I noticed that it will make the caller's side very messy. nxsem_destroyholder is defined as empty macro if priority inheritance is not enabled:

#ifdef CONFIG_PRIORITY_INHERITANCE
void nxsem_initialize_holders(void);
void nxsem_destroyholder(FAR sem_t *sem);
void nxsem_add_holder(FAR sem_t *sem);
void nxsem_add_holder_tcb(FAR struct tcb_s *htcb, FAR sem_t *sem);
void nxsem_boost_priority(FAR sem_t *sem);
void nxsem_release_holder(FAR sem_t *sem);
void nxsem_restore_baseprio(FAR struct tcb_s *stcb, FAR sem_t *sem);
void nxsem_canceled(FAR struct tcb_s *stcb, FAR sem_t *sem);
void nxsem_release_all(FAR struct tcb_s *stcb);
#else

The callers that do not run inside critical section are as follows:

int nxsem_destroy(FAR sem_t *sem)
{
int32_t old;
DEBUGASSERT(sem != NULL);
/* There is really no particular action that we need
* take to destroy a semaphore. We will just reset
* the count to some reasonable value (0) and release
* ownership.
*
* Check if other threads are waiting on the semaphore.
* In this case, the behavior is undefined. We will:
* leave the count unchanged but still return OK.
*/
old = atomic_read(NXSEM_COUNT(sem));
do
{
if (old < 0)
{
break;
}
}
while (!atomic_try_cmpxchg_release(NXSEM_COUNT(sem), &old, 1));
/* Release holders of the semaphore */
nxsem_destroyholder(sem);
return OK;
}

int nxsem_set_protocol(FAR sem_t *sem, int protocol)
{
DEBUGASSERT(sem != NULL);
switch (protocol & SEM_PRIO_MASK)
{
case SEM_PRIO_NONE:
/* Remove any current holders */
nxsem_destroyholder(sem);
break;
case SEM_PRIO_INHERIT:
break;
case SEM_PRIO_PROTECT:
#ifdef CONFIG_PRIORITY_PROTECT
break;
#else
/* Not yet supported */
return -ENOTSUP;
#endif
default:
return -EINVAL;
}
sem->flags = protocol;
return OK;
}

If the critical section is moved to the caller side, it will be quite ugly because we need to flag the critical section out if priority inheritance is off:

int nxsem_set_protocol(FAR sem_t *sem, int protocol)
{
  DEBUGASSERT(sem != NULL);

#ifdef CONFIG_PRIORITY_INHERITANCE
  irqstate_t flags;
#endif

  switch (protocol & SEM_PRIO_MASK)
    {
      case SEM_PRIO_NONE:

        /* Remove any current holders */

#ifdef CONFIG_PRIORITY_INHERITANCE
        flags = enter_critical_section();
#endif
        nxsem_destroyholder(sem);
#ifdef CONFIG_PRIORITY_INHERITANCE
        leave_critical_section(flags);
#endif
        break;

      case SEM_PRIO_INHERIT:

        break;

      case SEM_PRIO_PROTECT:
#ifdef CONFIG_PRIORITY_PROTECT
        break;
#else
        /* Not yet supported */

        return -ENOTSUP;
#endif

      default:
        return -EINVAL;
    }

  sem->flags = protocol;
  return OK;
}

int nxsem_destroy(FAR sem_t *sem)
{
  int32_t old;
#ifdef CONFIG_PRIORITY_INHERITANCE
  irqstate_t flags;
#endif

  DEBUGASSERT(sem != NULL);

  /* There is really no particular action that we need
   * take to destroy a semaphore.  We will just reset
   * the count to some reasonable value (0) and release
   * ownership.
   *
   * Check if other threads are waiting on the semaphore.
   * In this case, the behavior is undefined.  We will:
   * leave the count unchanged but still return OK.
   */

  old = atomic_read(NXSEM_COUNT(sem));
  do
    {
      if (old < 0)
        {
          break;
        }
    }
  while (!atomic_try_cmpxchg_release(NXSEM_COUNT(sem), &old, 1));

  /* Release holders of the semaphore */

#ifdef CONFIG_PRIORITY_INHERITANCE
  flags = enter_critical_section();
#endif
  nxsem_destroyholder(sem);
#ifdef CONFIG_PRIORITY_INHERITANCE
  leave_critical_section(flags);
#endif
  return OK;
}

Maybe some of the #ifdef areas could be merged but it will still be pretty ugly IMO. I suggest we keep this patch as is. What do you think ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: OS Components OS Components issues Size: XS The size of the change in this PR is very small
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants