Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Actors list locking #147

Open
gvissers opened this issue Aug 19, 2021 · 4 comments
Open

Actors list locking #147

gvissers opened this issue Aug 19, 2021 · 4 comments
Assignees

Comments

@gvissers
Copy link
Collaborator

I had a quick look at the locking of the actors list, triggered by the current thread on eye candy issues. I don't know if these are related, but the actors list is is accessed in multiple places while another thread is holding the mutex.

I created a macro to check if the current thread is holding the lock (almost, there could also be no thread holding the lock at all):

#define CHECK_ACTORS_LIST_LOCK() \
	do \
	{ \
		if (SDL_TryLockMutex(actors_lists_mutex) == SDL_MUTEX_TIMEDOUT) \
		{ \
			fprintf(stderr, "%s:%d: The actors list mutex is held by another thread!\n", \
				__FILE__, __LINE__); \
		} \
		else \
		{ \
			SDL_UnlockMutex(actors_lists_mutex); \
		} \
	} while(0)

Since the SDL locks are recursive, the TryLock call will succeed if the current thread is holding the lock (or the mutex is not locked at all), and fail if another thread is holding it.

I sprinkled these across the source where actors_list is used, took a short walk around Isla Prima, killed some creatures, and talked to Lasud, and got the following:

~> ./el.linux.bin test 2> log
~> grep mutex log | sort | uniq -c
     72 actors.c:1137: The actors list mutex is held by another thread!
    171 actors.c:1298: The actors list mutex is held by another thread!
     67 actors.c:1338: The actors list mutex is held by another thread!
     26 actors.c:1465: The actors list mutex is held by another thread!
     82 actors.c:1749: The actors list mutex is held by another thread!
      7 gamewin.c:1136: The actors list mutex is held by another thread!
      1 select.cpp:135: The actors list mutex is held by another thread!
      7 select.cpp:328: The actors list mutex is held by another thread!

(the line numbers may not match exactly with the current source, because the calls to CHECK_ACTORS_LIST_LOCK() add lines as well).

This is only checking actors_list, not your_actor, MY_HORSE, MY_HORSE_ID, IS_HORSE, ACTOR, ACTOR_WEAPON, or other ways in which the actors list is abused.

As I said, I have no idea if this is related to the eye candy issues, and given the fact that most of the time, things work well enough it's probably not immediatly dangerous, but I really think this needs fixing.

@gvissers gvissers self-assigned this Aug 21, 2021
@pjbroad
Copy link
Collaborator

pjbroad commented Aug 26, 2021

I've looked at this before but abandoned the work because it was too big, but I agreed it needs fixing. There are likely places where read access could be allowed in parallel too.

@gvissers
Copy link
Collaborator Author

I've looked at this before but abandoned the work because it was too big

Yes, that pesky list seems to have put its tendrils into everything (which is not altogether surprising). I am working on it, but fixing this is definitely not for the upcoming release.

There are likely places where read access could be allowed in parallel too.

That's what I thought at first. However, although in most places the pointers in the actors list itself remain unchanged, the actor data they point to is written to. Perhaps we could protect the actors list with an rwlock, but we'd have to add a mutex to the individual actors. It's something we can look into, but for now I'm still using a single mutex for all of the actor data. Let's see what the impact of that is once everything is properly locked.

@gvissers
Copy link
Collaborator Author

gvissers commented Sep 3, 2021

Short update:

I have the actors list (and the actors within) now properly contained in my actors_list_lock branch. It should no longer be possible to retrieve an actor pointer without locking the list. Since the client is not written in Rust, you can of course still hold on to it after you have released the lock, thus invalidating this work, but at least you will have to work at it 😆

Unfortunately the diff is moderately huge again (64 changed files with 6,262 additions and 4,900 deletions at the time of writing), sorry for that. I have not seen a performance regression, a quick analysis shows that actor lookup constitues a negligible part of the time usage. The code by default still uses a recursive mutex, AFAICT due to a single function in the sound code that can be called through too many paths to simply pass down the required actors. I will see if I can remedy this, so that we can switch to a normal mutex.

I will hold off on submitting a PR until after the new release, that will also give me the time to test it properly. With actors being rather central to the game play, basically everything will have to be tested, but if anyone wants to help, focusing on eye candy effects (magic, mostly), multi-combat, and ranging would be a good idea.

@gvissers
Copy link
Collaborator Author

Pull request made, #164.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants