Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tsan fix browsethread #13872
base: 2.5
Are you sure you want to change the base?
tsan fix browsethread #13872
Changes from 2 commits
6ce3d0b
043971e
1aec9f9
7dd247d
2433eae
a286eb4
10d83fa
f39cf96
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this fixes the data race
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed. This is the fix for the data race. Connecting the signals/slots while the thread is not yet running triggers the issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not understood the loop. Isn't there only one event/codition we are waiting for? m_bRun shall become a semaphore? On the other hand syncronizing the thread start by halting the main thread feels wrong. It would be better to protect the shared sources explicit. What are the shared resources in this case we have tsan complains?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we are waiting for m_bRun to be set to true, as set by the thread::run method. A semaphore would work as well, but since we already have the condition, I take advantage of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not very clear to me what this problem is exactly, but this is the tsan report:
The code in the main thread:
So, my conclusion was that the connection to the thread before it was running was problematic.
I will re-assess that this is indeed the case, but I do think that is it. And if so, I don't see a problem with waiting for the thread to have started. That happens only once, and it will be fast anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reusing makees the code harder to read. So I would prefer two primitived for two tasks.
The ThreadSanitizer warning is really confusing, what does it check for? Whether a variable is written/read from two threads? It must be a control structure in Qt. However waiting for the thread to start should not change the issue for my undertsanding.
But if it is the solution i don't mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does the solution work? I am afraid instead of a solution, we trick ThreadSanitizer.
Can, it happen that it treated the unrelated mutex as a proper guard?
Qt allows to send signals and reconnect them concurrently. So it should not be a problem.
Maybe only a missing atomic annotation in Qt?
Is there a way to find out what 0x0001466d5f28 is and which mutexted I are considered?
Or is it s QT bug? Is 2.4 with Qt 5 affected as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyway, I am now building Qt5. But it is on purpose that the 2.5 branch of mixxx vcpkg has Qt6 as the default? I am confused now as to what we are supposed to use for Mixxx 2.5.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am running into lots of problems trying to build against Qt5, so it would be much appreciated if you could help me building Qt6 with vcpkg with this Qt sanitize thread option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mixxx 2.4.2 will be our last Qt5 release. So there is IMHO no need to deal with QT5 in this case. We just keep the Qt5 support in our 2.5 branch just in case we want to test regresions.
I suggest to issue a PR against our vcpkg branch and use the build package downloaded from GitHub. For my undersanding you jsut need to add the sanitizer flag at the position I pointed out above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I prefer to use Qt 6, but you pointed it out in the qt5 portfile.cmake. The qt6 portfile.cmake is completely different, and I don't see where to add this option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this works. I'd prefer to set this from my triplets file arm64-osx-tsan.cmake, but for now this have to do.
diff --git a/ports/qtbase/portfile.cmake b/ports/qtbase/portfile.cmake
index 8b9e3b0d3..4a96412d7 100644
--- a/ports/qtbase/portfile.cmake
+++ b/ports/qtbase/portfile.cmake
@@ -92,7 +92,7 @@ INVERTED_FEATURES
)
list(APPEND FEATURE_OPTIONS -DCMAKE_DISABLE_FIND_PACKAGE_Libudev:BOOL=ON)
-list(APPEND FEATURE_OPTIONS -DFEATURE_xml:BOOL=ON)
+list(APPEND FEATURE_OPTIONS -DFEATURE_xml:BOOL=ON -DFEATURE_sanitize_thread:BOOL=ON)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or maybe this mutex placement. either way, both are improvements.
EDIT: no, I reverted this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's understand and double check that. The old code IMHO does not suffere any issues: https://doc.qt.io/qt-6/qsharedpointer.html#thread-safety -> Reference counting is thread safe. We only must protect the
s_weakInstanceRef = strong.toWeakRef();
case which the rety code does in the original version. The old code is lock free exept on the first invocation.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I will check again where the issue comes from, here or the thread being used before being started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. The original code was correct. I have reverted it, except for making
local variables in the function instead of globals.
Please resolve if you are okay with the latest version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is explained in the comment in line 76 seems weird to me. it makes sense when the client is the same, but what if it's a different client? anyway, I didn't want to change the behaviour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it works, because only one table is shown at a time. But indeed an area of improvement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What can happen if the thread is not yet waiting? In this case no one is waked up.
But the
m_modelObserver
andm_path
have a value. I think the thread can first check if m_modelObserver is not null and pick that up before waiting. Probably the waiting in the constructor is obsolete than?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if m_path has a value, the thread will not enter the wait state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wakeOne() and wait() below is confusing, Let's use a second sync method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put the request in a separate method.
Check warning on line 103 in src/library/browse/browsethread.cpp
GitHub Actions / clang-tidy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a gurantee that the m_modelObserver outlives the thread?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, because it's holds a shared pointer to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m_modelObserver holds a share pointer to the Browsthread? Maybe you can add a comment to the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a single, refcounted, BrowseThread. This is what is returned by BrowseThread::getInstanceRef();
In BrowseTableModel.cpp it is stored: m_pBrowseThread = BrowseThread::getInstanceRef();
So, when all BrowseTableModel instances are gone, the ref count is 0, and the BrowseThread instance is destroyed.
But I am not sure why I have to explain this or add more comments. This is not my code! I only want to solve the thread sanitizer warnings. If you feel the (existing!) code needs more explanation, go ahead.
Check failure on line 144 in src/library/browse/browsethread.cpp
GitHub Actions / clazy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
QWaitConditions are normally named for the condition that shall become true. Just "condition" is a bit poor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the thing is, I am using it for 2 things: waiting for the thread to be started and waiting for a new populate request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have renamed the member vars for clarity, to express that these are requests (between the main thread and the browse thread) and what. Please review and resolve.