-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use copy of state for state dumps to file #365
Open
johannaengland
wants to merge
1
commit into
Uninett:master
Choose a base branch
from
johannaengland:bugfix/use-state-copy-for-statedump
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Use a copy of the Zino state for saving it to a file to avoid errors due to changes to the state in the middle of saving |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you may actually want a deep copy of the state to ensure nothing is being updated from another thread:
https://docs.pydantic.dev/latest/api/base_model/#pydantic.BaseModel.model_copy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I try to use deep copy I get the following exception
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oooh, that seems pretty weird... what could be putting a
Future
object into this structure?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm investigating it - it is not
FlappingStates
- I checked out a commit before it was added and the same error happensThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same with
PlannedMaintenances
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, I have figured out that the problem lies with
ZinoState.events
- I will investigate further which of them lead to problemsThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But then again, I'm not sure that this actually solves the issue at hand. I still get the occasional error about dicts changing size while Pydantic is iterating them to copy them, which was the reason for doing a copy to begin with. And, of course, this makes sense. The copy is being made in the same thread that is doing the dumping, so if the structure is being changed by another thread, the same problem will still occur.
The only way to ensure we don't get this error is to ensure that no other threads are concurrently modifying the data structure while it's being serialized to disk (or while it's being copied). This can be done by either making sure we're not running multiple threads, or by adding locks (which will complicate things quite a bit).
It could also be "worked around", by simply retrying the serialization several times if it fails due to these errors - which would be safe enough when #366 is merged, as that would protect us from leaving a corrupt state file when the error inevitably occurs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do believe the reason the dumping is happening in a separate thread is because the method
dump_state_to_file()
is not declared as async. When this function is scheduled to run using APscheduler, apscheduler will automatically defer its execution to a separate worker thread because it is a non-async function and would block the entire event loop.I think the safest way to run the function would be to declare it as an
async
function, which lets it run in the main event loop thread. It will, of course, block the event loop while serializing - which could be ok if it's fast, disastrous if it is slow. A more complicated implementation would be to implement it as a function that copies the state in the main thread, and then defers the actual serialization of this copy to a worker thread (since the disk i/o is what would block the most).