Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rethink state improvements #214

Open
Zannick opened this issue Sep 3, 2024 · 3 comments
Open

Rethink state improvements #214

Zannick opened this issue Sep 3, 2024 · 3 comments

Comments

@Zannick
Copy link
Owner

Zannick commented Sep 3, 2024

The statedb contains our best-seen path to any particular state, with a state referencing its total time and the step taken from its parent state. Each is updated individually, so may be stale. We also have the next column which shows us the children states from any parent, which means we don't have to deserialize and evaluate again if we encounter the same state later.

When we improve part of a route, the later states will have to get improved as well so we can appropriately reprioritize them for evaluation. Right now we do that eagerly in HeapDB::record_one_internal, including the entire recursive tree. As I write this, it's been almost an hour waiting for a recreate step that's around 70% of the way through the route, for a fourth canon replacement from #182.

This can potentially explain a lot of things:

  1. Why the best routes don't seem to improve upon the early stages of the input or the first solution found.
  2. Why stopping the program and restarting it (deleting the db in the process) is better for finding improvements.
  3. Why the program throughput slows down so much after running a long time.
  4. Why mutations from Improve minimizes and mutates #182 seem to be very slow.
  5. Why mutations don't seem to produce better solutions (they are updating states for so long first that another thread might take an improved state and find the solution before the mutation thread is done with the db).
  6. Why sometimes solutions find discrepancies between their state data and their elapsed times.

We need to find a more lazy approach to this. We're already geared towards understanding stale information in the statedb; what happens if we only update the new states produced in any step (including all of any solution)? Perhaps later when we retrieve a state from the queue we can check its predecessor (or all of them??)? Perhaps we can add a versioning tag somewhere that makes it easier to see if we need to rescore a state. Perhaps we can add another background thread.

I think if we only check a state's immediate predecessor when we pop from the queue, then we might not pick up on its actual best time from ancestors further away. However, if we find any solutions with that state (e.g. from the solve trie search), then we will recreate_store the entire solution and thus update those times.

@Zannick
Copy link
Owner Author

Zannick commented Sep 3, 2024

Here are some things we need to do:

  • Stop recursing into next states in record_one_internal.
  • When evaluating a state, don't drop any of its children for being more than the max time. These all need to be added into the statedb, but we can exclude them from the queue.
  • When an unevaluated state's score is improved in the db from above max time to below it, add it to the queuedb.
  • When recreating a route, if we find an already-evaluated state, update it and all its children, rather than ignore it.
  • Change handling of max time in trie_search in case the state being looked up has already been improved. (We find out by replaying the whole history.)

Zannick added a commit that referenced this issue Sep 4, 2024
* Stop reevaluating children states
* Change NextData to be just a list of history sequences
* Remove need to save the solution collector and the world in the db.
* Move private class functions to be private module functions.
* Retrieve next steps in `recreate_store` and apply them all.
* Don't bother applying local movements in the other case since we'll include the history step anyway
* Perform all applicable steps when single-stepping.
* Don't drop non-solutions for being over the max time.
* No need to convert the trie search vec into a deque.
Zannick added a commit that referenced this issue Sep 4, 2024
And fix single-item adds to the db not having the right time_since.
Zannick added a commit that referenced this issue Sep 4, 2024
@Zannick
Copy link
Owner Author

Zannick commented Sep 4, 2024

Wow! It's much faster at reducing the best time now!

@Zannick
Copy link
Owner Author

Zannick commented Sep 4, 2024

This should be accomplished now but we really ought to have some further way to check, such as a background thread. For example, we could improve a state that has already been processed. Because it's already been processed, we never look it up in the solve trie, or we never process some descendent of that path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant