Rethink state improvements #214

Zannick · 2024-09-03T17:55:42Z

The statedb contains our best-seen path to any particular state, with a state referencing its total time and the step taken from its parent state. Each is updated individually, so may be stale. We also have the next column which shows us the children states from any parent, which means we don't have to deserialize and evaluate again if we encounter the same state later.

When we improve part of a route, the later states will have to get improved as well so we can appropriately reprioritize them for evaluation. Right now we do that eagerly in HeapDB::record_one_internal, including the entire recursive tree. As I write this, it's been almost an hour waiting for a recreate step that's around 70% of the way through the route, for a fourth canon replacement from #182.

This can potentially explain a lot of things:

Why the best routes don't seem to improve upon the early stages of the input or the first solution found.
Why stopping the program and restarting it (deleting the db in the process) is better for finding improvements.
Why the program throughput slows down so much after running a long time.
Why mutations from Improve minimizes and mutates #182 seem to be very slow.
Why mutations don't seem to produce better solutions (they are updating states for so long first that another thread might take an improved state and find the solution before the mutation thread is done with the db).
Why sometimes solutions find discrepancies between their state data and their elapsed times.

We need to find a more lazy approach to this. We're already geared towards understanding stale information in the statedb; what happens if we only update the new states produced in any step (including all of any solution)? Perhaps later when we retrieve a state from the queue we can check its predecessor (or all of them??)? Perhaps we can add a versioning tag somewhere that makes it easier to see if we need to rescore a state. Perhaps we can add another background thread.

I think if we only check a state's immediate predecessor when we pop from the queue, then we might not pick up on its actual best time from ancestors further away. However, if we find any solutions with that state (e.g. from the solve trie search), then we will recreate_store the entire solution and thus update those times.

The text was updated successfully, but these errors were encountered:

Zannick · 2024-09-03T19:15:41Z

Here are some things we need to do:

Stop recursing into next states in record_one_internal.
When evaluating a state, don't drop any of its children for being more than the max time. These all need to be added into the statedb, but we can exclude them from the queue.
When an unevaluated state's score is improved in the db from above max time to below it, add it to the queuedb.
When recreating a route, if we find an already-evaluated state, update it and all its children, rather than ignore it.
Change handling of max time in trie_search in case the state being looked up has already been improved. (We find out by replaying the whole history.)

* Stop reevaluating children states * Change NextData to be just a list of history sequences * Remove need to save the solution collector and the world in the db. * Move private class functions to be private module functions. * Retrieve next steps in `recreate_store` and apply them all. * Don't bother applying local movements in the other case since we'll include the history step anyway * Perform all applicable steps when single-stepping. * Don't drop non-solutions for being over the max time. * No need to convert the trie search vec into a deque.

And fix single-item adds to the db not having the right time_since.

Zannick · 2024-09-04T02:09:48Z

Wow! It's much faster at reducing the best time now!

Zannick · 2024-09-04T03:25:43Z

This should be accomplished now but we really ought to have some further way to check, such as a background thread. For example, we could improve a state that has already been processed. Because it's already been processed, we never look it up in the solve trie, or we never process some descendent of that path.

Zannick added a commit that referenced this issue Sep 4, 2024

Re-add eligible improved states to the queue #214

8db8383

And fix single-item adds to the db not having the right time_since.

Zannick added a commit that referenced this issue Sep 4, 2024

Don't replay two steps at once #214

e7dfe78

Zannick added a commit that referenced this issue Sep 4, 2024

Continue keeping track of size properly #214

4387699

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rethink state improvements #214

Rethink state improvements #214

Zannick commented Sep 3, 2024

Zannick commented Sep 3, 2024 •

edited

Loading

Zannick commented Sep 4, 2024

Zannick commented Sep 4, 2024

Rethink state improvements #214

Rethink state improvements #214

Comments

Zannick commented Sep 3, 2024

Zannick commented Sep 3, 2024 • edited Loading

Zannick commented Sep 4, 2024

Zannick commented Sep 4, 2024

Zannick commented Sep 3, 2024 •

edited

Loading