Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MVP Memo Table (DO NOT MERGE) #49

Draft
wants to merge 13 commits into
base: clean-slate
Choose a base branch
from
Draft

MVP Memo Table (DO NOT MERGE) #49

wants to merge 13 commits into from

Conversation

connortsui20
Copy link
Member

@connortsui20 connortsui20 commented Nov 27, 2024

Very incomplete

Less incomplete

TODO:

  • We should stabilize the on-disk representation of expressions (kind + serializable and deserializable data, where each operator kind needs to know how to retrieve their children from this data field). Once this is stabilized, it should not change.
  • Potentially make APIs clearer. There are a bunch of Result<Result<_, _>> which can be unclear, but I have no idea how to make this better.
  • Potentially should make a distinction at the type level between a GroupId and a RootGroupId, where impl From<RootGroupId> for GroupId is implemented but not the other way around...
  • Since fingerprints are in a separate table, it might be good to do updates instead of additional inserts because the old fingerprints will all be stale and there shouldn't be that much contention on them?
  • Should the union-find pointer data structure be separated out into its own table? Right now it is embedded into the group table records, but if we separate it out into a separate table we will have a level of abstraction that allows us to change the implementation when we need to.
  • Add union by size optimization after deciding if the union find data structure should be separated from the main group records.
  • Logical expression semantics should be formalized and a trait should be made that represents these semantics.
  • All methods on the memo table should be serializable transactions, and the implementations should reflect that. This might mean that the transaction object needs to be passed around to different function calls, as the nested transaction API SeaORM might not be ideal (but maybe it is, we'll have to see).
  • Add much more tests
  • Add support for running tests in CI (should test both in-memory and disk based DBMS)
  • Add code coverage in CI
  • Add tracing and log support

@connortsui20 connortsui20 force-pushed the mvp branch 9 times, most recently from 2cbdae0 to f0e4b45 Compare November 28, 2024 17:25
This commit adds the boilerplate files and fixes the CI workflows.
@connortsui20 connortsui20 force-pushed the mvp branch 2 times, most recently from 4d60811 to 8610184 Compare November 28, 2024 19:00
This commit adds a first draft of a memo table trait and a persistent
memo table implementation backed by SeaORM entities.
.gitignore Outdated Show resolved Hide resolved
optd-mvp/src/entities/prelude.rs Show resolved Hide resolved
optd-mvp/src/memo/interface.rs Outdated Show resolved Hide resolved
@connortsui20 connortsui20 force-pushed the mvp branch 6 times, most recently from b3467f3 to 24df49f Compare November 29, 2024 19:38
This commit adds the `src/expression` module which contains a very
simple representation of Cascades expressions.

The `Memo` trait interface and implemenation has also changed, where
it now correctly detects exact match duplicates, and it does not track
fingerprints for physical expressions (only logical).

TODO: Add more tests.
TODO: Figure out how to test in CI.
@connortsui20 connortsui20 force-pushed the mvp branch 4 times, most recently from a74f03c to aaba197 Compare November 30, 2024 20:09
This commit completely refactors the memo table, removing the `Memo`
trait and instead placing all methods directly on the `PersistentMemo`
structure itself.

This also cleans up some code in other places.
@connortsui20 connortsui20 force-pushed the mvp branch 2 times, most recently from 4b06b81 to 9d9db84 Compare November 30, 2024 22:57
@connortsui20 connortsui20 force-pushed the mvp branch 2 times, most recently from 1fe79ad to c4462fb Compare December 1, 2024 22:54
@connortsui20
Copy link
Member Author

To run tests, use:

cargo t -- --test-threads 1 --ignored

They are ignored because tests don't work in CI. I need to figure out how to run migrations in CI before those will work.

Also, each of the tests assumes they are isolated. I need to convert all of the implementation methods to use transaction instead in order to make this thread safe, which won't be hard but will be a bit tedious.

@connortsui20 connortsui20 changed the title MVP QO (DO NOT MERGE) MVP Memo Table (DO NOT MERGE) Dec 1, 2024
@skyzh
Copy link
Member

skyzh commented Dec 2, 2024

You can create a sqlite in memory database and run the migrate function on it for every test case?

@connortsui20
Copy link
Member Author

Yes itll be easy I just haven't had time to do it yet and it's not that important

@connortsui20 connortsui20 force-pushed the mvp branch 2 times, most recently from c09012e to 09b868f Compare December 2, 2024 17:22
This commit replaces the specific expression types with traits that
define the behavior the in-memory represenations of both logical and
physical expressions need to have. Right now, the `PhysicalExpression`
trait does not do that much, but the `LogicalExpression` trait is super
important to how the persistent memo table works.
This commit adds the rank by size optimization into the embedded
union-find data structure of the group sets. It also bumps the version
number of `sea-orm-cli` to `1.1.2`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants