Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DB schema #13

Open
cspotcode opened this issue Jul 14, 2021 · 7 comments
Open

DB schema #13

cspotcode opened this issue Jul 14, 2021 · 7 comments

Comments

@cspotcode
Copy link
Collaborator

cspotcode commented Jul 14, 2021

Thinking about a DB schema:

Table: lib
Each library we render has a row in this table
Columns

  • name - npm name
  • last-checked npm API was last queried at this time to check for new versions

Table render_queue
When a new npm version of a lib is identified, it's added to this queue.
We can also manually add items to the queue to force a re-render.
Columns

  • lib
  • version
  • state: pending, in-progress
  • started: if it was started too long ago, assume the worker died?

Table: lib_render
Each row describes a single render of a single lib@version
Columns:

  • lib_name
  • lib_version
  • renderer_version - if we update the renderer, we'll know that docs have been rendered using an older version and need to be eventually re-rendered
  • datetime
  • result - success, failure. If rendering fails, it may be nice to show all failures in a list, or to prioritize re-rendering them after we bugfix the reasons they failed.
@cspotcode
Copy link
Collaborator Author

EDIT: I added result to the lib_render table so that failed renders are tracked and a human can more easily investigate.

@keithlayne
Copy link
Collaborator

A (GH Actions) build ID might be good for the lib_render table.

I was also thinking about schema in general terms. Since we have a single db file and fully control it, we can make any DDL changes in a pretty straightforward way I think. Could get weird though, don't want to touch the schema while a run is happening, and if code depends on it things could get weird too.

Possibly the simplest thing to do is keep an idempotent DDL script in code and run that at the start of every run.

And if we wanna test branches somehow we can just not persist/push the db when done.

One concern might be potential unbounded growth of the db over time. Don't know if this is a real concern.

I found this: https://www.npmjs.com/package/better-queue which has a sqlite backend but looks mostly dead. Something like that out of the box might be helpful. Sorta thinking that avoiding persisting the queue might be fine to start.

@cspotcode
Copy link
Collaborator Author

cspotcode commented Jul 15, 2021

Possibly the simplest thing to do is keep an idempotent DDL script in code and run that at the start of every run.

I like this idea.

And if we wanna test branches somehow we can just not persist/push the db when done.

Good point. I didn't think about this when writing db.ts Right now the db is stored in refs/heads/db We could refactor it to store the db in refs/heads/db/<branch you are working on> e.g. refs/heads/db/main and refs/heads/db/my-topic-branch

One concern might be potential unbounded growth of the db over time. Don't know if this is a real concern.

render_queue seems like the only table which could grow unbounded unless we cleanup old rows periodically.

@keithlayne
Copy link
Collaborator

I think all the pieces we need are pretty much already there in db.ts. Just need to factor out some things and make multiple entry points or allow CLI args in the worker (meh) or whatever. I'll play with this some when I get a chance.

I'm generally in favor of making entry points really thin in cases where everything doesn't basically fit in a single module.

@cspotcode
Copy link
Collaborator Author

Would it be ok if I stub out some sort of a shim adapter to the DB, and then update work.ts to use this shim?

I'm thinking maybe the shim will have one or two methods to start, but it'll let me update work.ts to do the work requested by the DB.

Could be something like:

dbShim = {
  getPendingRenders() {
    return {lib: 'lodash', version: '0.0.1'
  },
  addToQueue(libVersionPair) {/* no op */},
  isRendered(libVersionPair) {return false}
}

It'd let me start refactoring work.ts, and then when the DB is ready, we can update the shim to talk to the DB properly.

@keithlayne
Copy link
Collaborator

yeah I started working on it over the weekend and then got distracted :( I was thinking something like that.

I kinda waffled on the sql.js vs. actual file access, but if all the db access happens in one process I don't think we need the native library.

I sometimes get hung up in minutiae and overthinking in early stages of projects. I'll try to find time tonight to crank something out, even if it is garbage.

@cspotcode
Copy link
Collaborator Author

cspotcode commented Jul 21, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants