-
Notifications
You must be signed in to change notification settings - Fork 50
Description
Note this mostly exclude indexes from the picture because I havn't used them and havn't needed them so I can't comment well on their API.
The APIs are either too low level and require consumers to have a copy of the car spec to be used or provide a level above the CAR format and requires consumer to provide a bunch of features they might not have.
APIs that are too level to be used without a copy of the car spec:
The things above are usefull, but I don't think they are enough to claim this librairy can be used to easily decode car files.
It's like trying to use encoding/json but you can only use json.Decoder.Token.
APIs that are too high level and provides features and types that are not needed to interact with the car format:
DefaultWalkFuncWriteCarWriteCarWithWalkerDagMaxTraversalLinksTraverseLinksOnlyOnceSelectiveCarSelectiveCarPreparedTraverseToFileTraverseV1
Thoses are usefull, but they are specialised helper functions, if I am not creating a car file from a random access CID block interface (github.com/ipfs/go-ipld-format.NodeGetter) or if I am not using https://pkg.go.dev/github.com/ipfs/go-merkledag or https://pkg.go.dev/github.com/ipld/go-ipld-prime I cannot use thoses.
Things I think are good:
CarReader
It is simple, has one job (provide an iterator that read from anio.Readerand return you blocks as they are found in the carv1 stream), with a sane safe API, it does not require consumers to understand deep things about the carv1 spec.BlockReader
Same as above
Streaming a carv1 body from a stream of blocks.
This can't be found in neither the v1 or v2 packages.
You have to write this code:
util.LdWrite(writer, block.Cid().Bytes(), block.RawData())Which is impossible to figure out for any new comer without a deep read and exploration of the car spec or by looking up some code that already do this.
Note that it's also really easy to messup, the ...[]byte might lead you to think you can do this for example: util.LdWrite(writer, block.Cid().Bytes(), block.RawData(), block2.Cid().Bytes(), block2.RawData()) but no this does not follow the car spec and will be silently incorrectly serialised.
I get why this API exists, I can see edge cases where it would be usefull, I don't think it is acceptable as the only way to stream a stream of blocks.
Things I think would make this better:
Provide an util.Ldwrite free way to stream a car body.
An API like this would be enough:
// BlockWriter streams blocks to an io.Writer.
type BlockWriter struct{/* ... */}
func NewBlockWriter(w io.Writer, roots []cid.Cid, opts ...WriteOption) (*BlockWriter, error) {/* ... */}
func (bw *BlockWriter) Write(b blocks.Block) error {/* ... */}
// WriteFromReader allows for zero copy through [io.ReaderFrom] or [io.WriterTo].
func (bw *BlockWriter) WriteFromReader(c cid.Cid, r io.Reader) error {/* ... */}I would also move the helpers and lower level functions away in different packages. Given the current state creating a new package like github.com/ipld/go-car/simple bundling easy safe wrappers around the car format sounds simpler (no need to have a tool rewrite consumers to a new import path).
Somewhat out of scope notes:
It is impossible to do anything allocation free, random example about reading blocks:
It would be nice if Blockreaders object had a Peek() (cid []byte, block []byte, error) method, the difference is that it use bufio.Reader.Peek and returns a pointer to bufio.Reader's internal pointer, this allows to read a block without allocation.
Just so you know I'll make thoses changes to github.com/ipfs/boxo/car and provide a lighter API (just expose BlockReader and BlockWriter).