-
Notifications
You must be signed in to change notification settings - Fork 90
Bacalhau project report 20220808
We now support parallelizing a job across many nodes.
You can have an IPFS CID which is a directory of many files, and you can submit the job such that Bacalhau will automatically split the work across multiple servers. Results from the sub-jobs will be combined by the CLI when it downloads them. This assumes that each input file gets processed as a corresponding output file (that these files are separable).
Demo here: https://drive.google.com/file/d/1eSaECJ4IT5mEk_sWwmMuxdKJhzb5KVq1/view?usp=sharing
This was a major goal for July, so we are glad to have shipped it :-) https://github.com/filecoin-project/bacalhau/blob/main/ROADMAP.md#july
More detail on how this works here: https://github.com/filecoin-project/bacalhau/pull/442
This is a work in progress still, but we are working on improving performance in large networks. So far, we've 3x'd performance on 250 node clusters. Lots to do here still.
The 3x improvement came from making the nodes not bid on a job if they've seen concurrency-many BidAccepted messages for a job, or 1.5x concurrency-many Bid messages. This means the number of messages exchanged for a small job in a large network has gone down from O(250) to O(15).
We're still seeing a fairly high error rate (timeouts) in large networks so addressing this is the next priority.
We have started the process of designing how we will integrate Bacalhau with Filecoin. The main initial idea is to support writing to Filecoin via the lotus CLI for outputs of jobs. This way, Bacalhau can be used as a bridge between IPFS (or HTTP!) datasets, processing those datasets and publishing the results to Filecoin as well.
We'll also look at how we can make it easy to integrate with gateways like Web3.storage and Estuary in the future.
There were various bug fixes and improvements to bacalhau list
and bacalhau describe
, e.g. supporting short ids. Also, supporting -w
working directory argument in bacalhau docker run
.
- Continue performance work
- Filecoin integration implementation