-
Notifications
You must be signed in to change notification settings - Fork 90
Bacalhau project report 20220520
The CLI now works end to end with the new codebase - you can submit a job against 3 nodes and the CLI can view the status of the job. The JSONRPC code has been refactored to make it much easier to work with and (basic) validation has been added to the "submit job" endpoint.
There is now a Verifier
interface that handles transporting the results of a job back to the requester node.
There are 2 implementations of this interface and it's the job that "selects" which verification strategy it wants to use - ideally it would choose one that makes sense for the executor engine the job has chosen.
-
noop
- does nothing and just returns the local folder given to it as the "results" (useful for tests) -
ipfs
- publishes the results to ipfs so the client / requester can download the files produced by the job
NOTE neither of these verifiers actually do any verification - they are concerned only with transporting the results. However - by introducing the verifier at this stage - the workflow of a job now includes calls the verification engine to process the results.
This will come in handy when we introuce the WASM executor which will pair nicely with the DeterministicHash
based verification engine (where the hash of all outputs needs to match for the results to be accepted)
A job can now run any docker image as specified by the user - the entrypoint can also be overriden in the same way as docker run
A job now supports the concept of input and output volumes
and the docker executor implements support for these.
This means you can specific ipfs CIDs and input paths and also write results to an output volume - this can be seen by the following example:
cid=$(ipfs add file.txt)
bacalhau run \
-v $cid:/file.txt \
-o apples:/output_folder \
ubuntu \
bash -c 'cat /file.txt > /output_folder/file.txt'
The above job shows off an input volume -v $cid:/file.txt
and an output volume -o apples:/output_folder
Once the job has run on the executor - the contents of stdout
and stderr
will be added to any named output volumes the job has used (in this case apples
) and all those entities will be packaged into the results folder which is then published to ipfs via the verifier
The test suites have had a lot of work and it's now very easy to add new scenarios to both:
- executor tests (which focus just on a single job producing the correct results)
- devstack tests (which focus on the end to end example of multiple nodes using the transport interface to produce correct results)
This is captured in the scenarios
folder of the tests - an example of how to specific a simple test that checks the results of a job that just does "cat file.txt"
func CatFileToStdout(t *testing.T) TestCase {
return TestCase{
Name: "cat_file_to_stdout",
SetupStorage: singleFileSetupStorageWithData(
t,
HELLO_WORLD,
SIMPLE_MOUNT_PATH,
),
ResultsChecker: singleFileResultsChecker(
t,
STDOUT,
HELLO_WORLD,
ExpectedModeEquals,
1,
),
GetJobSpec: func() types.JobSpecVm {
return types.JobSpecVm{
Image: "ubuntu:latest",
Entrypoint: []string{
"cat",
SIMPLE_MOUNT_PATH,
},
}
},
}
}
Methods like singleFileSetupStorageWithData
and singleFileResultsChecker
can easily be replaced by more complicated used cases (such as entireFolderSetupStorage
)
- Docs for bacalhau.org
- DevOps to get our first 10 nodes deployed to GCP
- Bootstrapping list
- Manual QA
- Implement Python FaaS in beta