Skip to content

Bacalhau project report 20220520

Kai Davenport edited this page May 20, 2022 · 1 revision

CLI working with refactor branch

The CLI now works end to end with the new codebase - you can submit a job against 3 nodes and the CLI can view the status of the job. The JSONRPC code has been refactored to make it much easier to work with and (basic) validation has been added to the "submit job" endpoint.

verifier interface

There is now a Verifier interface that handles transporting the results of a job back to the requester node.

There are 2 implementations of this interface and it's the job that "selects" which verification strategy it wants to use - ideally it would choose one that makes sense for the executor engine the job has chosen.

  • noop - does nothing and just returns the local folder given to it as the "results" (useful for tests)
  • ipfs - publishes the results to ipfs so the client / requester can download the files produced by the job

NOTE neither of these verifiers actually do any verification - they are concerned only with transporting the results. However - by introducing the verifier at this stage - the workflow of a job now includes calls the verification engine to process the results.

This will come in handy when we introuce the WASM executor which will pair nicely with the DeterministicHash based verification engine (where the hash of all outputs needs to match for the results to be accepted)

docker images

A job can now run any docker image as specified by the user - the entrypoint can also be overriden in the same way as docker run

input / output volumes

A job now supports the concept of input and output volumes and the docker executor implements support for these.

This means you can specific ipfs CIDs and input paths and also write results to an output volume - this can be seen by the following example:

cid=$(ipfs add file.txt)
bacalhau run \
  -v $cid:/file.txt \
  -o apples:/output_folder \
  ubuntu \
  bash -c 'cat /file.txt > /output_folder/file.txt'

The above job shows off an input volume -v $cid:/file.txt and an output volume -o apples:/output_folder

Once the job has run on the executor - the contents of stdout and stderr will be added to any named output volumes the job has used (in this case apples) and all those entities will be packaged into the results folder which is then published to ipfs via the verifier

test tooling

The test suites have had a lot of work and it's now very easy to add new scenarios to both:

  • executor tests (which focus just on a single job producing the correct results)
  • devstack tests (which focus on the end to end example of multiple nodes using the transport interface to produce correct results)

This is captured in the scenarios folder of the tests - an example of how to specific a simple test that checks the results of a job that just does "cat file.txt"

func CatFileToStdout(t *testing.T) TestCase {
	return TestCase{
		Name: "cat_file_to_stdout",
		SetupStorage: singleFileSetupStorageWithData(
			t,
			HELLO_WORLD,
			SIMPLE_MOUNT_PATH,
		),
		ResultsChecker: singleFileResultsChecker(
			t,
			STDOUT,
			HELLO_WORLD,
			ExpectedModeEquals,
			1,
		),
		GetJobSpec: func() types.JobSpecVm {
			return types.JobSpecVm{
				Image: "ubuntu:latest",
				Entrypoint: []string{
					"cat",
					SIMPLE_MOUNT_PATH,
				},
			}
		},
	}
}

Methods like singleFileSetupStorageWithData and singleFileResultsChecker can easily be replaced by more complicated used cases (such as entireFolderSetupStorage)

next

  • Docs for bacalhau.org
  • DevOps to get our first 10 nodes deployed to GCP
  • Bootstrapping list
  • Manual QA
  • Implement Python FaaS in beta
Clone this wiki locally