introduces the notion of forfeited job #49

polus-arcticus · 2024-06-05T18:37:57Z

Currently all errors that arise from the execution layer are treated as errors belonging to the computation being requested. One cannot guarantee execution of a faulty program after all. However, without assumptions of network consistency, it is totally possible that errors arise belonging to the resource provider. Hardware can break, data centers can lose power, etc.

Under the current paradigm the job creator needs to wait for timeouts to expire on the job, find the job id, and send a new tx to recover the funds. And there is no coophive recover command yet. In the case of long time out periods for long jobs (say a couple months) it would be a lot nicer if the resource provider could just say 'hey i cant run your job anymore, here is a refund`

or if the job creator can say, 'hey thanks for running so far, but i dont need it anymore, lets cancel now and prorate the computation completed.

	"DealNegotiating",
	"DealAgreed",

         "DealForfeited" // This new state can be evoked while in the DealAgreed phase, but not after

	"ResultsSubmitted",
	"ResultsAccepted",
	"ResultsChecked",
	"MediationAccepted",
	"MediationRejected",
	"TimeoutSubmitResults",
	"TimeoutJudgeResults",
	"TimeoutMediateResults",

Payment is refunded to job creator
Resource Provider receives collateral back

lukemarsden · 2024-06-06T15:12:07Z

Hmm, this is an interesting idea. However, I think a lot of our current problems would be solved by simply catching errors better and returning them to the user in the results bundle. Also, interpreting errors in the results bundle in the CLI and presenting them as actual errors. Maybe we could do that within the context of the current smart contracts etc, rather than changing the protocol itself?

lukemarsden · 2024-06-06T15:13:05Z

I guess what I described is complicated by the fact that if bacalhau fails to run the job, we don't have a CID to return. Maybe we need another field in the result type to include an error message instead of a CID?

polus-arcticus · 2024-06-06T17:14:52Z

our current problems would be solved by simply catching errors better and returning them to the user in the results bundle
what I described is complicated by the fact that if bacalhau fails to run the job, we don't have a CID to return.

Indeed, our error in question is here https://github.com/CoopHive/coophive/blob/2e88aedce706158c4cb46176e07e2c2a2746cc1c/pkg/resourceprovider/controller.go#L422-L425

I suppose the question is how to handle errors that are on the resource providers machine, but not in bacalhau, and ones on the machine, and also inside bacalhau. Since they both show up here in that permalink without really any indication which is which. One will pipe stderr to the CID fine, one will do the blank string.

Maybe we need another field in the result type to include an error message instead of a CID?

In the permalink above i can change the forfeit logic to stop the spinner, and provide an error like 'the resource provider had trouble starting your job, your payment and collateral has been disbonded' type thing

introduces the notion of forfeited job

2e88aed

polus-arcticus mentioned this pull request Jun 5, 2024

when there's an error running a container, report it #42

Open

polus-arcticus requested a review from mactus13 June 26, 2024 07:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

introduces the notion of forfeited job #49

introduces the notion of forfeited job #49

polus-arcticus commented Jun 5, 2024 •

edited

Loading

lukemarsden commented Jun 6, 2024

lukemarsden commented Jun 6, 2024

polus-arcticus commented Jun 6, 2024 •

edited

Loading

introduces the notion of forfeited job #49

Are you sure you want to change the base?

introduces the notion of forfeited job #49

Conversation

polus-arcticus commented Jun 5, 2024 • edited Loading

lukemarsden commented Jun 6, 2024

lukemarsden commented Jun 6, 2024

polus-arcticus commented Jun 6, 2024 • edited Loading

polus-arcticus commented Jun 5, 2024 •

edited

Loading

polus-arcticus commented Jun 6, 2024 •

edited

Loading