Fix bug for running tasks twice #1804

libre-man · 2024-09-12T14:43:30Z

This fixes #1498 by keeping track of the namespaces of a task in an array, instead of just the last one.

libre-man · 2024-09-12T14:53:28Z

I'm not entire sure what the test failure is here, can someone shed some light on that? Locally all tests pass.

It seems like a race condition, but running even master with the race detector on gives many errors.

pd93 · 2024-09-13T14:28:22Z

It's not clear to me what is being fixed here since #1498 is already resolved. Could you please provide an example that reproduces your specific issue. Please also read the discussion in #1655 as a decision was made to not change some behaviour on purpose.

libre-man · 2024-09-13T17:14:17Z

I created this PR before that ticket was closed. The issue here is when we have a structure like this

FileA |
      Includes -> FileB
                | Includes -> FileC
                            | -> Has Task A
      Includes -> FileC
                | -> Has Task A

If now we call Task A through FileA -> FileB -> FileC and through FileA -> FileC we get Task A to run multiple times. This is especially and issue when Task A is not idempotent or not safe to run concurrently. A prime example of this is installing something with npm ci.

pd93 · 2024-09-14T00:35:10Z

Here is a reproduction of your issue as I understand it:

# Taskfile.yml
version: "3"

includes:
  b: ./b.yml
  c: ./c.yml

tasks:
  default:
    deps:
      - abc
      - ac

  abc:
    cmds:
      - task: b:c:task-a

  ac:
    cmds:
      - task: c:task-a

# b.yml
version: "3"

includes:
  c: ./c.yml

# c.yml
version: "3"

tasks:
  task-a:
    run: once
    sources:
      - ./input
    generates:
      - ./output
    cmds:
      - touch output

❯ task -d ./tmp/1804 abc
task: [b:c:task-a] touch output

❯ task -d ./tmp/1804 ac
task: [c:task-a] touch output          # Runs task again

This is currently working as intended (See discussion in #1655), although we have left the door open to discussion. We made the decision that we would not prevent these tasks from running since they have different calling paths. It is possible that the tasks can contain different data etc and it is safer to run a task twice than risk it not running when it should have.

This needs more discussion before it can be merged, but I have left a couple of comments. @vmaerten, @andreynering, do you have any additional thoughts beyond what has been previously discussed? Note that this implementation appears to follow option 3 as described in #1655 i.e. Whether the task is run or not is only affected when run: once is enabled. The behavior does not appear to change if this is omited.

@libre-man please feel free to speak up if you disagree with any of the comments made previously. Also the PR could probably do with a new test that replicates this issue to ensure that we don't get regressions in the future it this gets merged.

pd93 · 2024-09-14T00:14:00Z

task_test.go

+func (sb *SyncBuffer) String() string {
+	sb.mu.Lock()
+	defer sb.mu.Unlock()
+	return sb.buf.String()
+}
+
+func (sb *SyncBuffer) Reset() {
+	sb.mu.Lock()
+	defer sb.mu.Unlock()
+	sb.buf.Reset()
+}
+


I don't think these are necessary. You can just call buff.buf.Reset() like you have done elsewhere. The only operation that needs a mutex is Write().

pd93 · 2024-09-14T00:23:41Z

taskfile/ast/task.go

@@ -219,7 +221,7 @@ func (t *Task) DeepCopy() *Task {
 		Platforms:            deepcopy.Slice(t.Platforms),
 		Location:             t.Location.DeepCopy(),
 		Requires:             t.Requires.DeepCopy(),
-		Namespace:            t.Namespace,
+		Namespace:            append([]string{}, t.Namespace...),


This change isn't necessary.

Suggested change

Namespace: append([]string{}, t.Namespace...),

Namespace: t.Namespace,

But this is not an array, which would mean that a deep copy would need to deep copy the array too right?

Ah yes sorry, you are right. We have deepcopy.Slice() for this.

Suggested change

Namespace: append([]string{}, t.Namespace...),

Namespace: deepcopy.Slice(t.Namespace),

libre-man · 2024-09-14T09:56:38Z

The main issue is that run: once as I see it is used to say "I only want this test to be run once", which it currently doesn't do. I would even say that for run: when_changed the current behaviour of saying that different calling paths are different tasks don't make a lot of sense, however I might be wrong of course. The task, in my opinion, should care where it is called from, only which arguments it gets.

My use case is this: I have a file Taskfile.setup.yml that is imported by many other taskfiles. It is responsible for setting up stuff like Python, pnpm, smithy. Making all those scripts safe to be run concurrently is not really feasible, I would need to use file system locks and it would get way too complicated very quickly. With the current implementation there is no way around this, and it kinda breaks my encapsulation that every task is responsible for setting up their own dependencies.

If the change gets the OK I'll polish the PR with tests too, I'm waiting to do that until the OK so I don't waste any time on that if it won't get merged anyway.

One thing about your reproduction: it also runs the task twice if you would do task --parallel -d ./tmp/1804 abc ac.

erictooth · 2024-10-10T16:57:13Z

Just to add to the discussion, I'm running into what sounds like a similar (the same?) situation where some taskfiles get included by multiple other taskfiles, and despite them having run: once, the same task (pkg_a:build) runs multiple times in parallel:

pkg_b:pkg_a:build
pkg_c:pkg_b:pkg_a:build
pkg_d:pkg_c:pkg_b:pkg_a:build

The only workaround I've been able to think of is renaming the task pkg_a_build and then including the pkg_a taskfile with flatten: true. This causes the task to be properly deduplicated across the chained includes and then it runs only once.

Fix bug for running tasks twice

5163578

Fix race conditions in test file

3ecbc01

pd93 reviewed Sep 14, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix bug for running tasks twice #1804

Fix bug for running tasks twice #1804

libre-man commented Sep 12, 2024

libre-man commented Sep 12, 2024 •

edited

Loading

pd93 commented Sep 13, 2024

libre-man commented Sep 13, 2024

pd93 commented Sep 14, 2024

pd93 Sep 14, 2024

pd93 Sep 14, 2024

libre-man Sep 14, 2024

pd93 Sep 16, 2024

libre-man commented Sep 14, 2024 •

edited

Loading

erictooth commented Oct 10, 2024

	Namespace: append([]string{}, t.Namespace...),
	Namespace: t.Namespace,

	Namespace: append([]string{}, t.Namespace...),
	Namespace: deepcopy.Slice(t.Namespace),

Fix bug for running tasks twice #1804

Are you sure you want to change the base?

Fix bug for running tasks twice #1804

Conversation

libre-man commented Sep 12, 2024

libre-man commented Sep 12, 2024 • edited Loading

pd93 commented Sep 13, 2024

libre-man commented Sep 13, 2024

pd93 commented Sep 14, 2024

pd93 Sep 14, 2024

Choose a reason for hiding this comment

pd93 Sep 14, 2024

Choose a reason for hiding this comment

libre-man Sep 14, 2024

Choose a reason for hiding this comment

pd93 Sep 16, 2024

Choose a reason for hiding this comment

libre-man commented Sep 14, 2024 • edited Loading

erictooth commented Oct 10, 2024

libre-man commented Sep 12, 2024 •

edited

Loading

libre-man commented Sep 14, 2024 •

edited

Loading