Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When multiple includes are used sometimes variables randomly get lost #1660

Open
luischre opened this issue May 16, 2024 · 7 comments
Open
Labels
state: needs discussion Issues that require discussion before pursuing.

Comments

@luischre
Copy link

  • Task version: 3.37.2
  • Operating system: Linux
  • Experiments enabled:

Pretty much the same issue that was described here, please find below a pretty similar reproduceable example.

Taskfile.dist.yml:

version: "3"

includes:
  a:
    taskfile: ./a/Taskfile.yml
  b:
    taskfile: ./b/Taskfile.yml
    vars:
      FIRSTITEM: "{{ .FIRSTITEM }}"
      SECONDITEM: "{{ .SECONDITEM }}"
      THIRDITEM: "{{ .THIRDITEM }}"

tasks:
  print-list:
    desc: 'Print the list'
    cmds:
      - for: { var: MYLIST }
        task: print:item
        vars: { MYITEM: ' {{ .ITEM }}' }
    vars:
      MYLIST: >
        {{ .FIRSTITEM }}
        {{ .SECONDITEM }}
        {{ .THIRDITEM }}
        {{ .FIRSTITEMEXTENDED }}
        {{ .SECONDITEMEXTENDED }}
        {{ .THIRDITEMEXTENDED }}

  print:item:
    desc: 'Print the item'
    label: "print:item-{{ .MYITEM }}"
    cmds:
      - echo "Current item {{ .MYITEM }}"
    var:
      MYITEM: '{{ .MYITEM }}'

./a/Taskfile.yml

version: "3"

vars:
  FIRSTITEM: "FIRST"
  SECONDITEM: "SECOND"
  THIRDITEM: "THIRD"

./b/Taskfile.yml

version: "3"

vars:
  FIRSTITEMEXTENDED: "{{ .FIRSTITEM }}EXTENDED"
  SECONDITEMEXTENDED: "{{ .SECONDITEM }}EXTENDED"
  THIRDITEMEXTENDED: "{{ .THIRDITEM }}EXTENDED"

testTask.sh

#!/bin/bash

for i in {1..1000}; do
    task print-list
done

Executing this with ./testTask.sh | grep -o FIRSTEXTENDED | wc -l returns 1000 for version 3.36.0 it only return 817 for version 3.37.2, also visible in the print output in the excerpt below:

task: [print:item- SECOND] echo "Current item  SECOND"
task: [print:item- THIRD] echo "Current item  THIRD"
task: [print:item- FIRSTEXTENDED] echo "Current item  FIRSTEXTENDED"
task: [print:item- SECONDEXTENDED] echo "Current item  SECONDEXTENDED"
task: [print:item- THIRDEXTENDED] echo "Current item  THIRDEXTENDED"
task: [print:item- FIRST] echo "Current item  FIRST"
task: [print:item- SECOND] echo "Current item  SECOND"
task: [print:item- THIRD] echo "Current item  THIRD"
task: [print:item- EXTENDED] echo "Current item  EXTENDED"
task: [print:item- EXTENDED] echo "Current item  EXTENDED"
task: [print:item- EXTENDED] echo "Current item  EXTENDED"
task: [print:item- FIRST] echo "Current item  FIRST"
task: [print:item- SECOND] echo "Current item  SECOND"
task: [print:item- THIRD] echo "Current item  THIRD"
task: [print:item- FIRSTEXTENDED] echo "Current item  FIRSTEXTENDED"
task: [print:item- SECONDEXTENDED] echo "Current item  SECONDEXTENDED"
task: [print:item- THIRDEXTENDED] echo "Current item  THIRDEXTENDED"
@task-bot task-bot added the state: needs triage Waiting to be triaged by a maintainer. label May 16, 2024
@pd93
Copy link
Member

pd93 commented May 16, 2024

This is an interesting issue... I'll be perfectly honest - I didn't even know it was possible to define variables in an included Taskfile, use them in a sibling Taskfile and then use the result in the parent Taskfile. I'm curious what your real-world use case is for this as it's the first time I'm seeing it.

This is not quite the same issue as #1646 (which was 100% a bug), though the behaviour looks similar. A bit of background as to why this is happening... Since 3.37.0, when you run Task, all your Taskfiles are read into a DAG. This DAG is then traversed, and all nodes are merged down into a single ast.Taskfile structure. The problem is that the order of traversal is non-deterministic and synchronous. i.e. in your example, sometimes a is evaluated before b and sometimes b is evaluated before a (The latter causes your problem).

Sidenote: while there are plans to remove AST merging, this won't solve this issue as traversal order is still non-deterministic.

To solve this is not particularly challenging. We simply need to store the order in which includes were defined in the graph and use a StableTopologicalSort instead of a TopologicalSort in our Merge function. However, we would also have to remove the goroutines so that files are parsed one-at-a-time. This will remove some of the performance benefits added by the DAG which is unfortunate.

My question for @andreynering is this: Do we want to support this kind of behaviour? It seems odd to me that variables inside Taskfiles aren't scoped. Most developers would expect variables inside an inner scope to not be available in the outer scope. Task doesn't seem to respect this. The reason I asked @luischre for their use case is because I want to know if it serves a genuine purpose, or if this is just as easily solved by moving the variables into the parent Taskfile where they will be evaluated in the expected order.

cc @vmaerten

@pd93 pd93 added state: needs discussion Issues that require discussion before pursuing. and removed state: needs triage Waiting to be triaged by a maintainer. labels May 16, 2024
@andreynering
Copy link
Member

Users should avoid having conflicting variable names for now, and we should attempt to make it Taskfile scoped in the future.

I'm not 100% sure how things worked before DAG. It's possible that we made it scoped already, but I think they weren't.

@vmaerten
Copy link
Member

I agree with you @andreynering but scoped variables would be a breaking changes right ?

I think some people (a lot ?) use the behavior that variable are not scoped, espacially top from the bottom.
For example, a var declared in the root Taskfile, used in the included Taskfile

Maybe we should write it in the doc, something like to avoid unwanted behavior, try to avoid having conflicting variable names

@luischre
Copy link
Author

Thanks for the quick reactions!

Our use case is as follows: We have the top level taskfile and we have environment specific adjustments in the included sub level taskfiles. While taskfile a is rather generic, taskfile b is environment specific and relies on certain values from taskfile a.

We did so to keep the top level taskfile as structured and readable as possible.

With regards to:

Maybe we should write it in the doc, something like to avoid unwanted behavior, try to avoid having conflicting variable names

This is not really the problem in my case, is it? While scoping is not disabled, it's rather something like avoid dependencies between included files I would say.

And with regards to the proposal of scoping them in the future:

Users should avoid having conflicting variable names for now, and we should attempt to make it Taskfile scoped in the future.

How would we keep our taskfiles as structured as possible? Is there any way to get return values from included files to use them later on?

@pd93
Copy link
Member

pd93 commented May 20, 2024

I'm not 100% sure how things worked before DAG. It's possible that we made it scoped already, but I think they weren't.

@andreynering Taskfile variables were not scoped previously. It seems like variables could be used anywhere as long as they are defined before they are read according to the order of includes and vars. Since the DAG changes made includes unordered, this is breaking.

scoped variables would be a breaking changes right?

@vmaerten Yes, I believe so. A little bit of me was hoping this was not a common use case (I suspect it is not since there has only been 1 report so far), but regardless, we shouldn't break things for our users.

I think some people (a lot ?) use the behavior that variable are not scoped, espacially top from the bottom.
For example, a var declared in the root Taskfile, used in the included Taskfile

Using a variable defined in a parent Taskfile in a child Taskfile, doesn't violate scoping rules in the same way and is not affected here as parent files are always parsed before child files. The DAG changes do not affect this behaviour.

The breaking change is only that child Taskfiles are no longer parsed in the order in which they are defined which means that variables in one child that reference variables defined in a sibling might not work as intended.

Maybe we should write it in the doc, something like to avoid unwanted behavior, try to avoid having conflicting variable names

I think, for now, we need to fix this breaking change and we can have a discussion about a long-term fix somewhere else (probably #1030).

Our use case is as follows: We have the top level taskfile and we have environment specific adjustments in the included sub level taskfiles. While taskfile a is rather generic, taskfile b is environment specific and relies on certain values from taskfile a.

We did so to keep the top level taskfile as structured and readable as possible.

@luischre Thanks for the info, it's helpful to understand context when discussing these things as users often have a different perspective to us.

I suspect there is probably another way to go about solving your problem, though its difficult to say with confidence without seeing your setup. Certainly, in the long-term, Task is likely to adopt scoped variables in one form or another and we would need to accommodate users like yourself with other solutions.

@trulede
Copy link

trulede commented May 25, 2024

@luischre you may find dotenv a better way to manage your variables. It would seem that you could control which .ENV files are loaded with the assistance of variables (the example shows ENV however I suspet VARS would work too) and therefore get a very concise main Taskfile as well as very structured variable mapping.

https://taskfile.dev/usage/#env-files

The handling of Taskfile vars seems a little bit flakey (to me). I'm somewhat surprised that an imported Taskfile can modify variables in the importing Taskfile (not desired either BTW). Perhaps I would expect them to be scoped in the same way that tasks are? At least in that way, multiple Taskfiles could directly import a particular Taskfile and access the vars of that Taskfile. For instance:

includes:
  a: ./a # has vars FOO: "FUBAR"

vars:
  FOO: "BAR"

tasks:
  one:
    cmds:
      - echo {{.a:FOO}}  # FUBAR
      - echo {{.FOO}}  # BAR

@ex-nihil
Copy link

ex-nihil commented Jan 2, 2025

It seems like variables could be used anywhere as long as they are defined before they are read according to the order of includes and vars. Since the DAG changes made includes unordered, this is breaking.

While it seems dependant on order, it doesn't seem consistent to the ordering.


root

env:
  TEST: 'parent'

includes: 
  child1: ./child1
  child2: ./child2

child1

env:
  TEST: 'child1'

tasks:
    run:
    cmds:
      - echo "$TEST"

child2

env:
  TEST: 'child2'

tasks:
    run:
    cmds:
      - echo "$TEST"

When I run the any of the tasks from the root directory, it's random which of the child env's are used (using version 3.40.1)

❯ task child1:run
child1
❯ task child1:run
child2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
state: needs discussion Issues that require discussion before pursuing.
Projects
None yet
Development

No branches or pull requests

7 participants