Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider refactoring Bats to avoid pipelines, subshells #79

Closed
mbland opened this issue Jan 1, 2017 · 1 comment
Closed

Consider refactoring Bats to avoid pipelines, subshells #79

mbland opened this issue Jan 1, 2017 · 1 comment
Assignees
Milestone

Comments

@mbland
Copy link
Owner

mbland commented Jan 1, 2017

As mentioned in #76, fork() isn't native to Windows and requires a expensive hack to emulate. While the go-script-bash framework itself avoids subshells and pipelines pretty thoroughly and itself runs fairly quickly everywhere, Bats is written in more traditional Bash that doesn't shy away from either. This makes the current suite of tests on Windows take on the order of 30min, when they take on the order of five or less on Linux and macOS systems. This is consistent across Git for Windows, Cygwin, MSYS2 (which is the basis for Git for Windows), and even the Windows Subsystem for Linux now included in Windows 10.

Also, knowing from experience that folks often won't write tests if they're slow to run—especially if it's too slow to experiment effectively and achieve the flow state necessary for deep learning—the inherent slowness of the Bats framework may turn off devs on Windows in particular. Since I'm hoping go-script-bash will prove not only useful for writing ./go scripts and other Bash apps, but also for encouraging thorough and effective Bats testing as well (with the lib/bats library helping out), the slowness may become a significant obstacle to this goal, especially for Windows users.

And there's a huge opportunity here, given the prevalence of Git for Windows and the Windows Subsystem for Linux! See also:

Consequently, I'm considering forking sstephenson/bats and seeing if there are any significant performance wins to be had without substantially complicating the code. On top of being a potentially huge positive factor for Windows, it will likely reap benefits for the other platforms as well.

@mbland mbland self-assigned this Jan 1, 2017
mbland added a commit that referenced this issue Jan 1, 2017
While starting to work on #59, and thinking about #79, I thought about
whether I could dispense with the `fold` command altogether and whether
it would have an impact on Windows performance in particular.

The upshot is, the difference isn't dramatic, but since there's
definitely a big performance hit and I'd prefer depending on as few
external commands and subshells in general, I'm going to go ahead and
commit the change. The nice feature is that the output will always be
folded to the terminal width, since it no longer depends on the
presence of `fold`.

After implementing the folding algorithm first on macOS 10.12.2, I ran the
following command (with `/bin` for Bash 3.2, and `/usr/local/bin` for Bash
4.4.5) to produce a timing comparison, since `./go help builtins` produces a
fair amount of text printed with `@go.printf`:

  $ time PATH="/bin:$PATH" bash ./go help builtins

The running times didn't appear impacted at all when `COLUMNS == 161`. When
`COLUMNS == 40`, the new version was very very marginally slower, O(0.004s):

  Bash 3.2.57(1)-release, new implementation without `fold`
  real    0m0.062s
  user    0m0.046s
  sys     0m0.014s

  Bash 3.2.57(1)-release, new implementation without `fold`
  real    0m0.066s
  user    0m0.051s
  sys     0m0.013s

  Bash 4.4.5(1)-release, old implementation with `fold`
  real    0m0.058s
  user    0m0.041s
  sys     0m0.013s

  Bash 4.4.5(1)-release, new implementation without `fold`
  real    0m0.062s
  user    0m0.045s
  sys     0m0.013s

On Windows 10, the results weren't super dramatic, but were generally an
improvement for most Bash environments:

  Bash 4.3.46(2)-release, Git Bash from Command Prompt,
  old implementation with `fold`, COLUMNS=161
  real    0m0.156s
  user    0m0.030s
  sys     0m0.106s

  Bash 4.3.46(2)-release, Git Bash from Command Prompt,
  new implementation with `fold`, COLUMNS=161
  real    0m0.156s
  user    0m0.030s
  sys     0m0.075s

  Bash 4.3.46(2)-release, Git Bash from Command Prompt,
  old implementation with `fold`, COLUMNS=40
  real    0m0.236s
  user    0m0.045s
  sys     0m0.090s

  Bash 4.3.46(2)-release, Git Bash from Command Prompt,
  new implementation with `fold`, COLUMNS=40
  real    0m0.231s
  user    0m0.046s
  sys     0m0.153s

  Bash 4.3.46(2)-release, Git Bash,
  old implementation with `fold`, COLUMNS=161
  real    0m0.148s
  user    0m0.030s
  sys     0m0.075s

  Bash 4.3.46(2)-release, Git Bash,
  new implementation with `fold`, COLUMNS=161
  real    0m0.164s
  user    0m0.046s
  sys     0m0.107s

  Bash 4.3.46(2)-release, Git Bash,
  old implementation with `fold`, COLUMNS=40
  real    0m0.202s
  user    0m0.030s
  sys     0m0.137s

  Bash 4.3.46(2)-release, Git Bash,
  new implementation with `fold`, COLUMNS=40
  real    0m0.194s
  user    0m0.045s
  sys     0m0.091s

  Bash 4.3.48(8)-release, Cygwin Bash,
  old implementation with `fold`, COLUMNS=161
  real    0m0.135s
  user    0m0.030s
  sys     0m0.030s

  Bash 4.3.48(8)-release, Cygwin Bash,
  new implementation without `fold`, COLUMNS=161
  real    0m0.123s
  user    0m0.060s
  sys     0m0.015s

  Bash 4.3.48(8)-release, Cygwin Bash,
  old implementation with `fold`, COLUMNS=40
  real    0m0.189s
  user    0m0.076s
  sys     0m0.091s

  Bash 4.3.48(8)-release, Cygwin Bash,
  new implementation without `fold`, COLUMNS=40
  real    0m0.181s
  user    0m0.076s
  sys     0m0.046s

  Bash 4.3.11(1)-release, Windows Subsystem for Linux,
  old implementation with `fold`, COLUMNS=161
  real    0m0.093s
  user    0m0.031s
  sys     0m0.063s

  Bash 4.3.11(1)-release, Windows Subsystem for Linux,
  new implementation without `fold`, COLUMNS=161
  real    0m0.078s
  user    0m0.016s
  sys     0m0.063s

  Bash 4.3.11(1)-release, Windows Subsystem for Linux,
  old implementation with `fold`, COLUMNS=40
  real    0m0.155s
  user    0m0.016s
  sys     0m0.125s

  Bash 4.3.11(1)-release, Windows Subsystem for Linux,
  new implementation without `fold`, COLUMNS=40
  real    0m0.141s
  user    0m0.016s
  sys     0m0.109s
@mbland mbland modified the milestone: v1.4.0 Jan 16, 2017
@mbland
Copy link
Owner Author

mbland commented Feb 16, 2017

This is more or less closed by #165; see that PR and sstephenson/bats#210 for the details. There's possibly a little more efficiency to be gained from eliminating subshells and pipelines during the startup phase (there is still a bit of a noticeable delay when starting to run the entire suite on Windows), but the performance gain isn't likely to be as profound as from these changes.

@mbland mbland closed this as completed Feb 16, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant