Skip to content

Latest commit

 

History

History
536 lines (395 loc) · 35.4 KB

04-HistoryVisualization.adoc

File metadata and controls

536 lines (395 loc) · 35.4 KB

History Visualization

In this chapter you’ll learn about visualizing the history of a Git repository in varying formats by learning the following topics:

  • How to filter git log output commits by various criteria

  • How to format git log output to display the information you care about

  • How to find why and when a line in a file was changed, and who changed it using git blame

  • How to identify which commit caused a particular bug using git bisect

When working with a Git repository on large, long-running software projects you’ll sometimes want to dig through the history to identify old versions of code, work out why and by whom changes were made, analyze the changes to identify why a bug is occurring, for example. You can do this to a limited extent using the commands you’ve already learned (git log and git diff) and extend this with two more we’ll cover in this chapter: git blame and git bisect.

Let’s start by learning how to optimize our use of git log to only list only particular commits.

List only certain commits

Sometimes when examining history, you’ll want to filter the commits that are displayed based on some of their metadata. Perhaps you’re tracking down a commit that you can remember was made on a rough date, by a particular person or with a particular word in its commit message. You could do this manually, but sometimes there are too many commits in the history to scan through the git log or gitx output in a timely fashion. For these cases, the git log command has various flags and arguments that can be used to filter which commits are shown in its output.

Let’s start by trying to find a commit by author, date, and commit message simultaneously.

Problem

You want to list the commits authored by Mike McQuaid, after November 10, 2013, with the string "file." in their message.

Solution

  1. Change to the directory containing your repository; for example cd /Users/mike/GitInPracticeRedux/.

  2. Run git log --author "Mike McQuaid" --after "Nov 10 2013" --grep 'file\.' and, if necessary, q to exit. The output should resemble the following:

Filtered log output
# git log --author "Mike McQuaid" --after "Nov 10 2013" --grep 'file\.'

commit 06b5eb58b62ba8bbbeee258705c636ca7ac20b49 (1)
Author: Mike McQuaid <[email protected]>
Date:   Thu Nov 28 15:49:30 2013 +0000

    Remove unfavourable review file.

commit fcd8f6e957a03061cdf411851fe38034a44c97ab
Author: Mike McQuaid <[email protected]>
Date:   Thu Nov 28 15:48:24 2013 +0000

    Add first review temporary file.

commit c6eed6681efc8d0bff908e6dbb7d887c4b3fab3e (2)
Author: Mike McQuaid <[email protected]>
Date:   Thu Nov 28 15:39:38 2013 +0000

    Rename book file to first part file.
  1. most recent commit

  2. least recent commit

The filtered log output is the essentially the same as the git log output you saw in 01-LocalGit.adoc, but with only the commits that are matched by the specified arguments.

From the arguments provided to the log command:

  • --author specifies a regular expression that matches the contents of the author. In the previous case, it was searching for the author string Mike McQuaid <[email protected]> and found a match as the string started with the requested 'Mike McQuaid'.

  • --after (or --since) specifies that the only commits shown should be those that were made after the specified date. These dates can be in any format that Git recognizes such as Nov 10 2013, 2014-01-30, midnight or yesterday.

  • --grep specifies a regular expression that matches the contents of the commit message. file\. was used rather than file. to escape the . character.

Note
What are regular expressions?
Regular expressions are search patterns that are typically used to match these patterns inside of strings. There are multiple variants of regular expressions, and Git uses the POSIX regular expression format. Git uses them for various filtering operations such as filtering by commit message or author, as we saw previously. Characters either have literal meanings, such as a, which will always match a lowercase 'a' character, or they can have special meanings, such as ., which will match any character. For example, file. will match the strings files or filed. To turn a character with a special meaning into one with a literal meaning, you can escape them with a backslash--file\. will match the string file. but not the string files. I’m not going to cover regular expressions in detail because they’re beyond the scope of this book. Don’t worry if you don’t fully understand them; what you’ve seen here should be more than enough to work with Git. If you wish to learn more about them, I recommend the Wikipedia Regular Expressions page: http://en.wikipedia.org/wiki/Regular_expression.

You’ve shown a subset of commits filtered by author, date and commit message.

Discussion

git log can take:

  • A --max-count (or -n) argument to limit the number of commits that are shown in the log output. I tend to use this often when I know that I only care about something in, say, the last 10 commits and don’t want to scroll through more output than that.

  • A --reverse argument to show the commits in ascending chronological order (oldest commit first)

  • A --before (or --until) argument, which will only show commits before the given date. This is the reverse of --after.

  • A --merges flag (or --min-parents=2), which will only show merge commits--commits that have at least two parents. If you adopt a branch-heavy workflow with Git then this will be useful in identifying which branches were merged and when.

git show

git show is a command similar to git log, but shows a single commit. It also defaults to showing what was changed in that commit. Remember from 01-LocalGit.adoc that git log has a --patch (or -p) argument to show what was changed by each commit in its output.

Show commit output
# git show HEAD^

commit 06b5eb58b62ba8bbbeee258705c636ca7ac20b49 (1)
Author: Mike McQuaid <[email protected]>
Date:   Thu Nov 28 15:49:30 2013 +0000

    Remove unfavourable review file.

diff --git a/GitInPracticeReviews.tmp b/GitInPracticeReviews.tmp (2)
deleted file mode 100644
index ebb69c3..0000000
--- a/GitInPracticeReviews.tmp
+++ /dev/null
@@ -1 +0,0 @@
-Git Sandwich
  1. commit information

  2. commit diff

From the show commit output:

  • "commit information (1)" shows all the same information expected in git log output, but only ever shows a single commit.

  • "commit diff (2)" shows the changes that were made in that commit. It’s the equivalent of typing git diff HEAD^..HEAD--the difference between the previous commit and the one before it.

The git show HEAD^ output is equivalent to git log --max-count=1 --patch HEAD^.

List commits with different formatting

The default git log output format is helpful, but takes a minimum of six lines of output to display each commit. It displays the commit SHA-1, author name and email, commit date, and the full commit message (each additional line of which adds a line to the git log output). Sometimes you’ll want to display more information and sometimes you’ll want to display less. You may even have a personal preference on how the output is presented that doesn’t match how it currently is.

Thankfully git log has some powerful formatting features with varied, sensible supplied options and the ability to completely customize the output to meet your needs.

Note
Why are commits structured like emails?
Remember in 01-LocalGit.adoc I mentioned that commits are structured like emails? This is because Git was initially created for use by the Linux kernel project, which has a high-traffic mailing list. People frequently send commits (known as "patches") to the mailing list. Previously there was an implicit format that people used to turn a requested change into an email for the mailing list, but Git can convert commits to and from an email format to facilitate this. Commands such as git format-patch, git send-mail, and git am (an abbreviation for "apply mailbox") can work directly with email files to convert them to/from Git commits. This is particularly useful for open source projects where everyone can access the Git repository but fewer people have write access to it. In this case, someone could send me an email that contains all the metadata of a commit using one of these commands. Nowadays typically this will be done with a GitHub pull request instead (which we’ll cover in chapter 11).

Let’s display some commits in an email-style format.

Problem

You want to list the last two commits in an email format with the eldest displayed first.

Solution

  1. Change to the directory containing your repository; for example, cd /Users/mike/GitInPracticeRedux/.

  2. Run git log --format=email --reverse --max-count 2' and, if necessary, q to exit. The output should resemble the following:

Email formatted log output
# git log --format=email --reverse --max-count 2

From 06b5eb58b62ba8bbbeee258705c636ca7ac20b49 Mon Sep 17 00:00:00 2001 (1)
From: Mike McQuaid <[email protected]> (2)
Date: Thu, 28 Nov 2013 15:49:30 +0000 (3)
Subject: [PATCH] Remove unfavourable review file. (4)


From 36640a59af951a26e0793f8eb0f4cc8e4c030167 Mon Sep 17 00:00:00 2001
From: Mike McQuaid <[email protected]>
Date: Thu, 28 Nov 2013 15:57:43 +0000
Subject: [PATCH] Ignore .tmp files.
  1. unix mailbox date

  2. commit author

  3. commit date

  4. commit subject

From the email formatted log output:

  • "unix mailbox date (1)" can be safely ignored. The first part is the SHA-1 hash for the commit. The log output is generated in the Unix mbox (short for "mailbox") format. The second, date part is not affected by the commit date or contents but is a special value used to indicate that this was outputted from Git rather than taken from real Unix mbox.

  • "commit author (2)" is the author of the commit. This is one of the reasons why Git stores a name and email address for authors and in commits; it eases the transition to email format. A commit can be seen as an email sent by the author of the commit requesting a change be made.

  • "commit date (3)" is the date on which the commit was made. This also sets the date for the email in its headers.

  • "commit subject (4)" is the first line of the commit message prefixed with "[PATCH]". This is another reason to structure your commit messages like emails (as mentioned in 01-LocalGit.adoc).

If there’s more than one line in a commit message then the other lines will be shown as the message body. Remember if you use the --patch (or -p) argument then git log output will also include the changes made in the commit. With this argument provided, each outputted git log entry will contain the commit and all the metadata necessary to convert it to or from an email.

Discussion

If you specify the --patch (or -p) flag to git log then you can also format the diff output by specifying flags for git diff too. Recall word diffs from 01-LocalGit.adoc. git log --patch --word-diff will show the word diff (rather than unified diff) for each log entry.

git log can take a --date flag, which takes various parameters to display the output dates in different formats. For example, --date=relative displays all dates relative to the current date--6 weeks ago and --date=short displays only the date, such as 2013-11-28. Also iso (or iso8601), rfc (or rfc2822), raw, local, and default formats are available but I won’t detail them all in this book.

The --format (or --pretty) flag can take various parameters such as email that we’ve seen in this example, medium which is the default if no format was specified, or oneline, short, full, fuller, or raw. I won’t show every format in this book but please compare and contrast them on your local machine. Different formats are better used in different situations depending on how much of their displayed information you care about at that time.

You may have noticed the "full" output contains details about an author and a committer and the "fuller" output additionally contains details of the author date and commit date.

Fuller log snippet
# git log --format=fuller

commit 334181a038e812050051776b69f0a80187abbeed
Author:     BrewTestBot <[email protected]>
AuthorDate: Thu Jan 9 23:48:16 2014 +0000
Commit:     Mike McQuaid <[email protected]>
CommitDate: Fri Jan 10 08:19:50 2014 +0000

    rust: add 0.9 bottle.

...

This snippet shows a single commit from Homebrew, an open-source project accessible at https://github.com/Homebrew/homebrew. This was used because in the GitInPracticeRedux repository all the previous commits will have the same author and committer, author date, and commit date.

Note
Why do commits have an author and committer?
This fuller commit output shows that for a commit, there are two recorded actions: the original author of the commit and the committer (the person who added this commit to the repository). These two attributes are both set at git commit time. If they’re both set at once then why are they separate values? Remember that we’ve seen repeatedly that commits are like emails, and can be formatted as emails and sent to others. If I have a public repository on GitHub then other users can clone my repository but can’t commit to it.

In these cases they may send me commits through a pull request (which will be discussed later in 10-GitHubPullRequests.adoc) or by email. If I want include these in my repository, the separation between committing and authoring means I can then include these commits, and Git stores the person who, for example, made the code changes and the person who added these changes to the repository (hopefully after reviewing them). This means you can keep the original attribution for the person who did the work but still record the person who added the commit to the repository and (hopefully) reviewed it. This is particularly useful in open source software; with other tools such as Subversion, if you don’t have commit access to a repository, the best attribution you could hope for would be something like "Thanks to Mike McQuaid for this commit!" in the commit message.

In Subversion the equivalent git blame command is svn blame. It also has an alias called svn praise. In Git there’s no such alias by default (but 07-PersonalizingGit.adoc will later show you how to create one yourself). I’m sure there’s a joke to be made about the fact that Subversion offers praise and blame equally but Git offers only blame!

Custom output format

If none of the git log output formats meets your needs, you can create your own custom formats using a format string. The format string uses placeholders to fill in various attributes per commit.

Let’s try and create a more prose-like format for git log:

Custom prose log format
# git log --format="%ar %an did: %s"

6 weeks ago Mike McQuaid did: Ignore .tmp files.
6 weeks ago Mike McQuaid did: Remove unfavourable review file.
6 weeks ago Mike McQuaid did: Add first review temporary file.
6 weeks ago Mike McQuaid did: Rename book file to first part file.
9 weeks ago Mike McQuaid did: Start Chapter 2.
3 months ago Mike McQuaid did: Joke rejected by editor!
3 months ago Mike McQuaid did: Improve joke comic timing.
3 months ago Mike McQuaid did: Add opening joke. Funny?
3 months ago Mike McQuaid did: Initial commit of book.

Here we’ve specified the format string with %ar %an did: %s. In this format string:

  • %ar is the relative format date on which the commit was authored.

  • %an is the name of the author of the commit.

  • did: is text that’s displayed the same in every commit and isn’t a placeholder.

  • %s is the commit message subject (the first line).

You can see the complete list of these placeholders in git log --help. There are too many for me to detail them all in this book. The large number of placeholders should mean that you can customize git log output into almost any format.

Release logs: git shortlog

git shortlog shows the output of git log in a format that’s typically used for open source software release announcements. It displays commits grouped by author with one commit subject per line.

Short log output
# git shortlog HEAD~6..HEAD

Mike McQuaid (9):  (1)
      Joke rejected by editor! (2)
      Start Chapter 2.
      Rename book file to first part file.
      Add first review temporary file.
      Remove unfavourable review file.
      Ignore .tmp files.
  1. commit author

  2. commit message

From the short log output:

  • "commit author (1)" shows the name of the author of the following commits and how many commits they’ve made.

  • "commit subject (2)" shows the first line of the commit message.

The commit range (HEAD~6..HEAD) is optional but typically you’d want to use one to create a software release announcement for any version after the first.

The ultimate log output

As mentioned previously often the git log output is too verbose or doesn’t display all the information you wish to query in a compact format. It’s also not obvious from the output how local or remote branches relate to the output.

I have a selection of format options I refer to as my "ultimate log output". Let’s look at the output with these options:

Graph log output
# git log --oneline --graph --decorate

* 36640a5 (HEAD, origin/master, origin/HEAD, master) Ignore .tmp files.
* 06b5eb5 Remove unfavourable review file.
* fcd8f6e Add first review temporary file.
* c6eed66 Rename book file to first part file.
* ac14a50 Start Chapter 2.
* 07fc4c3 Joke rejected by editor!
* 85a5db1 Improve joke comic timing.
* 6b437c7 Add opening joke. Funny?
* 6576b68 Initial commit of book.

This output format displays each commit on a single line. The line begins with a branch graph indicator (which I will explain shortly), follows with the short SHA-1 (which is useful for quickly copying-and-pasting), the branches, tags (introduced in 05-AdvancedBranching.adoc), HEAD that points to this commit in parentheses and ends with the commit subject.

As you may have noticed, this format is quite similar to that of the first two columns of GitX:

04 GitXGraph
Figure 1. GitX graph output

The GitInPracticeRedux repository doesn’t currently have any merge commits. Let’s see what the graph log output looks like with some of them.

Graph log merge commit snippet
# git log --oneline --graph --decorate

*   129cce6 (origin/master, origin/HEAD, master) Merge branch 'testing'
|\
| * a86067a (origin/testing, testing) testing branch commit
* | 1a36bd6 master branch commit

...

Here you can see the branch graph indicator becoming more useful. Like the graphical tools we’ve seen in 01-LocalGit.adoc, this displays branch merges and the commits on different branches using ASCII symbols to draw lines. The * means a commit that was made. Each "line" follows a single branch. Reading from the bottom up, we can see from the preceding listing that a commit was made on the master branch, a commit was made on the testing branch, and then the testing branch was merged into master. Both testing and master branches remain (haven’t been deleted) and both have been pushed to their respective remote branches. All this from just three lines of ASCII output. Hopefully you can see why I love this presentation. As typing git log --oneline --graph --decorate is unwieldy, we’ll see later in 07-PersonalizingGit.adoc how to shorten this using an alias to something like git l.

Show who last changed each line of a file: git blame

I’m sure all developers have been in a situation where they’ve seen some line of code in a file and wonder why it is was written that way. As long as the file is stored in a Git repository, it’s easy to query who, when, and why (given a good commit message was used) a certain change is made.

You could do this by using git diff or git log --patch, but neither of these tools are optimized for this particular use case; they both usually require reading through a lot of information you aren’t interested in to find the information you are.

Instead let’s see how to use the command designed specifically for this use case: git blame.

Problem

You wish to show the commit, person, and date in which each line of GitInPractice.asciidoc was changed.

Solution

  1. Change to the directory containing your repository; for example, cd /Users/mike/GitInPracticeRedux/.

  2. Run git blame --date=short 01-IntroducingGitInPractice.asciidoc. The output should resemble the following:

Blame output
# git blame --date=short 01-IntroducingGitInPractice.asciidoc

^6576b68 GitInPractice.asciidoc (Mike McQuaid 2013-09-29 1)
 = Git In Practice
6b437c77 GitInPractice.asciidoc (Mike McQuaid 2013-09-29 2)
 == Chapter 1
07fc4c3c GitInPractice.asciidoc (Mike McQuaid 2013-10-11 3)
 // TODO: think of funny first line that editor will approve.
ac14a504 GitInPractice.asciidoc (Mike McQuaid 2013-11-09 4)
 == Chapter 2
ac14a504 GitInPractice.asciidoc (Mike McQuaid 2013-11-09 5)
 // TODO: write two chapters

Firstly, note that the output shows GitInPractice.asciidoc rather than 01-IntroducingGitInPractice.asciidoc. This is because the filename has been changed since these changes were made. git blame is only showing changes to lines in the file and ignoring that the file was renamed. This is useful, as it means you don’t lose all blame data whenever you rename a file.

From the blame output:

  • --date=short is used to display only the date (not the time). This accepts the same formats as the --date flag for git log. This was used in the preceding listing to make it more readable, as git blame lines tend to be very long.

  • The ^ (caret) prefix on the first line indicates that this line was inserted in the initial commit.

  • Each line contains the short SHA-1, filename (if the line was changed when the file had a different name), parenthesized name, date, line number, and finally the line contents itself. For example, in commit 6b437c77 on September 29, 2013, Mike McQuaid added the == Chapter 1 line to GitInPractice.asciidoc (although the file is now named 01-IntroducingGitInPractice.asciidoc).

You’ve shown who changed each line of a file, in which commit, and when the commit was made.

Discussion

git blame has a --show-email (or -e) flag that can show the email address of the author instead of the name.

You can use the -w flag to ignore whitespace changes when finding where the line changes came from. Sometimes people will fix stuff like indentation or whitespace on a line, which makes no functional difference to the code in most programming languages. In these cases, you want to ignore whitespace changes so you can look at the changes that actually affect program behavior.

The -s flag hide the author name and date from in the output (and takes precedence over --show-email/-e). This can be useful for displaying a more concise output format and instead looking up this information by passing the SHA-1 to git show at a later point.

If the -L flag is specified and followed with a line range, for example -L 40,60, then only the lines in that range will be shown. This can be useful if you know already what subset of the file you care about and don’t want to have to search through it again in the git blame output.

Find which commit caused a particular bug: git bisect

The only thing worse than finding a bug in software and having to fix it is having to fix the same bug multiple times. A bug that was found, fixed, and has appeared again is typically known as a regression.

The traditional workflow for finding regressions is fairly painful. You typically will keep checking out older and older revisions in the version control history until you find a commit in which the bug wasn’t present, check out newer and newer revisions until you find where it happens again, and repeat the process to narrow it down. It’s typically a tedious exercise, which is made worse by your having to fix the same problem again.

Thankfully Git has a useful tool that makes this process much easier for you: git bisect. This uses a binary search algorithm to identify the problematic commit as quickly as possible; effectively automated the search backwards and forwards through history that I explained earlier.

The git bisect command takes good and bad arguments that you use to tell it that a particular commit didn’t have the bug (good) or did have the bug (bad). As it assumes that the bug does disappear and reappear multiple times but occurred once, it can make the assumption that the commit that caused a particular bug is first one chronologically that contains that bug. It uses this assumption, records the good and bad commits, and uses this to narrow down the commits each time. For example, if it was bisecting between commits from Monday (good) to Friday (bad), if a commit on Wednesday was known to be good then it can narrow the search down to Monday, Tuesday, or Wednesday. This halving of the search space each time is known as a binary search because it makes a binary decision each time: was the bad commit before or after this one.

For a simple example let’s try to find out which commit renamed a particular file (without manually looking through the history).

Problem

You wish to locate the commit that renamed GitInPractice.asciidoc to 01-IntroducingGitInPractice.asciidoc.

Solution

  1. Change to the directory containing your repository; for example, cd /Users/mike/GitInPracticeRedux/.

  2. Run git bisect start. There will be no output.

  3. Run git bisect bad. There will be no output.

  4. Run git bisect good 6576b6 where 6576b6 is the SHA-1 of any commit that you know was before the rename. The output should resemble First good bisect output.

  5. Check the names of the files in the directory by running ls .asciidoc.

  6. When the .asciidoc file is named GitInPractice.asciidoc, run git bisect good to indicate the file hasn’t been renamed yet. When the .asciidoc file is named 01-IntroducingGitInPractice.asciidoc run git bisect bad to indicate the file has been renamed. The output should be similar each time. No other parameters are required to git bisect good or git bisect bad; they will automatically check out the next revision to be checked when they’re run.

  7. Eventually the first bad commit will be found. The output should resemble Final bad bisect output.

  8. Run git bisect reset. The output should resemble Listing Bisect log output.

First good bisect output
# git bisect good

Bisecting: 3 revisions left to test after this (roughly 2 steps) (1)
[ac14a50465f37cfb038bdecd1293eb4c1d98a2ee] Start Chapter 2. (2)
  1. steps remaining

  2. new commit

From the good bisect output:

  • "steps remaining (1)" shows how many revisions remain untested and, using the binary search algorithm, roughly how many more git bisect invocations remain until you find the problematic commit.

  • "new commit (2)" shows the new commit SHA-1 that git bisect has checked out for examining whether this commit is "good" (the bug isn’t present) or "bad" (the bug is present).

Final bad bisect output
# git bisect bad

c6eed6681efc8d0bff908e6dbb7d887c4b3fab3e is the first bad commit (1)
commit c6eed6681efc8d0bff908e6dbb7d887c4b3fab3e (2)
Author: Mike McQuaid <[email protected]>
Date:   Thu Nov 28 15:39:38 2013 +0000

    Rename book file to first part file.

:000000 100644 0000000000000000000000000000000000000000
 5e02125ebbc8384e8217d4370251268e867f8f03 A
 01-IntroducingGitInPractice.asciidoc (3)
:100644 000000 5e02125ebbc8384e8217d4370251268e867f8f03
 0000000000000000000000000000000000000000 D (4)
 GitInPractice.asciidoc
  1. Bisect result

  2. Commit information

  3. New object metadata

  4. Old object metadata

From the final bisect output:

  • "Bisect result (1)" shows the commit that has been identified to cause the bug or, in this case, the rename. This matches the commit message here, so this is a slightly silly example, but typically this will allow you to then examine these changes and identify what in this commit caused the regression.

  • "Commit information (2)" shows the git show information for this commit.

  • "New object metadata (3)" shows the old and new file mode and SHA-1 for the new object (after renaming).

  • "Old object metadata (4)" shows the old and new file mode and SHA-1 for the old object (before renaming).

04 GitXBisect
Figure 2. GitX bisect output before git bisect reset

From GitX bisect output before git bisect reset, you can see that git bisect creates new, temporary refs (they’re removed by git bisect reset) as it is working. These indicate the commits that were marked by git bisect bad and git bisect good while working through the history. The refs/bisect/bad ref points to the final, bad commit that was detected.

You’ve located the commit that renamed GitInPractice.asciidoc.

Discussion

Each time git bisect good, git bisect bad, or git bisect reset is run, Git will check out the relevant next commit for examination. As a result, it’s important to ensure that all outstanding changes have been committed (or stashed) before you use git bisect.

Table 1. bisect binary search performance
Total commits Max checked commits

10

6

100

13

1000

19

As you can see from the table, as the number of commits increases, the max number of commits that need to be checked increases much more slowly. This algorithm means that you can quickly navigate through a huge numbers of commits with git bisect without too many steps.

If you wish to examine the steps that you followed in a git bisect operation, then you can run git bisect log:

Bisect log output
# git bisect log

git bisect start (1)
# bad: [36640a59af951a26e0793f8eb0f4cc8e4c030167] (2)
 Ignore .tmp files. (3)
git bisect bad 36640a59af951a26e0793f8eb0f4cc8e4c030167
# good: [6576b6803e947b29e7d3b4870477ae283409ba71]
 Initial commit of book.
git bisect good 6576b6803e947b29e7d3b4870477ae283409ba71
# good: [ac14a50465f37cfb038bdecd1293eb4c1d98a2ee]
 Start Chapter 2.
git bisect good ac14a50465f37cfb038bdecd1293eb4c1d98a2ee
# bad: [fcd8f6e957a03061cdf411851fe38034a44c97ab]
 Add first review temporary file.
git bisect bad fcd8f6e957a03061cdf411851fe38034a44c97ab
# bad: [c6eed6681efc8d0bff908e6dbb7d887c4b3fab3e]
 Rename book file to first part file.
git bisect bad c6eed6681efc8d0bff908e6dbb7d887c4b3fab3e
# first bad commit: [c6eed6681efc8d0bff908e6dbb7d887c4b3fab3e] (4)
 Rename book file to first part file.
  1. Bisect command

  2. Commit SHA-1

  3. Commit subject

  4. Bisect result

From the bisect log output:

  • "Bisect command (1)" shows the git bisect command that you invoked at this step.

  • "Commit SHA-1 (2)" shows the status and SHA-1 of a commit.

  • "Commit subject (3)" shows the commit subject of a commit.

  • "Bisect result (4)" shows the final result of the whole bisect operation.

If you already know that bug has come from particular files or paths in the working tree then you can specify these to git bisect start. For example, if you knew that the changes that caused the regression were in the src/gui directory then you could run git bisect start src/gui. This means that only the commits that changed the contents of this directory will be checked, and this makes things even faster.

If it’s difficult or impossible to tell if a particular commit is good or bad, you can run git bisect skip, which will ignore it. Given there are enough other commits, git bisect will use another to narrow the search instead.

Automating git bisect

Although git bisect is already useful, wouldn’t it be even better if, rather than having to keep typing git bisect good or git bisect bad, it could run automatically and tell you which commit caused the regression? This is possible with git bisect run.

git bisect run is run instead of git bisect good or git bisect bad (after a git bisect start, git bisect good, git bisect bad, and before a git bisect reset) and automates the future runs of git bisect good and git bisect bad. It uses the exit code of a process to identify whether the command was successful. For example, if you run the command ls GitInPractice.asciidoc it returns an exit code of 0 on success (when the file is present) and 1 on failure (when the file is not). Let’s take advantage of this to use it with git bisect run:

Bisect run output
# git bisect start

# git bisect bad

# git bisect good 6576b6

Bisecting: 3 revisions left to test after this (roughly 2 steps)
[ac14a50465f37cfb038bdecd1293eb4c1d98a2ee] Start Chapter 2.

# git bisect run ls GitInPractice.asciidoc

Bisecting: 3 revisions left to test after this (roughly 2 steps) (1)
[ac14a50465f37cfb038bdecd1293eb4c1d98a2ee]
 Start Chapter 2.
running ls GitInPractice.asciidoc
GitInPractice.asciidoc
Bisecting: 1 revision left to test after this (roughly 1 step)
[fcd8f6e957a03061cdf411851fe38034a44c97ab]
 Add first review temporary file.
running ls GitInPractice.asciidoc
ls: GitInPractice.asciidoc: No such file or directory
Bisecting: 0 revisions left to test after this (roughly 0 steps)
[c6eed6681efc8d0bff908e6dbb7d887c4b3fab3e]
 Rename book file to first part file.
running ls GitInPractice.asciidoc
ls: GitInPractice.asciidoc: No such file or directory
c6eed6681efc8d0bff908e6dbb7d887c4b3fab3e is the first bad commit (3)
commit c6eed6681efc8d0bff908e6dbb7d887c4b3fab3e
Author: Mike McQuaid <[email protected]>
Date:   Thu Nov 28 15:39:38 2013 +0000

    Rename book file to first part file.

:000000 100644 0000000000000000000000000000000000000000
 5e02125ebbc8384e8217d4370251268e867f8f03 A
 01-IntroducingGitInPractice.asciidoc
:100644 000000 5e02125ebbc8384e8217d4370251268e867f8f03
 0000000000000000000000000000000000000000 D
 GitInPractice.asciidoc
bisect run success

The output is identical to the git bisect log output or the combined output of all the other git bisect operations. No human intervention is required in the preceding output; it just ran until it reached a result.

A typical case would be writing a unit test that reproduces a regression and using that with git bisect run to quickly test a large number of commits.

Note
How can I stop git bisect from overwriting my test?
As git bisect good and git bisect bad perform a git checkout each time, you need to make sure that the regression test isn’t overwritten by other files or committed after the earliest "bad" commit. The easiest way of doing this is to make a copy of the test in another directory outside the Git working directory, so git bisect run won’t change its contents as it checks out different commits.

Summary

In this chapter you hopefully learned:

  • How to filter git log output by author, date, commit message, and merge commits

  • How to display only a single commit or requested number of commits

  • How to display git log output in various formats

  • How to display commits in an open source release announcement format

  • How to display branching effectively with git log

  • How to show who changed each line of a file, when, why, and what was the original filename

  • How to use git bisect to search quickly (but manually) through the history with git bisect good and git bisect bad to identify regressions

  • How to use git bisect run to search automatically through the history to identify regressions with a test