Rather than simply testing and implementing code, a professional software engineer also documents and clearly explains the reasons behind their changes. Like a form of journaling, this practice makes each commit a page in a book that explains the evolution of how your architecture came into being. Documenting your changes may take a little effort, but you’ll save your team valuable debugging time when those commits help you reason through what you were doing six months to a year ago and beyond. Such documentation leverages the mental model of Chesterton’s Fence in which a team of any size can leverage second-order thinking of their Git history in order to make wise and forward thinking next steps.
The goal of this article is to explain why commits like this are important, namely because they explain:
Who committed the changes.
What the changes are.
Why the changes are important.
All three should be clear in every Git commit, a practice which eventually becomes instinctual.
Before diving into the anatomy of a quality Git commit, it is worth explaining how the output of the above screenshot was generated. The command line for this is:
git show --stat \ --pretty=format:"%C(yellow)%h%C(reset) %G? %C(bold blue)%an%C(reset) %s%C(bold cyan)%d%C(reset) %C(green)%cr.%C(reset) %n%n%b%n%N%-%n" \ 0ac0d05f87c5
I mention the above command first because it’s so useful. In fact, I alias
git show as
ghow so I
can get a concise view of what I need on a frequent basis. If you’d like more details on the format
used for Git show or Git logging in general, take a look at the
Git Log Pretty Formats, the
Git Log Pretty screencast, or any of the
Screencasts on this site that deal with showing/logging of Git commits.
Returning to the above screenshot, let’s jump straight into what makes a quality commit by diving into all aspects of this commit.
While often ignored, the author of a commit is as vital as the subject and body. After all, the name reveals who committed the changes so we know who to speak to for support if we want to talk about architecture/implementation or get clarity on changes in general. Ideally, a commit’s documentation should provide all of the context necessary so no one has to track down the original author in the first place — assuming the engineer is still part of the team/company! Sometimes this documentation will be the only explanation you ever get.
In the screenshot above, we see my full name, both first and last. At a bare minimum, every author in your repository should have a first and last name with proper capitalization and punctuation. Seriously, in future years, you might not want to be known as the witty or silly handle with which you committed code.
In addition to your full name, a proper email address should be used as well. Although, not shown in the screenshot above, the email address should consist of the author’s handle and company domain. Example:
Lastly, an avatar should be associated with the email address via a service like Gravatar or uploaded and associated with the account through whatever service is used to host the Git repository. For several reasons, the avatar should be an accurate representation who the person is in real life. Sketches, cartoons, abstract art, etc. hinder engineers from identifying the individual when seeking help or wanting to collaborate on work. Real pictures make the commits more authentic too.
Returning to our example, the subject of the above commit is:
Added cache inspection
What makes this a great subject is the proper use of the Added prefix, used in the past tense.
Past tense is important because, once you’ve committed code, the changes are made and therefore
already in the past. Plus, our Git history is a view of the past so the tense is right here in the
name! The same applies when using commands such as
git blame, etc. After
years of writing Git commits, I’ve found there are only five prefixes you ever need to use:
For more details, check out the Git Lint Commit Subject Prefix style guide.
Using only these five prefix provides additional value when assembling project changes and/or release notes when you group and alphabetically sort them for distribution. Even better, tools like Milestoner can further automate the process so you have less work to do when communicating changes to your engineering team, stakeholders, and/or customers.
In addition to the commit prefix, the rest of our example’s subject is clear in that we know cache inspection support was added. A Git commit subject is always meant to tell us what is being committed to the code base and nothing more. Sometimes you might see people add story IDs or other forms of identification but the subject is not the place for this information. Metadata, like that, should always go in the body of the commit message via trailers, which we’ll discuss shortly.
Almost every commit should have a body to explain why the commit was made. The purpose of the body is to always support the subject and explain why the commit is necessary. Here’s only the commit message body from the screenshot above:
Necessary to improve inspection by answering computed environment settings in shell format. Example: <key>=<value> Additionally, ensures the entire environment is not output when constructing a new cache.
There are several important qualities worth pointing out here:
Capitalization - Each paragraph is properly capitalized and reads like a page out of a book.
Paragraphs - Each paragraph is devoted to a single idea and uses proper punctuation.
Code Blocks - There is a code block (i.e. ASCII Doc / Markdown) that delineates the two paragraphs to properly explain the format of the key=value pair for illustration purposes.
72 Character Limit - All paragraphs are word wrapped at 72 character columns so you don’t have to horizontally scroll when reading why the commit was made, which is always annoying. This makes reading the commit a more pleasurable experience, especially on smaller screens like mobile devices.
As mentioned in the Git Lint Style Guide, all commits should be atomic, which means each commit should be the smallest and most concise unit of work possible. As a general rule of thumb, this means only the implementation and test:
lib/xdg/cache.rb | 22 spec/lib/xdg/cache_spec.rb | 35
Regardless of the above being a Ruby implementation, the concept applies here, too, in that we have
both the implementation (
cache.rb) and test (
cache_spec.rb) which comprises this Git commit.
When reviewing this code or looking at the history of changes to these files we’ll always have the
implementation and test bound together so we can understand what the code is and why the code
Atomic commits improve your workflow too, especially a Git Rebase Workflow. I find detailed commits easier to fix, squash, edit, drop, cherry pick, etc. while working on a feature branch during the course of development. Large, messy, commits are never as easy to work with due to being inherently harder to untangle and manage. Stay small because you’ll be happier, agile, and more flexible. I promise.
Git Trailers are where all metadata associated with a commit should go:
Co-Authored-By: Jill Smith <email@example.com> Tracker: Clubhouse Issue: 562
This information always goes at the end of a commit message, right after the body, and should never be used in the subject or body. Doing so ensures a commit message remains readable by fellow engineers and keeps the metadata in a format that is useful for post-processing by a computer. Git Lint can aid in this endeavor as well.