A Guide to git: Three Essential Hidden Concepts

Like many other people, I have struggled with git. It was obviously all very clever, but somehow inexplicably difficult and frustrating to use.

Eventually, I realized that my difficulties stemmed from three misconceptions: areas, where git did something different from what I thought it did, or different from what I was led to believe it did.

Eventually, I figured it out. My three primary insights were:

  • Git is primarily an object store; the version-control functionality is implemented on top (or in terms) of that object store. The object store is separate from the conventional filesystem, yet uses the filesystem’s working directory as a viewer into the state of the object store.

  • As an object store, git arranges its objects in a directed, acyclic graph, which is oriented backwards in time. Understanding this data structure, in particular its backward directionality, is essential for recovering and restoring older versions of the stored objects, and for managing git’s branching functionality.

  • Read operations involving remote repositories are not direct, but proceed via a local read-only proxy, the so-called “remote-tracking branch”. Understanding the role of this branch, and the necessary message flows between it and both the remote repository, and any local development branches, is crucial for avoiding sudden and frustrating failures when working with remote repositories.

There is nothing new here: I am merely restating what has been said in the git documentation and by various blogs and tutorials many times before. Nevertheless, somehow all these careful descriptions failed to convey those three points to me.

For this reason, and for my own education, I thought it might be worthwhile, one more time, to try again. You can find the entire 6500 word guide here. Maybe you’ll find it just a little bit clearer, or at least different, than other write-ups on the topic.