Patterns from Git

Git's data model is built on copy-on-write immutable objects and efficient diffing.

Pattern	Where	What It Does
Copy-on-Write	`object-file.c`	Content-addressed immutable objects; branches share data, copy only on change
Diff / Patch	`diff.c`, `xdiff/`	Myers' diff algorithm for minimal edit distance between file versions
Bitmask	`read-cache-ll.h`	`CE_*` cache entry flags — staged, valid, intent-to-add
Bloom Filter	`bloom.c`	Changed-path bloom filters for faster `git log -- <path>`
Trie	`read-cache.c`	Name hash table for fast directory-level path lookup
LRU Cache	`pack-objects.c`	Delta base cache for reusing computed deltas during pack
Merkle Tree	`tree.c`	Content-addressed Merkle DAG — every commit, tree, blob is hashed; changing one byte changes all hashes up to root

How They Compose: `git commit`

When you run git commit, multiple patterns work together to create an immutable, verifiable snapshot:

git commit -m "fix bug"▼

Diff / Patch

Git computes the diff between the index (staging area) and the working tree to determine what changed.

Copy-on-Write

Changed files become new blob objects. Unchanged files are shared by reference (same SHA-1 → same object). No data is copied unless it actually changed.

Merkle Tree

Tree objects hash their children. A changed blob changes its parent tree hash, which changes the commit hash. Any tampering anywhere is detectable from the root.

Bloom Filter

The commit-graph file stores changed-path bloom filters. Future git log queries can skip commits that didn't touch a given path without reading the tree.

▼New commit object (immutable, content-addressed)

The core insight is that copy-on-write + Merkle hashing gives Git both space efficiency (shared objects) and integrity verification (tamper-evident hashes) with no trade-off between the two.

Patterns from Git ​

How They Compose: git commit ​

Further Reading ​

Patterns from Git

How They Compose: `git commit`

Further Reading