|
Due to scalability issues, Shawn Pearce has originally proposed a new
"reftable" format more than six years ago [1]. Initially, this new
format was implemented in JGit with promising results. Around two years
ago, we have then added the "reftable" library to the Git codebase via
a4bbd13be3 (Merge branch 'hn/reftable', 2021-12-15). With this we have
landed all the low-level code to read and write reftables. Notably
missing though was the integration of this low-level code into the Git
code base in the form of a new ref backend that ties all of this
together.
This gap is now finally closed by introducing a new "reftable" backend
into the Git codebase. This new backend promises to bring some notable
improvements to Git repositories:
- It becomes possible to do truly atomic writes where either all refs
are committed to disk or none are. This was not possible with the
"files" backend because ref updates were split across multiple loose
files.
- The disk space required to store many refs is reduced, both compared
to loose refs and packed-refs. This is enabled both by the reftable
format being a binary format, which is more compact, and by prefix
compression.
- We can ignore filesystem-specific behaviour as ref names are not
encoded via paths anymore. This means there is no need to handle
case sensitivity on Windows systems or Unicode precomposition on
macOS.
- There is no need to rewrite the complete refdb anymore every time a
ref is being deleted like it was the case for packed-refs. This
means that ref deletions are now constant time instead of scaling
linearly with the number of refs.
- We can ignore file/directory conflicts so that it becomes possible
to store both "refs/heads/foo" and "refs/heads/foo/bar".
- Due to this property we can retain reflogs for deleted refs. We have
previously been deleting reflogs together with their refs to avoid
file/directory conflicts, which is not necessary anymore.
- We can properly enumerate all refs. With the "files" backend it is
not easily possible to distinguish between refs and non-refs because
they may live side by side in the gitdir.
Not all of these improvements are realized with the current "reftable"
backend implementation. At this point, the new backend is supposed to be
a drop-in replacement for the "files" backend that is used by basically
all Git repositories nowadays. It strives for 1:1 compatibility, which
means that a user can expect the same behaviour regardless of whether
they use the "reftable" backend or the "files" backend for most of the
part.
Most notably, this means we artificially limit the capabilities of the
"reftable" backend to match the limits of the "files" backend. It is not
possible to create refs that would end up with file/directory conflicts,
we do not retain reflogs, we perform stricter-than-necessary checks.
This is done intentionally due to two main reasons:
- It makes it significantly easier to land the "reftable" backend as
tests behave the same. It would be tough to argue for each and every
single test that doesn't pass with the "reftable" backend.
- It ensures compatibility between repositories that use the "files"
backend and repositories that use the "reftable" backend. Like this,
hosters can migrate their repositories to use the "reftable" backend
without causing issues for clients that use the "files" backend in
their clones.
It is expected that these artificial limitations may eventually go away
in the long term.
Performance-wise things very much depend on the actual workload. The
following benchmarks compare the "files" and "reftable" backends in the
current version:
- Creating N refs in separate transactions shows that the "files"
backend is ~50% faster. This is not surprising given that creating a
ref only requires us to create a single loose ref. The "reftable"
backend will also perform auto compaction on updates. In real-world
workloads we would likely also want to perform pack loose refs,
which would likely change the picture.
Benchmark 1: update-ref: create refs sequentially (refformat = files, refcount = 1)
Time (mean ± σ): 2.1 ms ± 0.3 ms [User: 0.6 ms, System: 1.7 ms]
Range (min … max): 1.8 ms … 4.3 ms 133 runs
Benchmark 2: update-ref: create refs sequentially (refformat = reftable, refcount = 1)
Time (mean ± σ): 2.7 ms ± 0.1 ms [User: 0.6 ms, System: 2.2 ms]
Range (min … max): 2.4 ms … 2.9 ms 132 runs
Benchmark 3: update-ref: create refs sequentially (refformat = files, refcount = 1000)
Time (mean ± σ): 1.975 s ± 0.006 s [User: 0.437 s, System: 1.535 s]
Range (min … max): 1.969 s … 1.980 s 3 runs
Benchmark 4: update-ref: create refs sequentially (refformat = reftable, refcount = 1000)
Time (mean ± σ): 2.611 s ± 0.013 s [User: 0.782 s, System: 1.825 s]
Range (min … max): 2.597 s … 2.622 s 3 runs
Benchmark 5: update-ref: create refs sequentially (refformat = files, refcount = 100000)
Time (mean ± σ): 198.442 s ± 0.241 s [User: 43.051 s, System: 155.250 s]
Range (min … max): 198.189 s … 198.670 s 3 runs
Benchmark 6: update-ref: create refs sequentially (refformat = reftable, refcount = 100000)
Time (mean ± σ): 294.509 s ± 4.269 s [User: 104.046 s, System: 190.326 s]
Range (min … max): 290.223 s … 298.761 s 3 runs
- Creating N refs in a single transaction shows that the "files"
backend is significantly slower once we start to write many refs.
The "reftable" backend only needs to update two files, whereas the
"files" backend needs to write one file per ref.
Benchmark 1: update-ref: create many refs (refformat = files, refcount = 1)
Time (mean ± σ): 1.9 ms ± 0.1 ms [User: 0.4 ms, System: 1.4 ms]
Range (min … max): 1.8 ms … 2.6 ms 151 runs
Benchmark 2: update-ref: create many refs (refformat = reftable, refcount = 1)
Time (mean ± σ): 2.5 ms ± 0.1 ms [User: 0.7 ms, System: 1.7 ms]
Range (min … max): 2.4 ms … 3.4 ms 148 runs
Benchmark 3: update-ref: create many refs (refformat = files, refcount = 1000)
Time (mean ± σ): 152.5 ms ± 5.2 ms [User: 19.1 ms, System: 133.1 ms]
Range (min … max): 148.5 ms … 167.8 ms 15 runs
Benchmark 4: update-ref: create many refs (refformat = reftable, refcount = 1000)
Time (mean ± σ): 58.0 ms ± 2.5 ms [User: 28.4 ms, System: 29.4 ms]
Range (min … max): 56.3 ms … 72.9 ms 40 runs
Benchmark 5: update-ref: create many refs (refformat = files, refcount = 1000000)
Time (mean ± σ): 152.752 s ± 0.710 s [User: 20.315 s, System: 131.310 s]
Range (min … max): 152.165 s … 153.542 s 3 runs
Benchmark 6: update-ref: create many refs (refformat = reftable, refcount = 1000000)
Time (mean ± σ): 51.912 s ± 0.127 s [User: 26.483 s, System: 25.424 s]
Range (min … max): 51.769 s … 52.012 s 3 runs
- Deleting a ref in a fully-packed repository shows that the "files"
backend scales with the number of refs. The "reftable" backend has
constant-time deletions.
Benchmark 1: update-ref: delete ref (refformat = files, refcount = 1)
Time (mean ± σ): 1.7 ms ± 0.1 ms [User: 0.4 ms, System: 1.2 ms]
Range (min … max): 1.6 ms … 2.1 ms 316 runs
Benchmark 2: update-ref: delete ref (refformat = reftable, refcount = 1)
Time (mean ± σ): 1.8 ms ± 0.1 ms [User: 0.4 ms, System: 1.3 ms]
Range (min … max): 1.7 ms … 2.1 ms 294 runs
Benchmark 3: update-ref: delete ref (refformat = files, refcount = 1000)
Time (mean ± σ): 2.0 ms ± 0.1 ms [User: 0.5 ms, System: 1.4 ms]
Range (min … max): 1.9 ms … 2.5 ms 287 runs
Benchmark 4: update-ref: delete ref (refformat = reftable, refcount = 1000)
Time (mean ± σ): 1.9 ms ± 0.1 ms [User: 0.5 ms, System: 1.3 ms]
Range (min … max): 1.8 ms … 2.1 ms 217 runs
Benchmark 5: update-ref: delete ref (refformat = files, refcount = 1000000)
Time (mean ± σ): 229.8 ms ± 7.9 ms [User: 182.6 ms, System: 46.8 ms]
Range (min … max): 224.6 ms … 245.2 ms 6 runs
Benchmark 6: update-ref: delete ref (refformat = reftable, refcount = 1000000)
Time (mean ± σ): 2.0 ms ± 0.0 ms [User: 0.6 ms, System: 1.3 ms]
Range (min … max): 2.0 ms … 2.1 ms 3 runs
- Listing all refs shows no significant advantage for either of the
backends. The "files" backend is a bit faster, but not by a
significant margin. When repositories are not packed the "reftable"
backend outperforms the "files" backend because the "reftable"
backend performs auto-compaction.
Benchmark 1: show-ref: print all refs (refformat = files, refcount = 1, packed = true)
Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms]
Range (min … max): 1.5 ms … 2.0 ms 1729 runs
Benchmark 2: show-ref: print all refs (refformat = reftable, refcount = 1, packed = true)
Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms]
Range (min … max): 1.5 ms … 1.8 ms 1816 runs
Benchmark 3: show-ref: print all refs (refformat = files, refcount = 1000, packed = true)
Time (mean ± σ): 4.3 ms ± 0.1 ms [User: 0.9 ms, System: 3.3 ms]
Range (min … max): 4.1 ms … 4.6 ms 645 runs
Benchmark 4: show-ref: print all refs (refformat = reftable, refcount = 1000, packed = true)
Time (mean ± σ): 4.5 ms ± 0.2 ms [User: 1.0 ms, System: 3.3 ms]
Range (min … max): 4.2 ms … 5.9 ms 643 runs
Benchmark 5: show-ref: print all refs (refformat = files, refcount = 1000000, packed = true)
Time (mean ± σ): 2.537 s ± 0.034 s [User: 0.488 s, System: 2.048 s]
Range (min … max): 2.511 s … 2.627 s 10 runs
Benchmark 6: show-ref: print all refs (refformat = reftable, refcount = 1000000, packed = true)
Time (mean ± σ): 2.712 s ± 0.017 s [User: 0.653 s, System: 2.059 s]
Range (min … max): 2.692 s … 2.752 s 10 runs
Benchmark 7: show-ref: print all refs (refformat = files, refcount = 1, packed = false)
Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms]
Range (min … max): 1.5 ms … 1.9 ms 1834 runs
Benchmark 8: show-ref: print all refs (refformat = reftable, refcount = 1, packed = false)
Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms]
Range (min … max): 1.4 ms … 2.0 ms 1840 runs
Benchmark 9: show-ref: print all refs (refformat = files, refcount = 1000, packed = false)
Time (mean ± σ): 13.8 ms ± 0.2 ms [User: 2.8 ms, System: 10.8 ms]
Range (min … max): 13.3 ms … 14.5 ms 208 runs
Benchmark 10: show-ref: print all refs (refformat = reftable, refcount = 1000, packed = false)
Time (mean ± σ): 4.5 ms ± 0.2 ms [User: 1.2 ms, System: 3.3 ms]
Range (min … max): 4.3 ms … 6.2 ms 624 runs
Benchmark 11: show-ref: print all refs (refformat = files, refcount = 1000000, packed = false)
Time (mean ± σ): 12.127 s ± 0.129 s [User: 2.675 s, System: 9.451 s]
Range (min … max): 11.965 s … 12.370 s 10 runs
Benchmark 12: show-ref: print all refs (refformat = reftable, refcount = 1000000, packed = false)
Time (mean ± σ): 2.799 s ± 0.022 s [User: 0.735 s, System: 2.063 s]
Range (min … max): 2.769 s … 2.836 s 10 runs
- Printing a single ref shows no real difference between the "files"
and "reftable" backends.
Benchmark 1: show-ref: print single ref (refformat = files, refcount = 1)
Time (mean ± σ): 1.5 ms ± 0.1 ms [User: 0.4 ms, System: 1.0 ms]
Range (min … max): 1.4 ms … 1.8 ms 1779 runs
Benchmark 2: show-ref: print single ref (refformat = reftable, refcount = 1)
Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms]
Range (min … max): 1.4 ms … 2.5 ms 1753 runs
Benchmark 3: show-ref: print single ref (refformat = files, refcount = 1000)
Time (mean ± σ): 1.5 ms ± 0.1 ms [User: 0.3 ms, System: 1.1 ms]
Range (min … max): 1.4 ms … 1.9 ms 1840 runs
Benchmark 4: show-ref: print single ref (refformat = reftable, refcount = 1000)
Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms]
Range (min … max): 1.5 ms … 2.0 ms 1831 runs
Benchmark 5: show-ref: print single ref (refformat = files, refcount = 1000000)
Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms]
Range (min … max): 1.5 ms … 2.1 ms 1848 runs
Benchmark 6: show-ref: print single ref (refformat = reftable, refcount = 1000000)
Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms]
Range (min … max): 1.5 ms … 2.1 ms 1762 runs
So overall, performance depends on the usecases. Except for many
sequential writes the "reftable" backend is roughly on par or
significantly faster than the "files" backend though. Given that the
"files" backend has received 18 years of optimizations by now this can
be seen as a win. Furthermore, we can expect that the "reftable" backend
will grow faster over time when attention turns more towards
optimizations.
The complete test suite passes, except for those tests explicitly marked
to require the REFFILES prerequisite. Some tests in t0610 are marked as
failing because they depend on still-in-flight bug fixes. Tests can be
run with the new backend by setting the GIT_TEST_DEFAULT_REF_FORMAT
environment variable to "reftable".
There is a single known conceptual incompatibility with the dumb HTTP
transport. As "info/refs" SHOULD NOT contain the HEAD reference, and
because the "HEAD" file is not valid anymore, it is impossible for the
remote client to figure out the default branch without changing the
protocol. This shortcoming needs to be handled in a subsequent patch
series.
As the reftable library has already been introduced a while ago, this
commit message will not go into the details of how exactly the on-disk
format works. Please refer to our preexisting technical documentation at
Documentation/technical/reftable for this.
[1]: https://public-inbox.org/git/CAJo=hJtyof=HRy=2sLP0ng0uZ4=S-DpZ5dR1aF+VHVETKG20OQ@mail.gmail.com/
Original-idea-by: Shawn Pearce <spearce@spearce.org>
Based-on-patch-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Normally we try to avoid bumps of the whole-repository
core.repositoryformatversion field. However, it is
unavoidable if we want to safely change certain aspects of
git in a backwards-incompatible way (e.g., modifying the set
of ref tips that we must traverse to generate a list of
unreachable, safe-to-prune objects).
If we were to bump the repository version for every such
change, then any implementation understanding version `X`
would also have to understand `X-1`, `X-2`, and so forth,
even though the incompatibilities may be in orthogonal parts
of the system, and there is otherwise no reason we cannot
implement one without the other (or more importantly, that
the user cannot choose to use one feature without the other,
weighing the tradeoff in compatibility only for that
particular feature).
This patch documents the existing repositoryformatversion
strategy and introduces a new format, "1", which lets a
repository specify that it must run with an arbitrary set of
extensions. This can be used, for example:
- to inform git that the objects should not be pruned based
only on the reachability of the ref tips (e.g, because it
has "clone --shared" children)
- that the refs are stored in a format besides the usual
"refs" and "packed-refs" directories
Because we bump to format "1", and because format "1"
requires that a running git knows about any extensions
mentioned, we know that older versions of the code will not
do something dangerous when confronted with these new
formats.
For example, if the user chooses to use database storage for
refs, they may set the "extensions.refbackend" config to
"db". Older versions of git will not understand format "1"
and bail. Versions of git which understand "1" but do not
know about "refbackend", or which know about "refbackend"
but not about the "db" backend, will refuse to run. This is
annoying, of course, but much better than the alternative of
claiming that there are no refs in the repository, or
writing to a location that other implementations will not
read.
Note that we are only defining the rules for format 1 here.
We do not ever write format 1 ourselves; it is a tool that
is meant to be used by users and future extensions to
provide safety with older implementations.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|