diff options
| author | brian m. carlson <sandals@crustytoothpaste.net> | 2025-10-09 21:56:21 +0000 |
|---|---|---|
| committer | Junio C Hamano <gitster@pobox.com> | 2025-10-09 17:46:14 -0700 |
| commit | 24d46f86337b79083ffcb0c9f8806a4f82f6b9c8 (patch) | |
| tree | 24c3ba08b8c484d6c0fe62382a2574ddfb40c449 /t/unit-tests/u-hashmap.c | |
| parent | d477892b30b25333badb829190eb349fb671458c (diff) | |
docs: improve ambiguous areas of pack format documentation
It is fair to say that our pack and indexing code is quite complex.
Contributors who wish to work on this code or implementors of other
implementations would benefit from clear, unambiguous documentation
about how our data formats are structured and encoded and what data is
used in the computation of certain values. Unfortunately, some of this
data is missing, which leads to confusion and frustration.
Let's document some of this data to help clarify things. Specify over
what data CRC32 values are computed and also note which CRC32 algorithm
is used, since Wikipedia mentions at least four 32-bit CRC algorithms
and notes that it's possible to use different bit orderings.
In addition, note how we encode objects in the pack. One might be led
to believe that packed objects are always stored with the "<type>
<size>\0" prefix of loose objects, but that is not the case, although
for obvious reasons this data is included in the computation of the
object ID. Explain why this is for the curious reader.
Finally, indicate what the size field of the packed object represents.
Otherwise, a reader might think that the size of a delta is the size of
the full object or that it might contain the offset or object ID,
neither of which are the case. Explain clearly, however, that the
values represent uncompressed sizes to avoid confusion.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffstat (limited to 't/unit-tests/u-hashmap.c')
0 files changed, 0 insertions, 0 deletions
