docs: improve ambiguous areas of pack format documentation - user/sven/git.git

diff options

author	brian m. carlson <sandals@crustytoothpaste.net>	2025-10-09 21:56:21 +0000
committer	Junio C Hamano <gitster@pobox.com>	2025-10-09 17:46:14 -0700
commit	24d46f86337b79083ffcb0c9f8806a4f82f6b9c8 (patch)
tree	24c3ba08b8c484d6c0fe62382a2574ddfb40c449 /t/unit-tests
parent	d477892b30b25333badb829190eb349fb671458c (diff)

docs: improve ambiguous areas of pack format documentation

It is fair to say that our pack and indexing code is quite complex. Contributors who wish to work on this code or implementors of other implementations would benefit from clear, unambiguous documentation about how our data formats are structured and encoded and what data is used in the computation of certain values. Unfortunately, some of this data is missing, which leads to confusion and frustration. Let's document some of this data to help clarify things. Specify over what data CRC32 values are computed and also note which CRC32 algorithm is used, since Wikipedia mentions at least four 32-bit CRC algorithms and notes that it's possible to use different bit orderings. In addition, note how we encode objects in the pack. One might be led to believe that packed objects are always stored with the "<type> <size>\0" prefix of loose objects, but that is not the case, although for obvious reasons this data is included in the computation of the object ID. Explain why this is for the curious reader. Finally, indicate what the size field of the packed object represents. Otherwise, a reader might think that the size of a delta is the size of the full object or that it might contain the offset or object ID, neither of which are the case. Explain clearly, however, that the values represent uncompressed sizes to avoid confusion. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>

Diffstat (limited to 't/unit-tests')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: