summaryrefslogtreecommitdiff
path: root/builtin/commit.c
diff options
context:
space:
mode:
authorTaylor Blau <me@ttaylorr.com>2023-12-14 17:24:34 -0500
committerJunio C Hamano <gitster@pobox.com>2023-12-14 14:38:09 -0800
commit519e17ff75180c3e94a7c120c3028b926f2a781e (patch)
tree694917ea23d4c476d737868b11f95aac5661120a /builtin/commit.c
parentdbd5c520d2bb168f58bde4ead407d35589142668 (diff)
pack-bitmap: prepare to mark objects from multiple packs for reuse
Now that the pack-objects code is equipped to handle reusing objects from multiple packs, prepare the pack-bitmap code to mark objects from multiple packs as reuse candidates. In order to prepare the pack-bitmap code for this change, remove the same set of assumptions we unwound in previous commits from the helper function `reuse_partial_packfile_from_bitmap_1()`, in preparation for it to be called in a loop over the set of bitmapped packs in a following commit. Most importantly, we can no longer assume that the bit position corresponding to the first object in a given reuse pack candidate is at the beginning of the bitmap itself. For the single pack that this assumption is still true for (in MIDX bitmaps, this is the preferred pack, in single-pack bitmaps it is the pack the bitmap is tied to), we can still use our whole-words optimization. But for all subsequent packs, we can not make use of this optimization, since it assumes that all delta bases are being sent from the same pack, which would break if we are sending OFS_DELTAs down to the client. To understand why, consider two packs, P1 and P2 where: - P1 has object A which is a delta on base B - P2 has its own copy of B, in addition to other objects Suppose that the MIDX which covers P1 and P2 selected its copy of A from P1, but selected its copy of B from P2. Since A is a delta of B, but the base was selected from a different pack, sending the bytes corresponding to A as an OFS_DELTA verbatim from P1 would be incorrect, since we don't guarantee that B is in the same place relative to A in the generated pack as in P1. For now, we detect and reject these cross-pack deltas by searching for the (pack_id, offset) pair for the delta's base object (using the same pack_id as the pack containing the delta'd object) in the MIDX. If we find a match, that means that the MIDX did indeed pick the base object from the same pack, and we are OK to reuse the delta. If we don't find a match, however, that means that the base object was selected from a different pack in the MIDX, and we can let the slower path handle re-delta'ing our candidate object. In the future, there are a couple of other things we could do, namely: - Turn any cross-pack deltas (which are stored as OFS_DELTAs) into REF_DELTAs. We already do this today when reusing an OFS_DELTA without `--delta-base-offset` enabled, so it's not a huge stretch to do the same for cross-pack deltas even when `--delta-base-offset` is enabled. This would work, but would obviously result in larger-than-necessary packs, as we in theory *could* represent these cross-pack deltas by patching an existing OFS_DELTA. But it's not clear how much that would matter in practice. I suspect it would have a lot to do with how you pack your repository in the first place. - Finally, we could patch OFS_DELTAs across packs in a similar fashion as we do today for OFS_DELTAs within a single pack on either side of a gap. This would result in the smallest packs of the three options here, but implementing this would be more involved. At minimum, you'd have to keep the reusable chunks list for all reused packs, not just the one we're currently processing. And you'd have to ensure that any bases which are a part of cross-pack deltas appear before the delta. I think this is possible to do, but would require assembling the reusable chunks list potentially in a different order than they appear in the source packs. For now, let's pursue the simplest approach and reject any cross-pack deltas. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffstat (limited to 'builtin/commit.c')
0 files changed, 0 insertions, 0 deletions