diff options
author | Taylor Blau <me@ttaylorr.com> | 2022-05-20 15:01:48 -0400 |
---|---|---|
committer | Junio C Hamano <gitster@pobox.com> | 2022-05-20 13:42:40 -0700 |
commit | aab7bea14fee0f185b26413bf27a1cfefbe0114d (patch) | |
tree | 2a2ae283b57a784aadb7c167a2450d0073e9c39d /commit.c | |
parent | 4b5a808bb906896e07ca147645b33feb81e91c4c (diff) |
t7703: demonstrate object corruption with pack.packSizeLimit
When doing a `--geometric=<d>` repack, `git repack` determines a
splitting point among packs ordered by their object count such that:
- each pack above the split has at least `<d>` times as many objects
as the next-largest pack by object count, and
- the first pack above the split has at least `<d>` times as many
object as the sum of all packs below the split line combined
`git repack` then creates a pack containing all of the objects contained
in packs below the split line by running `git pack-objects
--stdin-packs` underneath. Once packs are moved into place, then any
packs below the split line are removed, since their objects were just
combined into a new pack.
But `git repack` tries to be careful to avoid removing a pack that it
just wrote, by checking:
struct packed_git *p = geometry->pack[i];
if (string_list_has_string(&names, hash_to_hex(p->hash)))
continue;
in the `delete_redundant` and `geometric` conditional towards the end of
`cmd_repack`.
But it's possible to trick `git repack` into not recognizing a pack that
it just wrote when `names` is out-of-order (which violates
`string_list_has_string()`'s assumption that the list is sorted and thus
binary search-able).
When this happens in just the right circumstances, it is possible to
remove a pack that we just wrote, leading to object corruption.
Luckily, this is quite difficult to provoke in practice (for a couple of
reasons):
- we ordinarily write just one pack, so `names` usually contains just
one entry, and is thus sorted
- when we do write more than one pack (e.g., due to `--max-pack-size`)
we have to: (a) write a pack identical to one that already
exists, (b) have that pack be below the split line, and (c) have
the set of packs written by `pack-objects` occur in an order which
tricks `string_list_has_string()`.
Demonstrate the above scenario in a failing test, which causes `git
repack --geometric` to write a pack which occurs below the split line,
_and_ fail to recognize that it wrote that pack.
The following patch will fix this bug.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffstat (limited to 'commit.c')
0 files changed, 0 insertions, 0 deletions