<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/git.git/pack-bitmap.h, branch v2.26.0-rc2</title>
<subtitle>Git
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/git.git/atom?h=v2.26.0-rc2</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/atom?h=v2.26.0-rc2'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/'/>
<updated>2020-03-02T23:07:18Z</updated>
<entry>
<title>Merge branch 'jk/object-filter-with-bitmap'</title>
<updated>2020-03-02T23:07:18Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2020-03-02T23:07:18Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=0df82d99dae85dbd4f667e95020a146ea0167975'/>
<id>urn:sha1:0df82d99dae85dbd4f667e95020a146ea0167975</id>
<content type='text'>
The object reachability bitmap machinery and the partial cloning
machinery were not prepared to work well together, because some
object-filtering criteria that partial clones use inherently rely
on object traversal, but the bitmap machinery is an optimization
to bypass that object traversal.  There however are some cases
where they can work together, and they were taught about them.

* jk/object-filter-with-bitmap:
  rev-list --count: comment on the use of count_right++
  pack-objects: support filters with bitmaps
  pack-bitmap: implement BLOB_LIMIT filtering
  pack-bitmap: implement BLOB_NONE filtering
  bitmap: add bitmap_unset() function
  rev-list: use bitmap filters for traversal
  pack-bitmap: basic noop bitmap filter infrastructure
  rev-list: allow commit-only bitmap traversals
  t5310: factor out bitmap traversal comparison
  rev-list: allow bitmaps when counting objects
  rev-list: make --count work with --objects
  rev-list: factor out bitmap-optimized routines
  pack-bitmap: refuse to do a bitmap traversal with pathspecs
  rev-list: fallback to non-bitmap traversal when filtering
  pack-bitmap: fix leak of haves/wants object lists
  pack-bitmap: factor out type iterator initialization
</content>
</entry>
<entry>
<title>Merge branch 'jk/packfile-reuse-cleanup'</title>
<updated>2020-02-14T20:54:19Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2020-02-14T20:54:19Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=a14aebeac330e6d58f9628a02521ea780daf0a5b'/>
<id>urn:sha1:a14aebeac330e6d58f9628a02521ea780daf0a5b</id>
<content type='text'>
The way "git pack-objects" reuses objects stored in existing pack
to generate its result has been improved.

* jk/packfile-reuse-cleanup:
  pack-bitmap: don't rely on bitmap_git-&gt;reuse_objects
  pack-objects: add checks for duplicate objects
  pack-objects: improve partial packfile reuse
  builtin/pack-objects: introduce obj_is_packed()
  pack-objects: introduce pack.allowPackReuse
  csum-file: introduce hashfile_total()
  pack-bitmap: simplify bitmap_has_oid_in_uninteresting()
  pack-bitmap: uninteresting oid can be outside bitmapped packfile
  pack-bitmap: introduce bitmap_walk_contains()
  ewah/bitmap: introduce bitmap_word_alloc()
  packfile: expose get_delta_base()
  builtin/pack-objects: report reused packfile objects
</content>
</entry>
<entry>
<title>pack-bitmap: basic noop bitmap filter infrastructure</title>
<updated>2020-02-14T18:46:22Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2020-02-14T18:22:29Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=6663ae0a0818aba5d4de289b1a37e1961ad6c367'/>
<id>urn:sha1:6663ae0a0818aba5d4de289b1a37e1961ad6c367</id>
<content type='text'>
Currently you can't use object filters with bitmaps, but we plan to
support at least some filters with bitmaps. Let's introduce some
infrastructure that will help us do that:

  - prepare_bitmap_walk() now accepts a list_objects_filter_options
    parameter (which can be NULL for no filtering; all the current
    callers pass this)

  - we'll bail early if the filter is incompatible with bitmaps (just as
    we would if there were no bitmaps at all). Currently all filters are
    incompatible.

  - we'll filter the resulting bitmap; since there are no supported
    filters yet, this is always a noop.

There should be no behavior change yet, but we'll support some actual
filters in a future patch.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>rev-list: allow commit-only bitmap traversals</title>
<updated>2020-02-14T18:46:22Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2020-02-14T18:22:27Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=4eb707ebd681eb85306071db33ed70186d1642ac'/>
<id>urn:sha1:4eb707ebd681eb85306071db33ed70186d1642ac</id>
<content type='text'>
Ever since we added reachability bitmap support, we've been able to use
it with rev-list to get the full list of objects, like:

  git rev-list --objects --use-bitmap-index --all

But you can't do so without --objects, since we weren't ready to just
show the commits. However, the internals of the bitmap code are mostly
ready for this: they avoid opening up trees when walking to fill in the
bitmaps. We just need to actually pass in the rev_info to
traverse_bitmap_commit_list() so it knows which types to bother
triggering our callback for.

For completeness, the perf test now covers both the existing --objects
case, as well as the new commits-only behavior (the objects one got way
faster when we introduced bitmaps, but obviously isn't improved now).

Here are numbers for linux.git:

  Test                         HEAD^               HEAD
  ------------------------------------------------------------------------
  5310.7: rev-list (commits)   8.29(8.10+0.19)       1.76(1.72+0.04) -78.8%
  5310.8: rev-list (objects)   8.06(7.94+0.12)       8.14(7.94+0.13) +1.0%

That run was cheating a little, as I didn't have any commit-graph in the
repository, and we'd built it by default these days when running git-gc.
Here are numbers with a commit-graph:

  Test                         HEAD^               HEAD
  ------------------------------------------------------------------------
  5310.7: rev-list (commits)   0.70(0.58+0.12)     0.51(0.46+0.04) -27.1%
  5310.8: rev-list (objects)   6.20(6.09+0.10)     6.27(6.16+0.11) +1.1%

Still an improvement, but a lot less impressive.

We could have the perf script remove any commit-graph to show the
out-sized effect, but it probably makes sense to leave it in what would
be a more typical setup.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>pack-objects: improve partial packfile reuse</title>
<updated>2020-01-23T18:51:50Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2019-12-18T11:25:45Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=bb514de356cfcfd314f7ac7ae1acfeede3fa4b1f'/>
<id>urn:sha1:bb514de356cfcfd314f7ac7ae1acfeede3fa4b1f</id>
<content type='text'>
The old code to reuse deltas from an existing packfile
just tried to dump a whole segment of the pack verbatim.
That's faster than the traditional way of actually adding
objects to the packing list, but it didn't kick in very
often. This new code is really going for a middle ground:
do _some_ per-object work, but way less than we'd
traditionally do.

The general strategy of the new code is to make a bitmap
of objects from the packfile we'll include, and then
iterate over it, writing out each object exactly as it is
in our on-disk pack, but _not_ adding it to our packlist
(which costs memory, and increases the search space for
deltas).

One complication is that if we're omitting some objects,
we can't set a delta against a base that we're not
sending. So we have to check each object in
try_partial_reuse() to make sure we have its delta.

About performance, in the worst case we might have
interleaved objects that we are sending or not sending,
and we'd have as many chunks as objects. But in practice
we send big chunks.

For instance, packing torvalds/linux on GitHub servers
now reused 6.5M objects, but only needed ~50k chunks.

Helped-by: Jonathan Tan &lt;jonathantanmy@google.com&gt;
Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Christian Couder &lt;chriscool@tuxfamily.org&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>pack-bitmap: introduce bitmap_walk_contains()</title>
<updated>2020-01-23T18:51:50Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2019-12-18T11:25:39Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=40d18ff8c6b6977eaf8450dc4eeb866ece5298fc'/>
<id>urn:sha1:40d18ff8c6b6977eaf8450dc4eeb866ece5298fc</id>
<content type='text'>
We will use this helper function in a following commit to
tell us if an object is packed.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Christian Couder &lt;chriscool@tuxfamily.org&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>pack-bitmap.h: remove magic number</title>
<updated>2019-09-28T05:04:20Z</updated>
<author>
<name>Denton Liu</name>
<email>liu.denton@gmail.com</email>
</author>
<published>2019-09-25T08:20:59Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=af26e2a9d265c7ad676e3b178c359a0ff75440a2'/>
<id>urn:sha1:af26e2a9d265c7ad676e3b178c359a0ff75440a2</id>
<content type='text'>
When we ran `make hdr-check` with the following patch

	diff --git a/Makefile b/Makefile
	index f879697ea3..d8df4e316b 100644
	--- a/Makefile
	+++ b/Makefile
	@@ -2773,7 +2773,7 @@ CHK_HDRS = $(filter-out $(EXCEPT_HDRS),$(patsubst ./%,%,$(LIB_H)))
	HCO = $(patsubst %.h,%.hco,$(CHK_HDRS))

	$(HCO): %.hco: %.h FORCE
	-	$(QUIET_HDR)$(CC) -include git-compat-util.h -I. -o /dev/null -c -xc $&lt;
	+	$(QUIET_HDR)$(CC) -include git-compat-util.h -I. -o /dev/null -c -xc $(ALL_CFLAGS) $&lt;

	.PHONY: hdr-check $(HCO)
	hdr-check: $(HCO)

and with `DEVELOPER=1`, we got the following warning on Arch Linux:

	pack-bitmap.h:20:19: error: ‘BITMAP_IDX_SIGNATURE’ defined but not used [-Werror=unused-const-variable=]
	   20 | static const char BITMAP_IDX_SIGNATURE[] = {'B', 'I', 'T', 'M'};
	      |                   ^~~~~~~~~~~~~~~~~~~~
	cc1: all warnings being treated as errors

"Use" the BITMAP_IDX_SIGNATURE variable by making the size of
bitmap_disk_header.magic equal to the size of BITMAP_IDX_SIGNATURE,
thereby eliminating the magic number (4).

An alternative was to simply add MAYBE_UNUSED, however that does not
eliminate the magic number.

Another alternative was to change the definition to

	extern const char BITMAP_IDX_SIGNATURE[4];

However, this design was also not chosen as the static definition allows
us to keep the declaration together for readability along with removing
the magic number.

Signed-off-by: Denton Liu &lt;liu.denton@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>pack-bitmap: convert khash_sha1 maps into kh_oid_map</title>
<updated>2019-06-20T17:35:05Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2019-06-20T07:41:35Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=d2bc62b1fa7f2df247199ed88edff30875ee19bc'/>
<id>urn:sha1:d2bc62b1fa7f2df247199ed88edff30875ee19bc</id>
<content type='text'>
All of the users of our khash_sha1 maps actually have a "struct
object_id". Let's use the more descriptive type.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>pack-bitmap: switch hash tables to use struct object_id</title>
<updated>2019-04-01T02:57:37Z</updated>
<author>
<name>brian m. carlson</name>
<email>sandals@crustytoothpaste.net</email>
</author>
<published>2019-02-19T00:04:58Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=3c7714485dc8adc810b6c52058992cfc767dfcb5'/>
<id>urn:sha1:3c7714485dc8adc810b6c52058992cfc767dfcb5</id>
<content type='text'>
Instead of storing unsigned char pointers in the hash tables, switch to
storing instances of struct object_id. Update several internal functions
and one external function to take pointers to struct object_id.

Signed-off-by: brian m. carlson &lt;sandals@crustytoothpaste.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>pack-bitmap: make bitmap header handling hash agnostic</title>
<updated>2019-04-01T02:57:37Z</updated>
<author>
<name>brian m. carlson</name>
<email>sandals@crustytoothpaste.net</email>
</author>
<published>2019-02-19T00:04:54Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=0f4d6cada83dc6bd6b6b24dc0d2b3e6460c645cb'/>
<id>urn:sha1:0f4d6cada83dc6bd6b6b24dc0d2b3e6460c645cb</id>
<content type='text'>
Increase the checksum field in struct bitmap_disk_header to be
GIT_MAX_RAWSZ bytes in length and ensure that we hash the proper number
of bytes out when computing the bitmap checksum.

Signed-off-by: brian m. carlson &lt;sandals@crustytoothpaste.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
