<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/git.git/midx.c, branch v2.26.0-rc2</title>
<subtitle>Git
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/git.git/atom?h=v2.26.0-rc2</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/atom?h=v2.26.0-rc2'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/'/>
<updated>2020-02-24T20:55:42Z</updated>
<entry>
<title>nth_packed_object_oid(): use customary integer return</title>
<updated>2020-02-24T20:55:42Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2020-02-24T04:27:36Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=0763671b8e0b3ef873df13c741a911b809e6813d'/>
<id>urn:sha1:0763671b8e0b3ef873df13c741a911b809e6813d</id>
<content type='text'>
Our nth_packed_object_sha1() function returns NULL for error. So when we
wrapped it with nth_packed_object_oid(), we kept the same semantics. But
it's a bit funny, because the caller actually passes in an out
parameter, and the pointer we return is just that same struct they
passed to us (or NULL).

It's not too terrible, but it does make the interface a little
non-idiomatic. Let's switch to our usual "0 for success, negative for
error" return value. Most callers either don't check it, or are
trivially converted. The one that requires the biggest change is
actually improved, as we can ditch an extra aliased pointer variable.

Since we are changing the interface in a subtle way that the compiler
wouldn't catch, let's also change the name to catch any topics in
flight. We can drop the 'o' and make it nth_packed_object_id(). That's
slightly shorter, but also less redundant since the 'o' stands for
"object" already.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>midx: honor the MIDX_PROGRESS flag in midx_repack</title>
<updated>2019-10-23T03:05:06Z</updated>
<author>
<name>William Baker</name>
<email>William.Baker@microsoft.com</email>
</author>
<published>2019-10-21T18:40:02Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=64d80e7d52cc2663a44157fc3d49af576ea10192'/>
<id>urn:sha1:64d80e7d52cc2663a44157fc3d49af576ea10192</id>
<content type='text'>
Update midx_repack to only display progress when
the MIDX_PROGRESS flag is set.

Signed-off-by: William Baker &lt;William.Baker@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>midx: honor the MIDX_PROGRESS flag in verify_midx_file</title>
<updated>2019-10-23T03:05:05Z</updated>
<author>
<name>William Baker</name>
<email>William.Baker@microsoft.com</email>
</author>
<published>2019-10-21T18:40:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=ad60096d1c82a6e05a01bb33c12cd1070bf01b4f'/>
<id>urn:sha1:ad60096d1c82a6e05a01bb33c12cd1070bf01b4f</id>
<content type='text'>
Update verify_midx_file to only display progress when
the MIDX_PROGRESS flag is set.

Signed-off-by: William Baker &lt;William.Baker@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>midx: add progress to expire_midx_packs</title>
<updated>2019-10-23T03:05:05Z</updated>
<author>
<name>William Baker</name>
<email>William.Baker@microsoft.com</email>
</author>
<published>2019-10-21T18:40:00Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=8dc18f8937faf542da785b28062731ddfbfee974'/>
<id>urn:sha1:8dc18f8937faf542da785b28062731ddfbfee974</id>
<content type='text'>
Add progress to expire_midx_packs.  Progress is
displayed when the MIDX_PROGRESS flag is set.

Signed-off-by: William Baker &lt;William.Baker@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>midx: add progress to write_midx_file</title>
<updated>2019-10-23T03:05:05Z</updated>
<author>
<name>William Baker</name>
<email>William.Baker@microsoft.com</email>
</author>
<published>2019-10-21T18:39:59Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=840cef0c70e4c664d814d09f304d4be9a63d11e4'/>
<id>urn:sha1:840cef0c70e4c664d814d09f304d4be9a63d11e4</id>
<content type='text'>
Add progress to write_midx_file.  Progress is displayed
when the MIDX_PROGRESS flag is set.

Signed-off-by: William Baker &lt;William.Baker@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>midx: add MIDX_PROGRESS flag</title>
<updated>2019-10-23T03:05:05Z</updated>
<author>
<name>William Baker</name>
<email>William.Baker@microsoft.com</email>
</author>
<published>2019-10-21T18:39:58Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=efbc3aee08dfac70d426cca93cc5cfc0f14f8ee7'/>
<id>urn:sha1:efbc3aee08dfac70d426cca93cc5cfc0f14f8ee7</id>
<content type='text'>
Add the MIDX_PROGRESS flag and update the
write|verify|expire|repack functions in midx.h
to accept a flags parameter.  The MIDX_PROGRESS
flag indicates whether the caller of the function
would like progress information to be displayed.
This patch only changes the method prototypes
and does not change the functionality. The
functionality change will be handled by a later patch.

Signed-off-by: William Baker &lt;William.Baker@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>midx: switch to using the_hash_algo</title>
<updated>2019-08-19T22:05:00Z</updated>
<author>
<name>brian m. carlson</name>
<email>sandals@crustytoothpaste.net</email>
</author>
<published>2019-08-18T20:04:27Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=aaa95dfa051fc517dde0fa31187c744094cd17c5'/>
<id>urn:sha1:aaa95dfa051fc517dde0fa31187c744094cd17c5</id>
<content type='text'>
Instead of hard-coding the hash size, use the_hash_algo to look up the
hash size at runtime.  Remove the #define constant which was used to
hold the hash length, since writing the expression with the_hash_algo
provide enough documentary value on its own.

Signed-off-by: brian m. carlson &lt;sandals@crustytoothpaste.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>midx: implement midx_repack()</title>
<updated>2019-06-11T17:34:40Z</updated>
<author>
<name>Derrick Stolee</name>
<email>dstolee@microsoft.com</email>
</author>
<published>2019-06-10T23:35:27Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=ce1e4a105b4ff2457f2537bc703863175f9195c9'/>
<id>urn:sha1:ce1e4a105b4ff2457f2537bc703863175f9195c9</id>
<content type='text'>
To repack with a non-zero batch-size, first sort all pack-files by
their modified time. Second, walk those pack-files from oldest
to newest, compute their expected size, and add the packs to a list
if they are smaller than the given batch-size. Stop when the total
expected size is at least the batch size.

If the batch size is zero, select all packs in the multi-pack-index.

Finally, collect the objects from the multi-pack-index that are in
the selected packs and send them to 'git pack-objects'. Write a new
multi-pack-index that includes the new pack.

Using a batch size of zero is very similar to a standard 'git repack'
command, except that we do not delete the old packs and instead rely
on the new multi-pack-index to prevent new processes from reading the
old packs. This does not disrupt other Git processes that are currently
reading the old packs based on the old multi-pack-index.

While first designing a 'git multi-pack-index repack' operation, I
started by collecting the batches based on the actual size of the
objects instead of the size of the pack-files. This allows repacking
a large pack-file that has very few referencd objects. However, this
came at a significant cost of parsing pack-files instead of simply
reading the multi-pack-index and getting the file information for
the pack-files. The "expected size" version provides similar
behavior, but could skip a pack-file if the average object size is
much larger than the actual size of the referenced objects, or
can create a large pack if the actual size of the referenced objects
is larger than the expected size.

Signed-off-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>multi-pack-index: prepare 'repack' subcommand</title>
<updated>2019-06-11T17:34:40Z</updated>
<author>
<name>Derrick Stolee</name>
<email>dstolee@microsoft.com</email>
</author>
<published>2019-06-10T23:35:26Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=2af890bb28013ae9edf8f5de7873fac849d39c32'/>
<id>urn:sha1:2af890bb28013ae9edf8f5de7873fac849d39c32</id>
<content type='text'>
In an environment where the multi-pack-index is useful, it is due
to many pack-files and an inability to repack the object store
into a single pack-file. However, it is likely that many of these
pack-files are rather small, and could be repacked into a slightly
larger pack-file without too much effort. It may also be important
to ensure the object store is highly available and the repack
operation does not interrupt concurrent git commands.

Introduce a 'repack' subcommand to 'git multi-pack-index' that
takes a '--batch-size' option. The subcommand will inspect the
multi-pack-index for referenced pack-files whose size is smaller
than the batch size, until collecting a list of pack-files whose
sizes sum to larger than the batch size. Then, a new pack-file
will be created containing the objects from those pack-files that
are referenced by the multi-pack-index. The resulting pack is
likely to actually be smaller than the batch size due to
compression and the fact that there may be objects in the pack-
files that have duplicate copies in other pack-files.

The current change introduces the command-line arguments, and we
add a test that ensures we parse these options properly. Since
we specify a small batch size, we will guarantee that future
implementations do not change the list of pack-files.

In addition, we hard-code the modified times of the packs in
the pack directory to ensure the list of packs sorted by modified
time matches the order if sorted by size (ascending). This will
be important in a future test.

Signed-off-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>multi-pack-index: implement 'expire' subcommand</title>
<updated>2019-06-11T17:34:40Z</updated>
<author>
<name>Derrick Stolee</name>
<email>dstolee@microsoft.com</email>
</author>
<published>2019-06-10T23:35:25Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=19575c7c8e60c95c6714e51039ee9a21721cc31d'/>
<id>urn:sha1:19575c7c8e60c95c6714e51039ee9a21721cc31d</id>
<content type='text'>
The 'git multi-pack-index expire' subcommand looks at the existing
mult-pack-index, counts the number of objects referenced in each
pack-file, deletes the pack-fils with no referenced objects, and
rewrites the multi-pack-index to no longer reference those packs.

Refactor the write_midx_file() method to call write_midx_internal()
which now takes an existing 'struct multi_pack_index' and a list
of pack-files to drop (as specified by the names of their pack-
indexes). As we write the new multi-pack-index, we drop those
file names from the list of known pack-files.

The expire_midx_packs() method removes the unreferenced pack-files
after carefully closing the packs to avoid open handles.

Test that a new pack-file that covers the contents of two other
pack-files leads to those pack-files being deleted during the
expire subcommand. Be sure to read the multi-pack-index to ensure
it no longer references those packs.

Signed-off-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
