<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/git.git/commit-graph.h, branch v2.26.0-rc2</title>
<subtitle>Git
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/git.git/atom?h=v2.26.0-rc2</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/atom?h=v2.26.0-rc2'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/'/>
<updated>2020-02-04T19:36:51Z</updated>
<entry>
<title>commit-graph.h: use odb in 'load_commit_graph_one_fd_st'</title>
<updated>2020-02-04T19:36:51Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2020-02-03T21:18:04Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=a7df60cac834cb7f91c3463205c2b6ba5e694200'/>
<id>urn:sha1:a7df60cac834cb7f91c3463205c2b6ba5e694200</id>
<content type='text'>
Apply a similar treatment as in the previous patch to pass a 'struct
object_directory *' through the 'load_commit_graph_one_fd_st'
initializer, too.

This prevents a potential bug where a pointer comparison is made to a
NULL 'g-&gt;odb', which would cause the commit-graph machinery to think
that a pair of commit-graphs belonged to different alternates when in
fact they do not (i.e., in the case of no '--object-dir').

Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>commit-graph.c: remove path normalization, comparison</title>
<updated>2020-02-04T19:36:51Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2020-02-03T21:18:02Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=ad2dd5bb63b6c2da0091d63463e29ae27e47a893'/>
<id>urn:sha1:ad2dd5bb63b6c2da0091d63463e29ae27e47a893</id>
<content type='text'>
As of the previous patch, all calls to 'commit-graph.c' functions which
perform path normalization (for e.g., 'get_commit_graph_filename()') are
of the form 'ctx-&gt;odb-&gt;path', which is always in normalized form.

Now that there are no callers passing non-normalized paths to these
functions, ensure that future callers are bound by the same restrictions
by making these functions take a 'struct object_directory *' instead of
a 'const char *'. To match, replace all calls with arguments of the form
'ctx-&gt;odb-&gt;path' with 'ctx-&gt;odb' To recover the path, functions that
perform path manipulation simply use 'odb-&gt;path'.

Further, avoid string comparisons with arguments of the form
'odb-&gt;path', and instead prefer raw pointer comparisons, which
accomplish the same effect, but are far less brittle.

This has a pleasant side-effect of making these functions much more
robust to paths that cannot be normalized by 'normalize_path_copy()',
i.e., because they are outside of the current working directory.

For example, prior to this patch, Valgrind reports that the following
uninitialized memory read [1]:

  $ ( cd t &amp;&amp; GIT_DIR=../.git valgrind git rev-parse HEAD^ )

because 'normalize_path_copy()' can't normalize '../.git' (since it's
relative to but above of the current working directory) [2].

By using a 'struct object_directory *' directly,
'get_commit_graph_filename()' does not need to normalize, because all
paths are relative to the current working directory since they are
always read from the '-&gt;path' of an object directory.

[1]: https://lore.kernel.org/git/20191027042116.GA5801@sigill.intra.peff.net.
[2]: The bug here is that 'get_commit_graph_filename()' returns the
     result of 'normalize_path_copy()' without checking the return
     value.

Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>commit-graph.h: store object directory in 'struct commit_graph'</title>
<updated>2020-02-04T19:36:51Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2020-02-03T21:18:00Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=13c249924995d504001eb8083fac106041b32f98'/>
<id>urn:sha1:13c249924995d504001eb8083fac106041b32f98</id>
<content type='text'>
In a previous patch, the 'char *object_dir' in 'struct commit_graph' was
replaced with a 'struct object_directory'. This patch applies the same
treatment to 'struct commit_graph', which is another intermediate step
towards getting rid of all path normalization in 'commit-graph.c'.

Instead of taking a 'char *object_dir', functions that construct a
'struct commit_graph' now take a 'struct object_directory *'. Any code
that needs an object directory path use '-&gt;path' instead.

This ensures that all calls to functions that perform path normalization
are given arguments which do not themselves require normalization. This
prepares those functions to drop their normalization entirely, which
will occur in the subsequent patch.

Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>commit-graph.h: store an odb in 'struct write_commit_graph_context'</title>
<updated>2020-02-04T19:36:37Z</updated>
<author>
<name>Taylor Blau</name>
<email>me@ttaylorr.com</email>
</author>
<published>2020-02-04T05:51:50Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=0bd52e27e315a76d6950fa30ce602eef15d90571'/>
<id>urn:sha1:0bd52e27e315a76d6950fa30ce602eef15d90571</id>
<content type='text'>
There are lots of places in 'commit-graph.h' where a function either has
(or almost has) a full 'struct object_directory *', accesses '-&gt;path',
and then throws away the rest of the struct.

This can cause headaches when comparing the locations of object
directories across alternates (e.g., in the case of deciding if two
commit-graph layers can be merged). These paths are normalized with
'normalize_path_copy()' which mitigates some comparison issues, but not
all [1].

Replace usage of 'char *object_dir' with 'odb-&gt;path' by storing a
'struct object_directory *' in the 'write_commit_graph_context'
structure. This is an intermediate step towards getting rid of all path
normalization in 'commit-graph.c'.

Resolving a user-provided '--object-dir' argument now requires that we
compare it to the known alternates for equality.  Prior to this patch,
an unknown '--object-dir' argument would silently exit with status zero.

This can clearly lead to unintended behavior, such as verifying
commit-graphs that aren't in a repository's own object store (or one of
its alternates), or causing a typo to mask a legitimate commit-graph
verification failure. Make this error non-silent by 'die()'-ing when the
given '--object-dir' does not match any known alternate object store.

[1]: In my testing, for example, I can get one side of the commit-graph
code to fill object_dir with "./objects" and the other with just
"objects".

Signed-off-by: Taylor Blau &lt;me@ttaylorr.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>upload-pack: disable commit graph more gently for shallow traversal</title>
<updated>2019-09-12T19:30:08Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2019-09-12T14:44:45Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=6abada1880156df9c6a0de91fb23716e378e13d7'/>
<id>urn:sha1:6abada1880156df9c6a0de91fb23716e378e13d7</id>
<content type='text'>
When the client has asked for certain shallow options like
"deepen-since", we do a custom rev-list walk that pretends to be
shallow. Before doing so, we have to disable the commit-graph, since it
is not compatible with the shallow view of the repository. That's
handled by 829a321569 (commit-graph: close_commit_graph before shallow
walk, 2018-08-20). That commit literally closes and frees our
repo-&gt;objects-&gt;commit_graph struct.

That creates an interesting problem for commits that have _already_ been
parsed using the commit graph. Their commit-&gt;object.parsed flag is set,
their commit-&gt;graph_pos is set, but their commit-&gt;maybe_tree may still
be NULL. When somebody later calls repo_get_commit_tree(), we see that
we haven't loaded the tree oid yet and try to get it from the commit
graph. But since it has been freed, we segfault!

So the root of the issue is a data dependency between the commit's
lazy-load of the tree oid and the fact that the commit graph can go
away mid-process. How can we resolve it?

There are a couple of general approaches:

  1. The obvious answer is to avoid loading the tree from the graph when
     we see that it's NULL. But then what do we return for the tree oid?
     If we return NULL, our caller in do_traverse() will rightly
     complain that we have no tree. We'd have to fallback to loading the
     actual commit object and re-parsing it. That requires teaching
     parse_commit_buffer() to understand re-parsing (i.e., not starting
     from a clean slate and not leaking any allocated bits like parent
     list pointers).

  2. When we close the commit graph, walk through the set of in-memory
     objects and clear any graph_pos pointers. But this means we also
     have to "unparse" any such commits so that we know they still need
     to open the commit object to fill in their trees. So it's no less
     complicated than (1), and is more expensive (since we clear objects
     we might not later need).

  3. Stop freeing the commit-graph struct. Continue to let it be used
     for lazy-loads of tree oids, but let upload-pack specify that it
     shouldn't be used for further commit parsing.

  4. Push the whole shallow rev-list out to its own sub-process, with
     the commit-graph disabled from the start, giving it a clean memory
     space to work from.

I've chosen (3) here. Options (1) and (2) would work, but are
non-trivial to implement. Option (4) is more expensive, and I'm not sure
how complicated it is (shelling out for the actual rev-list part is
easy, but we do then parse the resulting commits internally, and I'm not
clear which parts need to be handling shallow-ness).

The new test in t5500 triggers this segfault, but see the comments there
for how horribly intimate it has to be with how both upload-pack and
commit graphs work.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>commit-graph: error out on invalid commit oids in 'write --stdin-commits'</title>
<updated>2019-08-05T21:33:39Z</updated>
<author>
<name>SZEDER Gábor</name>
<email>szeder.dev@gmail.com</email>
</author>
<published>2019-08-05T08:02:40Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=7c5c9b9c57d58273d17dfc3fec3ebdb25077a9de'/>
<id>urn:sha1:7c5c9b9c57d58273d17dfc3fec3ebdb25077a9de</id>
<content type='text'>
While 'git commit-graph write --stdin-commits' expects commit object
ids as input, it accepts and silently skips over any invalid commit
object ids, and still exits with success:

  # nonsense
  $ echo not-a-commit-oid | git commit-graph write --stdin-commits
  $ echo $?
  0
  # sometimes I forgot that refs are not good...
  $ echo HEAD | git commit-graph write --stdin-commits
  $ echo $?
  0
  # valid tree OID, but not a commit OID
  $ git rev-parse HEAD^{tree} | git commit-graph write --stdin-commits
  $ echo $?
  0
  $ ls -l .git/objects/info/commit-graph
  ls: cannot access '.git/objects/info/commit-graph': No such file or directory

Check that all input records are indeed valid commit object ids and
return with error otherwise, the same way '--stdin-packs' handles
invalid input; see e103f7276f (commit-graph: return with errors during
write, 2019-06-12).

Note that it should only return with error when encountering an
invalid commit object id coming from standard input.  However,
'--reachable' uses the same code path to process object ids pointed to
by all refs, and that includes tag object ids as well, which should
still be skipped over.  Therefore add a new flag to 'enum
commit_graph_write_flags' and a corresponding field to 'struct
write_commit_graph_context', so we can differentiate between those two
cases.

Signed-off-by: SZEDER Gábor &lt;szeder.dev@gmail.com&gt;
Acked-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>commit-graph: turn a group of write-related macro flags into an enum</title>
<updated>2019-08-05T21:31:24Z</updated>
<author>
<name>SZEDER Gábor</name>
<email>szeder.dev@gmail.com</email>
</author>
<published>2019-08-05T08:02:39Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=39d88318569188c0544c3c0f8207f2e1b1829e60'/>
<id>urn:sha1:39d88318569188c0544c3c0f8207f2e1b1829e60</id>
<content type='text'>
Signed-off-by: SZEDER Gábor &lt;szeder.dev@gmail.com&gt;
Acked-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>commit-graph: verify chains with --shallow mode</title>
<updated>2019-06-20T03:46:26Z</updated>
<author>
<name>Derrick Stolee</name>
<email>dstolee@microsoft.com</email>
</author>
<published>2019-06-18T18:14:32Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=3da4b609bb14b13672f64af908706462617f53cb'/>
<id>urn:sha1:3da4b609bb14b13672f64af908706462617f53cb</id>
<content type='text'>
If we wrote a commit-graph chain, we only modified the tip file in
the chain. It is valuable to verify what we wrote, but not waste
time checking files we did not write.

Add a '--shallow' option to the 'git commit-graph verify' subcommand
and check that it does not read the base graph in a two-file chain.

Making the verify subcommand read from a chain of commit-graphs takes
some rearranging of the builtin code.

Signed-off-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>commit-graph: create options for split files</title>
<updated>2019-06-20T03:46:26Z</updated>
<author>
<name>Derrick Stolee</name>
<email>dstolee@microsoft.com</email>
</author>
<published>2019-06-18T18:14:32Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=c2bc6e6ab0ade70c475a73d06326e677e70840d2'/>
<id>urn:sha1:c2bc6e6ab0ade70c475a73d06326e677e70840d2</id>
<content type='text'>
The split commit-graph feature is now fully implemented, but needs
some more run-time configurability. Allow direct callers to 'git
commit-graph write --split' to specify the values used in the
merge strategy and the expire time.

Update the documentation to specify these values.

Signed-off-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>commit-graph: allow cross-alternate chains</title>
<updated>2019-06-20T03:46:26Z</updated>
<author>
<name>Derrick Stolee</name>
<email>dstolee@microsoft.com</email>
</author>
<published>2019-06-18T18:14:30Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=c523035cbd8515d095dc15e9661fd896733bedbc'/>
<id>urn:sha1:c523035cbd8515d095dc15e9661fd896733bedbc</id>
<content type='text'>
In an environment like a fork network, it is helpful to have a
commit-graph chain that spans both the base repo and the fork repo. The
fork is usually a small set of data on top of the large repo, but
sometimes the fork is much larger. For example, git-for-windows/git has
almost double the number of commits as git/git because it rebases its
commits on every major version update.

To allow cross-alternate commit-graph chains, we need a few pieces:

1. When looking for a graph-{hash}.graph file, check all alternates.

2. When merging commit-graph chains, do not merge across alternates.

3. When writing a new commit-graph chain based on a commit-graph file
   in another object directory, do not allow success if the base file
   has of the name "commit-graph" instead of
   "commit-graphs/graph-{hash}.graph".

Signed-off-by: Derrick Stolee &lt;dstolee@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
