<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/git.git/tree-diff.c, branch v1.8.0.2</title>
<subtitle>Git
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/git.git/atom?h=v1.8.0.2</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/atom?h=v1.8.0.2'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/'/>
<updated>2012-08-27T18:54:28Z</updated>
<entry>
<title>Merge branch 'jk/maint-null-in-trees'</title>
<updated>2012-08-27T18:54:28Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2012-08-27T18:54:28Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=3b753148b636be9dc821feebf85cd7f1739b07a1'/>
<id>urn:sha1:3b753148b636be9dc821feebf85cd7f1739b07a1</id>
<content type='text'>
We do not want a link to 0{40} object stored anywhere in our objects.

* jk/maint-null-in-trees:
  fsck: detect null sha1 in tree entries
  do not write null sha1s to on-disk index
  diff: do not use null sha1 as a sentinel value
</content>
</entry>
<entry>
<title>diff_setup_done(): return void</title>
<updated>2012-08-03T19:11:07Z</updated>
<author>
<name>Thomas Rast</name>
<email>trast@student.ethz.ch</email>
</author>
<published>2012-08-03T12:16:24Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=28452655af988094792483a51d188c58137760cd'/>
<id>urn:sha1:28452655af988094792483a51d188c58137760cd</id>
<content type='text'>
diff_setup_done() has historically returned an error code, but lost
the last nonzero return in 943d5b7 (allow diff.renamelimit to be set
regardless of -M/-C, 2006-08-09).  The callers were in a pretty
confused state: some actually checked for the return code, and some
did not.

Let it return void, and patch all callers to take this into account.
This conveniently also gets rid of a handful of different(!) error
messages that could never be triggered anyway.

Note that the function can still die().

Signed-off-by: Thomas Rast &lt;trast@student.ethz.ch&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>diff: do not use null sha1 as a sentinel value</title>
<updated>2012-07-29T22:04:32Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2012-07-28T15:03:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=e54501004abbd20fa8d813c1e5b82c3b50bb9361'/>
<id>urn:sha1:e54501004abbd20fa8d813c1e5b82c3b50bb9361</id>
<content type='text'>
The diff code represents paths using the diff_filespec
struct. This struct has a sha1 to represent the sha1 of the
content at that path, as well as a sha1_valid member which
indicates whether its sha1 field is actually useful. If
sha1_valid is not true, then the filespec represents a
working tree file (e.g., for the no-index case, or for when
the index is not up-to-date).

The diff_filespec is only used internally, though. At the
interfaces to the diff subsystem, callers feed the sha1
directly, and we create a diff_filespec from it. It's at
that point that we look at the sha1 and decide whether it is
valid or not; callers may pass the null sha1 as a sentinel
value to indicate that it is not.

We should not typically see the null sha1 coming from any
other source (e.g., in the index itself, or from a tree).
However, a corrupt tree might have a null sha1, which would
cause "diff --patch" to accidentally diff the working tree
version of a file instead of treating it as a blob.

This patch extends the edges of the diff interface to accept
a "sha1_valid" flag whenever we accept a sha1, and to use
that flag when creating a filespec. In some cases, this
means passing the flag through several layers, making the
code change larger than would be desirable.

One alternative would be to simply die() upon seeing
corrupted trees with null sha1s. However, this fix more
directly addresses the problem (while bogus sha1s in a tree
are probably a bad thing, it is really the sentinel
confusion sending us down the wrong code path that is what
makes it devastating). And it means that git is more capable
of examining and debugging these corrupted trees. For
example, you can still "diff --raw" such a tree to find out
when the bogus entry was introduced; you just cannot do a
"--patch" diff (just as you could not with any other
corrupted tree, as we do not have any content to diff).

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>use custom rename score during --follow</title>
<updated>2011-12-16T20:33:49Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2011-12-16T11:27:50Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=dd98d88be7a2f3782e938701102a96cb09d284da'/>
<id>urn:sha1:dd98d88be7a2f3782e938701102a96cb09d284da</id>
<content type='text'>
If you provide a custom rename score on the command line,
like:

  git log -M50 --follow foo.c

it is completely ignored, and there is no way to --follow
with a looser rename score. Instead, let's use the same
rename score that will be used for generating diffs. This is
convenient, and mirrors what we do with the break-score.

You can see an example of it being useful in git.git:

  $ git log --oneline --summary --follow \
	    Documentation/technical/api-string-list.txt
  86d4b52 string-list: Add API to remove an item from an unsorted list
  1d2f80f string_list: Fix argument order for string_list_append
  e242148 string-list: add unsorted_string_list_lookup()
  0dda1d1 Fix two leftovers from path_list-&gt;string_list
  c455c87 Rename path_list to string_list
   create mode 100644 Documentation/technical/api-string-list.txt

  $ git log --oneline --summary -M40 --follow \
	  Documentation/technical/api-string-list.txt
  86d4b52 string-list: Add API to remove an item from an unsorted list
  1d2f80f string_list: Fix argument order for string_list_append
  e242148 string-list: add unsorted_string_list_lookup()
  0dda1d1 Fix two leftovers from path_list-&gt;string_list
  c455c87 Rename path_list to string_list
   rename Documentation/technical/{api-path-list.txt =&gt; api-string-list.txt} (47%)
  328a475 path-list documentation: document all functions and data structures
  530e741 Start preparing the API documents.
   create mode 100644 Documentation/technical/api-path-list.txt

You could have two separate rename scores, one for following
and one for diff. But almost nobody is going to want that,
and it would just be unnecessarily confusing. Besides which,
we re-use the diff results from try_to_follow_renames for
the actual diff output, which means having them as separate
scores is actively wrong. E.g., with the current code, you
get:

  $ git log --oneline --diff-filter=R --name-status \
            -M90 --follow git.spec.in
  27dedf0 GIT 0.99.9j aka 1.0rc3
  R084    git-core.spec.in        git.spec.in
  f85639c Rename the RPM from "git" to "git-core"
  R098    git.spec.in     git-core.spec.in

The first one should not be considered a rename by the -M
score we gave, but we print it anyway, since we blindly
re-use the diff information from the follow (which uses the
default score). So this could also be considered simply a
bug-fix, as with the current code "-M" is completely ignored
when using "--follow".

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>tree_entry_interesting(): give meaningful names to return values</title>
<updated>2011-10-27T18:38:24Z</updated>
<author>
<name>Nguyễn Thái Ngọc Duy</name>
<email>pclouds@gmail.com</email>
</author>
<published>2011-10-24T06:36:10Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=d688cf07b15b664a2164c3d92bcb5e8265400a2b'/>
<id>urn:sha1:d688cf07b15b664a2164c3d92bcb5e8265400a2b</id>
<content type='text'>
It is a basic code hygiene to avoid magic constants that are unnamed.
Besides, this helps extending the value later on for "interesting, but
cannot decide if the entry truely matches yet" (ie. prefix matches)

Signed-off-by: Nguyễn Thái Ngọc Duy &lt;pclouds@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>tree-walk.c: do not leak internal structure in tree_entry_len()</title>
<updated>2011-10-27T18:08:26Z</updated>
<author>
<name>Nguyễn Thái Ngọc Duy</name>
<email>pclouds@gmail.com</email>
</author>
<published>2011-10-24T06:36:09Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=0de1633783685e9fb1943551217cdda7edbd245b'/>
<id>urn:sha1:0de1633783685e9fb1943551217cdda7edbd245b</id>
<content type='text'>
tree_entry_len() does not simply take two random arguments and return
a tree length. The two pointers must point to a tree item structure,
or struct name_entry. Passing random pointers will return incorrect
value.

Force callers to pass struct name_entry instead of two pointers (with
hope that they don't manually construct struct name_entry themselves)

Signed-off-by: Nguyễn Thái Ngọc Duy &lt;pclouds@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'jk/diff-not-so-quick'</title>
<updated>2011-06-06T18:40:14Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2011-06-06T18:40:14Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=456a4c08b8d8ddefda939014c15877ace3e3f499'/>
<id>urn:sha1:456a4c08b8d8ddefda939014c15877ace3e3f499</id>
<content type='text'>
* jk/diff-not-so-quick:
  diff: futureproof "stop feeding the backend early" logic
  diff_tree: disable QUICK optimization with diff filter

Conflicts:
	diff.c
</content>
</entry>
<entry>
<title>diff: futureproof "stop feeding the backend early" logic</title>
<updated>2011-05-31T16:21:36Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2011-05-31T16:14:17Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=28b9264dd6cbadcef8b3e48c24ffcb2893b668b3'/>
<id>urn:sha1:28b9264dd6cbadcef8b3e48c24ffcb2893b668b3</id>
<content type='text'>
Refactor the "do not stop feeding the backend early" logic into a small
helper function and use it in both run_diff_files() and diff_tree() that
has the stop-early optimization. We may later add other types of diffcore
transformation that require to look at the whole result like diff-filter
does, and having the logic in a single place is essential for longer term
maintainability.

Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>diff_tree: disable QUICK optimization with diff filter</title>
<updated>2011-05-31T16:20:31Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2011-05-31T15:33:56Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=af7b41c923677ff9291bab56ec7069922e37453b'/>
<id>urn:sha1:af7b41c923677ff9291bab56ec7069922e37453b</id>
<content type='text'>
We stop looking for changes early with QUICK, so our diff
queue contains only a subset of the changes. However, we
don't apply diff filters until later; it will appear at that
point as though there are no changes matching our filter,
when in reality we simply didn't keep looking for changes
long enough.

Commit 2cfe8a6 (diff --quiet: disable optimization when
--diff-filter=X is used, 2011-03-16) fixes this in some
cases by disabling the optimization when a filter is
present. However, it only tweaked run_diff_files, missing
the similar case in diff_tree. Thus the fix worked only for
diffing the working tree and index, but not between trees.

Noticed by Yasushi SHOJI.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'nd/struct-pathspec'</title>
<updated>2011-05-06T17:50:06Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2011-05-06T17:50:06Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=1273738f05dedc63865085deb5d1fb8816ff809c'/>
<id>urn:sha1:1273738f05dedc63865085deb5d1fb8816ff809c</id>
<content type='text'>
* nd/struct-pathspec:
  pathspec: rename per-item field has_wildcard to use_wildcard
  Improve tree_entry_interesting() handling code
  Convert read_tree{,_recursive} to support struct pathspec
  Reimplement read_tree_recursive() using tree_entry_interesting()
</content>
</entry>
</feed>
