<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/git.git/diff-lib.c, branch v1.6.4.5</title>
<subtitle>Git
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/git.git/atom?h=v1.6.4.5</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/atom?h=v1.6.4.5'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/'/>
<updated>2009-06-21T04:47:30Z</updated>
<entry>
<title>Merge branch 'jc/cache-tree'</title>
<updated>2009-06-21T04:47:30Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2009-06-21T04:47:30Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=c28a17f270a51a4ed5e432e83c0ed962361a37c9'/>
<id>urn:sha1:c28a17f270a51a4ed5e432e83c0ed962361a37c9</id>
<content type='text'>
* jc/cache-tree:
  Avoid "diff-index --cached" optimization under --find-copies-harder
  Optimize "diff-index --cached" using cache-tree
  t4007: modernize the style
  cache-tree.c::cache_tree_find(): simplify internal API
  write-tree --ignore-cache-tree
</content>
</entry>
<entry>
<title>Avoid "diff-index --cached" optimization under --find-copies-harder</title>
<updated>2009-05-25T18:35:29Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2009-05-23T06:14:25Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=a0919ced8a5efe938cf97c74a0f851cbbe00aaf6'/>
<id>urn:sha1:a0919ced8a5efe938cf97c74a0f851cbbe00aaf6</id>
<content type='text'>
When find-copies-harder is in effect, the diff frontends are expected to
feed all paths, not just changed paths, to the diffcore, so that copy
sources can be picked up.  In such a case, not descending into subtrees
using the cache-tree information is simply wrong.

Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Optimize "diff-index --cached" using cache-tree</title>
<updated>2009-05-25T18:35:29Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2009-05-20T22:57:22Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=b65982b60876c8f5f4d3b2898d5174f4812552b1'/>
<id>urn:sha1:b65982b60876c8f5f4d3b2898d5174f4812552b1</id>
<content type='text'>
When running "diff-index --cached" after making a change to only a small
portion of the index, there is no point unpacking unchanged subtrees into
the index recursively, only to find that all entries match anyway.  Tweak
unpack_trees() logic that is used to read in the tree object to catch the
case where the tree entry we are looking at matches the index as a whole
by looking at the cache-tree.

As an exercise, after modifying a few paths in the kernel tree, here are
a few numbers on my Athlon 64X2 3800+:

    (without patch, hot cache)
    $ /usr/bin/time git diff --cached --raw
    :100644 100644 b57e1f5... e69de29... M  Makefile
    :100644 000000 8c86b72... 0000000... D  arch/x86/Makefile
    :000000 100644 0000000... e69de29... A  arche
    0.07user 0.02system 0:00.09elapsed 102%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+0outputs (0major+9407minor)pagefaults 0swaps

    (with patch, hot cache)
    $ /usr/bin/time ../git.git/git-diff --cached --raw
    :100644 100644 b57e1f5... e69de29... M  Makefile
    :100644 000000 8c86b72... 0000000... D  arch/x86/Makefile
    :000000 100644 0000000... e69de29... A  arche
    0.02user 0.00system 0:00.02elapsed 103%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+0outputs (0major+2446minor)pagefaults 0swaps

Cold cache numbers are very impressive, but it does not matter very much
in practice:

    (without patch, cold cache)
    $ su root sh -c 'echo 3 &gt;/proc/sys/vm/drop_caches'
    $ /usr/bin/time git diff --cached --raw
    :100644 100644 b57e1f5... e69de29... M  Makefile
    :100644 000000 8c86b72... 0000000... D  arch/x86/Makefile
    :000000 100644 0000000... e69de29... A  arche
    0.06user 0.17system 0:10.26elapsed 2%CPU (0avgtext+0avgdata 0maxresident)k
    247032inputs+0outputs (1172major+8237minor)pagefaults 0swaps

    (with patch, cold cache)
    $ su root sh -c 'echo 3 &gt;/proc/sys/vm/drop_caches'
    $ /usr/bin/time ../git.git/git-diff --cached --raw
    :100644 100644 b57e1f5... e69de29... M  Makefile
    :100644 000000 8c86b72... 0000000... D  arch/x86/Makefile
    :000000 100644 0000000... e69de29... A  arche
    0.02user 0.01system 0:01.01elapsed 3%CPU (0avgtext+0avgdata 0maxresident)k
    18440inputs+0outputs (79major+2369minor)pagefaults 0swaps

This of course helps "git status" as well.

    (without patch, hot cache)
    $ /usr/bin/time ../git.git/git-status &gt;/dev/null
    0.17user 0.18system 0:00.35elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+5336outputs (0major+10970minor)pagefaults 0swaps

    (with patch, hot cache)
    $ /usr/bin/time ../git.git/git-status &gt;/dev/null
    0.10user 0.16system 0:00.27elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+5336outputs (0major+3921minor)pagefaults 0swaps

Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'lt/maint-diff-reduce-lstat'</title>
<updated>2009-05-23T08:40:33Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2009-05-23T08:40:33Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=3ed24211d48d9a297b3784347acbf7e0265a58f3'/>
<id>urn:sha1:3ed24211d48d9a297b3784347acbf7e0265a58f3</id>
<content type='text'>
* lt/maint-diff-reduce-lstat:
  Teach 'git checkout' to preload the index contents
  Avoid unnecessary 'lstat()' calls in 'get_stat_data()'
</content>
</entry>
<entry>
<title>Avoid unnecessary 'lstat()' calls in 'get_stat_data()'</title>
<updated>2009-05-10T03:42:19Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2009-05-09T21:57:30Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=658dd48c8572d0db49719cbef6605d384621d87c'/>
<id>urn:sha1:658dd48c8572d0db49719cbef6605d384621d87c</id>
<content type='text'>
When we ask get_stat_data() to get the mode and size of an index entry,
we can avoid the lstat() call if we have marked the index entry as being
uptodate due to earlier lstat() calls.

This avoids a lot of unnecessary lstat() calls in eg 'git checkout',
where the last phase shows the differences to the working tree
(requiring a diff), but earlier phases have already verified the index.

On the kernel repo (with a fast machine and everything cached), this
changes timings of a nul 'git checkout' from

 - Before (best of ten):

	0.14user 0.05system 0:00.19elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
	0inputs+0outputs (0major+13237minor)pagefaults 0swaps

 - After
	0.11user 0.03system 0:00.15elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k
	0inputs+0outputs (0major+13235minor)pagefaults 0swaps

so it can obviously be noticeable, although equally obviously it's not a
show-stopper on this particular machine. The difference is likely larger
on slower machines, or with operating systems that don't do as good a job
of name caching.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'kb/checkout-optim'</title>
<updated>2009-03-18T01:54:31Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2009-03-18T01:54:31Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=a9bfe813094cf2c8ea0e30c3196070c868fb294c'/>
<id>urn:sha1:a9bfe813094cf2c8ea0e30c3196070c868fb294c</id>
<content type='text'>
* kb/checkout-optim:
  Revert "lstat_cache(): print a warning if doing ping-pong between cache types"
  checkout bugfix: use stat.mtime instead of stat.ctime in two places
  Makefile: Set compiler switch for USE_NSEC
  Create USE_ST_TIMESPEC and turn it on for Darwin
  Not all systems use st_[cm]tim field for ns resolution file timestamp
  Record ns-timestamps if possible, but do not use it without USE_NSEC
  write_index(): update index_state-&gt;timestamp after flushing to disk
  verify_uptodate(): add ce_uptodate(ce) test
  make USE_NSEC work as expected
  fix compile error when USE_NSEC is defined
  check_updates(): effective removal of cache entries marked CE_REMOVE
  lstat_cache(): print a warning if doing ping-pong between cache types
  show_patch_diff(): remove a call to fstat()
  write_entry(): use fstat() instead of lstat() when file is open
  write_entry(): cleanup of some duplicated code
  create_directories(): remove some memcpy() and strchr() calls
  unlink_entry(): introduce schedule_dir_for_removal()
  lstat_cache(): swap func(length, string) into func(string, length)
  lstat_cache(): generalise longest_match_lstat_cache()
  lstat_cache(): small cleanup and optimisation
</content>
</entry>
<entry>
<title>Generalize and libify index_is_dirty() to index_differs_from(...)</title>
<updated>2009-02-11T06:25:39Z</updated>
<author>
<name>Stephan Beyer</name>
<email>s-beyer@gmx.net</email>
</author>
<published>2009-02-10T14:30:35Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=75f3ff2eeaba820b37016f464b6d1078cb6260e2'/>
<id>urn:sha1:75f3ff2eeaba820b37016f464b6d1078cb6260e2</id>
<content type='text'>
index_is_dirty() in builtin-revert.c checks if the index is dirty.
This patch generalizes this function to check if the index differs
from a revision, i.e. the former index_is_dirty() behavior can now be
achieved by index_differs_from("HEAD", 0).

The second argument "diff_flags" allows to set further diff option
flags like DIFF_OPT_IGNORE_SUBMODULES. See DIFF_OPT_* macros in diff.h
for a list.

index_differs_from() seems to be useful for more than builtin-revert.c,
so it is moved into diff-lib.c and also used in builtin-commit.c.

Yet to mention:

 - "rev.abbrev = 0;" can be safely removed.
   This has no impact on performance or functioning of neither
   setup_revisions() nor run_diff_index().

 - rev.pending.objects is free()d because this fixes a leak.
   (Also see 295dd2ad "Fix memory leak in traverse_commit_list")

Mentored-by: Daniel Barkalow &lt;barkalow@iabervon.org&gt;
Mentored-by: Christian Couder &lt;chriscool@tuxfamily.org&gt;
Signed-off-by: Stephan Beyer &lt;s-beyer@gmx.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>lstat_cache(): swap func(length, string) into func(string, length)</title>
<updated>2009-02-10T04:59:26Z</updated>
<author>
<name>Kjetil Barvik</name>
<email>barvik@broadpark.no</email>
</author>
<published>2009-02-09T20:54:06Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=571998921d8fd4ee674545406aabb86987921252'/>
<id>urn:sha1:571998921d8fd4ee674545406aabb86987921252</id>
<content type='text'>
Swap function argument pair (length, string) into (string, length) to
conform with the commonly used order inside the GIT source code.

Also, add a note about this fact into the coding guidelines.

Signed-off-by: Kjetil Barvik &lt;barvik@broadpark.no&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Cleanup of unused symcache variable inside diff-lib.c</title>
<updated>2009-01-11T23:56:55Z</updated>
<author>
<name>Kjetil Barvik</name>
<email>barvik@broadpark.no</email>
</author>
<published>2009-01-11T18:36:42Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=ff7e6aad6d995a52e88cfeaae10cf8a768ff2358'/>
<id>urn:sha1:ff7e6aad6d995a52e88cfeaae10cf8a768ff2358</id>
<content type='text'>
Commit c40641b77b0274186fd1b327d5dc3246f814aaaf, 'Optimize
symlink/directory detection' by Linus Torvalds, removed the 'char
*symcache' parameter to the has_symlink_leading_path() function.  This
made all variables currently named 'symcache' inside diff-lib.c
unnecessary.

This also let us throw away the 'struct oneway_unpack_data', and
instead directly use the 'struct rev_info *revs' member, which
was the only member left after removal of the 'symcache[] array'
member.  The 'struct oneway_unpack_data' was introduced by the
following commit:

  948dd346  "diff-files: careful when inspecting work tree items"

Impact: cleanup
        PATH_MAX bytes less memory stack usage in some cases

Signed-off-by: Kjetil Barvik &lt;barvik@broadpark.no&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>diff: vary default prefix depending on what are compared</title>
<updated>2008-08-31T03:53:24Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2008-08-19T03:08:09Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=a5a818ee4877e4458e8e6741a03ac3b19941d58a'/>
<id>urn:sha1:a5a818ee4877e4458e8e6741a03ac3b19941d58a</id>
<content type='text'>
With a new configuration "diff.mnemonicprefix", "git diff" shows the
differences between various combinations of preimage and postimage trees
with prefixes different from the standard "a/" and "b/".  Hopefully this
will make the distinction stand out for some people.

    "git diff" compares the (i)ndex and the (w)ork tree;
    "git diff HEAD" compares a (c)ommit and the (w)ork tree;
    "git diff --cached" compares a (c)ommit and the (i)ndex;
    "git-diff HEAD:file1 file2" compares an (o)bject and a (w)ork tree entity;
    "git diff --no-index a b" compares two non-git things (1) and (2).

Because these mnemonics now have meanings, they are swapped when reverse
diff is in effect and this feature is enabled.

Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
