<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/git.git/diffcore.h, branch v1.5.4.4</title>
<subtitle>Git
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/git.git/atom?h=v1.5.4.4</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/atom?h=v1.5.4.4'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/'/>
<updated>2007-10-27T06:18:06Z</updated>
<entry>
<title>copy vs rename detection: avoid unnecessary O(n*m) loops</title>
<updated>2007-10-27T06:18:06Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2007-10-25T18:20:56Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=644797119d7a3b7a043a51a9cccd8758f8451f91'/>
<id>urn:sha1:644797119d7a3b7a043a51a9cccd8758f8451f91</id>
<content type='text'>
The core rename detection had some rather stupid code to check if a
pathname was used by a later modification or rename, which basically
walked the whole pathname space for all renames for each rename, in
order to tell whether it was a pure rename (no remaining users) or
should be considered a copy (other users of the source file remaining).

That's really silly, since we can just keep a count of users around, and
replace all those complex and expensive loops with just testing that
simple counter (but this all depends on the previous commit that shared
the diff_filespec data structure by using a separate reference count).

Note that the reference count is not the same as the rename count: they
behave otherwise rather similarly, but the reference count is tied to
the allocation (and decremented at de-allocation, so that when it turns
zero we can get rid of the memory), while the rename count is tied to
the renames and is decremented when we find a rename (so that when it
turns zero we know that it was a rename, not a copy).

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Ref-count the filespecs used by diffcore</title>
<updated>2007-10-27T06:18:05Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2007-10-25T18:19:10Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=9fb88419ba85e641006c80db53620423f37f1c93'/>
<id>urn:sha1:9fb88419ba85e641006c80db53620423f37f1c93</id>
<content type='text'>
Rather than copy the filespecs when introducing new versions of them
(for rename or copy detection), use a refcount and increment the count
when reusing the diff_filespec.

This avoids unnecessary allocations, but the real reason behind this is
a future enhancement: we will want to track shared data across the
copy/rename detection.  In order to efficiently notice when a filespec
is used by a rename, the rename machinery wants to keep track of a
rename usage count which is shared across all different users of the
filespec.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>rename diff_free_filespec_data_large() to diff_free_filespec_blob()</title>
<updated>2007-10-03T04:02:09Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2007-10-03T04:01:03Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=8ae92e63895000ff9b12046325ae381f3c17d414'/>
<id>urn:sha1:8ae92e63895000ff9b12046325ae381f3c17d414</id>
<content type='text'>
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>diffcore-rename: cache file deltas</title>
<updated>2007-10-03T04:02:03Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2007-09-25T19:29:42Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=eede7b7d110e2c354235d7a3f6c8f1644b5120e5'/>
<id>urn:sha1:eede7b7d110e2c354235d7a3f6c8f1644b5120e5</id>
<content type='text'>
We find rename candidates by computing a fingerprint hash of
each file, and then comparing those fingerprints. There are
inherently O(n^2) comparisons, so it pays in CPU time to
hoist the (rather expensive) computation of the fingerprint
out of that loop (or to cache it once we have computed it once).

Previously, we didn't keep the filespec information around
because then we had the potential to consume a great deal of
memory. However, instead of keeping all of the filespec
data, we can instead just keep the fingerprint.

This patch implements and uses diff_free_filespec_data_large
to accomplish that goal. We also have to change
estimate_similarity not to needlessly repopulate the
filespec data when we already have the hash.

Practical tests showed 4.5x speedup for a 10% memory usage
increase.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Fix configuration syntax to specify customized hunk header patterns.</title>
<updated>2007-07-07T08:49:58Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2007-07-07T08:49:58Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=e0e324a4dc18a4341e1320a7cfac9733d81f8b0b'/>
<id>urn:sha1:e0e324a4dc18a4341e1320a7cfac9733d81f8b0b</id>
<content type='text'>
This updates the hunk header customization syntax.  The special
case 'funcname' attribute is gone.

You assign the name of the type of contents to path's "diff"
attribute as a string value in .gitattributes like this:

	*.java diff=java
	*.perl diff=perl
	*.doc diff=doc

If you supply "diff.&lt;name&gt;.funcname" variable via the
configuration mechanism (e.g. in $HOME/.gitconfig), the value is
used as the regexp set to find the line to use for the hunk
header (the variable is called "funcname" because such a line
typically is the one that has the name of the function in
programming language source text).

If there is no such configuration, built-in default is used, if
any.  Currently there are two default patterns: default and java.

Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Per-path attribute based hunk header selection.</title>
<updated>2007-07-06T08:20:47Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2007-07-06T07:45:10Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=f258475a6ede3617ae768b69e33f78cbab8312de'/>
<id>urn:sha1:f258475a6ede3617ae768b69e33f78cbab8312de</id>
<content type='text'>
This makes"diff -p" hunk headers customizable via gitattributes mechanism.
It is based on Johannes's earlier patch that allowed to define a single
regexp to be used for everything.

The mechanism to arrive at the regexp that is used to define hunk header
is the same as other use of gitattributes.  You assign an attribute, funcname
(because "diff -p" typically uses the name of the function the patch is about
as the hunk header), a simple string value.  This can be one of the names of
built-in pattern (currently, "java" is defined) or a custom pattern name, to
be looked up from the configuration file.

  (in .gitattributes)
  *.java   funcname=java
  *.perl   funcname=perl

  (in .git/config)
  [funcname]
    java = ... # ugly and complicated regexp to override the built-in one.
    perl = ... # another ugly and complicated regexp to define a new one.

Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Introduce diff_filespec_is_binary()</title>
<updated>2007-07-06T07:21:41Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2007-07-06T07:18:54Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=29a3eefde111f6a24292163c4308f00ab3572627'/>
<id>urn:sha1:29a3eefde111f6a24292163c4308f00ab3572627</id>
<content type='text'>
This replaces an explicit initialization of filespec-&gt;is_binary
field used for rename/break followed by direct access to that
field with a wrapper function that lazily iniaitlizes and
accesses the field.  We would add more attribute accesses for
the use of diff routines, and it would be better to make this
abstraction earlier.

Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>diffcore_filespec: add is_binary</title>
<updated>2007-07-01T03:51:31Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2007-06-29T05:56:07Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=706098af6b53403b5ea3db9216fb99afbe06f52d'/>
<id>urn:sha1:706098af6b53403b5ea3db9216fb99afbe06f52d</id>
<content type='text'>
diffcore-break and diffcore-rename would want to behave slightly
differently depending on the binary-ness of the data, so add one
bit to the filespec, as the structure is now passed down to
diffcore_count_changes() function.

Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>diffcore_count_changes: pass diffcore_filespec</title>
<updated>2007-07-01T03:51:31Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2007-06-29T05:54:37Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=d8c3d03a0b7f10977dd508a5a965a417b7f1b065'/>
<id>urn:sha1:d8c3d03a0b7f10977dd508a5a965a417b7f1b065</id>
<content type='text'>
We may want to use richer information on the data we are dealing
with in this function, so instead of passing a buffer address
and length, just pass the diffcore_filespec structure.  Existing
callers always call this function with parameters taken from a
filespec anyway, so there is no functionality changes.

Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Make macros to prevent double-inclusion in headers consistent.</title>
<updated>2007-04-29T09:05:11Z</updated>
<author>
<name>Junio C Hamano</name>
<email>junkio@cox.net</email>
</author>
<published>2007-04-29T06:38:52Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=e0173ad9fcabfcabe54d65be2c8b6602f61079e6'/>
<id>urn:sha1:e0173ad9fcabfcabe54d65be2c8b6602f61079e6</id>
<content type='text'>
Signed-off-by: Junio C Hamano &lt;junkio@cox.net&gt;
</content>
</entry>
</feed>
