<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/git.git/userdiff.c, branch v1.7.7.4</title>
<subtitle>Git
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/git.git/atom?h=v1.7.7.4</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/atom?h=v1.7.7.4'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/'/>
<updated>2011-08-04T22:53:21Z</updated>
<entry>
<title>Rename git_checkattr() to git_check_attr()</title>
<updated>2011-08-04T22:53:21Z</updated>
<author>
<name>Michael Haggerty</name>
<email>mhagger@alum.mit.edu</email>
</author>
<published>2011-08-04T04:36:33Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=d932f4eb9fc88e93040635d291f6c195ea2e4ee3'/>
<id>urn:sha1:d932f4eb9fc88e93040635d291f6c195ea2e4ee3</id>
<content type='text'>
Suggested by: Junio Hamano &lt;gitster@pobox.com&gt;

Signed-off-by: Michael Haggerty &lt;mhagger@alum.mit.edu&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'jk/combine-diff-binary-etc'</title>
<updated>2011-06-30T00:03:10Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2011-06-30T00:03:10Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=dbae1a13363c45d3f77061f61ca73a0c3ed045cf'/>
<id>urn:sha1:dbae1a13363c45d3f77061f61ca73a0c3ed045cf</id>
<content type='text'>
* jk/combine-diff-binary-etc:
  combine-diff: respect textconv attributes
  refactor get_textconv to not require diff_filespec
  combine-diff: handle binary files as binary
  combine-diff: calculate mode_differs earlier
  combine-diff: split header printing into its own function
</content>
</entry>
<entry>
<title>refactor get_textconv to not require diff_filespec</title>
<updated>2011-05-23T22:46:02Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2011-05-23T20:30:14Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=3813e69031d2df2702f50b9649fa2e40ea11e558'/>
<id>urn:sha1:3813e69031d2df2702f50b9649fa2e40ea11e558</id>
<content type='text'>
This function actually does two things:

  1. Load the userdiff driver for the filespec.

  2. Decide whether the driver has a textconv component, and
     initialize the textconv cache if applicable.

Only part (1) requires the filespec object, and some callers
may not have a filespec at all. So let's split them it into
two functions, and put part (2) with the userdiff code,
which is a better fit.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>userdiff/perl: tighten BEGIN/END block pattern to reject here-doc delimiters</title>
<updated>2011-05-23T18:39:13Z</updated>
<author>
<name>Jonathan Nieder</name>
<email>jrnieder@gmail.com</email>
</author>
<published>2011-05-22T17:29:32Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=f143d9c695cd4c3e86069c536fa0dff04fc93e93'/>
<id>urn:sha1:f143d9c695cd4c3e86069c536fa0dff04fc93e93</id>
<content type='text'>
A naive method of treating BEGIN/END blocks with a brace on the second
line as diff/grep funcname context involves also matching unrelated
lines that consist of all-caps letters:

	sub foo {
		print &lt;&lt;'EOF'
	text goes here
	...
	EOF
		... rest of foo ...
	}

That's not so great, because it means that "git diff" and "git grep
--show-function" would write "=EOF" or "@@ EOF" as context instead of
a more useful reminder like "@@ sub foo {".

To avoid this, tighten the pattern to only match the special block
names that perl accepts (namely BEGIN, END, INIT, CHECK, UNITCHECK,
AUTOLOAD, and DESTROY).  The list is taken from perl's toke.c.

Suggested-by: Jakub Narebski &lt;jnareb@gmail.com&gt;
Signed-off-by: Jonathan Nieder &lt;jrnieder@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>userdiff/perl: catch sub with brace on second line</title>
<updated>2011-05-22T05:29:32Z</updated>
<author>
<name>Jonathan Nieder</name>
<email>jrnieder@gmail.com</email>
</author>
<published>2011-05-21T19:38:26Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=ea2ca4497bdb716977a3e2526780635cb6bac513'/>
<id>urn:sha1:ea2ca4497bdb716977a3e2526780635cb6bac513</id>
<content type='text'>
Accept

	sub foo
	{
	}

as an alternative to a more common style that introduces perl
functions with a brace on the first line (and likewise for BEGIN/END
blocks).  The new regex is a little hairy to avoid matching

	# forward declaration
	sub foo;

while continuing to match "sub foo($;@) {" and

	sub foo { # This routine is interesting;
		# in fact, the lines below explain how...

While at it, pay attention to Perl 5.14's "package foo {" syntax as an
alternative to the traditional "package foo;".

Requested-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Signed-off-by: Jonathan Nieder &lt;jrnieder@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>userdiff/perl: match full line of POD headers</title>
<updated>2011-05-22T05:29:32Z</updated>
<author>
<name>Jonathan Nieder</name>
<email>jrnieder@gmail.com</email>
</author>
<published>2011-05-21T19:35:51Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=12f0967a8a1e3c11c678de181f77d1c7883b37cf'/>
<id>urn:sha1:12f0967a8a1e3c11c678de181f77d1c7883b37cf</id>
<content type='text'>
The builtin perl userdiff driver is not greedy enough about catching
POD header lines.  Capture the whole line, so instead of just
declaring that we are in some "@@ =head1" section, diff/grep output
can explain that the enclosing section is about "@@ =head1 OPTIONS".

Reported-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Signed-off-by: Jonathan Nieder &lt;jrnieder@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>userdiff/perl: anchor "sub" and "package" patterns on the left</title>
<updated>2011-05-22T05:29:31Z</updated>
<author>
<name>Jonathan Nieder</name>
<email>jrnieder@gmail.com</email>
</author>
<published>2011-05-21T19:29:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=f12c66b9bb851aa7350d40370e6adf78535c5930'/>
<id>urn:sha1:f12c66b9bb851aa7350d40370e6adf78535c5930</id>
<content type='text'>
The userdiff funcname mechanism has no concept of nested scopes ---
instead, "git diff" and "git grep --show-function" simply label the
diff header with the most recent matching line.  Unfortunately that
means text following a subroutine in a POD section:

	=head1 DESCRIPTION

	You might use this facility like so:

		sub example {
			foo;
		}

	Now, having said that, let's say more about the facility.
	Blah blah blah ... etc etc.

gets the subroutine name instead of the POD header in its diff/grep
funcname header, making it harder to get oriented when reading a
diff without enough context.

The fix is simple: anchor the funcname syntax to the left margin so
nested subroutines and packages like this won't get picked up.  (The
builtin C++ funcname pattern already does the same thing.)  This means
the userdiff driver will misparse the idiom

	{
		my $static;
		sub foo {
			... use $static ...
		}
	}

but I think that's worth it; we can revisit this later if the userdiff
mechanism learns to keep track of the beginning and end of nested
scopes.

Reported-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Signed-off-by: Jonathan Nieder &lt;jrnieder@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'tr/diff-words-test'</title>
<updated>2011-02-10T00:41:17Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2011-02-10T00:41:17Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=1bb4abeff7e541ccfd7c87f5369b090829346e68'/>
<id>urn:sha1:1bb4abeff7e541ccfd7c87f5369b090829346e68</id>
<content type='text'>
* tr/diff-words-test:
  t4034 (diff --word-diff): add a minimum Perl drier test vector
  t4034 (diff --word-diff): style suggestions
  userdiff: simplify word-diff safeguard
  t4034: bulk verify builtin word regex sanity
</content>
</entry>
<entry>
<title>Merge branch 'as/userdiff-pascal'</title>
<updated>2011-01-24T18:54:12Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2011-01-24T18:54:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=342953a6860915b4b66dfb24147c46a191342daf'/>
<id>urn:sha1:342953a6860915b4b66dfb24147c46a191342daf</id>
<content type='text'>
* as/userdiff-pascal:
  userdiff: match Pascal class methods
</content>
</entry>
<entry>
<title>userdiff: simplify word-diff safeguard</title>
<updated>2011-01-18T16:59:39Z</updated>
<author>
<name>Jonathan Nieder</name>
<email>jrnieder@gmail.com</email>
</author>
<published>2011-01-11T21:48:50Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=664d44ee7fb18bdfdd66a1be760c7ee1bbe911c6'/>
<id>urn:sha1:664d44ee7fb18bdfdd66a1be760c7ee1bbe911c6</id>
<content type='text'>
git's diff-words support has a detail that can be a little dangerous:
any text not matched by a given language's tokenization pattern is
treated as whitespace and changes in such text would go unnoticed.
Therefore each of the built-in regexes allows a special token type
consisting of a single non-whitespace character [^[:space:]].

To make sure UTF-8 sequences remain human readable, the builtin
regexes also have a special token type for runs of bytes with the high
bit set.  In English, non-ASCII characters are usually isolated so
this is analogous to the [^[:space:]] pattern, except it matches a
single _multibyte_ character despite use of the C locale.

Unfortunately it is easy to make typos or forget entirely to include
these catch-all token types when adding support for new languages (see
v1.7.3.5~16, userdiff: fix typo in ruby and python word regexes,
2010-12-18).  Avoid this by including them automatically within the
PATTERNS and IPATTERN macros.

While at it, change the UTF-8 sequence token type to match exactly one
non-ASCII multi-byte character, rather than an arbitrary run of them.

Suggested-by: Thomas Rast &lt;trast@student.ethz.ch&gt;
Signed-off-by: Jonathan Nieder &lt;jrnieder@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
