user/sven/git.git/userdiff.c, branch v1.7.7.4

Rename git_checkattr() to git_check_attr()

2011-08-04T22:53:21Z

Suggested by: Junio Hamano Signed-off-by: Michael Haggerty Signed-off-by: Junio C Hamano

Merge branch 'jk/combine-diff-binary-etc'

2011-06-30T00:03:10Z

* jk/combine-diff-binary-etc: combine-diff: respect textconv attributes refactor get_textconv to not require diff_filespec combine-diff: handle binary files as binary combine-diff: calculate mode_differs earlier combine-diff: split header printing into its own function

refactor get_textconv to not require diff_filespec

2011-05-23T22:46:02Z

This function actually does two things: 1. Load the userdiff driver for the filespec. 2. Decide whether the driver has a textconv component, and initialize the textconv cache if applicable. Only part (1) requires the filespec object, and some callers may not have a filespec at all. So let's split them it into two functions, and put part (2) with the userdiff code, which is a better fit. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano

userdiff/perl: tighten BEGIN/END block pattern to reject here-doc delimiters

2011-05-23T18:39:13Z

A naive method of treating BEGIN/END blocks with a brace on the second line as diff/grep funcname context involves also matching unrelated lines that consist of all-caps letters: sub foo { print <<'EOF' text goes here ... EOF ... rest of foo ... } That's not so great, because it means that "git diff" and "git grep --show-function" would write "=EOF" or "@@ EOF" as context instead of a more useful reminder like "@@ sub foo {". To avoid this, tighten the pattern to only match the special block names that perl accepts (namely BEGIN, END, INIT, CHECK, UNITCHECK, AUTOLOAD, and DESTROY). The list is taken from perl's toke.c. Suggested-by: Jakub Narebski Signed-off-by: Jonathan Nieder Signed-off-by: Junio C Hamano

userdiff/perl: catch sub with brace on second line

2011-05-22T05:29:32Z

Accept sub foo { } as an alternative to a more common style that introduces perl functions with a brace on the first line (and likewise for BEGIN/END blocks). The new regex is a little hairy to avoid matching # forward declaration sub foo; while continuing to match "sub foo($;@) {" and sub foo { # This routine is interesting; # in fact, the lines below explain how... While at it, pay attention to Perl 5.14's "package foo {" syntax as an alternative to the traditional "package foo;". Requested-by: Ævar Arnfjörð Bjarmason Signed-off-by: Jonathan Nieder Signed-off-by: Junio C Hamano

userdiff/perl: match full line of POD headers

2011-05-22T05:29:32Z

The builtin perl userdiff driver is not greedy enough about catching POD header lines. Capture the whole line, so instead of just declaring that we are in some "@@ =head1" section, diff/grep output can explain that the enclosing section is about "@@ =head1 OPTIONS". Reported-by: Ævar Arnfjörð Bjarmason Signed-off-by: Jonathan Nieder Signed-off-by: Junio C Hamano

userdiff/perl: anchor "sub" and "package" patterns on the left

2011-05-22T05:29:31Z

The userdiff funcname mechanism has no concept of nested scopes --- instead, "git diff" and "git grep --show-function" simply label the diff header with the most recent matching line. Unfortunately that means text following a subroutine in a POD section: =head1 DESCRIPTION You might use this facility like so: sub example { foo; } Now, having said that, let's say more about the facility. Blah blah blah ... etc etc. gets the subroutine name instead of the POD header in its diff/grep funcname header, making it harder to get oriented when reading a diff without enough context. The fix is simple: anchor the funcname syntax to the left margin so nested subroutines and packages like this won't get picked up. (The builtin C++ funcname pattern already does the same thing.) This means the userdiff driver will misparse the idiom { my $static; sub foo { ... use $static ... } } but I think that's worth it; we can revisit this later if the userdiff mechanism learns to keep track of the beginning and end of nested scopes. Reported-by: Ævar Arnfjörð Bjarmason Signed-off-by: Jonathan Nieder Signed-off-by: Junio C Hamano

Merge branch 'tr/diff-words-test'

2011-02-10T00:41:17Z

* tr/diff-words-test: t4034 (diff --word-diff): add a minimum Perl drier test vector t4034 (diff --word-diff): style suggestions userdiff: simplify word-diff safeguard t4034: bulk verify builtin word regex sanity

Merge branch 'as/userdiff-pascal'

2011-01-24T18:54:12Z

* as/userdiff-pascal: userdiff: match Pascal class methods

userdiff: simplify word-diff safeguard

2011-01-18T16:59:39Z

git's diff-words support has a detail that can be a little dangerous: any text not matched by a given language's tokenization pattern is treated as whitespace and changes in such text would go unnoticed. Therefore each of the built-in regexes allows a special token type consisting of a single non-whitespace character [^[:space:]]. To make sure UTF-8 sequences remain human readable, the builtin regexes also have a special token type for runs of bytes with the high bit set. In English, non-ASCII characters are usually isolated so this is analogous to the [^[:space:]] pattern, except it matches a single _multibyte_ character despite use of the C locale. Unfortunately it is easy to make typos or forget entirely to include these catch-all token types when adding support for new languages (see v1.7.3.5~16, userdiff: fix typo in ruby and python word regexes, 2010-12-18). Avoid this by including them automatically within the PATTERNS and IPATTERN macros. While at it, change the UTF-8 sequence token type to match exactly one non-ASCII multi-byte character, rather than an arbitrary run of them. Suggested-by: Thomas Rast Signed-off-by: Jonathan Nieder Signed-off-by: Junio C Hamano