<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/git.git/usage.c, branch v2.26.0-rc2</title>
<subtitle>Git
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/git.git/atom?h=v2.26.0-rc2</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/atom?h=v2.26.0-rc2'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/'/>
<updated>2019-11-02T06:20:21Z</updated>
<entry>
<title>vreportf(): avoid relying on stdio buffering</title>
<updated>2019-11-02T06:20:21Z</updated>
<author>
<name>Johannes Schindelin</name>
<email>johannes.schindelin@gmx.de</email>
</author>
<published>2019-10-30T10:44:36Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=116d1fa6c693e13321dc4c6abe256ca7878e55a5'/>
<id>urn:sha1:116d1fa6c693e13321dc4c6abe256ca7878e55a5</id>
<content type='text'>
The MSVC runtime behavior differs from glibc's with respect to
`fprintf(stderr, ...)` in that the former writes out the message
character by character.

In t5516, this leads to a funny problem where a `git fetch` process as
well as the `git upload-pack` process spawned by it _both_ call `die()`
at the same time. The output can look like this:

	fatal: git uploadfata-lp: raemcokte :error:  upload-pnot our arcef k6: n4ot our ea4cr1e3f 36d45ea94fca1398e86a771eda009872d63adb28598f6a9
	8e86a771eda009872d6ab2886

Let's avoid this predicament altogether by rendering the entire message,
including the prefix and the trailing newline, into the buffer we
already have (and which is still fixed size) and then write it out via
`write_in_full()`.

We still clip the message to at most 4095 characters.

The history of `vreportf()` with regard to this issue includes the
following commits:

d048a96e (2007-11-09) - 'char msg[256]' is introduced to avoid interleaving
389d1767 (2009-03-25) - Buffer size increased to 1024 to avoid truncation
625a860c (2009-11-22) - Buffer size increased to 4096 to avoid truncation
f4c3edc0 (2015-08-11) - Buffer removed to avoid truncation
b5a9e435 (2017-01-11) - Reverts f4c3edc0 to be able to replace control
                        chars before sending to stderr
9ac13ec9 (2006-10-11) - Another attempt to solve interleaving.
                        This is seemingly related to d048a96e.
137a0d0e (2007-11-19) - Addresses out-of-order for display()
34df8aba (2009-03-10) - Switches xwrite() to fprintf() in recv_sideband()
                        to support UTF-8 emulation
eac14f89 (2012-01-14) - Removes the need for fprintf() for UTF-8 emulation,
                        so it's safe to use xwrite() again
5e5be9e2 (2016-06-28) - recv_sideband() uses xwrite() again

Note that we print nothing if the `vsnprintf()` call failed to render
the error message; There is little we can do in that case, and it should
not happen anyway.

The process may have written to `stderr` and there may be something left
in the buffer kept in the stdio layer. Call `fflush(stderr)` before
writing the message we prepare in this function.

Helped-by: Jeff King &lt;peff@peff.net&gt;
Helped-by: Alexandr Miloslavskiy &lt;alexandr.miloslavskiy@syntevo.com&gt;
Helped-by: SZEDER Gábor &lt;szeder.dev@gmail.com&gt;
Helped-by: Junio C Hamano &lt;gitster@pobox.com&gt;
Signed-off-by: Johannes Schindelin &lt;johannes.schindelin@gmx.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>trace2: create new combined trace facility</title>
<updated>2019-02-22T23:27:59Z</updated>
<author>
<name>Jeff Hostetler</name>
<email>jeffhost@microsoft.com</email>
</author>
<published>2019-02-22T22:25:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=ee4512ed481a126dadd33334186e0e759d7f2f47'/>
<id>urn:sha1:ee4512ed481a126dadd33334186e0e759d7f2f47</id>
<content type='text'>
Create a new unified tracing facility for git.  The eventual intent is to
replace the current trace_printf* and trace_performance* routines with a
unified set of git_trace2* routines.

In addition to the usual printf-style API, trace2 provides higer-level
event verbs with fixed-fields allowing structured data to be written.
This makes post-processing and analysis easier for external tools.

Trace2 defines 3 output targets.  These are set using the environment
variables "GIT_TR2", "GIT_TR2_PERF", and "GIT_TR2_EVENT".  These may be
set to "1" or to an absolute pathname (just like the current GIT_TRACE).

* GIT_TR2 is intended to be a replacement for GIT_TRACE and logs command
  summary data.

* GIT_TR2_PERF is intended as a replacement for GIT_TRACE_PERFORMANCE.
  It extends the output with columns for the command process, thread,
  repo, absolute and relative elapsed times.  It reports events for
  child process start/stop, thread start/stop, and per-thread function
  nesting.

* GIT_TR2_EVENT is a new structured format. It writes event data as a
  series of JSON records.

Calls to trace2 functions log to any of the 3 output targets enabled
without the need to call different trace_printf* or trace_performance*
routines.

Signed-off-by: Jeff Hostetler &lt;jeffhost@microsoft.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'jk/snprintf-truncation'</title>
<updated>2018-05-30T12:51:28Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2018-05-30T12:51:27Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=7c3d15fe3113cf48db60656eedd152c46f47bf6b'/>
<id>urn:sha1:7c3d15fe3113cf48db60656eedd152c46f47bf6b</id>
<content type='text'>
Avoid unchecked snprintf() to make future code auditing easier.

* jk/snprintf-truncation:
  fmt_with_err: add a comment that truncation is OK
  shorten_unambiguous_ref: use xsnprintf
  fsmonitor: use internal argv_array of struct child_process
  log_write_email_headers: use strbufs
  http: use strbufs instead of fixed buffers
</content>
</entry>
<entry>
<title>fmt_with_err: add a comment that truncation is OK</title>
<updated>2018-05-21T00:59:14Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2018-05-19T01:58:44Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=ac4896f007a624c12feda866aeb4abe8a1394e39'/>
<id>urn:sha1:ac4896f007a624c12feda866aeb4abe8a1394e39</id>
<content type='text'>
Functions like die_errno() use fmt_with_err() to combine the
caller-provided format with the strerror() string. We use a
fixed stack buffer because we're already handling an error
and don't have any way to report another one. Our buffer
should generally be big enough to fit this, but if it's not,
truncation is our best option. Let's add a comment to that
effect, so that anybody auditing the code for truncation
bugs knows that this is fine.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>test-tool: help verifying BUG() code paths</title>
<updated>2018-05-06T10:06:13Z</updated>
<author>
<name>Johannes Schindelin</name>
<email>johannes.schindelin@gmx.de</email>
</author>
<published>2018-05-02T09:38:28Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=a86303cb5d5772364a3a5080d97be6f1a577be4c'/>
<id>urn:sha1:a86303cb5d5772364a3a5080d97be6f1a577be4c</id>
<content type='text'>
When we call BUG(), we signal via SIGABRT that something bad happened,
dumping cores if so configured. In some setups these coredumps are
redirected to some central place such as /proc/sys/kernel/core_pattern,
which is a good thing.

However, when we try to verify in our test suite that bugs are caught in
certain code paths, we do *not* want to clutter such a central place
with unnecessary coredumps.

So let's special-case the test helpers (which we use to verify such code
paths) so that the BUG() calls will *not* call abort() but exit with a
special-purpose exit code instead.

Helped-by: Nguyễn Thái Ngọc Duy &lt;pclouds@gmail.com&gt;
Signed-off-by: Johannes Schindelin &lt;johannes.schindelin@gmx.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>add UNLEAK annotation for reducing leak false positives</title>
<updated>2017-09-08T06:43:17Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2017-09-08T06:38:41Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=0e5bba53af7d6f911e99589d736cdf06badad0fe'/>
<id>urn:sha1:0e5bba53af7d6f911e99589d736cdf06badad0fe</id>
<content type='text'>
It's a common pattern in git commands to allocate some
memory that should last for the lifetime of the program and
then not bother to free it, relying on the OS to throw it
away.

This keeps the code simple, and it's fast (we don't waste
time traversing structures or calling free at the end of the
program). But it also triggers warnings from memory-leak
checkers like valgrind or LSAN. They know that the memory
was still allocated at program exit, but they don't know
_when_ the leaked memory stopped being useful. If it was
early in the program, then it's probably a real and
important leak. But if it was used right up until program
exit, it's not an interesting leak and we'd like to suppress
it so that we can see the real leaks.

This patch introduces an UNLEAK() macro that lets us do so.
To understand its design, let's first look at some of the
alternatives.

Unfortunately the suppression systems offered by
leak-checking tools don't quite do what we want. A
leak-checker basically knows two things:

  1. Which blocks were allocated via malloc, and the
     callstack during the allocation.

  2. Which blocks were left un-freed at the end of the
     program (and which are unreachable, but more on that
     later).

Their suppressions work by mentioning the function or
callstack of a particular allocation, and marking it as OK
to leak.  So imagine you have code like this:

  int cmd_foo(...)
  {
	/* this allocates some memory */
	char *p = some_function();
	printf("%s", p);
	return 0;
  }

You can say "ignore allocations from some_function(),
they're not leaks". But that's not right. That function may
be called elsewhere, too, and we would potentially want to
know about those leaks.

So you can say "ignore the callstack when main calls
some_function".  That works, but your annotations are
brittle. In this case it's only two functions, but you can
imagine that the actual allocation is much deeper. If any of
the intermediate code changes, you have to update the
suppression.

What we _really_ want to say is that "the value assigned to
p at the end of the function is not a real leak". But
leak-checkers can't understand that; they don't know about
"p" in the first place.

However, we can do something a little bit tricky if we make
some assumptions about how leak-checkers work. They
generally don't just report all un-freed blocks. That would
report even globals which are still accessible when the
leak-check is run.  Instead they take some set of memory
(like BSS) as a root and mark it as "reachable". Then they
scan the reachable blocks for anything that looks like a
pointer to a malloc'd block, and consider that block
reachable. And then they scan those blocks, and so on,
transitively marking anything reachable from a global as
"not leaked" (or at least leaked in a different category).

So we can mark the value of "p" as reachable by putting it
into a variable with program lifetime. One way to do that is
to just mark "p" as static. But that actually affects the
run-time behavior if the function is called twice (you
aren't likely to call main() twice, but some of our cmd_*()
functions are called from other commands).

Instead, we can trick the leak-checker by putting the value
into _any_ reachable bytes. This patch keeps a global
linked-list of bytes copied from "unleaked" variables. That
list is reachable even at program exit, which confers
recursive reachability on whatever values we unleak.

In other words, you can do:

  int cmd_foo(...)
  {
	char *p = some_function();
	printf("%s", p);
	UNLEAK(p);
	return 0;
  }

to annotate "p" and suppress the leak report.

But wait, couldn't we just say "free(p)"? In this toy
example, yes. But UNLEAK()'s byte-copying strategy has
several advantages over actually freeing the memory:

  1. It's recursive across structures. In many cases our "p"
     is not just a pointer, but a complex struct whose
     fields may have been allocated by a sub-function. And
     in some cases (e.g., dir_struct) we don't even have a
     function which knows how to free all of the struct
     members.

     By marking the struct itself as reachable, that confers
     reachability on any pointers it contains (including those
     found in embedded structs, or reachable by walking
     heap blocks recursively.

  2. It works on cases where we're not sure if the value is
     allocated or not. For example:

       char *p = argc &gt; 1 ? argv[1] : some_function();

     It's safe to use UNLEAK(p) here, because it's not
     freeing any memory. In the case that we're pointing to
     argv here, the reachability checker will just ignore
     our bytes.

  3. Likewise, it works even if the variable has _already_
     been freed. We're just copying the pointer bytes. If
     the block has been freed, the leak-checker will skip
     over those bytes as uninteresting.

  4. Because it's not actually freeing memory, you can
     UNLEAK() before we are finished accessing the variable.
     This is helpful in cases like this:

       char *p = some_function();
       return another_function(p);

     Writing this with free() requires:

       int ret;
       char *p = some_function();
       ret = another_function(p);
       free(p);
       return ret;

     But with unleak we can just write:

       char *p = some_function();
       UNLEAK(p);
       return another_function(p);

This patch adds the UNLEAK() macro and enables it
automatically when Git is compiled with SANITIZE=leak.  In
normal builds it's a noop, so we pay no runtime cost.

It also adds some UNLEAK() annotations to show off how the
feature works. On top of other recent leak fixes, these are
enough to get t0000 and t0001 to pass when compiled with
LSAN.

Note the case in commit.c which actually converts a
strbuf_release() into an UNLEAK. This code was already
non-leaky, but the free didn't do anything useful, since
we're exiting. Converting it to an annotation means that
non-leak-checking builds pay no runtime cost. The cost is
minimal enough that it's probably not worth going on a
crusade to convert these kinds of frees to UNLEAKS. I did it
here for consistency with the "sb" leak (though it would
have been equally correct to go the other way, and turn them
both into strbuf_release() calls).

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>die(): stop hiding errors due to overzealous recursion guard</title>
<updated>2017-06-21T21:09:13Z</updated>
<author>
<name>Ævar Arnfjörð Bjarmason</name>
<email>avarab@gmail.com</email>
</author>
<published>2017-06-21T20:47:42Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=2d3c02f5db620af7169b651ff4afe625df3af156'/>
<id>urn:sha1:2d3c02f5db620af7169b651ff4afe625df3af156</id>
<content type='text'>
Change the recursion limit for the default die routine from a *very*
low 1 to 1024. This ensures that infinite recursions are broken, but
doesn't lose the meaningful error messages under threaded execution
where threads concurrently start to die.

The intent of the existing code, as explained in commit
cd163d4b4e ("usage.c: detect recursion in die routines and bail out
immediately", 2012-11-14), is to break infinite recursion in cases
where the die routine itself calls die(), and would thus infinitely
recurse.

However, doing that very aggressively by immediately printing out
"recursion detected in die handler" if we've already called die() once
means that threaded invocations of git can end up only printing out
the "recursion detected" error, while hiding the meaningful error.

An example of this is running a threaded grep which dies on execution
against pretty much any repo, git.git will do:

    git grep -P --threads=8 '(*LIMIT_MATCH=1)-?-?-?---$'

With the current version of git this will print some combination of
multiple PCRE failures that caused the abort and multiple "recursion
detected", some invocations will print out multiple "recursion
detected" errors with no PCRE error at all!

Before this change, running the above grep command 1000 times against
git.git[1] and taking the top 20 results will on my system yield the
following distribution of actual errors ("E") and recursion
errors ("R"):

    322 E R
    306 E
    116 E R R
     65 R R
     54 R E
     49 E E
     44 R
     15 E R R R
      9 R R R
      7 R E R
      5 R R E
      3 E R R R R
      2 E E R
      1 R R R R
      1 R R R E
      1 R E R R

The exact results are obviously random and system-dependent, but this
shows the race condition in this code. Some small part of the time
we're about to print out the actual error ("E") but another thread's
recursion error beats us to it, and sometimes we print out nothing but
the recursion error.

With this change we get, now with "W" to mean the new warning being
emitted indicating that we've called die() many times:

    502 E
    160 E W E
    120 E E
     53 E W
     35 E W E E
     34 W E E
     29 W E E E
     16 E E W
     16 E E E
     11 W E E E E
      7 E E W E
      4 W E
      3 W W E E
      2 E W E E E
      1 W W E
      1 W E W E
      1 E W W E E E
      1 E W W E E
      1 E W W E
      1 E W E E W

Which still sucks a bit, due to a still present race-condition in this
code we're sometimes going to print out several errors still, or
several warnings, or two duplicate errors without the warning.

But we will never have a case where we completely hide the actual
error as we do now.

Now, git-grep could make use of the pluggable error facility added in
commit c19a490e37 ("usage: allow pluggable die-recursion checks",
2013-04-16). There's other threaded code that calls set_die_routine()
or set_die_is_recursing_routine().

But this is about fixing the general die() behavior with threading
when we don't have such a custom routine yet. Right now the common
case is not an infinite recursion in the handler, but us losing error
messages by default because we're overly paranoid about our recursion
check.

So let's just set the recursion limit to a number higher than the
number of threads we're ever likely to spawn. Now we won't lose
errors, and if we have a recursing die handler we'll still die within
microseconds.

There are race conditions in this code itself, in particular the
"dying" variable is not thread mutexed, so we e.g. won't be dying at
exactly 1024, or for that matter even be able to accurately test
"dying == 2", see the cases where we print out more than one "W"
above.

But that doesn't really matter, for the recursion guard we just need
to die "soon", not at exactly 1024 calls, and for printing the correct
error and only one warning most of the time in the face of threaded
death this is good enough and a net improvement on the current code.

1. for i in {1..1000}; do git grep -P --threads=8 '(*LIMIT_MATCH=1)-?-?-?---$' 2&gt;&amp;1|perl -pe 's/^fatal: r.*/R/; s/^fatal: p.*/E/; s/^warning.*/W/' | tr '\n' ' '; echo; done | sort | uniq -c | sort -nr | head -n 20

Signed-off-by: Ævar Arnfjörð Bjarmason &lt;avarab@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'bw/forking-and-threading' into maint</title>
<updated>2017-06-13T20:27:00Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2017-06-13T20:26:59Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=e350625b68cf747e4933239735ceb111f4375a83'/>
<id>urn:sha1:e350625b68cf747e4933239735ceb111f4375a83</id>
<content type='text'>
The "run-command" API implementation has been made more robust
against dead-locking in a threaded environment.

* bw/forking-and-threading:
  usage.c: drop set_error_handle()
  run-command: restrict PATH search to executable files
  run-command: expose is_executable function
  run-command: block signals between fork and execve
  run-command: add note about forking and threading
  run-command: handle dup2 and close errors in child
  run-command: eliminate calls to error handling functions in child
  run-command: don't die in child when duping /dev/null
  run-command: prepare child environment before forking
  string-list: add string_list_remove function
  run-command: use the async-signal-safe execv instead of execvp
  run-command: prepare command before forking
  t0061: run_command executes scripts without a #! line
  t5550: use write_script to generate post-update hook
</content>
</entry>
<entry>
<title>usage: add NORETURN to BUG() function definitions</title>
<updated>2017-05-22T02:00:53Z</updated>
<author>
<name>Ramsay Jones</name>
<email>ramsay@ramsayjones.plus.com</email>
</author>
<published>2017-05-21T22:25:39Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=3d7dd2d3b6ccc8903a37cffe3a2f39cf1be21c86'/>
<id>urn:sha1:3d7dd2d3b6ccc8903a37cffe3a2f39cf1be21c86</id>
<content type='text'>
Commit d8193743e0 ("usage.c: add BUG() function", 12-05-2017) added the
BUG() functions and macros as a replacement for calls to die("BUG: ..").
The use of NORETURN on the declarations (in git-compat-util.h) and the
lack of NORETURN on the function definitions, however, leads sparse to
complain thus:

      SP usage.c
  usage.c:220:6: error: symbol 'BUG_fl' redeclared with different type
  (originally declared at git-compat-util.h:1074) - different modifiers

In order to suppress the sparse error, add the NORETURN to the function
definitions.

Signed-off-by: Ramsay Jones &lt;ramsay@ramsayjones.plus.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>usage.c: drop set_error_handle()</title>
<updated>2017-05-15T04:00:25Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2017-05-13T03:48:18Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=e3f43ce765c38f4be94239d07c8c3c596780c514'/>
<id>urn:sha1:e3f43ce765c38f4be94239d07c8c3c596780c514</id>
<content type='text'>
The set_error_handle() function was introduced by 3b331e926
(vreportf: report to arbitrary filehandles, 2015-08-11) so
that run-command could send post-fork, pre-exec errors to
the parent's original stderr.

That use went away in 79319b194 (run-command: eliminate
calls to error handling functions in child, 2017-04-19),
which pushes all of the error reporting to the parent.
This leaves no callers of set_error_handle(). As we're not
likely to add any new ones, let's drop it.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Acked-by: Brandon Williams &lt;bmwill@google.com&gt;
Reviewed-by: Ramsay Jones &lt;ramsay@ramsayjones.plus.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
