<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/git.git/convert.c, branch v2.7.0-rc2</title>
<subtitle>Git
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/git.git/atom?h=v2.7.0-rc2</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/atom?h=v2.7.0-rc2'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/'/>
<updated>2015-09-25T17:18:18Z</updated>
<entry>
<title>convert trivial sprintf / strcpy calls to xsnprintf</title>
<updated>2015-09-25T17:18:18Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2015-09-24T21:06:08Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=5096d4909f9b13c7a650d9dbb7c9702ea7413566'/>
<id>urn:sha1:5096d4909f9b13c7a650d9dbb7c9702ea7413566</id>
<content type='text'>
We sometimes sprintf into fixed-size buffers when we know
that the buffer is large enough to fit the input (either
because it's a constant, or because it's numeric input that
is bounded in size). Likewise with strcpy of constant
strings.

However, these sites make it hard to audit sprintf and
strcpy calls for buffer overflows, as a reader has to
cross-reference the size of the array with the input. Let's
use xsnprintf instead, which communicates to a reader that
we don't expect this to overflow (and catches the mistake in
case we do).

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>filter_buffer_or_fd(): ignore EPIPE</title>
<updated>2015-05-20T17:19:12Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2015-05-19T18:08:23Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=0c4dd67a048b39470b9b95912e4912fecc405a85'/>
<id>urn:sha1:0c4dd67a048b39470b9b95912e4912fecc405a85</id>
<content type='text'>
We are explicitly ignoring SIGPIPE, as we fully expect that the
filter program may not read our output fully.  Ignore EPIPE that
may come from writing to it as well.

A new test was stolen from Jeff's suggestion.

Helped-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'sp/stream-clean-filter'</title>
<updated>2014-10-08T20:05:32Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2014-10-08T20:05:32Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=f0d89001750a27c1c447b2eb3149b998521fa52c'/>
<id>urn:sha1:f0d89001750a27c1c447b2eb3149b998521fa52c</id>
<content type='text'>
When running a required clean filter, we do not have to mmap the
original before feeding the filter.  Instead, stream the file
contents directly to the filter and process its output.

* sp/stream-clean-filter:
  sha1_file: don't convert off_t to size_t too early to avoid potential die()
  convert: stream from fd to required clean filter to reduce used address space
  copy_fd(): do not close the input file descriptor
  mmap_limit: introduce GIT_MMAP_LIMIT to allow testing expected mmap size
  memory_limit: use git_env_ulong() to parse GIT_ALLOC_LIMIT
  config.c: add git_env_ulong() to parse environment variable
  convert: drop arguments other than 'path' from would_convert_to_git()
</content>
</entry>
<entry>
<title>convert: stream from fd to required clean filter to reduce used address space</title>
<updated>2014-08-28T17:25:15Z</updated>
<author>
<name>Steffen Prohaska</name>
<email>prohaska@zib.de</email>
</author>
<published>2014-08-26T15:23:25Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=9035d75a2be9d80d82676504d69553245017f6d4'/>
<id>urn:sha1:9035d75a2be9d80d82676504d69553245017f6d4</id>
<content type='text'>
The data is streamed to the filter process anyway.  Better avoid mapping
the file if possible.  This is especially useful if a clean filter
reduces the size, for example if it computes a sha1 for binary data,
like git media.  The file size that the previous implementation could
handle was limited by the available address space; large files for
example could not be handled with (32-bit) msysgit.  The new
implementation can filter files of any size as long as the filter output
is small enough.

The new code path is only taken if the filter is required.  The filter
consumes data directly from the fd.  If it fails, the original data is
not immediately available.  The condition can easily be handled as
a fatal error, which is expected for a required filter anyway.

If the filter was not required, the condition would need to be handled
in a different way, like seeking to 0 and reading the data.  But this
would require more restructuring of the code and is probably not worth
it.  The obvious approach of falling back to reading all data would not
help achieving the main purpose of this patch, which is to handle large
files with limited address space.  If reading all data is an option, we
can simply take the old code path right away and mmap the entire file.

The environment variable GIT_MMAP_LIMIT, which has been introduced in
a previous commit is used to test that the expected code path is taken.
A related test that exercises required filters is modified to verify
that the data actually has been modified on its way from the file system
to the object store.

Signed-off-by: Steffen Prohaska &lt;prohaska@zib.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>run-command: introduce CHILD_PROCESS_INIT</title>
<updated>2014-08-20T16:53:37Z</updated>
<author>
<name>René Scharfe</name>
<email>l.s.r@web.de</email>
</author>
<published>2014-08-19T19:09:35Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=d3180279322c7450a47decf8833de47f444ca93f'/>
<id>urn:sha1:d3180279322c7450a47decf8833de47f444ca93f</id>
<content type='text'>
Most struct child_process variables are cleared using memset first after
declaration.  Provide a macro, CHILD_PROCESS_INIT, that can be used to
initialize them statically instead.  That's shorter, doesn't require a
function call and is slightly more readable (especially given that we
already have STRBUF_INIT, ARGV_ARRAY_INIT etc.).

Helped-by: Johannes Sixt &lt;j6t@kdbg.org&gt;
Signed-off-by: Rene Scharfe &lt;l.s.r@web.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>use skip_prefix to avoid magic numbers</title>
<updated>2014-06-20T17:44:45Z</updated>
<author>
<name>Jeff King</name>
<email>peff@peff.net</email>
</author>
<published>2014-06-18T19:47:50Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=ae021d87911da4328157273df24779892cb51277'/>
<id>urn:sha1:ae021d87911da4328157273df24779892cb51277</id>
<content type='text'>
It's a common idiom to match a prefix and then skip past it
with a magic number, like:

  if (starts_with(foo, "bar"))
	  foo += 3;

This is easy to get wrong, since you have to count the
prefix string yourself, and there's no compiler check if the
string changes.  We can use skip_prefix to avoid the magic
numbers here.

Note that some of these conversions could be much shorter.
For example:

  if (starts_with(arg, "--foo=")) {
	  bar = arg + 6;
	  continue;
  }

could become:

  if (skip_prefix(arg, "--foo=", &amp;bar))
	  continue;

However, I have left it as:

  if (skip_prefix(arg, "--foo=", &amp;v)) {
	  bar = v;
	  continue;
  }

to visually match nearby cases which need to actually
process the string. Like:

  if (skip_prefix(arg, "--foo=", &amp;v)) {
	  bar = atoi(v);
	  continue;
  }

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>replace {pre,suf}fixcmp() with {starts,ends}_with()</title>
<updated>2013-12-05T22:13:21Z</updated>
<author>
<name>Christian Couder</name>
<email>chriscool@tuxfamily.org</email>
</author>
<published>2013-11-30T20:55:40Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=59556548230e617b837343c2c07e357e688e2ca4'/>
<id>urn:sha1:59556548230e617b837343c2c07e357e688e2ca4</id>
<content type='text'>
Leaving only the function definitions and declarations so that any
new topic in flight can still make use of the old functions, replace
existing uses of the prefixcmp() and suffixcmp() with new API
functions.

The change can be recreated by mechanically applying this:

    $ git grep -l -e prefixcmp -e suffixcmp -- \*.c |
      grep -v strbuf\\.c |
      xargs perl -pi -e '
        s|!prefixcmp\(|starts_with\(|g;
        s|prefixcmp\(|!starts_with\(|g;
        s|!suffixcmp\(|ends_with\(|g;
        s|suffixcmp\(|!ends_with\(|g;
      '

on the result of preparatory changes in this series.

Signed-off-by: Christian Couder &lt;chriscool@tuxfamily.org&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>typofix: in-code comments</title>
<updated>2013-07-22T23:06:49Z</updated>
<author>
<name>Ondřej Bílka</name>
<email>neleai@seznam.cz</email>
</author>
<published>2013-07-22T21:02:23Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=749f763dbbe4dbcc4082f02bf98bfc1a09427c6f'/>
<id>urn:sha1:749f763dbbe4dbcc4082f02bf98bfc1a09427c6f</id>
<content type='text'>
Signed-off-by: Ondřej Bílka &lt;neleai@seznam.cz&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'lf/read-blob-data-from-index'</title>
<updated>2013-04-22T01:39:45Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2013-04-22T01:39:45Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=4b35b007a6bd8b76bd37589fa397c12265935029'/>
<id>urn:sha1:4b35b007a6bd8b76bd37589fa397c12265935029</id>
<content type='text'>
Reduce duplicated code between convert.c and attr.c.

* lf/read-blob-data-from-index:
  convert.c: remove duplicate code
  read_blob_data_from_index(): optionally return the size of blob data
  attr.c: extract read_index_data() as read_blob_data_from_index()
</content>
</entry>
<entry>
<title>convert.c: remove duplicate code</title>
<updated>2013-04-17T16:52:33Z</updated>
<author>
<name>Lukas Fleischer</name>
<email>git@cryptocrack.de</email>
</author>
<published>2013-04-13T13:28:32Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=4982fd78f61db69504f91151f94fce9914d610b6'/>
<id>urn:sha1:4982fd78f61db69504f91151f94fce9914d610b6</id>
<content type='text'>
The has_cr_in_index() function is an almost 1:1 copy of
read_blob_data_from_index() with some additions.  Use the
latter instead of using copy-pasted code.

Signed-off-by: Lukas Fleischer &lt;git@cryptocrack.de&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
