<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/git.git/list-objects.c, branch v1.8.0.2</title>
<subtitle>Git
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/git.git/atom?h=v1.8.0.2</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/atom?h=v1.8.0.2'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/'/>
<updated>2011-10-27T18:38:24Z</updated>
<entry>
<title>tree_entry_interesting(): give meaningful names to return values</title>
<updated>2011-10-27T18:38:24Z</updated>
<author>
<name>Nguyễn Thái Ngọc Duy</name>
<email>pclouds@gmail.com</email>
</author>
<published>2011-10-24T06:36:10Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=d688cf07b15b664a2164c3d92bcb5e8265400a2b'/>
<id>urn:sha1:d688cf07b15b664a2164c3d92bcb5e8265400a2b</id>
<content type='text'>
It is a basic code hygiene to avoid magic constants that are unnamed.
Besides, this helps extending the value later on for "interesting, but
cannot decide if the entry truely matches yet" (ie. prefix matches)

Signed-off-by: Nguyễn Thái Ngọc Duy &lt;pclouds@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>list-objects: pass callback data to show_objects()</title>
<updated>2011-09-01T22:46:12Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2011-09-01T22:43:33Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=4947367267cbcd0ca528711b2393613e2e817878'/>
<id>urn:sha1:4947367267cbcd0ca528711b2393613e2e817878</id>
<content type='text'>
The traverse_commit_list() API takes two callback functions, one to show
commit objects, and the other to show other kinds of objects. Even though
the former has a callback data parameter, so that the callback does not
have to rely on global state, the latter does not.

Give the show_objects() callback the same callback data parameter.

Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'nd/struct-pathspec'</title>
<updated>2011-05-06T17:50:06Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2011-05-06T17:50:06Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=1273738f05dedc63865085deb5d1fb8816ff809c'/>
<id>urn:sha1:1273738f05dedc63865085deb5d1fb8816ff809c</id>
<content type='text'>
* nd/struct-pathspec:
  pathspec: rename per-item field has_wildcard to use_wildcard
  Improve tree_entry_interesting() handling code
  Convert read_tree{,_recursive} to support struct pathspec
  Reimplement read_tree_recursive() using tree_entry_interesting()
</content>
</entry>
<entry>
<title>Improve tree_entry_interesting() handling code</title>
<updated>2011-03-25T16:20:33Z</updated>
<author>
<name>Nguyễn Thái Ngọc Duy</name>
<email>pclouds@gmail.com</email>
</author>
<published>2011-03-25T09:34:20Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=97d0b74a49f0c81c3f9673c1a17721ac0624c3df'/>
<id>urn:sha1:97d0b74a49f0c81c3f9673c1a17721ac0624c3df</id>
<content type='text'>
t_e_i() can return -1 or 2 to early shortcut a search. Current code
may use up to two variables to handle it. One for saving return value
from t_e_i temporarily, one for saving return code 2.

The second variable is not needed. If we make sure the first variable
does not change until the next t_e_i() call, then we can do something
like this:

int ret = 0;

while (...) {
	if (ret != 2) {
		ret = t_e_i();
		if (ret &lt; 0) /* no longer interesting */
			break;
		if (ret == 0) /* skip this round */
			continue;
	}
	/* ret &gt; 0, interesting */
}

Signed-off-by: Nguyễn Thái Ngọc Duy &lt;pclouds@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'jc/maint-rev-list-culled-boundary'</title>
<updated>2011-03-23T04:37:59Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2011-03-23T04:37:59Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=4b28cd9f2fb755cdfe1b97130e84644ae932988c'/>
<id>urn:sha1:4b28cd9f2fb755cdfe1b97130e84644ae932988c</id>
<content type='text'>
* jc/maint-rev-list-culled-boundary:
  list-objects.c: don't add an unparsed NULL as a pending tree

Conflicts:
	list-objects.c
</content>
</entry>
<entry>
<title>list-objects.c: don't add an unparsed NULL as a pending tree</title>
<updated>2011-03-14T19:33:58Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2011-03-14T19:29:50Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=6e7d0efa9058aa61147d9d18b33bd046055584e4'/>
<id>urn:sha1:6e7d0efa9058aa61147d9d18b33bd046055584e4</id>
<content type='text'>
"git rev-list --first-parent --boundary $commit^..$commit" segfaults on a
merge commit since 8d2dfc4 (process_{tree,blob}: show objects without
buffering, 2009-04-10), as it tried to dereference a commit that was
discarded as UNINTERESTING without being parsed (hence lacking "tree").

Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Make rev-list --objects work together with pathspecs</title>
<updated>2011-02-03T22:10:46Z</updated>
<author>
<name>Elijah Newren</name>
<email>newren@gmail.com</email>
</author>
<published>2010-12-17T13:26:47Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=cc5fa2fdafa186d6e1e310828b22d0028e1fb51b'/>
<id>urn:sha1:cc5fa2fdafa186d6e1e310828b22d0028e1fb51b</id>
<content type='text'>
When traversing commits, the selection of commits would heed the list of
pathspecs passed, but subsequent walking of the trees of those commits
would not.  This resulted in 'rev-list --objects HEAD -- &lt;paths&gt;'
displaying objects at unwanted paths.

Have process_tree() call tree_entry_interesting() to determine which paths
are interesting and should be walked.

Naturally, this change can provide a large speedup when paths are specified
together with --objects, since many tree entries are now correctly ignored.
Interestingly, though, this change also gives me a small (~1%) but
repeatable speedup even when no paths are specified with --objects.

Signed-off-by: Elijah Newren &lt;newren@gmail.com&gt;
Signed-off-by: Nguyễn Thái Ngọc Duy &lt;pclouds@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'lt/pack-object-memuse'</title>
<updated>2009-04-18T21:46:17Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2009-04-18T21:46:17Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=9824a388e53ba0951e38f246038fa0ef6fda3397'/>
<id>urn:sha1:9824a388e53ba0951e38f246038fa0ef6fda3397</id>
<content type='text'>
* lt/pack-object-memuse:
  show_object(): push path_name() call further down
  process_{tree,blob}: show objects without buffering

Conflicts:
	builtin-pack-objects.c
	builtin-rev-list.c
	list-objects.c
	list-objects.h
	upload-pack.c
</content>
</entry>
<entry>
<title>process_{tree,blob}: show objects without buffering</title>
<updated>2009-04-13T00:28:31Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2009-04-11T00:27:58Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=8d2dfc49b199c7da6faefd7993630f24bd37fee0'/>
<id>urn:sha1:8d2dfc49b199c7da6faefd7993630f24bd37fee0</id>
<content type='text'>
Here's a less trivial thing, and slightly more dubious one.

I was looking at that "struct object_array objects", and wondering why we
do that. I have honestly totally forgotten. Why not just call the "show()"
function as we encounter the objects? Rather than add the objects to the
object_array, and then at the very end going through the array and doing a
'show' on all, just do things more incrementally.

Now, there are possible downsides to this:

 - the "buffer using object_array" _can_ in theory result in at least
   better I-cache usage (two tight loops rather than one more spread out
   one). I don't think this is a real issue, but in theory..

 - this _does_ change the order of the objects printed. Instead of doing a
   "process_tree(revs, commit-&gt;tree, &amp;objects, NULL, "");" in the loop
   over the commits (which puts all the root trees _first_ in the object
   list, this patch just adds them to the list of pending objects, and
   then we'll traverse them in that order (and thus show each root tree
   object together with the objects we discover under it)

   I _think_ the new ordering actually makes more sense, but the object
   ordering is actually a subtle thing when it comes to packing
   efficiency, so any change in order is going to have implications for
   packing. Good or bad, I dunno.

 - There may be some reason why we did it that odd way with the object
   array, that I have simply forgotten.

Anyway, now that we don't buffer up the objects before showing them
that may actually result in lower memory usage during that whole
traverse_commit_list() phase.

This is seriously not very deeply tested. It makes sense to me, it seems
to pass all the tests, it looks ok, but...

Does anybody remember why we did that "object_array" thing? It used to be
an "object_list" a long long time ago, but got changed into the array due
to better memory usage patterns (those linked lists of obejcts are
horrible from a memory allocation standpoint). But I wonder why we didn't
do this back then. Maybe there's a reason for it.

Or maybe there _used_ to be a reason, and no longer is.

Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>show_object(): push path_name() call further down</title>
<updated>2009-04-13T00:28:31Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2009-04-11T01:15:26Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=cf2ab916afa4231f7e9db31796e7c0f712ff6ad1'/>
<id>urn:sha1:cf2ab916afa4231f7e9db31796e7c0f712ff6ad1</id>
<content type='text'>
In particular, pushing the "path_name()" call _into_ the show() function
would seem to allow

 - more clarity into who "owns" the name (ie now when we free the name in
   the show_object callback, it's because we generated it ourselves by
   calling path_name())

 - not calling path_name() at all, either because we don't care about the
   name in the first place, or because we are actually happy walking the
   linked list of "struct name_path *" and the last component.

Now, I didn't do that latter optimization, because it would require some
more coding, but especially looking at "builtin-pack-objects.c", we really
don't even want the whole pathname, we really would be better off with the
list of path components.

Why? We use that name for two things:
 - add_preferred_base_object(), which actually _wants_ to traverse the
   path, and now does it by looking for '/' characters!
 - for 'name_hash()', which only cares about the last 16 characters of a
   name, so again, generating the full name seems to be just unnecessary
   work.

Anyway, so I didn't look any closer at those things, but it did convince
me that the "show_object()" calling convention was crazy, and we're
actually better off doing _less_ in list-objects.c, and giving people
access to the internal data structures so that they can decide whether
they want to generate a path-name or not.

This patch does that, and then for people who did use the name (even if
they might do something more clever in the future), it just does the
straightforward "name = path_name(path, component); .. free(name);" thing.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
