<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/git.git/read-cache.c, branch v1.7.3.5</title>
<subtitle>Git
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/git.git/atom?h=v1.7.3.5</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/atom?h=v1.7.3.5'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/'/>
<updated>2010-08-11T16:57:43Z</updated>
<entry>
<title>core: Stop leaking ondisk_cache_entrys</title>
<updated>2010-08-11T16:57:43Z</updated>
<author>
<name>Jonathan Nieder</name>
<email>jrnieder@gmail.com</email>
</author>
<published>2010-08-10T03:28:07Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=59efba64ac144a8838a35ae687b8c5bb6cd43363'/>
<id>urn:sha1:59efba64ac144a8838a35ae687b8c5bb6cd43363</id>
<content type='text'>
Noticed with valgrind.

Signed-off-by: Jonathan Nieder &lt;jrnieder@gmail.com&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Correct spelling of 'REUC' extension</title>
<updated>2010-02-02T17:54:34Z</updated>
<author>
<name>Shawn O. Pearce</name>
<email>spearce@spearce.org</email>
</author>
<published>2010-02-02T15:33:28Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=b659b49bb0ee8995ddc4730e1796866baccc39be'/>
<id>urn:sha1:b659b49bb0ee8995ddc4730e1796866baccc39be</id>
<content type='text'>
The new dircache extension CACHE_EXT_RESOLVE_UNDO, whose value is
0x52455543, is actually the ASCII sequence 'REUC', not the ASCII
sequence 'REUN'.

Signed-off-by: Shawn O. Pearce &lt;spearce@spearce.org&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Make ce_uptodate() trustworthy again</title>
<updated>2010-01-24T08:15:29Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2010-01-24T08:10:20Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=125fd98434ce773de45c4a40927c222ec5c43ae1'/>
<id>urn:sha1:125fd98434ce773de45c4a40927c222ec5c43ae1</id>
<content type='text'>
The rule has always been that a cache entry that is ce_uptodate(ce)
means that we already have checked the work tree entity and we know
there is no change in the work tree compared to the index, and nobody
should have to double check.  Note that false ce_uptodate(ce) does not
mean it is known to be dirty---it only means we don't know if it is
clean.

There are a few codepaths (refresh-index and preload-index are among
them) that mark a cache entry as up-to-date based solely on the return
value from ie_match_stat(); this function uses lstat() to see if the
work tree entity has been touched, and for a submodule entry, if its
HEAD points at the same commit as the commit recorded in the index of
the superproject (a submodule that is not even cloned is considered
clean).

A submodule is no longer considered unmodified merely because its HEAD
matches the index of the superproject these days, in order to prevent
people from forgetting to commit in the submodule and updating the
superproject index with the new submodule commit, before commiting the
state in the superproject.  However, the patch to do so didn't update
the codepath that marks cache entries up-to-date based on the updated
definition and instead worked it around by saying "we don't trust the
return value of ce_uptodate() for submodules."

This makes ce_uptodate() trustworthy again by not marking submodule
entries up-to-date.

The next step _could_ be to introduce a few "in-core" flag bits to
cache_entry structure to record "this entry is _known_ to be dirty",
call is_submodule_modified() from ie_match_stat(), and use these new
bits to avoid running this rather expensive check more than once, but
that can be a separate patch.

Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Remove diff machinery dependency from read-cache</title>
<updated>2010-01-22T01:05:13Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2010-01-21T19:37:38Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=fb7d3f32b283a3847e6f151a06794abd14ffd81b'/>
<id>urn:sha1:fb7d3f32b283a3847e6f151a06794abd14ffd81b</id>
<content type='text'>
Exal Sibeaz pointed out that some git files are way too big, and that
add_files_to_cache() brings in all the diff machinery to any git binary
that needs the basic git SHA1 object operations from read-cache.c. Which
is pretty much all of them.

It's doubly silly, since add_files_to_cache() is only used by builtin
programs (add, checkout and commit), so it's fairly easily fixed by just
moving the thing to builtin-add.c, and avoiding the dependency entirely.

I initially argued to Exal that it would probably be best to try to depend
on smart compilers and linkers, but after spending some time trying to
make -ffunction-sections work and giving up, I think Exal was right, and
the fix is to just do some trivial cleanups like this.

This trivial cleanup results in pretty stunning file size differences.
The diff machinery really is mostly used by just the builtin programs, and
you have things like these trivial before-and-after numbers:

  -rwxr-xr-x 1 torvalds torvalds 1727420 2010-01-21 10:53 git-hash-object
  -rwxrwxr-x 1 torvalds torvalds  940265 2010-01-21 11:16 git-hash-object

Now, I'm not saying that 940kB is good either, but that's mostly all the
debug information - you can see the real code with 'size':

   text	   data	    bss	    dec	    hex	filename
 418675	   3920	 127408	 550003	  86473	git-hash-object (before)
 230650	   2288	 111728	 344666	  5425a	git-hash-object (after)

ie we have a nice 24% size reduction from this trivial cleanup.

It's not just that one file either. I get:

	[torvalds@nehalem git]$ du -s /home/torvalds/libexec/git-core
	45640	/home/torvalds/libexec/git-core (before)
	33508	/home/torvalds/libexec/git-core (after)

so we're talking 12MB of diskspace here.

(Of course, stripping all the binaries brings the 33MB down to 9MB, so the
whole debug information thing is still the bulk of it all, but that's a
separate issue entirely)

Now, I'm sure there are other things we should do, and changing our
compiler flags from -O2 to -Os would bring the text size down by an
additional almost 20%, but this thing Exal pointed out seems to be some
good low-hanging fruit.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'jc/cache-unmerge'</title>
<updated>2010-01-20T22:46:35Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2010-01-20T22:44:31Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=6751e0471df1bdc4a1d5e5a3929a531c74e95aeb'/>
<id>urn:sha1:6751e0471df1bdc4a1d5e5a3929a531c74e95aeb</id>
<content type='text'>
* jc/cache-unmerge:
  rerere forget path: forget recorded resolution
  rerere: refactor rerere logic to make it independent from I/O
  rerere: remove silly 1024-byte line limit
  resolve-undo: teach "update-index --unresolve" to use resolve-undo info
  resolve-undo: "checkout -m path" uses resolve-undo information
  resolve-undo: allow plumbing to clear the information
  resolve-undo: basic tests
  resolve-undo: record resolved conflicts in a new index extension section
  builtin-merge.c: use standard active_cache macros

Conflicts:
	builtin-ls-files.c
	builtin-merge.c
	builtin-rerere.c
</content>
</entry>
<entry>
<title>Merge branch 'jc/symbol-static'</title>
<updated>2010-01-20T22:37:25Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2010-01-20T22:37:25Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=56eb8b43ebdb38683ff5ce395d7b5e080d402b5b'/>
<id>urn:sha1:56eb8b43ebdb38683ff5ce395d7b5e080d402b5b</id>
<content type='text'>
* jc/symbol-static:
  date.c: mark file-local function static
  Replace parse_blob() with an explanatory comment
  symlinks.c: remove unused functions
  object.c: remove unused functions
  strbuf.c: remove unused function
  sha1_file.c: remove unused function
  mailmap.c: remove unused function
  utf8.c: mark file-local function static
  submodule.c: mark file-local function static
  quote.c: mark file-local function static
  remote-curl.c: mark file-local function static
  read-cache.c: mark file-local functions static
  parse-options.c: mark file-local function static
  entry.c: mark file-local function static
  http.c: mark file-local functions static
  pretty.c: mark file-local function static
  builtin-rev-list.c: mark file-local function static
  bisect.c: mark file-local function static
</content>
</entry>
<entry>
<title>Merge branch 'cc/reset-more'</title>
<updated>2010-01-13T19:58:56Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2010-01-13T19:58:56Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=dc96c5ee703fb7265619b1ecb2b5f2c5ab3ef40d'/>
<id>urn:sha1:dc96c5ee703fb7265619b1ecb2b5f2c5ab3ef40d</id>
<content type='text'>
* cc/reset-more:
  t7111: check that reset options work as described in the tables
  Documentation: reset: add some missing tables
  Fix bit assignment for CE_CONFLICTED
  "reset --merge": fix unmerged case
  reset: use "unpack_trees()" directly instead of "git read-tree"
  reset: add a few tests for "git reset --merge"
  Documentation: reset: add some tables to describe the different options
  reset: improve mixed reset error message when in a bare repo
</content>
</entry>
<entry>
<title>Merge branch 'nd/sparse'</title>
<updated>2010-01-13T19:58:34Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2010-01-13T19:58:34Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=73d66323ac78c750ba42fef23b1cb8fd2110e023'/>
<id>urn:sha1:73d66323ac78c750ba42fef23b1cb8fd2110e023</id>
<content type='text'>
* nd/sparse: (25 commits)
  t7002: test for not using external grep on skip-worktree paths
  t7002: set test prerequisite "external-grep" if supported
  grep: do not do external grep on skip-worktree entries
  commit: correctly respect skip-worktree bit
  ie_match_stat(): do not ignore skip-worktree bit with CE_MATCH_IGNORE_VALID
  tests: rename duplicate t1009
  sparse checkout: inhibit empty worktree
  Add tests for sparse checkout
  read-tree: add --no-sparse-checkout to disable sparse checkout support
  unpack-trees(): ignore worktree check outside checkout area
  unpack_trees(): apply $GIT_DIR/info/sparse-checkout to the final index
  unpack-trees(): "enable" sparse checkout and load $GIT_DIR/info/sparse-checkout
  unpack-trees.c: generalize verify_* functions
  unpack-trees(): add CE_WT_REMOVE to remove on worktree alone
  Introduce "sparse checkout"
  dir.c: export excluded_1() and add_excludes_from_file_1()
  excluded_1(): support exclude files in index
  unpack-trees(): carry skip-worktree bit over in merged_entry()
  Read .gitignore from index if it is skip-worktree
  Avoid writing to buffer in add_excludes_from_file_1()
  ...

Conflicts:
	.gitignore
	Documentation/config.txt
	Documentation/git-update-index.txt
	Makefile
	entry.c
	t/t7002-grep.sh
</content>
</entry>
<entry>
<title>read-cache.c: mark file-local functions static</title>
<updated>2010-01-12T09:06:08Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2010-01-12T06:29:35Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=87b29e5a5ab02f10505fca567d027b57d2a9314e'/>
<id>urn:sha1:87b29e5a5ab02f10505fca567d027b57d2a9314e</id>
<content type='text'>
Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
<entry>
<title>"reset --merge": fix unmerged case</title>
<updated>2010-01-04T00:01:05Z</updated>
<author>
<name>Junio C Hamano</name>
<email>gitster@pobox.com</email>
</author>
<published>2010-01-01T07:04:04Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/git.git/commit/?id=e11d7b5969704cd5ce39d053414d905bb203886b'/>
<id>urn:sha1:e11d7b5969704cd5ce39d053414d905bb203886b</id>
<content type='text'>
Commit 9e8ecea (Add 'merge' mode to 'git reset', 2008-12-01) disallowed
"git reset --merge" when there was unmerged entries.  But it wished if
unmerged entries were reset as if --hard (instead of --merge) has been
used.  This makes sense because all "mergy" operations makes sure that
any path involved in the merge does not have local modifications before
starting, so resetting such a path away won't lose any information.

The previous commit changed the behavior of --merge to accept resetting
unmerged entries if they are reset to a different state than HEAD, but it
did not reset the changes in the work tree, leaving the conflict markers
in the resulting file in the work tree.

Fix it by doing three things:

 - Update the documentation to match the wish of original "reset --merge"
   better, namely, "An unmerged entry is a sign that the path didn't have
   any local modification and can be safely resetted to whatever the new
   HEAD records";

 - Update read_index_unmerged(), which reads the index file into the cache
   while dropping any higher-stage entries down to stage #0, not to copy
   the object name from the higher stage entry.  The code used to take the
   object name from the a stage entry ("base" if you happened to have
   stage #1, or "ours" if both sides added, etc.), which essentially meant
   that you are getting random results depending on what the merge did.

   The _only_ reason we want to keep a previously unmerged entry in the
   index at stage #0 is so that we don't forget the fact that we have
   corresponding file in the work tree in order to be able to remove it
   when the tree we are resetting to does not have the path.  In order to
   differentiate such an entry from ordinary cache entry, the cache entry
   added by read_index_unmerged() is marked as CE_CONFLICTED.

 - Update merged_entry() and deleted_entry() so that they pay attention to
   cache entries marked as CE_CONFLICTED.  They are previously unmerged
   entries, and the files in the work tree that correspond to them are
   resetted away by oneway_merge() to the version from the tree we are
   resetting to.

Signed-off-by: Junio C Hamano &lt;gitster@pobox.com&gt;
</content>
</entry>
</feed>
