summaryrefslogtreecommitdiff
path: root/refs/files-backend.c
AgeCommit message (Collapse)Author
2017-10-30files-backend: don't rewrite the `packed-refs` file unnecessarilyMichael Haggerty
Even when we are deleting references, we needn't overwrite the `packed-refs` file if the references that we are deleting only exist as loose references. Implement this optimization as follows: * Add a function `is_packed_transaction_needed()`, which checks whether a given packed-refs transaction actually needs to be carried out (i.e., it returns false if the transaction obviously wouldn't have any effect). This function must be called while holding the `packed-refs` lock to avoid races. * Change `files_transaction_prepare()` to check whether the packed-refs transaction is actually needed. If not, squelch it, but continue holding the `packed-refs` lock until the end of the transaction to avoid races. This fixes a mild regression caused by dc39e09942 (files_ref_store: use a transaction to update packed refs, 2017-09-08). Before that commit, unnecessary rewrites of `packed-refs` were suppressed by `repack_without_refs()`. But the transaction-based writing introduced by that commit didn't perform that optimization. Note that the pre-dc39e09942 code still had to *read* the whole `packed-refs` file to determine that the rewrite could be skipped, so the performance for the cases that the write could be elided was `O(N)` in the number of packed references both before and after dc39e09942. But after that commit the constant factor increased. This commit reimplements the optimization of eliding unnecessary `packed-refs` rewrites. That, plus the fact that since cfa2e29c34 (packed_ref_store: get rid of the `ref_cache` entirely, 2017-03-17) we don't necessarily have to read the whole `packed-refs` file at all, means that deletes of one or a few loose references can now be done with `O(n lg N)` effort, where `n` is the number of loose references being deleted and `N` is the total number of packed references. This commit fixes two tests in t1409. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-28Merge branch 'bc/object-id' into baseMichael Haggerty
2017-10-26Merge branch 'mh/ref-locking-fix'Junio C Hamano
Transactions to update multiple references that involves a deletion was quite broken in an error codepath and did not abort everything correctly. * mh/ref-locking-fix: files_transaction_prepare(): fix handling of ref lock failure t1404: add a bunch of tests of D/F conflicts
2017-10-25files_transaction_prepare(): fix handling of ref lock failureMichael Haggerty
Since dc39e09942 (files_ref_store: use a transaction to update packed refs, 2017-09-08), failure to lock a reference has been handled incorrectly by `files_transaction_prepare()`. If `lock_ref_for_update()` fails in the lock-acquisition loop of that function, it sets `ret` then breaks out of that loop. Prior to dc39e09942, that was OK, because the only thing following the loop was the cleanup code. But dc39e09942 added another blurb of code between the loop and the cleanup. That blurb sometimes resets `ret` to zero, making the cleanup code think that the locking was successful. Specifically, whenever * One or more reference deletions have been processed successfully in the lock-acquisition loop. (Processing the first such reference causes a packed-ref transaction to be initialized.) * Then `lock_ref_for_update()` fails for a subsequent reference. Such a failure can happen for a number of reasons, such as the old SHA-1 not being correct, lock contention, etc. This causes a `break` out of the lock-acquisition loop. * The `packed-refs` lock is acquired successfully and `ref_transaction_prepare()` succeeds for the packed-ref transaction. This has the effect of resetting `ret` back to 0, and making the cleanup code think that lock acquisition was successful. In that case, any reference updates that were processed prior to breaking out of the loop would be carried out (loose and packed), but the reference that couldn't be locked and any subsequent references would silently be ignored. This can easily cause data loss if, for example, the user was trying to push a new name for an existing branch while deleting the old name. After the push, the branch could be left unreachable, and could even subsequently be garbage-collected. This problem was noticed in the context of deleting one reference and creating another in a single transaction, when the two references D/F conflict with each other, like git update-ref --stdin <<EOF delete refs/foo create refs/foo/bar HEAD EOF This triggers the above bug because the deletion is processed successfully for `refs/foo`, then the D/F conflict causes `lock_ref_for_update()` to fail when `refs/foo/bar` is processed. In this case the transaction *should* fail, but instead it causes `refs/foo` to be deleted without creating `refs/foo`. This could easily result in data loss. The fix is simple: instead of just breaking out of the loop, jump directly to the cleanup code. This fixes some tests in t1404 that were added in the previous commit. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16refs/files-backend: convert static functions to object_idbrian m. carlson
Convert several static functions to take pointers to struct object_id. Change the relevant parameters to write_packed_entry to be const, as we don't modify them. Rename lock_ref_sha1_basic to lock_ref_oid_basic to reflect its new argument. Update the docstring for verify lock to account for the new parameter name, and note additionally that the old_oid may be NULL. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16refs: convert read_raw_ref backends to struct object_idbrian m. carlson
Convert the unsigned char * parameter to struct object_id * for files_read_raw_ref and packed_read_raw_ref. Update the documentation. Switch from using get_sha1_hex and a hard-coded 40 to using parse_oid_hex. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16refs: convert resolve_ref_unsafe to struct object_idbrian m. carlson
Convert resolve_ref_unsafe to take a pointer to struct object_id by converting one remaining caller to use struct object_id, removing the temporary NULL pointer check in expand_ref, converting the declaration and definition, and applying the following semantic patch: @@ expression E1, E2, E3, E4; @@ - resolve_ref_unsafe(E1, E2, E3.hash, E4) + resolve_ref_unsafe(E1, E2, &E3, E4) @@ expression E1, E2, E3, E4; @@ - resolve_ref_unsafe(E1, E2, E3->hash, E4) + resolve_ref_unsafe(E1, E2, E3, E4) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16refs: convert reflog_expire parameter to struct object_idbrian m. carlson
reflog_expire already used struct object_id internally, but it did not take it as a parameter. Adjust the parameter (and the callers) to pass a pointer to struct object_id instead of a pointer to unsigned char. Remove the temporary inserted earlier as it is no longer required. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16refs: convert read_ref and read_ref_full to object_idbrian m. carlson
All but two of the call sites already have parameters using the hash parameter of struct object_id, so convert them to take a pointer to the struct directly. Also convert refs_read_refs_full, the underlying implementation. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16refs: update ref transactions to use struct object_idbrian m. carlson
Update the ref transaction code to use struct object_id. Remove one NULL pointer check which was previously inserted around a dereference; since we now pass a pointer to struct object_id directly through, the code we're calling handles this for us. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16refs: convert delete_ref and refs_delete_ref to struct object_idbrian m. carlson
Convert delete_ref and refs_delete_ref to take a pointer to struct object_id. Update the documentation accordingly, including referring to null_oid in lowercase, as it is not a #define constant. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16refs/files-backend: convert struct ref_to_prune to object_idbrian m. carlson
Change the member of this struct to be a struct object_id. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-05Merge branch 'rs/resolve-ref-optional-result'Junio C Hamano
Code clean-up. * rs/resolve-ref-optional-result: refs: pass NULL to resolve_refdup() if hash is not needed refs: pass NULL to refs_resolve_refdup() if hash is not needed
2017-10-03Merge branch 'mh/mmap-packed-refs'Junio C Hamano
Operations that do not touch (majority of) packed refs have been optimized by making accesses to packed-refs file lazy; we no longer pre-parse everything, and an access to a single ref in the packed-refs does not touch majority of irrelevant refs, either. * mh/mmap-packed-refs: (21 commits) packed-backend.c: rename a bunch of things and update comments mmapped_ref_iterator: inline into `packed_ref_iterator` ref_cache: remove support for storing peeled values packed_ref_store: get rid of the `ref_cache` entirely ref_store: implement `refs_peel_ref()` generically packed_read_raw_ref(): read the reference from the mmapped buffer packed_ref_iterator_begin(): iterate using `mmapped_ref_iterator` read_packed_refs(): ensure that references are ordered when read packed_ref_cache: keep the `packed-refs` file mmapped if possible packed-backend.c: reorder some definitions mmapped_ref_iterator_advance(): no peeled value for broken refs mmapped_ref_iterator: add iterator over a packed-refs file packed_ref_cache: remember the file-wide peeling state read_packed_refs(): read references with minimal copying read_packed_refs(): make parsing of the header line more robust read_packed_refs(): only check for a header at the top of the file read_packed_refs(): use mmap to read the `packed-refs` file die_unterminated_line(), die_invalid_line(): new functions packed_ref_cache: add a backlink to the associated `packed_ref_store` prefix_ref_iterator: break when we leave the prefix ...
2017-10-03Merge branch 'jk/read-in-full'Junio C Hamano
Code clean-up to prevent future mistakes by copying and pasting code that checks the result of read_in_full() function. * jk/read-in-full: worktree: check the result of read_in_full() worktree: use xsize_t to access file size distinguish error versus short read from read_in_full() avoid looking at errno for short read_in_full() returns prefer "!=" when checking read_in_full() result notes-merge: drop dead zero-write code files-backend: prefer "0" for write_in_full() error check
2017-10-03Merge branch 'sd/branch-copy'Junio C Hamano
"git branch" learned "-c/-C" to create a new branch by copying an existing one. * sd/branch-copy: branch: fix "copy" to never touch HEAD branch: add a --copy (-c) option to go with --move (-m) branch: add test for -m renaming multiple config sections config: create a function to format section headers
2017-10-01refs: pass NULL to refs_resolve_refdup() if hash is not neededRené Scharfe
This gets us rid of a write-only variable. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-28Merge branch 'rs/resolve-ref-optional-result'Junio C Hamano
Code clean-up. * rs/resolve-ref-optional-result: refs: pass NULL to resolve_ref_unsafe() if hash is not needed refs: pass NULL to refs_resolve_ref_unsafe() if hash is not needed refs: make sha1 output parameter of refs_resolve_ref_unsafe() optional
2017-09-26files-backend: prefer "0" for write_in_full() error checkJeff King
Commit 06f46f237a (avoid "write_in_full(fd, buf, len) != len" pattern, 2017-09-13) converted this callsite from: write_in_full(...) != 1 to write_in_full(...) < 0 But during the conflict resolution in c50424a6f0 (Merge branch 'jk/write-in-full-fix', 2017-09-25), this morphed into write_in_full(...) < 1 This behaves as we want, but we prefer to avoid modeling the "less than length" error-check which can be subtly buggy, as shown in efacf609c8 (config: avoid "write_in_full(fd, buf, len) < len" pattern, 2017-09-13). Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-25ref_store: implement `refs_peel_ref()` genericallyMichael Haggerty
We're about to stop storing packed refs in a `ref_cache`. That means that the only way we have left to optimize `peel_ref()` is by checking whether the reference being peeled is the one currently being iterated over (in `current_ref_iter`), and if so, using `ref_iterator_peel()`. But this can be done generically; it doesn't have to be implemented per-backend. So implement `refs_peel_ref()` in `refs.c` and remove the `peel_ref()` method from the refs API. This removes the last callers of a couple of functions, so delete them. More cleanup to come... Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-25Merge branch 'jk/write-in-full-fix'Junio C Hamano
Many codepaths did not diagnose write failures correctly when disks go full, due to their misuse of write_in_full() helper function, which have been corrected. * jk/write-in-full-fix: read_pack_header: handle signed/unsigned comparison in read result config: flip return value of store_write_*() notes-merge: use ssize_t for write_in_full() return value pkt-line: check write_in_full() errors against "< 0" convert less-trivial versions of "write_in_full() != len" avoid "write_in_full(fd, buf, len) != len" pattern get-tar-commit-id: check write_in_full() return against 0 config: avoid "write_in_full(fd, buf, len) < len" pattern
2017-09-24refs: pass NULL to refs_resolve_ref_unsafe() if hash is not neededRené Scharfe
This allows us to get rid of two write-only variables, one of them being a SHA1 buffer. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-19Merge branch 'mh/packed-ref-transactions'Junio C Hamano
Implement transactional update to the packed-ref representation of references. * mh/packed-ref-transactions: files_transaction_finish(): delete reflogs before references packed-backend: rip out some now-unused code files_ref_store: use a transaction to update packed refs t1404: demonstrate two problems with reference transactions files_initial_transaction_commit(): use a transaction for packed refs prune_refs(): also free the linked list files_pack_refs(): use a reference transaction to write packed refs packed_delete_refs(): implement method packed_ref_store: implement reference transactions struct ref_transaction: add a place for backends to store data packed-backend: don't adjust the reference count on lock/unlock
2017-09-19Merge branch 'jk/incore-lockfile-removal'Junio C Hamano
The long-standing rule that an in-core lockfile instance, once it is used, must not be freed, has been lifted and the lockfile and tempfile APIs have been updated to reduce the chance of programming errors. * jk/incore-lockfile-removal: stop leaking lock structs in some simple cases ref_lock: stop leaking lock_files lockfile: update lifetime requirements in documentation tempfile: auto-allocate tempfiles on heap tempfile: remove deactivated list entries tempfile: use list.h for linked list tempfile: release deactivated strbufs instead of resetting tempfile: robustify cleanup handler tempfile: factor out deactivation tempfile: factor out activation tempfile: replace die("BUG") with BUG() tempfile: handle NULL tempfile pointers gracefully tempfile: prefer is_tempfile_active to bare access lockfile: do not rollback lock on failed close tempfile: do not delete tempfile on failed close always check return value of close_tempfile verify_signed_buffer: prefer close_tempfile() to close() setup_temporary_shallow: move tempfile struct into function setup_temporary_shallow: avoid using inactive tempfile write_index_as_tree: cleanup tempfile on error
2017-09-19Merge branch 'nd/prune-in-worktree'Junio C Hamano
"git gc" and friends when multiple worktrees are used off of a single repository did not consider the index and per-worktree refs of other worktrees as the root for reachability traversal, making objects that are in use only in other worktrees to be subject to garbage collection. * nd/prune-in-worktree: refs.c: reindent get_submodule_ref_store() refs.c: remove fallback-to-main-store code get_submodule_ref_store() rev-list: expose and document --single-worktree revision.c: --reflog add HEAD reflog from all worktrees files-backend: make reflog iterator go through per-worktree reflog revision.c: --all adds HEAD from all worktrees refs: remove dead for_each_*_submodule() refs.c: move for_each_remote_ref_submodule() to submodule.c revision.c: use refs_for_each*() instead of for_each_*_submodule() refs: add refs_head_ref() refs: move submodule slash stripping code to get_submodule_ref_store refs.c: refactor get_submodule_ref_store(), share common free block revision.c: --indexed-objects add objects from all worktrees revision.c: refactor add_index_objects_to_pending() refs.c: use is_dir_sep() in resolve_gitlink_ref() revision.h: new flag in struct rev_info wrt. worktree-related refs
2017-09-19Merge branch 'ma/split-symref-update-fix'Junio C Hamano
A leakfix. * ma/split-symref-update-fix: refs/files-backend: add `refname`, not "HEAD", to list refs/files-backend: correct return value in lock_ref_for_update refs/files-backend: fix memory leak in lock_ref_for_update refs/files-backend: add longer-scoped copy of string to list
2017-09-14ref_iterator: keep track of whether the iterator output is orderedMichael Haggerty
References are iterated over in order by refname, but reflogs are not. Some consumers of reference iteration care about the difference. Teach each `ref_iterator` to keep track of whether its output is ordered. `overlay_ref_iterator` is one of the picky consumers. Add a sanity check in `overlay_ref_iterator_begin()` to verify that its inputs are ordered. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14convert less-trivial versions of "write_in_full() != len"Jeff King
The prior commit converted many sites to check the return value of write_in_full() for negativity, rather than a mismatch with the input length. This patch covers similar cases, but where the return value is stored in an intermediate variable. These should get the same treatment, but they need to be reviewed more carefully since it would be a bug if the return value is stored in an unsigned type (which indeed, it is in one of the cases). Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14avoid "write_in_full(fd, buf, len) != len" patternJeff King
The return value of write_in_full() is either "-1", or the requested number of bytes[1]. If we make a partial write before seeing an error, we still return -1, not a partial value. This goes back to f6aa66cb95 (write_in_full: really write in full or return error on disk full., 2007-01-11). So checking anything except "was the return value negative" is pointless. And there are a couple of reasons not to do so: 1. It can do a funny signed/unsigned comparison. If your "len" is signed (e.g., a size_t) then the compiler will promote the "-1" to its unsigned variant. This works out for "!= len" (unless you really were trying to write the maximum size_t bytes), but is a bug if you check "< len" (an example of which was fixed recently in config.c). We should avoid promoting the mental model that you need to check the length at all, so that new sites are not tempted to copy us. 2. Checking for a negative value is shorter to type, especially when the length is an expression. 3. Linus says so. In d34cf19b89 (Clean up write_in_full() users, 2007-01-11), right after the write_in_full() semantics were changed, he wrote: I really wish every "write_in_full()" user would just check against "<0" now, but this fixes the nasty and stupid ones. Appeals to authority aside, this makes it clear that writing it this way does not have an intentional benefit. It's a historical curiosity that we never bothered to clean up (and which was undoubtedly cargo-culted into new sites). So let's convert these obviously-correct cases (this includes write_str_in_full(), which is just a wrapper for write_in_full()). [1] A careful reader may notice there is one way that write_in_full() can return a different value. If we ask write() to write N bytes and get a return value that is _larger_ than N, we could return a larger total. But besides the fact that this would imply a totally broken version of write(), it would already invoke undefined behavior. Our internal remaining counter is an unsigned size_t, which means that subtracting too many byte will wrap it around to a very large number. So we'll instantly begin reading off the end of the buffer, trying to write gigabytes (or petabytes) of data. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-10refs/files-backend: add `refname`, not "HEAD", to listMartin Ågren
An earlier patch rewrote `split_symref_update()` to add a copy of a string to a string list instead of adding the original string. That was so that the original string could be freed in a later patch, but it is also conceptually cleaner, since now all calls to `string_list_insert()` and `string_list_append()` add `update->refname`. --- Except a literal "HEAD" is added in `split_head_update()`. Restructure `split_head_update()` in the same way as the earlier patch did for `split_symref_update()`. This does not correct any practical problem, but makes things conceptually cleaner. The downside is a call to `string_list_has_string()`, which should be relatively cheap. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Martin Ågren <martin.agren@gmail.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-10refs/files-backend: correct return value in lock_ref_for_updateMartin Ågren
In one code path we return a literal -1 and not a symbolic constant. The value -1 would be interpreted as TRANSACTION_NAME_CONFLICT, which is wrong. Use TRANSACTION_GENERIC_ERROR instead (that is the only other return value we have to choose from). Noticed-by: Michael Haggerty <mhagger@alum.mit.edu> Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Martin Ågren <martin.agren@gmail.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-10refs/files-backend: fix memory leak in lock_ref_for_updateMartin Ågren
After the previous patch, none of the functions we call hold on to `referent.buf`, so we can safely release the string buffer before returning. Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Martin Ågren <martin.agren@gmail.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-10refs/files-backend: add longer-scoped copy of string to listMartin Ågren
split_symref_update() receives a string-pointer `referent` and adds it to the list of `affected_refnames`. The list simply holds on to the pointers it is given, it does not copy the strings and it does not ever free them. The `referent` string in split_symref_update() belongs to a string buffer in the caller. After we return, the string will be leaked. In the next patch, we want to properly release the string buffer in the caller, but we can't safely do so until we've made sure that `affected_refnames` will not be holding on to a pointer to the string. We could configure the list to handle its own resources, but it would mean some alloc/free-churning. The list is already handling other strings (through other code paths) which we do not need to worry about, and we'd be memory-churning those strings too, completely unnecessary. Observe that split_symref_update() creates a `new_update`-object through ref_transaction_add_update(), after which `new_update->refname` is a copy of `referent`. The difference is, this copy will be freed, and it will be freed *after* `affected_refnames` has been cleared. Rearrange the handling of `referent`, so that we don't add it directly to `affected_refnames`. Instead, first just check whether `referent` exists in the string list, and later add `new_update->refname`. Helped-by: Michael Haggerty <mhagger@alum.mit.edu> Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Martin Ågren <martin.agren@gmail.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-09files_transaction_finish(): delete reflogs before referencesMichael Haggerty
If the deletion steps unexpectedly fail, it is less bad to leave a reference without its reflog than it is to leave a reflog without its reference, since the latter is an invalid repository state. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-09files_ref_store: use a transaction to update packed refsMichael Haggerty
When processing a `files_ref_store` transaction, it is sometimes necessary to delete some references from the "packed-refs" file. Do that using a reference transaction conducted against the `packed_ref_store`. This change further decouples `files_ref_store` from `packed_ref_store`. It also fixes multiple problems, including the two revealed by test cases added in the previous commit. First, the old code didn't obtain the `packed-refs` lock until `files_transaction_finish()`. This means that a failure to acquire the `packed-refs` lock (e.g., due to contention with another process) wasn't detected until it was too late (problems like this are supposed to be detected in the "prepare" phase). The new code acquires the `packed-refs` lock in `files_transaction_prepare()`, the same stage of the processing when the loose reference locks are being acquired, removing another reason why the "prepare" phase might succeed and the "finish" phase might nevertheless fail. Second, the old code deleted the loose version of a reference before deleting any packed version of the same reference. This left a moment when another process might think that the packed version of the reference is current, which is incorrect. (Even worse, the packed version of the reference can be arbitrarily old, and might even point at an object that has since been garbage-collected.) Third, if a reference deletion fails to acquire the `packed-refs` lock altogether, then the old code might leave the repository in the incorrect state (possibly corrupt) described in the previous paragraph. Now we activate the new "packed-refs" file (sans any references that are being deleted) *before* deleting the corresponding loose references. But we hold the "packed-refs" lock until after the loose references have been finalized, thus preventing a simultaneous "pack-refs" process from packing the loose version of the reference in the time gap, which would otherwise defeat our attempt to delete it. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-09files_initial_transaction_commit(): use a transaction for packed refsMichael Haggerty
Use a `packed_ref_store` transaction in the implementation of `files_initial_transaction_commit()` rather than using internal features of the packed ref store. This further decouples `files_ref_store` from `packed_ref_store`. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-09prune_refs(): also free the linked listMichael Haggerty
At least since v1.7, the elements of the `refs_to_prune` linked list have been leaked. Fix the leak by teaching `prune_refs()` to free the list elements as it processes them. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-09files_pack_refs(): use a reference transaction to write packed refsMichael Haggerty
Now that the packed reference store supports transactions, we can use a transaction to write the packed versions of references that we want to pack. This decreases the coupling between `files_ref_store` and `packed_ref_store`. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-09packed_delete_refs(): implement methodMichael Haggerty
Implement `packed_delete_refs()` using a reference transaction. This means that `files_delete_refs()` can use `refs_delete_refs()` instead of `repack_without_refs()` to delete any packed references, decreasing the coupling between the classes. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06ref_lock: stop leaking lock_filesJeff King
Since the tempfile code recently relaxed the rule that tempfile structs (and thus locks) need to hang around forever, we no longer have to leak our lock_file structs. In fact, we don't even need to heap-allocate them anymore, since their lifetime can just match that of the surrounding ref_lock (and if we forget to delete a lock, the effect is the same as before: it will eventually go away at program exit). Note that there is a check in unlock_ref() to only rollback a lock file if it has been allocated. We don't need that check anymore; we zero the ref_lock (and thus the lock_file), so at worst we pass a NULL pointer to delete_tempfile(), which considers that a noop. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06tempfile: auto-allocate tempfiles on heapJeff King
The previous commit taught the tempfile code to give up ownership over tempfiles that have been renamed or deleted. That makes it possible to use a stack variable like this: struct tempfile t; create_tempfile(&t, ...); ... if (!err) rename_tempfile(&t, ...); else delete_tempfile(&t); But doing it this way has a high potential for creating memory errors. The tempfile we pass to create_tempfile() ends up on a global linked list, and it's not safe for it to go out of scope until we've called one of those two deactivation functions. Imagine that we add an early return from the function that forgets to call delete_tempfile(). With a static or heap tempfile variable, the worst case is that the tempfile hangs around until the program exits (and some functions like setup_shallow_temporary rely on this intentionally, creating a tempfile and then leaving it for later cleanup). But with a stack variable as above, this is a serious memory error: the variable goes out of scope and may be filled with garbage by the time the tempfile code looks at it. Let's see if we can make it harder to get this wrong. Since many callers need to allocate arbitrary numbers of tempfiles, we can't rely on static storage as a general solution. So we need to turn to the heap. We could just ask all callers to pass us a heap variable, but that puts the burden on them to call free() at the right time. Instead, let's have the tempfile code handle the heap allocation _and_ the deallocation (when the tempfile is deactivated and removed from the list). This changes the return value of all of the creation functions. For the cleanup functions (delete and rename), we'll add one extra bit of safety: instead of taking a tempfile pointer, we'll take a pointer-to-pointer and set it to NULL after freeing the object. This makes it safe to double-call functions like delete_tempfile(), as the second call treats the NULL input as a noop. Several callsites follow this pattern. The resulting patch does have a fair bit of noise, as each caller needs to be converted to handle: 1. Storing a pointer instead of the struct itself. 2. Passing the pointer instead of taking the struct address. 3. Handling a "struct tempfile *" return instead of a file descriptor. We could play games to make this less noisy. For example, by defining the tempfile like this: struct tempfile { struct heap_allocated_part_of_tempfile { int fd; ...etc } *actual_data; } Callers would continue to have a "struct tempfile", and it would be "active" only when the inner pointer was non-NULL. But that just makes things more awkward in the long run. There aren't that many callers, so we can simply bite the bullet and adjust all of them. And the compiler makes it easy for us to find them all. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06lockfile: do not rollback lock on failed closeJeff King
Since the lockfile code is based on the tempfile code, it has some of the same problems, including that close_lock_file() erases the tempfile's filename buf, making it hard for the caller to write a good error message. In practice this comes up less for lockfiles than for straight tempfiles, since we usually just report the refname. But there is at least one buggy case in write_ref_to_lockfile(). Besides, given the coupling between the lockfile and tempfile modules, it's less confusing if their close() functions have the same semantics. Just as the previous commit did for close_tempfile(), let's teach close_lock_file() and its wrapper close_ref() not to rollback on error. And just as before, we'll give them new "gently" names to catch any new callers that are added. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-26Merge branch 'mh/ref-lock-entry'Junio C Hamano
The code to acquire a lock on a reference (e.g. while accepting a push from a client) used to immediately fail when the reference is already locked---now it waits for a very short while and retries, which can make it succeed if the lock holder was holding it during a read-only operation. * mh/ref-lock-entry: refs: retry acquiring reference locks for 100ms
2017-08-24files-backend: make reflog iterator go through per-worktree reflogNguyễn Thái Ngọc Duy
refs/bisect is unfortunately per-worktree, so we need to look in per-worktree logs/refs/bisect in addition to per-repo logs/refs. The current iterator only goes through per-repo logs/refs. Use merge iterator to walk two ref stores at the same time and pick per-worktree refs from the right iterator. PS. Note the unsorted order of for_each_reflog in the test. This is supposed to be OK, for now. If we enforce order on for_each_reflog() then some more work will be required. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23refs: retry acquiring reference locks for 100msMichael Haggerty
The philosophy of reference locking has been, "if another process is changing a reference, then whatever I'm trying to do to it will probably fail anyway because my old-SHA-1 value is probably no longer current". But this argument falls down if the other process has locked the reference to do something that doesn't actually change the value of the reference, such as `pack-refs` or `reflog expire`. There actually *is* a decent chance that a planned reference update will still be able to go through after the other process has released the lock. So when trying to lock an individual reference (e.g., when creating "refs/heads/master.lock"), if it is already locked, then retry the lock acquisition for approximately 100 ms before giving up. This should eliminate some unnecessary lock conflicts without wasting a lot of time. Add a configuration setting, `core.filesRefLockTimeout`, to allow this setting to be tweaked. Note: the function `get_files_ref_lock_timeout_ms()` cannot be private to the files backend because it is also used by `write_pseudoref()` and `delete_pseudoref()`, which are defined in `refs.c` so that they can be used by other reference backends. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-22Merge branch 'mh/packed-ref-store'Junio C Hamano
The "ref-store" code reorganization continues. * mh/packed-ref-store: (32 commits) files-backend: cheapen refname_available check when locking refs packed_ref_store: handle a packed-refs file that is a symlink read_packed_refs(): die if `packed-refs` contains bogus data t3210: add some tests of bogus packed-refs file contents repack_without_refs(): don't lock or unlock the packed refs commit_packed_refs(): remove call to `packed_refs_unlock()` clear_packed_ref_cache(): don't protest if the lock is held packed_refs_unlock(), packed_refs_is_locked(): new functions packed_refs_lock(): report errors via a `struct strbuf *err` packed_refs_lock(): function renamed from lock_packed_refs() commit_packed_refs(): use a staging file separate from the lockfile commit_packed_refs(): report errors rather than dying packed_ref_store: make class into a subclass of `ref_store` packed-backend: new module for handling packed references packed_read_raw_ref(): new function, replacing `resolve_packed_ref()` packed_ref_store: support iteration packed_peel_ref(): new function, extracted from `files_peel_ref()` repack_without_refs(): take a `packed_ref_store *` parameter get_packed_ref(): take a `packed_ref_store *` parameter rollback_packed_refs(): take a `packed_ref_store *` parameter ...
2017-08-17files-backend: cheapen refname_available check when locking refsMichael Haggerty
When locking references in preparation for updating them, we need to check that none of the newly added references D/F conflict with existing references (e.g., we don't allow `refs/foo` to be added if `refs/foo/bar` already exists, or vice versa). Prior to 524a9fdb51 (refs_verify_refname_available(): use function in more places, 2017-04-16), conflicts with existing loose references were checked by looking directly in the filesystem, and then conflicts with existing packed references were checked by running `verify_refname_available_dir()` against the packed-refs cache. But that commit changed the final check to call `refs_verify_refname_available()` against the *whole* files ref-store, including both loose and packed references, with the following comment: > This means that those callsites now check for conflicts with all > references rather than just packed refs, but the performance cost > shouldn't be significant (and will be regained later). That comment turned out to be too sanguine. User s@kazlauskas.me reported that fetches involving a very large number of references in neighboring directories were slowed down by that change. The problem is that when fetching, each reference is updated individually, within its own reference transaction. This is done because some reference updates might succeed even though others fail. But every time a reference update transaction is finished, `clear_loose_ref_cache()` is called. So when it is time to update the next reference, part of the loose ref cache has to be repopulated for the `refs_verify_refname_available()` call. If the references are all in neighboring directories, then the cost of repopulating the reference cache increases with the number of references, resulting in O(N²) effort. The comment above also claims that the performance cost "will be regained later". The idea was that once the packed-refs were finished being split out into a separate ref-store, we could limit the `refs_verify_refname_available()` call to the packed references again. That is what we do now. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-03repack_without_refs(): don't lock or unlock the packed refsMichael Haggerty
Change `repack_without_refs()` to expect the packed-refs lock to be held already, and not to release the lock before returning. Change the callers to deal with lock management. This change makes it possible for callers to hold the packed-refs lock for a longer span of time, a possibility that will eventually make it possible to fix some longstanding races. The only semantic change here is that `repack_without_refs()` used to forget to release the lock in the `if (!removed)` exit path. That omission is now fixed. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-26Merge branch 'mh/packed-ref-store-prep'Junio C Hamano
Bugfix for a topic that is (only) in 'master'. * mh/packed-ref-store-prep: for_each_bisect_ref(): don't trim refnames lock_packed_refs(): fix cache validity check
2017-06-24Merge branch 'ab/free-and-null'Junio C Hamano
A common pattern to free a piece of memory and assign NULL to the pointer that used to point at it has been replaced with a new FREE_AND_NULL() macro. * ab/free-and-null: *.[ch] refactoring: make use of the FREE_AND_NULL() macro coccinelle: make use of the "expression" FREE_AND_NULL() rule coccinelle: add a rule to make "expression" code use FREE_AND_NULL() coccinelle: make use of the "type" FREE_AND_NULL() rule coccinelle: add a rule to make "type" code use FREE_AND_NULL() git-compat-util: add a FREE_AND_NULL() wrapper around free(ptr); ptr = NULL