summaryrefslogtreecommitdiff
path: root/refs/files-backend.c
AgeCommit message (Collapse)Author
7 daysMerge branch 'kn/maintenance-is-needed'Junio C Hamano
"git maintenance" command learned "is-needed" subcommand to tell if it is necessary to perform various maintenance tasks. * kn/maintenance-is-needed: maintenance: add 'is-needed' subcommand maintenance: add checking logic in `pack_refs_condition()` refs: add a `optimize_required` field to `struct ref_storage_be` reftable/stack: add function to check if optimization is required reftable/stack: return stack segments directly
9 daysMerge branch 'kn/refs-optim-cleanup'Junio C Hamano
Code clean-up. * kn/refs-optim-cleanup: t/pack-refs-tests: move the 'test_done' to callees refs: rename 'pack_refs_opts' to 'refs_optimize_opts' refs: move to using the '.optimize' functions
9 daysMerge branch 'ps/ref-peeled-tags'Junio C Hamano
Some ref backend storage can hold not just the object name of an annotated tag, but the object name of the object the tag points at. The code to handle this information has been streamlined. * ps/ref-peeled-tags: t7004: do not chdir around in the main process ref-filter: fix stale parsed objects ref-filter: parse objects on demand ref-filter: detect broken tags when dereferencing them refs: don't store peeled object IDs for invalid tags object: add flag to `peel_object()` to verify object type refs: drop infrastructure to peel via iterators refs: drop `current_ref_iter` hack builtin/show-ref: convert to use `reference_get_peeled_oid()` ref-filter: propagate peeled object ID upload-pack: convert to use `reference_get_peeled_oid()` refs: expose peeled object ID via the iterator refs: refactor reference status flags refs: fully reset `struct ref_iterator::ref` on iteration refs: introduce `.ref` field for the base iterator refs: introduce wrapper struct for `each_ref_fn`
2025-11-10refs: add a `optimize_required` field to `struct ref_storage_be`Karthik Nayak
To allow users of the refs namespace to check if the reference backend requires optimization, add a new field `optimize_required` field to `struct ref_storage_be`. This field is of type `optimize_required_fn` which is also introduced in this commit. Modify the debug, files, packed and reftable backend to implement this field. A following commit will expose this via 'git pack-refs' and 'git refs optimize'. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Acked-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-06Merge branch 'pk/reflog-migrate-message-fix'Junio C Hamano
Message fix. * pk/reflog-migrate-message-fix: refs: add missing space in messages
2025-11-05refs: add missing space in messagesPeter Krefting
Signed-off-by: Peter Krefting <peter@softwolves.pp.se> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-04Merge branch 'kn/refs-optim-cleanup' into kn/maintenance-is-neededJunio C Hamano
* kn/refs-optim-cleanup: t/pack-refs-tests: move the 'test_done' to callees refs: rename 'pack_refs_opts' to 'refs_optimize_opts' refs: move to using the '.optimize' functions
2025-11-04Merge branch 'ps/ref-peeled-tags' into kn/maintenance-is-neededJunio C Hamano
* ps/ref-peeled-tags: (23 commits) t7004: do not chdir around in the main process ref-filter: fix stale parsed objects ref-filter: parse objects on demand ref-filter: detect broken tags when dereferencing them refs: don't store peeled object IDs for invalid tags object: add flag to `peel_object()` to verify object type refs: drop infrastructure to peel via iterators refs: drop `current_ref_iter` hack builtin/show-ref: convert to use `reference_get_peeled_oid()` ref-filter: propagate peeled object ID upload-pack: convert to use `reference_get_peeled_oid()` refs: expose peeled object ID via the iterator refs: refactor reference status flags refs: fully reset `struct ref_iterator::ref` on iteration refs: introduce `.ref` field for the base iterator refs: introduce wrapper struct for `each_ref_fn` builtin/repo: add progress meter for structure stats builtin/repo: add keyvalue and nul format for structure stats builtin/repo: add object counts in structure output builtin/repo: introduce structure subcommand ...
2025-11-04refs: rename 'pack_refs_opts' to 'refs_optimize_opts'Karthik Nayak
The previous commit removed all references to 'pack_refs()' within the refs subsystem. Continue this cleanup by also renaming 'pack_refs_opts' to 'refs_optimize_opts' and the respective flags accordingly. Keeping the naming consistent will make the code easier to maintain. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-04refs: move to using the '.optimize' functionsKarthik Nayak
The `struct ref_store` variable exposes two ways to optimize a reftable backend: 1. pack_refs 2. optimize The former was specific to the 'files' + 'packed' refs backend. The latter is more generic and covers all backends. While the naming is different, both of these functions perform the same functionality. Consolidate this code to only maintain the 'optimize' functions. Do this by modifying the backends so that they exclusively implement the `optimize` callback, only. All users of the refs subsystem already use the 'optimize' function so there is no changes needed on the callee side. Finally, cleanup all references to the 'pack_refs' field of the structure and code around it. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-04refs: drop infrastructure to peel via iteratorsPatrick Steinhardt
Now that the peeled object ID gets propagated via the `struct reference` there is no need anymore to call into the reference iterator itself to dereference an object. Remove this infrastructure. Most of the changes are straight-forward deletions of code. There is one exception though in `refs/packed-backend.c::write_with_updates()`. Here we stop peeling the iterator and instead just pass the peeled object ID of that iterator directly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-04refs: introduce `.ref` field for the base iteratorPatrick Steinhardt
The base iterator has a couple of fields that tracks the name, target, object ID and flags for the current reference. Due to this design we have to create a new `struct reference` whenever we want to hand over that reference to the callback function, which is tedious and not very efficient. Convert the structure to instead contain a `struct reference` as member. This member is expected to be populated by the implementations of the iterator and is handed over to the callback directly. While at it, simplify `should_pack_ref()` to take a `struct reference` directly instead of passing its respective fields. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-04refs: introduce wrapper struct for `each_ref_fn`Patrick Steinhardt
The `each_ref_fn` callback function type is used across our code base for several different functions that iterate through reference. There's a bunch of callbacks implementing this type, which makes any changes to the callback signature extremely noisy. An example of the required churn is e8207717f1 (refs: add referent to each_ref_fn, 2024-08-09): adding a single argument required us to change 48 files. It was already proposed back then [1] that we might want to introduce a wrapper structure to alleviate the pain going forward. While this of course requires the same kind of global refactoring as just introducing a new parameter, it at least allows us to more change the callback type afterwards by just extending the wrapper structure. One counterargument to this refactoring is that it makes the structure more opaque. While it is obvious which callsites need to be fixed up when we change the function type, it's not obvious anymore once we use a structure. That being said, we only have a handful of sites that actually need to populate this wrapper structure: our ref backends, "refs/iterator.c" as well as very few sites that invoke the iterator callback functions directly. Introduce this wrapper structure so that we can adapt the iterator interfaces more readily. [1]: <ZmarVcF5JjsZx0dl@tanuki> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-30Merge branch 'ps/symlink-symref-deprecation'Junio C Hamano
"Symlink symref" has been added to the list of things that will disappear at Git 3.0 boundary. * ps/symlink-symref-deprecation: refs/files: deprecate writing symrefs as symbolic links
2025-10-26Merge branch 'js/unreachable-workaround-for-no-symlink-head' into maint-2.51Junio C Hamano
Code clean-up. * js/unreachable-workaround-for-no-symlink-head: refs: forbid clang to complain about unreachable code
2025-10-20Merge branch 'js/unreachable-workaround-for-no-symlink-head'Junio C Hamano
Code clean-up. * js/unreachable-workaround-for-no-symlink-head: refs: forbid clang to complain about unreachable code
2025-10-15Merge branch 'kn/refs-files-case-insensitive' into maint-2.51Junio C Hamano
Deal more gracefully with directory / file conflicts when the files backend is used for ref storage, by failing only the ones that are involved in the conflict while allowing others. * kn/refs-files-case-insensitive: refs/files: handle D/F conflicts during locking refs/files: handle F/D conflicts in case-insensitive FS refs/files: use correct error type when lock exists refs/files: catch conflicts on case-insensitive file-systems
2025-10-15Merge branch 'ps/reflog-migrate-fixes' into maint-2.51Junio C Hamano
"git refs migrate" to migrate the reflog entries from a refs backend to another had a handful of bugs squashed. * ps/reflog-migrate-fixes: refs: fix invalid old object IDs when migrating reflogs refs: stop unsetting REF_HAVE_OLD for log-only updates refs/files: detect race when generating reflog entry for HEAD refs: fix identity for migrated reflogs ident: fix type of string length parameter builtin/reflog: implement subcommand to write new entries refs: export `ref_transaction_update_reflog()` builtin/reflog: improve grouping of subcommands Documentation/git-reflog: convert to use synopsis type
2025-10-15refs/files: deprecate writing symrefs as symbolic linksPatrick Steinhardt
The "files" backend has the ability to store symbolic refs as symbolic links, which can be configured via "core.preferSymlinkRefs". This feature stems back from the early days: the initial implementation of symbolic refs used symlinks exclusively. The symref format was only introduced in 9b143c6e15 (Teach update-ref about a symbolic ref stored in a textfile., 2005-09-25) and made the default in 9f0bb90d16 (core.prefersymlinkrefs: use symlinks for .git/HEAD, 2006-05-02). This is all about 20 years ago, and there are no known reasons nowadays why one would want to use symlinks instead of symrefs. Mark the feature for deprecation in Git 3.0. Note that this only deprecates _writing_ symrefs as symbolic links. Reading such symrefs is still supported for now. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-13Merge branch 'kn/reftable-consistency-checks'Junio C Hamano
The reftable backend learned to sanity check its on-disk data more carefully. * kn/reftable-consistency-checks: refs/reftable: add fsck check for checking the table name reftable: add code to facilitate consistency checks fsck: order 'fsck_msg_type' alphabetically Documentation/fsck-msgids: remove duplicate msg id reftable: check for trailing newline in 'tables.list' refs: move consistency check msg to generic layer refs: remove unused headers
2025-10-09refs: forbid clang to complain about unreachable codeJohannes Schindelin
When `NO_SYMLINK_HEAD` is defined, `create_ref_symlink()` is hard-coded as `(-1)`, and as a consequence the condition `!create_ref_symlink()` always evaluates to false, rendering any code guarded by that condition unreachable. Therefore, clang is _technically_ correct when it complains about unreachable code. It does completely miss the fact that this is okay because on _other_ platforms, where `NO_SYMLINK_HEAD` is not defined, the code isn't unreachable at all. Let's use the same trick as in 82e79c63642c (git-compat-util: add NOT_CONSTANT macro and use it in atfork_prepare(), 2025-03-17) to appease clang while at the same time keeping the `-Wunreachable` flag to potentially find _actually_ unreachable code. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-07refs: move consistency check msg to generic layerKarthik Nayak
The files-backend prints a message before the consistency checks run. Move this to the generic layer so both the files and reftable backend can benefit from this message. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-07refs: remove unused headersKarthik Nayak
In the 'refs/' namespace, some of the included header files are not needed, let's remove them. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-10-02Merge branch 'ms/refs-optimize'Junio C Hamano
"git refs optimize" is added for not very well explained reason despite it does the same thing as "git pack-refs"... * ms/refs-optimize: t: add test for git refs optimize subcommand t0601: refactor tests to be shareable builtin/refs: add optimize subcommand doc: pack-refs: factor out common options builtin/pack-refs: factor out core logic into a shared library builtin/pack-refs: convert to use the generic refs_optimize() API reftable-backend: implement 'optimize' action files-backend: implement 'optimize' action refs: add a generic 'optimize' API
2025-09-29Merge branch 'kn/refs-files-case-insensitive'Junio C Hamano
Deal more gracefully with directory / file conflicts when the files backend is used for ref storage, by failing only the ones that are involved in the conflict while allowing others. * kn/refs-files-case-insensitive: refs/files: handle D/F conflicts during locking refs/files: handle F/D conflicts in case-insensitive FS refs/files: use correct error type when lock exists refs/files: catch conflicts on case-insensitive file-systems
2025-09-19files-backend: implement 'optimize' actionMeet Soni
With the generic `refs_optimize()` API now in place, provide the first implementation for the 'files' reference backend. This makes the new API functional for existing repositories and serves as the foundation for migrating user-facing commands to the new architecture. The implementation simply calls the existing `files_pack_refs()` function, as 'packing' is the method used to optimize the files-based reference store. Wire up the new `files_optimize()` function to the `optimize` slot in the files backend's virtual table. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Meet Soni <meetsoni3017@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-17refs/files: handle D/F conflicts during lockingKarthik Nayak
The previous commit added the necessary validation and checks for F/D conflicts in the files backend when working on case insensitive systems. There is still a possibility for D/F conflicts. This is a different from the F/D since for F/D conflicts, there would not be a conflict during the lock creation phase: refs/heads/foo.lock refs/heads/foo/bar.lock However there would be a conflict when the locks are committed, since we cannot have 'refs/heads/foo/bar' and 'refs/heads/foo'. These kinds of conflicts are checked and resolved in `refs_verify_refnames_available()`, so the previous commit ensured that for case-insensitive filesystems, we would lowercase the inputs to that function. For D/F conflicts, there is a conflict during the lock creation phase itself: refs/heads/foo/bar.lock refs/heads/foo.lock As in `lock_raw_ref()` after creating the lock, we also check for D/F conflicts. This can occur in case-insensitive filesystems when trying to fetch case-conflicted references like: refs/heads/Foo/new refs/heads/foo D/F conflicts can also occur in case-sensitive filesystems, when the repository already contains a directory with a lock file 'refs/heads/foo/bar.lock' and trying to fetch 'refs/heads/foo'. This doesn't concern directories containing garbage files as those are handled on a higher level. To fix this, simply categorize the error as a name conflict. Also remove this reference from the list of valid refnames for availability checks. By categorizing the error and removing it from the list of valid references, batched updates now knows to reject such reference updates and apply the other reference updates. Fix a small typo in `ref_transaction_maybe_set_rejected()` while here. Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-17refs/files: handle F/D conflicts in case-insensitive FSKarthik Nayak
When using the files-backend on case-insensitive filesystems, there is possibility of hitting F/D conflicts when creating references within a single transaction, such as: - 'refs/heads/foo' - 'refs/heads/Foo/bar' Ideally such conflicts are caught in `refs_verify_refnames_available()` which is responsible for checking F/D conflicts within a given transaction. This utility function is shared across the reference backends. As such, it doesn't consider the issues of using a case-insensitive file system, which only affects the files-backend. While one solution would be to make the function aware of such issues, this feels like leaking implementation details of file-backend specific issues into the utility function. So opt for the more simpler option, of lowercasing all references sent to this function when on a case-insensitive filesystem and operating on the files-backend. To do this, simply use a `struct strbuf` to convert the refname to lowercase and append it to the list of refnames to be checked. Since we use a `struct strbuf` and the memory is cleared right after, make sure that the string list duplicates all provided string. Without this change, the user would simply be left with a repository with '.lock' files which were created in the 'prepare' phase of the transaction, as the 'commit' phase would simply abort and not do the necessary cleanup. Reported-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-17refs/files: use correct error type when lock existsKarthik Nayak
When fetching references into a repository, if a lock for a particular reference exists, then `lock_raw_ref()` throws: - REF_TRANSACTION_ERROR_CASE_CONFLICT: when there is a conflict because the transaction contains conflicting references while being on a case-insensitive filesystem. - REF_TRANSACTION_ERROR_GENERIC: for all other errors. The latter causes the entire set of batched updates to fail, even in case sensitive filessystems. Instead, return a 'REF_TRANSACTION_ERROR_CREATE_EXISTS' error. This allows batched updates to reject the individual update which conflicts with the existing file, while updating the rest of the references. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-17refs/files: catch conflicts on case-insensitive file-systemsKarthik Nayak
During the 'prepare' phase of a reference transaction in the files backend, we create the lock files for references to be created. When using batched updates on case-insensitive filesystems, the entire batched updates would be aborted if there are conflicting names such as: refs/heads/Foo refs/heads/foo This affects all commands which were migrated to use batched updates in Git 2.51, including 'git-fetch(1)' and 'git-receive-pack(1)'. Before that, reference updates would be applied serially with one transaction used per update. When users fetched multiple references on case-insensitive systems, subsequent references would simply overwrite any earlier references. So when fetching: refs/heads/foo: 5f34ec0bfeac225b1c854340257a65b106f70ea6 refs/heads/Foo: ec3053b0977e83d9b67fc32c4527a117953994f3 refs/heads/sample: 2eefd1150e06d8fca1ddfa684dec016f36bf4e56 The user would simply end up with: refs/heads/foo: ec3053b0977e83d9b67fc32c4527a117953994f3 refs/heads/sample: 2eefd1150e06d8fca1ddfa684dec016f36bf4e56 This is buggy behavior since the user is never informed about the overrides performed and missing references. Nevertheless, the user is left with a working repository with a subset of the references. Since Git 2.51, in such situations fetches would simply fail without updating any references. Which is also buggy behavior and worse off since the user is left without any references. The error is triggered in `lock_raw_ref()` where the files backend attempts to create a lock file. When a lock file already exists the function returns a 'REF_TRANSACTION_ERROR_GENERIC'. When this happens, the entire batched updates, not individual operation, is aborted as if it were in a transaction. Change this to return 'REF_TRANSACTION_ERROR_CASE_CONFLICT' instead to aid the batched update mechanism to simply reject such errors. The change only affects batched updates since batched updates will reject individual updates with non-generic errors. So specifically this would only affect: 1. git fetch 2. git receive-pack 3. git update-ref --batch-updates This bubbles the error type up to `files_transaction_prepare()` which tries to lock each reference update. So if the locking fails, we check if the rejection type can be ignored, which is done by calling `ref_transaction_maybe_set_rejected()`. As the error type is now 'REF_TRANSACTION_ERROR_CASE_CONFLICT', the specific reference update would simply be rejected, while other updates in the transaction would continue to be applied. This allows partial application of references in case-insensitive filesystems when fetching colliding references. While the earlier implementation allowed the last reference to be applied overriding the initial references, this change would allow the first reference to be applied while rejecting consequent collisions. This should be an okay compromise since with the files backend, there is no scenario possible where we would retain all colliding references. Let's also be more proactive and notify users on case-insensitive filesystems about such problems by providing a brief about the issue while also recommending using the reftable backend, which doesn't have the same issue. Reported-by: Joe Drew <joe.drew@indexexchange.com> Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-29Merge branch 'jk/no-clobber-dangling-symref-with-fetch'Junio C Hamano
"git fetch" can clobber a symref that is dangling when the remote-tracking HEAD is set to auto update, which has been corrected. * jk/no-clobber-dangling-symref-with-fetch: refs: do not clobber dangling symrefs t5510: prefer "git -C" to subshell for followRemoteHEAD tests t5510: stop changing top-level working directory t5510: make confusing config cleanup more explicit
2025-08-21Merge branch 'ps/remote-rename-fix'Junio C Hamano
"git remote rename origin upstream" failed to move origin/HEAD to upstream/HEAD when origin/HEAD is unborn and performed other renames extremely inefficiently, which has been corrected. * ps/remote-rename-fix: builtin/remote: only iterate through refs that are to be renamed builtin/remote: rework how remote refs get renamed builtin/remote: determine whether refs need renaming early on builtin/remote: fix sign comparison warnings refs: simplify logic when migrating reflog entries refs: pass refname when invoking reflog entry callback
2025-08-21Merge branch 'ps/reflog-migrate-fixes'Junio C Hamano
"git refs migrate" to migrate the reflog entries from a refs backend to another had a handful of bugs squashed. * ps/reflog-migrate-fixes: refs: fix invalid old object IDs when migrating reflogs refs: stop unsetting REF_HAVE_OLD for log-only updates refs/files: detect race when generating reflog entry for HEAD refs: fix identity for migrated reflogs ident: fix type of string length parameter builtin/reflog: implement subcommand to write new entries refs: export `ref_transaction_update_reflog()` builtin/reflog: improve grouping of subcommands Documentation/git-reflog: convert to use synopsis type
2025-08-19refs: do not clobber dangling symrefsJeff King
When given an expected "before" state, the ref-writing code will avoid overwriting any ref that does not match that expected state. We use the null oid as a sentinel value for "nothing should exist", and likewise that is the sentinel value we get when trying to read a ref that does not exist. But there's one corner case where this is ambiguous: dangling symrefs. Trying to read them will yield the null oid, but there is potentially something of value there: the dangling symref itself. For a normal recursive write, this is OK. Imagine we have a symref "FOO_HEAD" that points to a ref "refs/heads/bar" that does not exist, and we try to write to it with a create operation like: oid=$(git rev-parse HEAD) ;# or whatever git symbolic-ref FOO_HEAD refs/heads/bar echo "create FOO_HEAD $oid" | git update-ref --stdin The attempt to resolve FOO_HEAD will actually resolve "bar", yielding the null oid. That matches our expectation, and the write proceeds. This is correct, because we are not writing FOO_HEAD at all, but writing its destination "bar", which in fact does not exist. But what if the operation asked not to dereference symrefs? Like this: echo "create FOO_HEAD $oid" | git update-ref --no-deref --stdin Resolving FOO_HEAD would still result in a null oid, and the write will proceed. But it will overwrite FOO_HEAD itself, removing the fact that it ever pointed to "bar". This case is a little esoteric; we are clobbering a symref with a no-deref write of a regular ref value. But the same problem occurs when writing symrefs. For example: echo "symref-create FOO_HEAD refs/heads/other" | git update-ref --no-deref --stdin The "create" operation asked us to create FOO_HEAD only if it did not exist. But we silently overwrite the existing value. You can trigger this without using update-ref via the fetch followRemoteHEAD code. In "create" mode, it should not overwrite an existing value. But if you manually create a symref pointing to a value that does not yet exist (either via symbolic-ref or with "remote add -m"), create mode will happily overwrite it. Instead, we should detect this case and refuse to write. The correct specification to overwrite FOO_HEAD in this case is to provide an expected target ref value, like: echo "symref-update FOO_HEAD refs/heads/other ref refs/heads/bar" | git update-ref --no-deref --stdin Note that the non-symref "update" directive does not allow you to do this (you can only specify an oid). This is a weakness in the update-ref interface, and you'd have to overwrite unconditionally, like: echo "update FOO_HEAD $oid" | git update-ref --no-deref --stdin Likewise other symref operations like symref-delete do not accept the "ref" keyword. You should be able to do: echo "symref-delete FOO_HEAD ref refs/heads/bar" but cannot (and can only delete unconditionally). This patch doesn't address those gaps. We may want to do so in a future patch for completeness, but it's not clear if anybody actually wants to perform those operations. The symref update case (specifically, via followRemoteHEAD) is what I ran into in the wild. The code for the fix is relatively straight-forward given the discussion above. But note that we have to implement it independently for the files and reftable backends. The "old oid" checks happen as part of the locking process, which is implemented separately for each system. We may want to factor this out somehow, but it's beyond the scope of this patch. (Another curiosity is that the messages in the reftable code are marked for translation, but the ones in the files backend are not. I followed local convention in each case, but we may want to harmonize this at some point). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-06refs: pass refname when invoking reflog entry callbackPatrick Steinhardt
With `refs_for_each_reflog_ent()` callers can iterate through all the reflog entries for a given reference. The callback that is being invoked for each such entry does not receive the name of the reference that we are currently iterating through. This isn't really a limiting factor, as callers can simply pass the name via the callback data. But this layout sometimes does make for a bit of an awkward calling pattern. One example: when iterating through all reflogs, and for each reflog we iterate through all refnames, we have to do some extra book keeping to track which reference name we are currently yielding reflog entries for. Change the signature of the callback function so that the reference name of the reflog gets passed through to it. Adapt callers accordingly and start using the new parameter in trivial cases. The next commit will refactor the reference migration logic to make use of this parameter so that we can simplify its logic a bit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-06Merge branch 'ps/reflog-migrate-fixes' into ps/remote-rename-fixJunio C Hamano
* ps/reflog-migrate-fixes: refs: fix invalid old object IDs when migrating reflogs refs: stop unsetting REF_HAVE_OLD for log-only updates refs/files: detect race when generating reflog entry for HEAD refs: fix identity for migrated reflogs ident: fix type of string length parameter builtin/reflog: implement subcommand to write new entries refs: export `ref_transaction_update_reflog()` builtin/reflog: improve grouping of subcommands Documentation/git-reflog: convert to use synopsis type
2025-08-06refs: fix invalid old object IDs when migrating reflogsPatrick Steinhardt
When migrating reflog entries between different storage formats we end up with invalid old object IDs for the migrated entries: instead of writing the old object ID of the to-be-migrated entry, we end up with the all-zeroes object ID. The root cause of this issue is that we don't know to use the old object ID provided by the caller. Instead, we manually resolve the old object ID by resolving the current value of its matching reference. But as that reference does not yet exist in the target ref storage we always end up resolving it to all-zeroes. This issue got unnoticed as there is no user-facing command that would even show the old object ID. While `git log -g` knows to show the new object ID, we don't have any formatting directive to show the old object ID. Fix the bug by introducing a new flag `REF_LOG_USE_PROVIDED_OIDS`. If set, backends are instructed to use the old and new object IDs provided by the caller, without doing any manual resolving. Set this flag in `ref_transaction_update_reflog()`. Amend our tests in t1460-refs-migrate to use our test tool to read reflog entries. This test tool prints out both old and new object ID of each reflog entry, which fixes the test gap. Furthermore it also prints the full identity used to write the reflog, which provides test coverage for the previous commit in this patch series that fixed the identity for migrated reflogs. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-06refs: stop unsetting REF_HAVE_OLD for log-only updatesPatrick Steinhardt
The `REF_HAVE_OLD` flag indicates whether a given ref update has its old object ID set. If so, the value of that field is used to verify whether the current state of the reference matches this expected state. It is thus an important part of mitigating races with a concurrent process that updates the same set of references. When writing reflogs though we explicitly unset that flag. This is a sensible thing to do: the old state of reflog entry updates may not necessarily match the current on-disk state of its accompanying ref, but it's only intended to signal what old object ID we want to write into the new reflog entry. For example when migrating refs we end up writing many reflog entries for a single reference, and most likely those reflog entries will have many different old object IDs. But unsetting this flag also removes a useful signal, namely that the caller _did_ provide an old object ID for a given reflog entry. This signal will become useful in a subsequent commit, where we add a new flag that tells the transaction to use the provided old and new object IDs to write a reflog entry. The `REF_HAVE_OLD` flag is then used as a signal to verify that the caller really did provide an old object ID. Stop unsetting the flag so that we can use it as this described signal in a subsequent commit. Skip checking the old object ID for log-only updates so that we don't expect it to match the current on-disk state. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-06refs/files: detect race when generating reflog entry for HEADPatrick Steinhardt
When updating a reference that is being pointed to HEAD we don't only write a reflog message for that particular reference, but also generate one for HEAD. This logic is handled by `split_head_update()`, where we: 1. Verify that the condition actually triggered. This is done by reading HEAD at the start of the transaction so that we can then check whether a given reference update refers to its target. 2. Queue a new log-only update for HEAD in case it did. But the logic is unfortunately not free of races, as we do not lock the HEAD reference after we have read its target. This can lead to the following two scenarios: - HEAD gets concurrently updated to point to one of the references we have already processed. This causes us not writing a reflog message even though we should have done so. - HEAD gets concurrently updated to no longer point to a reference anymore that we have already processed. This causes us to write a reflog message even though we should _not_ have done so. Improve the situation by introducing a new `REF_LOG_VIA_SPLIT` flag that is specific to the "files" backend. If set, we will double check that the HEAD reference still points to the reference that we are creating the reflog entry for after we have locked HEAD. Furthermore, instead of manually resolving the old object ID of that entry, we now use the same old state as for the parent update. If we detect such a racy update we abort the transaction. This is a bit heavy-handed: the user didn't even ask us to write a reflog update for "HEAD", so it might be surprising if we abort the transaction. That being said: - Normal users wouldn't typically hit this case as we only hit the relevant code when committing to a branch that is being pointed to by "HEAD" directly. Commands like git-commit(1) typically commit to "HEAD" itself though. - Scripted users that use git-update-ref(1) and related plumbing commands are unlikely to hit this case either, as they would have to update the pointed-to-branch at the same as "HEAD" is being updated, which is an exceedingly rare event. The alternative would be to instead drop the log-only update completely, but that would require more logic that is hard to verify without adding infrastructure specific for such a test. So we rather do the pragmatic thing and don't worry too much about an edge case that is very unlikely to happen. Unfortunately, this change only helps with the second race. We cannot reliably plug the first race without locking the HEAD reference at the start of the transaction. Locking HEAD unconditionally would effectively serialize all writes though, and that doesn't seem like an option. Also, double checking its value at the end of the transaction is not an option either, as its target may have flip-flopped during the transaction. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-08-03Merge branch 'kn/for-each-ref-skip'Junio C Hamano
"git for-each-ref" learns "--start-after" option to help applications that want to page its output. * kn/for-each-ref-skip: ref-cache: set prefix_state when seeking for-each-ref: introduce a '--start-after' option ref-filter: remove unnecessary else clause refs: selectively set prefix in the seek functions ref-cache: remove unused function 'find_ref_entry()' refs: expose `ref_iterator` via 'refs.h'
2025-07-16Merge branch 'ps/refs-files-remove-empty-parent'Junio C Hamano
When a ref creation at refs/heads/foo/bar fails, the files backend now removes refs/heads/foo/ if the directory is otherwise not used. * ps/refs-files-remove-empty-parent: refs/files: remove empty parent dirs when ref creation fails
2025-07-15refs: selectively set prefix in the seek functionsKarthik Nayak
The ref iterator exposes a `ref_iterator_seek()` function. The name suggests that this would seek the iterator to a specific reference in some ways similar to how `fseek()` works for the filesystem. However, the function actually sets the prefix for refs iteration. So further iteration would only yield references which match the particular prefix. This is a bit confusing. Let's add a 'flags' field to the function, which when set with the 'REF_ITERATOR_SEEK_SET_PREFIX' flag, will set the prefix for the iteration in-line with the existing behavior. Otherwise, the reference backends will simply seek to the specified reference and clears any previously set prefix. This allows users to start iteration from a specific reference. In the packed and reftable backend, since references are available in a sorted list, the changes are simply setting the prefix if needed. The changes on the files-backend are a little more involved, since the files backend uses the 'ref-cache' mechanism. We move out the existing logic within `cache_ref_iterator_seek()` to `cache_ref_iterator_set_prefix()` which is called when the 'REF_ITERATOR_SEEK_SET_PREFIX' flag is set. We then parse the provided seek string and set the required levels and their indexes to ensure that seeking is possible. Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-08Merge branch 'kn/fetch-push-bulk-ref-update'Junio C Hamano
"git push" and "git fetch" are taught to update refs in batches to gain performance. * kn/fetch-push-bulk-ref-update: receive-pack: handle reference deletions separately refs/files: skip updates with errors in batched updates receive-pack: use batched reference updates send-pack: fix memory leak around duplicate refs fetch: use batched reference updates refs: add function to translate errors to strings
2025-07-08refs/files: remove empty parent dirs when ref creation failsPatrick Steinhardt
When creating a new reference in the "files" backend we first create the directory hierarchy for that reference, then create the lockfile for that reference, and finally rename the lockfile into place. When the transaction gets aborted we prune the lockfile, but we don't clean up the directory hierarchy that we may have created for the lockfile. In some egde cases this can lead to lots of empty directories being cluttered in the ".git/refs" directory that really serve no purpose at all. We know to prune such empty directories when packing refs, but that only patches over the issue. Improve this by removing empty parents when cleaning up still-locked references in `files_transaction_cleanup()`. This function is also called when preparing or committing the transaction, so this change also helps when not explicitly aborting the transaction. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-06-25refs/files: skip updates with errors in batched updatesKarthik Nayak
The commit 23fc8e4f61 (refs: implement batch reference update support, 2025-04-08) introduced support for batched reference updates. This allows users to batch updates together, while allowing some of the updates to fail. Under the hood, batched updates use the reference transaction mechanism. Each update which fails is marked as such. Any failed updates must be skipped over in the rest of the code, as they wouldn't apply any more. In two of the loops within 'files_transaction_finish()' of the files backend, the failed updates aren't skipped over. This can cause a SEGFAULT otherwise. Add the missing skips and a test to validate the same. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-06-03Merge branch 'sj/ref-contents-check-fix'Junio C Hamano
"git verify-refs" (and hence "git fsck --reference") started erroring out in a repository in which secondary worktrees were prepared with Git 2.43 or lower. * sj/ref-contents-check-fix: fsck: ignore missing "refs" directory for linked worktrees
2025-06-02fsck: ignore missing "refs" directory for linked worktreesshejialuo
"git refs verify" doesn't work if there are worktrees created on Git v2.43.0 or older versions. These versions don't automatically create the "refs" directory, causing the error: error: cannot open directory .git/worktrees/<worktree name>/refs: No such file or directory Since 8f4c00de95 (builtin/worktree: create refdb via ref backend, 2024-01-08), we automatically create the "refs" directory for new worktrees. And in 7c78d819e6 (ref: support multiple worktrees check for refs, 2024-11-20), we assume that all linked worktrees have this directory and would wrongly report an error to the user, thus introducing compatibility issue. Check for ENOENT errno before reporting directory access errors for linked worktrees to maintain backward compatibility. Reported-by: Kristoffer Haugsbakk <code@khaugsbakk.name> Signed-off-by: shejialuo <shejialuo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-24Merge branch 'ps/object-file-cleanup'Junio C Hamano
Code clean-up. * ps/object-file-cleanup: object-store: merge "object-store-ll.h" and "object-store.h" object-store: remove global array of cached objects object: split out functions relating to object store subsystem object-file: drop `index_blob_stream()` object-file: split up concerns of `HASH_*` flags object-file: split out functions relating to object store subsystem object-file: move `xmmap()` into "wrapper.c" object-file: move `git_open_cloexec()` to "compat/open.c" object-file: move `safe_create_leading_directories()` into "path.c" object-file: move `mkdir_in_gitdir()` into "path.c"
2025-04-16Merge branch 'kn/non-transactional-batch-updates'Junio C Hamano
Updating multiple references have only been possible in all-or-none fashion with transactions, but it can be more efficient to batch multiple updates even when some of them are allowed to fail in a best-effort manner. A new "best effort batches of updates" mode has been introduced. * kn/non-transactional-batch-updates: update-ref: add --batch-updates flag for stdin mode refs: support rejection in batch updates during F/D checks refs: implement batch reference update support refs: introduce enum-based transaction error types refs/reftable: extract code from the transaction preparation refs/files: remove duplicate duplicates check refs: move duplicate refname update check to generic layer refs/files: remove redundant check in split_symref_update()
2025-04-15Merge branch 'ps/object-wo-the-repository'Junio C Hamano
The object layer has been updated to take an explicit repository instance as a parameter in more code paths. * ps/object-wo-the-repository: hash: stop depending on `the_repository` in `null_oid()` hash: fix "-Wsign-compare" warnings object-file: split out logic regarding hash algorithms delta-islands: stop depending on `the_repository` object-file-convert: stop depending on `the_repository` pack-bitmap-write: stop depending on `the_repository` pack-revindex: stop depending on `the_repository` pack-check: stop depending on `the_repository` environment: move access to "core.bigFileThreshold" into repo settings pack-write: stop depending on `the_repository` and `the_hash_algo` object: stop depending on `the_repository` csum-file: stop depending on `the_repository`