summaryrefslogtreecommitdiff
path: root/merge-ort.c
AgeCommit message (Collapse)Author
13 daysconfig: drop `git_config_get_bool()` wrapperPatrick Steinhardt
In 036876a1067 (config: hide functions using `the_repository` by default, 2024-08-13) we have moved around a bunch of functions in the config subsystem that depend on `the_repository`. Those function have been converted into mere wrappers around their equivalent function that takes in a repository as parameter, and the intent was that we'll eventually remove those wrappers to make the dependency on the global repository variable explicit at the callsite. Follow through with that intent and remove `git_config_get_bool()`. All callsites are adjusted so that they use `repo_config_get_bool(the_repository, ...)` instead. While some callsites might already have a repository available, this mechanical conversion is the exact same as the current situation and thus cannot cause any regression. Those sites should eventually be cleaned up in a later patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
13 daysconfig: drop `git_config_get_int()` wrapperPatrick Steinhardt
In 036876a1067 (config: hide functions using `the_repository` by default, 2024-08-13) we have moved around a bunch of functions in the config subsystem that depend on `the_repository`. Those function have been converted into mere wrappers around their equivalent function that takes in a repository as parameter, and the intent was that we'll eventually remove those wrappers to make the dependency on the global repository variable explicit at the callsite. Follow through with that intent and remove `git_config_get_int()`. All callsites are adjusted so that they use `repo_config_get_int(the_repository, ...)` instead. While some callsites might already have a repository available, this mechanical conversion is the exact same as the current situation and thus cannot cause any regression. Those sites should eventually be cleaned up in a later patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
13 daysconfig: drop `git_config_get_string()` wrapperPatrick Steinhardt
In 036876a1067 (config: hide functions using `the_repository` by default, 2024-08-13) we have moved around a bunch of functions in the config subsystem that depend on `the_repository`. Those function have been converted into mere wrappers around their equivalent function that takes in a repository as parameter, and the intent was that we'll eventually remove those wrappers to make the dependency on the global repository variable explicit at the callsite. Follow through with that intent and remove `git_config_get_string()`. All callsites are adjusted so that they use `repo_config_get_string(the_repository, ...)` instead. While some callsites might already have a repository available, this mechanical conversion is the exact same as the current situation and thus cannot cause any regression. Those sites should eventually be cleaned up in a later patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
13 daysconfig: drop `git_config()` wrapperPatrick Steinhardt
In 036876a1067 (config: hide functions using `the_repository` by default, 2024-08-13) we have moved around a bunch of functions in the config subsystem that depend on `the_repository`. Those function have been converted into mere wrappers around their equivalent function that takes in a repository as parameter, and the intent was that we'll eventually remove those wrappers to make the dependency on the global repository variable explicit at the callsite. Follow through with that intent and remove `git_config()`. All callsites are adjusted so that they use `repo_config(the_repository, ...)` instead. While some callsites might already have a repository available, this mechanical conversion is the exact same as the current situation and thus cannot cause any regression. Those sites should eventually be cleaned up in a later patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: rename `repo_read_object_file()`Patrick Steinhardt
Rename `repo_read_object_file()` to `odb_read_object()` to match other functions related to the object database and our modern coding guidelines. Introduce a compatibility wrapper so that any in-flight topics will continue to compile. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01odb: rename `oid_object_info()`Patrick Steinhardt
Rename `oid_object_info()` to `odb_read_object_info()` as well as their `_extended()` variant to match other functions related to the object database and our modern coding guidelines. Introduce compatibility wrappers so that any in-flight topics will continue to compile. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01object-store: rename files to "odb.{c,h}"Patrick Steinhardt
In the preceding commits we have renamed the structures contained in "object-store.h" to `struct object_database` and `struct odb_backend`. As such, the code files "object-store.{c,h}" are confusingly named now. Rename them to "odb.{c,h}" accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-16merge-ort: add a new mergeability_only optionElijah Newren
Git Forges may be interested in whether two branches can be merged while not being interested in what the resulting merge tree is nor which files conflicted. For such cases, add a new mergeability_only option. This option allows the merge machinery to, in the "outer layer" of the merge: * exit upon first[-ish] conflict * avoid (not prevent) writing merged blobs/trees to the object store I have a number of qualifiers there, so let me explain each: "outer layer": Note that since the recursive merge of merge bases (corresponding to call_depth > 0) can conflict without the outer final merge (corresponding to call_depth == 0) conflicting, we can't short-circuit nor avoid writing merged blobs/trees to the object store during those inner merges. "first-ish conflict": The current patch only exits early from process_entries() on the first conflict it detects, but conflicts could have been detected in a previous function call, namely detect_and_process_renames(). However: * conflicts detected by detect_and_process_renames() are quite rare conflict types * the detection would still come after regular rename detection (which is the expensive part of detect_and_process_renames()), so it is not saving us much in computation time given that process_entries() directly follows detect_and_process_renames() * [this overlaps with the next bullet point] process_entries() is the place where virtually all object writing occurs (object writing is sometimes more of a concern for Forges than computation time), so exiting early here isn't saving us much in object writes either * the code changes needed to handle an earlier exit are slightly more invasive in detect_and_process_renames() than for process_entries(). Given the rareness of the even earlier conflicts, the limited savings we'd get from exiting even earlier, and in an attempt to keep this patch simpler, we don't guarantee that we actually exit on the first conflict detected. We can always revisit this decision later if we decide that a further micro-optimization to exit slightly earlier in rare cases is worthwhile. "avoid (not prevent) writing objects": The detect_and_process_renames() call can also write objects to the object store, when rename/rename conflicts involve one (or more) files that have also been modified on both sides. Because of this alternate call path leading to handle_content_merges(), our "early exit" does not prevent writing objects entirely, even within the "outer layer" (i.e. even within call_depth == 0). I figure that's fine though, since we're already writing objects for the inner merges (i.e. for call_depth > 0), which are likely going to represent vastly more objects than files involved in rename/rename+modify/modify cases in the outer merge, on average. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-24Merge branch 'ps/object-file-cleanup'Junio C Hamano
Code clean-up. * ps/object-file-cleanup: object-store: merge "object-store-ll.h" and "object-store.h" object-store: remove global array of cached objects object: split out functions relating to object store subsystem object-file: drop `index_blob_stream()` object-file: split up concerns of `HASH_*` flags object-file: split out functions relating to object store subsystem object-file: move `xmmap()` into "wrapper.c" object-file: move `git_open_cloexec()` to "compat/open.c" object-file: move `safe_create_leading_directories()` into "path.c" object-file: move `mkdir_in_gitdir()` into "path.c"
2025-04-17Merge branch 'en/merge-recursive-debug'Junio C Hamano
Remove remnants of the recursive merge strategy backend, which was superseded by the ort merge strategy. * en/merge-recursive-debug: builtin/{merge,rebase,revert}: remove GIT_TEST_MERGE_ALGORITHM tests: remove GIT_TEST_MERGE_ALGORITHM and test_expect_merge_algorithm merge-recursive.[ch]: thoroughly debug these merge, sequencer: switch recursive merges over to ort sequencer: switch non-recursive merges over to ort merge-ort: enable diff-algorithms other than histogram builtin/merge-recursive: switch to using merge_ort_generic() checkout: replace merge_trees() with merge_ort_nonrecursive()
2025-04-15Merge branch 'ps/object-wo-the-repository'Junio C Hamano
The object layer has been updated to take an explicit repository instance as a parameter in more code paths. * ps/object-wo-the-repository: hash: stop depending on `the_repository` in `null_oid()` hash: fix "-Wsign-compare" warnings object-file: split out logic regarding hash algorithms delta-islands: stop depending on `the_repository` object-file-convert: stop depending on `the_repository` pack-bitmap-write: stop depending on `the_repository` pack-revindex: stop depending on `the_repository` pack-check: stop depending on `the_repository` environment: move access to "core.bigFileThreshold" into repo settings pack-write: stop depending on `the_repository` and `the_hash_algo` object: stop depending on `the_repository` csum-file: stop depending on `the_repository`
2025-04-15object-file: split out functions relating to object store subsystemPatrick Steinhardt
While we have the "object-store.h" header, most of the functionality for object stores is actually hosted in "object-file.c". This makes it hard to find relevant functions and causes us to mix up concerns. Split out functions relating to the object store subsystem into a new "object-store.c" file. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-08Merge branch 'ps/object-wo-the-repository' into ps/object-file-cleanupJunio C Hamano
* ps/object-wo-the-repository: hash: stop depending on `the_repository` in `null_oid()` hash: fix "-Wsign-compare" warnings object-file: split out logic regarding hash algorithms delta-islands: stop depending on `the_repository` object-file-convert: stop depending on `the_repository` pack-bitmap-write: stop depending on `the_repository` pack-revindex: stop depending on `the_repository` pack-check: stop depending on `the_repository` environment: move access to "core.bigFileThreshold" into repo settings pack-write: stop depending on `the_repository` and `the_hash_algo` object: stop depending on `the_repository` csum-file: stop depending on `the_repository`
2025-04-08merge-recursive.[ch]: thoroughly debug theseElijah Newren
As a wise man once told me, "Deleted code is debugged code!" So, move the functions that are shared between merge-recursive and merge-ort from the former to the latter, and then debug the remainder of merge-recursive.[ch]. Joking aside, merge-ort was always intended to replace merge-recursive. It has numerous advantages over merge-recursive (operates much faster, can operate without a worktree or index, and fixes a number of known bugs and suboptimal merges). Since we have now replaced all callers of merge-recursive with equivalent functions from merge-ort, move the shared functions from the former to the latter, and delete the remainder of merge-recursive.[ch]. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-08merge-ort: enable diff-algorithms other than histogramElijah Newren
The ort merge strategy has always used the histogram diff algorithm. The recursive merge strategy, in contrast, defaults to the myers diff algorithm, while allowing it to be changed. Change the ort merge strategy to allow different diff algorithms, by removing the hard coded value in merge_start() and instead just making it a default in init_merge_options(). Technically, this also changes the default diff algorithm for the recursive backend too, but we're going to remove the final callers of the recursive backend in the next two commits. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-08Merge branch 'en/assert-wo-side-effects'Junio C Hamano
Ensure what we write in assert() does not have side effects, and introduce ASSERT() macro to mark those that cannot be mechanically checked for lack of side effects. * en/assert-wo-side-effects: treewide: replace assert() with ASSERT() in special cases ci: add build checking for side-effects in assert() calls git-compat-util: introduce ASSERT() macro
2025-03-29Merge branch 'en/random-cleanups'Junio C Hamano
Miscellaneous code clean-ups. * en/random-cleanups: merge-ort: remove extraneous word in comment merge-ort: fix accidental strset<->strintmap t7615: be more explicit about diff algorithm used t6423: fix a comment that accidentally reversed two commits stash: remove merge-recursive.h include
2025-03-29Merge branch 'en/merge-ort-prepare-to-remove-recursive'Junio C Hamano
First step of deprecating and removing merge-recursive. * en/merge-ort-prepare-to-remove-recursive: am: switch from merge_recursive_generic() to merge_ort_generic() merge-ort: fix merge.directoryRenames=false t3650: document bug when directory renames are turned off merge-ort: support having merge verbosity be set to 0 merge-ort: allow rename detection to be disabled merge-ort: add new merge_ort_generic() function
2025-03-21treewide: replace assert() with ASSERT() in special casesElijah Newren
When the compiler/linker cannot verify that an assert() invocation is free of side effects for us (e.g. because the assertion includes some kind of function call), replace the use of assert() with ASSERT(). Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-18merge-ort: fix merge.directoryRenames=falseElijah Newren
There are two issues here. First, when merge.directoryRenames is set to false, there are a few code paths that should be turned off. I missed one; collect_renames() was still doing some directory rename detection logic unconditionally. It ended up not having much effect because get_provisional_directory_renames() was skipped earlier and not setting up renames->dir_renames, but the code should still be skipped. Second, the larger issue is that sometimes we get a cached_pair rename from a previous commit being replayed mapping A->B, but in a subsequent commit but collect_merge_info() doesn't even recurse into the directory containing B because there are no source pairings for that rename that are relevant; we can merge that commit fine without knowing the rename. But since the cached renames are added to the normal renames, when we go to process it and find that B is not part of opt->priv->paths, we hit the assertion error process_renames: Assertion `newinfo && ~newinfo->merged.clean` failed. I think we could fix this at the beginning of detect_regular_renames() by pruning from cached_pairs any entry whose destination isn't in opt->priv->paths, but it's suboptimal in that we'd kind of like the cached_pair to be restored afterwards so that it can help the subsequent commit, but more importantly since it sits at the intersection of the caching renames optimization and the relevant renames optimization, and the trivial directory resolution optimization, and I don't currently have Documentation/technical/remembering-renames.txt fully paged in, I'm not sure if that's a full solution or a bandaid for the current testcase. However, since the remembering renames optimization was the weakest of the set, and the optimization is far less important when directory rename detection is off (as that implies far fewer potential renames), let's just use a bigger hammer to ensure this special case is fixed: turn off the rename caching. We do the same thing already when we encounter rename/rename(1to1) cases (as per `git grep -3 disabling.the.optimization`, though it uses a slightly different triggering mechanism since it's trying to affect the next time that merge_check_renames_reusable() is called), and I think it makes sense to do the same here. Signed-off-by: Elijah Newren <newren@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-18merge-ort: allow rename detection to be disabledElijah Newren
When merge-ort was written, I did not at first allow rename detection to be disabled, because I suspected that most folks disabling rename detection were doing so solely for performance reasons. Since I put a lot of working into providing dramatic speedups for rename detection performance as used by the merge machinery, I wanted to know if there were still real world repositories where rename detection was problematic from a performance perspective. We have had years now to collect such information, and while we never received one, waiting longer with the option disabled seems unlikely to help surface such issues at this point. Also, there has been at least one request to allow rename detection to be disabled for behavioral rather than performance reasons (see the thread including https://lore.kernel.org/git/CABPp-BG-Nx6SCxxkGXn_Fwd2wseifMFND8eddvWxiZVZk0zRaA@mail.gmail.com/ ), so let's start heeding the config and command line settings. Signed-off-by: Elijah Newren <newren@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-18merge-ort: add new merge_ort_generic() functionElijah Newren
merge-recursive.[ch] have three entry points: * merge_trees() * merge_recursive() * merge_recursive_generic() merge-ort*.[ch] only has equivalents for the first two. Add an equivalent for the final entry point, so we can switch callers to use it and remove merge-recursive.[ch]. While porting it over, finally fix the issue with the label for the ancestor (used when merge.conflictStyle=diff3 as a conflict label). merge-recursive.c has traditionally not allowed callers to set that label, but I have found that problematic for years. (Side note: This function was initially part of the merge-ort rewrite, but reviewers questioned the ancestor label funnyness which I was never really happy with anyway. It resulted in me jettisoning it and hoping at the time that I would eventually be able to force the existing callers to use some other API. That worked with `git stash`, as per 874cf2a60444 (stash: apply stash using 'merge_ort_nonrecursive()', 2022-05-10), but this API is the most reasonable one for `git am` and `git merge-recursive`, if we can just allow them some freedom over the ancestor label.) The merge_recursive_generic() function did not know whether it was being invoked by `git stash`, `git merge-recursive`, or `git am`, and the choice of meaningful ancestor label, when there is a unique ancestor, varies for these different callers: * git am: ancestor is a constructed "fake ancestor" that user knows nothing about and has no access to. (And is different than the normal thing we mean by a "virtual merge base" which is the merging of merge bases.) * git merge-recursive: ancestor might be a tree, but at least it was one specified by the user (if they invoked merge-recursive directly) * git stash: ancestor was the commit serving as the stash base Thus, using a label like "constructed merge base" (as merge_recursive_generic() does) presupposes that `git am` is the only caller; it is incorrect for other callers. This label has thrown me off more than once. Allow the caller to override when there is a unique merge base. Signed-off-by: Elijah Newren <newren@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-17merge-ort: remove extraneous word in commentElijah Newren
"is was" -> "was" Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-17merge-ort: fix accidental strset<->strintmapElijah Newren
Both strset_for_each_entry and strintmap_for_each_entry are macros that evaluate to the same thing, so they are technically interchangeable. However, the intent is that we use the one matching the variable type we are passing. Unfortunately, I somehow mistakenly got one of these wrong in 7bee6c100431 (merge-ort: avoid recursing into directories when we don't need to, 2021-07-16) -- possibly related to the fact that relevant_sources was initially a strset and later refactored into a strintmap. Correct which macro we use. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10hash: stop depending on `the_repository` in `null_oid()`Patrick Steinhardt
The `null_oid()` function returns the object ID that only consists of zeroes. Naturally, this ID also depends on the hash algorithm used, as the number of zeroes is different between SHA1 and SHA256. Consequently, the function returns the hash-algorithm-specific null object ID. This is currently done by depending on `the_hash_algo`, which implicitly makes us depend on `the_repository`. Refactor the function to instead pass in the hash algorithm for which we want to retrieve the null object ID. Adapt callsites accordingly by passing in `the_repository`, thus bubbling up the dependency on that global variable by one layer. There are a couple of trivial exceptions for subsystems that already got rid of `the_repository`. These subsystems instead use the repository that is available via the calling context: - "builtin/grep.c" - "grep.c" - "refs/debug.c" There are also two non-trivial exceptions: - "diff-no-index.c": Here we know that we may not have a repository initialized at all, so we cannot rely on `the_repository`. Instead, we adapt `diff_no_index()` to get a `struct git_hash_algo` as parameter. The only caller is located in "builtin/diff.c", where we know to call `repo_set_hash_algo()` in case we're running outside of a Git repository. Consequently, it is fine to continue passing `the_repository->hash_algo` even in this case. - "builtin/ls-files.c": There is an in-flight patch series that drops `USE_THE_REPOSITORY_VARIABLE` in this file, which causes a semantic conflict because we use `null_oid()` in `show_submodule()`. The value is passed to `repo_submodule_init()`, which may use the object ID to resolve a tree-ish in the superproject from which we want to read the submodule config. As such, the object ID should refer to an object in the superproject, and consequently we need to use its hash algorithm. This means that we could in theory just not bother about this edge case at all and just use `the_repository` in "diff-no-index.c". But doing so would feel misdesigned. Remove the `USE_THE_REPOSITORY_VARIABLE` preprocessor define in "hash.c". Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-06merge-ort: fix slightly overzealous assertion for rename-to-selfElijah Newren
merge-ort has a number of sanity checks on the file it is processing in process_renames(). One of these sanity checks was slightly overzealous because it indirectly assumed that a renamed file always ended up at a different path than where it started. That is normally an entirely fair assumption, but directory rename detection can make things interesting. As a quick refresher, if one side of history renames directory A/ -> B/, and the other side of history adds new files to A/, then directory rename detection notices and suggests moving those new files to B/. A similar thing is done for paths renamed into A/, causing them to be transitively renamed into B/. But, if the file originally came from B/, then this can end up causing a file to be renamed back to itself. It turns out the rest of the code following this assertion handled the case fine; the assertion was just an extra sanity check, not a rigid precondition. Therefore, simply adjust the assertion to pass under this special case as well. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-06global: mark code units that generate warnings with `-Wsign-compare`Patrick Steinhardt
Mark code units that generate warnings with `-Wsign-compare`. This allows for a structured approach to get rid of all such warnings over time in a way that can be easily measured. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-30diff: improve lifecycle management of diff queuesPatrick Steinhardt
The lifecycle management of diff queues is somewhat confusing: - For most of the part this can be attributed to `DIFF_QUEUE_CLEAR()`, which does not release any memory but rather initializes the queue, only. This is in contrast to our common naming schema, where "clearing" means that we release underlying memory and then re-initialize the data structure such that it is ready to use. - A second offender is `diff_free_queue()`, which does not free the queue structure itself. It is rather a release-style function. Refactor the code to make things less confusing. `DIFF_QUEUE_CLEAR()` is replaced by `DIFF_QUEUE_INIT` and `diff_queue_init()`, while `diff_free_queue()` is replaced by `diff_queue_release()`. While on it, adapt callsites where we call `DIFF_QUEUE_CLEAR()` with the intent to release underlying memory to instead call `diff_queue_clear()` to fix memory leaks. This memory leak is exposed by t4211, but plugging it alone does not make the whole test suite pass. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25Merge branch 'ak/typofix-2.46-maint'Junio C Hamano
Typofix. * ak/typofix-2.46-maint: upload-pack: fix a typo sideband: fix a typo setup: fix a typo run-command: fix a typo revision: fix a typo refs: fix typos rebase: fix a typo read-cache-ll: fix a typo pretty: fix a typo object-file: fix a typo merge-ort: fix typos merge-ll: fix a typo http: fix a typo gpg-interface: fix a typo git-p4: fix typos git-instaweb: fix a typo fsmonitor-settings: fix a typo diffcore-rename: fix typos config.mak.dev: fix a typo
2024-09-19merge-ort: fix typosAndrew Kreimer
Fix typos in comments. Signed-off-by: Andrew Kreimer <algonell@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-05merge-ort: fix two leaks when handling directory rename modificationsPatrick Steinhardt
There are two leaks in `apply_directory_rename_modifications()`: - We do not release the `dirs_to_insert` string list. - We do not release some `conflict_info` we put into the `opt->priv->paths` string map. The former is trivial to fix. The latter is a bit less straight forward: the `util` pointer of the string map may sometimes point to data that has been allocated via `CALLOC()`, while at other times it may point to data that has been allocated via a `mem_pool`. It very much seems like an oversight that we didn't also allocate the conflict info in this code path via the memory pool, though. So let's fix that, which will also plug the memory leak for us. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14merge-ort: unconditionally release attributes indexPatrick Steinhardt
We conditionally release the index used for reading gitattributes in merge-ort based on whether or the index has been populated. This check uses `cache_nr` as a condition. This isn't sufficient though, as the variable may be zero even when some other parts of the index have been populated. This leads to memory leaks when sparse checkouts are in use, as we may not end up releasing the sparse checkout patterns. Fix this issue by unconditionally releasing the index. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-16Merge branch 'en/ort-inner-merge-error-fix'Junio C Hamano
The "ort" merge backend saw one bugfix for a crash that happens when inner merge gets killed, and assorted code clean-ups. * en/ort-inner-merge-error-fix: merge-ort: fix missing early return merge-ort: convert more error() cases to path_msg() merge-ort: upon merge abort, only show messages causing the abort merge-ort: loosen commented requirements merge-ort: clearer propagation of failure-to-function from merge_submodule merge-ort: fix type of local 'clean' var in handle_content_merge () merge-ort: maintain expected invariant for priv member merge-ort: extract handling of priv member into reusable function
2024-07-08Merge branch 'ps/leakfixes-more'Junio C Hamano
More memory leaks have been plugged. * ps/leakfixes-more: (29 commits) builtin/blame: fix leaking ignore revs files builtin/blame: fix leaking prefixed paths blame: fix leaking data for blame scoreboards line-range: plug leaking find functions merge: fix leaking merge bases builtin/merge: fix leaking `struct cmdnames` in `get_strategy()` sequencer: fix memory leaks in `make_script_with_merges()` builtin/clone: plug leaking HEAD ref in `wanted_peer_refs()` apply: fix leaking string in `match_fragment()` sequencer: fix leaking string buffer in `commit_staged_changes()` commit: fix leaking parents when calling `commit_tree_extended()` config: fix leaking "core.notesref" variable rerere: fix various trivial leaks builtin/stash: fix leak in `show_stash()` revision: free diff options builtin/log: fix leaking commit list in git-cherry(1) merge-recursive: fix memory leak when finalizing merge builtin/merge-recursive: fix leaking object ID bases builtin/difftool: plug memory leaks in `run_dir_diff()` object-name: free leaking object contexts ...
2024-07-06merge-ort: fix missing early returnElijah Newren
One of the conversions in f19b9165 (merge-ort: convert more error() cases to path_msg(), 2024-06-19) accidentally lost the early return. Restore it. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-20merge-ort: convert more error() cases to path_msg()Elijah Newren
merge_submodule() stores errors using path_msg(), whereas other call sites make use of the error() function. This is inconsistent, and moving towards path_msg() seems more friendly for libification efforts since it will allow the caller to determine whether the error messages need to be printed. Note that this deferred handling of error messages changes the error message in a recursive merge from error: failed to execute internal merge to From inner merge: error: failed to execute internal merge which provides a little more information about the error which may be useful. Since the recursive merge strategy still only shows the older error, we had to adjust the new testcase introduced a few commits ago to just search for the older message somewhere in the output. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-20merge-ort: upon merge abort, only show messages causing the abortElijah Newren
When something goes wrong enough that we need to abort early and not even attempt merging the remaining files, it probably does not make sense to report conflicts messages for the subset of files we processed before hitting the fatal error. Instead, only show the messages associated with paths where we hit the fatal error. Also, print these messages to stderr rather than stdout. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-20merge-ort: loosen commented requirementsElijah Newren
The comment above type_short_descriptions claimed that the order had to match what was found in the conflict_info_and_types enum. Since type_short_descriptions uses designated initializers, the order should not actually matter; I am guessing that positional initializers may have been under consideration when that comment was added, but the comment was not updated when designated initializers were chosen. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-20merge-ort: clearer propagation of failure-to-function from merge_submoduleElijah Newren
The 'clean' member variable is somewhat of a tri-state (1 = clean, 0 = conflicted, -1 = failure-to-determine), but we often like to think of it as binary (ignoring the possibility of a negative value) and use constructs like '!clean' to reflect this. However, these constructs can make codepaths more difficult to understand, unless we handle the negative case early and return pre-emptively; do that in handle_content_merge() to make the code a bit easier to read. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-20merge-ort: fix type of local 'clean' var in handle_content_merge ()Elijah Newren
handle_content_merge() returns an int. Every caller of handle_content_merge() expects an int. However, we declare a local variable 'clean' that we use for the return value to be unsigned. To make matters worse, we also assign 'clean' the return value of merge_submodule() in one codepath, which is defined to return an int. It seems that the only reason to have 'clean' be unsigned was to allow a cutesy bit manipulation operation to be well-defined. Fix the type of the 'clean' local in handle_content_merge(). Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-20merge-ort: maintain expected invariant for priv memberElijah Newren
The calling convention for the merge machinery is One call to init_merge_options() One or more calls to merge_incore_[non]recursive() One call to merge_finalize() (possibly indirectly via merge_switch_to_result()) Both merge_switch_to_result() and merge_finalize() expect opt->priv == NULL && result->priv != NULL which is supposed to be set up by our move_opt_priv_to_result_priv() function. However, two codepaths dealing with error cases did not execute this necessary logic, which could result in assertion failures (or, if assertions were compiled out, could result in segfaults). Fix the oversight and add a test that would have caught one of these problems. While at it, also tighten an existing test for a non-recursive merge to verify that it fails with appropriate status. Most merge tests in the testsuite check either for success or conflicts; those testing for neither are rare and it is good to ensure they support the invariant assumed by builtin/merge.c in this comment: /* * The backend exits with 1 when conflicts are * left to be resolved, with 2 when it does not * handle the given merge at all. */ So, explicitly check for the exit status of 2 in these cases. Reported-by: Matt Cree <matt.cree@gearset.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-20merge-ort: extract handling of priv member into reusable functionElijah Newren
In preparation for a subsequent commit which will ensure we do not forget to maintain our invariants for the priv member in error codepaths, extract the necessary functionality out into a separate function. This change is cosmetic at this point, and introduces no changes beyond an extra assertion sanity check. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14global: introduce `USE_THE_REPOSITORY_VARIABLE` macroPatrick Steinhardt
Use of the `the_repository` variable is deprecated nowadays, and we slowly but steadily convert the codebase to not use it anymore. Instead, callers should be passing down the repository to work on via parameters. It is hard though to prove that a given code unit does not use this variable anymore. The most trivial case, merely demonstrating that there is no direct use of `the_repository`, is already a bit of a pain during code reviews as the reviewer needs to manually verify claims made by the patch author. The bigger problem though is that we have many interfaces that implicitly rely on `the_repository`. Introduce a new `USE_THE_REPOSITORY_VARIABLE` macro that allows code units to opt into usage of `the_repository`. The intent of this macro is to demonstrate that a certain code unit does not use this variable anymore, and to keep it from new dependencies on it in future changes, be it explicit or implicit For now, the macro only guards `the_repository` itself as well as `the_hash_algo`. There are many more known interfaces where we have an implicit dependency on `the_repository`, but those are not guarded at the current point in time. Over time though, we should start to add guards as required (or even better, just remove them). Define the macro as required in our code units. As expected, most of our code still relies on the global variable. Nearly all of our builtins rely on the variable as there is no way yet to pass `the_repository` to their entry point. For now, declare the macro in "biultin.h" to keep the required changes at least a little bit more contained. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11merge: fix leaking merge basesPatrick Steinhardt
When calling either the recursive or the ORT merge machineries we need to provide a list of merge bases. The ownership of that parameter is then implicitly transferred to the callee, which is somewhat fishy. Furthermore, that list may leak in some cases where the merge machinery runs into an error, thus causing a memory leak. Refactor the code such that we stop transferring ownership. Instead, the merge machinery will now create its own local copies of the passed in list as required if they need to modify the list. Free the list at the callsites as required. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-01Merge branch 'pw/checkout-conflict-errorfix'Junio C Hamano
"git checkout --conflict=bad" reported a bad conflictStyle as if it were given to a configuration variable; it has been corrected to report that the command line option is bad. * pw/checkout-conflict-errorfix: checkout: fix interaction between --conflict and --merge checkout: cleanup --conflict=<style> parsing merge options: add a conflict style member merge-ll: introduce LL_MERGE_OPTIONS_INIT xdiff-interface: refactor parsing of merge.conflictstyle
2024-03-28Merge branch 'eb/hash-transition'Junio C Hamano
Work to support a repository that work with both SHA-1 and SHA-256 hash algorithms has started. * eb/hash-transition: (30 commits) t1016-compatObjectFormat: add tests to verify the conversion between objects t1006: test oid compatibility with cat-file t1006: rename sha1 to oid test-lib: compute the compatibility hash so tests may use it builtin/ls-tree: let the oid determine the output algorithm object-file: handle compat objects in check_object_signature tree-walk: init_tree_desc take an oid to get the hash algorithm builtin/cat-file: let the oid determine the output algorithm rev-parse: add an --output-object-format parameter repository: implement extensions.compatObjectFormat object-file: update object_info_extended to reencode objects object-file-convert: convert commits that embed signed tags object-file-convert: convert commit objects when writing object-file-convert: don't leak when converting tag objects object-file-convert: convert tag objects when writing object-file-convert: add a function to convert trees between algorithms object: factor out parse_mode out of fast-import and tree-walk into in object.h cache: add a function to read an OID of a specific algorithm tag: sign both hashes commit: export add_header_signature to support handling signatures on tags ...
2024-03-14merge options: add a conflict style memberPhillip Wood
Add a conflict_style member to `struct merge_options` and `struct ll_merge_options` to allow callers to override the default conflict style. This will be used in the next commit. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-14merge-ll: introduce LL_MERGE_OPTIONS_INITPhillip Wood
Introduce a macro to initialize `struct ll_merge_options` in preparation for the next commit that will add a new member that needs to be initialized to a non-zero value. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-11Merge branch 'js/merge-base-with-missing-commit'Junio C Hamano
Make sure failure return from merge_bases_many() is properly caught. * js/merge-base-with-missing-commit: merge-ort/merge-recursive: do report errors in `merge_submodule()` merge-recursive: prepare for `merge_submodule()` to report errors commit-reach(repo_get_merge_bases_many_dirty): pass on errors commit-reach(repo_get_merge_bases_many): pass on "missing commits" errors commit-reach(get_octopus_merge_bases): pass on "missing commits" errors commit-reach(repo_get_merge_bases): pass on "missing commits" errors commit-reach(get_merge_bases_many_0): pass on "missing commits" errors commit-reach(merge_bases_many): pass on "missing commits" errors commit-reach(paint_down_to_common): start reporting errors commit-reach(paint_down_to_common): prepare for handling shallow commits commit-reach(repo_in_merge_bases_many): report missing commits commit-reach(repo_in_merge_bases_many): optionally expect missing commits commit-reach(paint_down_to_common): plug two memory leaks
2024-03-09merge-ort/merge-recursive: do report errors in `merge_submodule()`Johannes Schindelin
In 24876ebf68b (commit-reach(repo_in_merge_bases_many): report missing commits, 2024-02-28), I taught `merge_submodule()` to handle errors reported by `repo_in_merge_bases_many()`. However, those errors were not passed through to the callers. That was unintentional, and this commit remedies that. Note that `find_first_merges()` can now also return -1 (because it passes through that return value from `repo_in_merge_bases()`), and this commit also adds the forgotten handling for that scenario. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>