summaryrefslogtreecommitdiff
path: root/diff.c
AgeCommit message (Collapse)Author
2025-07-01odb: rename `oid_object_info()`Patrick Steinhardt
Rename `oid_object_info()` to `odb_read_object_info()` as well as their `_extended()` variant to match other functions related to the object database and our modern coding guidelines. Introduce compatibility wrappers so that any in-flight topics will continue to compile. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-01object-store: rename files to "odb.{c,h}"Patrick Steinhardt
In the preceding commits we have renamed the structures contained in "object-store.h" to `struct object_database` and `struct odb_backend`. As such, the code files "object-store.{c,h}" are confusingly named now. Rename them to "odb.{c,h}" accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-08Merge branch 'js/diff-codeql-false-positive-workaround'Junio C Hamano
Work around false positive given by CodeQL. * js/diff-codeql-false-positive-workaround: diff: check range before dereferencing an array element
2025-04-29diff: check range before dereferencing an array elementJohannes Schindelin
Before accessing an array element at a given index, it should be verified that the index is within the desired bounds, not afterwards, otherwise it may not make sense to even access the array element in the first place. This is the point of CodeQL's `cpp/offset-use-before-range-check` rule. This CodeQL rule unfortunately is also triggered by the `fill_es_indent_data()` code, even though the condition `off < len - 1` does not even need to guarantee that the offset is in bounds (`s` points to a NUL-terminated string, for which `s[off] == '\r'` would fail before running out of bounds). Let's work around this rare false positive to help us use an otherwise mostly useful tool is a worthy thing to do. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-24Merge branch 'ps/parse-options-integers'Junio C Hamano
Update parse-options API to catch mistakes to pass address of an integral variable of a wrong type/size. * ps/parse-options-integers: parse-options: detect mismatches in integer signedness parse-options: introduce precision handling for `OPTION_UNSIGNED` parse-options: introduce precision handling for `OPTION_INTEGER` parse-options: rename `OPT_MAGNITUDE()` to `OPT_UNSIGNED()` parse-options: support unit factors in `OPT_INTEGER()` global: use designated initializers for options parse: fix off-by-one for minimum signed values
2025-04-17global: use designated initializers for optionsPatrick Steinhardt
While we expose macros for most of our different option types understood by the "parse-options" subsystem, not every combination of fields that has one as that would otherwise quickly lead to an explosion of macros. Instead, we just initialize structures manually for those variants of fields that don't have a macro. Callsites that open-code these structure initialization don't use designated initializers though and instead just provide values for each of the fields that they want to initialize. This has three significant downsides: - Callsites need to specify all values up to the last field that they care about. This often includes fields that should simply be left at their default zero-initialized state, which adds distraction. - Any reader not deeply familiar with the layout of the structure has a hard time figuring out what the respective initializers mean. - Reordering or introducing new fields in the middle of the structure is impossible without adapting all callsites. Convert all sites to instead use designated initializers, which we have started using in our codebase quite a while ago. This allows us to skip any default-initialized fields, gives the reader context by specifying the field names and allows us to reorder or introduce new fields where we want to. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-15object-store: merge "object-store-ll.h" and "object-store.h"Patrick Steinhardt
The "object-store-ll.h" header has been introduced to keep transitive header dependendcies and compile times at bay. Now that we have created a new "object-store.c" file though we can easily move the last remaining additional bit of "object-store.h", the `odb_path_map`, out of the header. Do so. As the "object-store.h" header is now equivalent to its low-level alternative we drop the latter and inline it into the former. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-08Merge branch 'ps/object-wo-the-repository' into ps/object-file-cleanupJunio C Hamano
* ps/object-wo-the-repository: hash: stop depending on `the_repository` in `null_oid()` hash: fix "-Wsign-compare" warnings object-file: split out logic regarding hash algorithms delta-islands: stop depending on `the_repository` object-file-convert: stop depending on `the_repository` pack-bitmap-write: stop depending on `the_repository` pack-revindex: stop depending on `the_repository` pack-check: stop depending on `the_repository` environment: move access to "core.bigFileThreshold" into repo settings pack-write: stop depending on `the_repository` and `the_hash_algo` object: stop depending on `the_repository` csum-file: stop depending on `the_repository`
2025-03-10hash: stop depending on `the_repository` in `null_oid()`Patrick Steinhardt
The `null_oid()` function returns the object ID that only consists of zeroes. Naturally, this ID also depends on the hash algorithm used, as the number of zeroes is different between SHA1 and SHA256. Consequently, the function returns the hash-algorithm-specific null object ID. This is currently done by depending on `the_hash_algo`, which implicitly makes us depend on `the_repository`. Refactor the function to instead pass in the hash algorithm for which we want to retrieve the null object ID. Adapt callsites accordingly by passing in `the_repository`, thus bubbling up the dependency on that global variable by one layer. There are a couple of trivial exceptions for subsystems that already got rid of `the_repository`. These subsystems instead use the repository that is available via the calling context: - "builtin/grep.c" - "grep.c" - "refs/debug.c" There are also two non-trivial exceptions: - "diff-no-index.c": Here we know that we may not have a repository initialized at all, so we cannot rely on `the_repository`. Instead, we adapt `diff_no_index()` to get a `struct git_hash_algo` as parameter. The only caller is located in "builtin/diff.c", where we know to call `repo_set_hash_algo()` in case we're running outside of a Git repository. Consequently, it is fine to continue passing `the_repository->hash_algo` even in this case. - "builtin/ls-files.c": There is an in-flight patch series that drops `USE_THE_REPOSITORY_VARIABLE` in this file, which causes a semantic conflict because we use `null_oid()` in `show_submodule()`. The value is passed to `repo_submodule_init()`, which may use the object ID to resolve a tree-ish in the superproject from which we want to read the submodule config. As such, the object ID should refer to an object in the superproject, and consequently we need to use its hash algorithm. This means that we could in theory just not bother about this edge case at all and just use `the_repository` in "diff-no-index.c". But doing so would feel misdesigned. Remove the `USE_THE_REPOSITORY_VARIABLE` preprocessor define in "hash.c". Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10environment: move access to "core.bigFileThreshold" into repo settingsPatrick Steinhardt
The "core.bigFileThreshold" setting is stored in a global variable and populated via `git_default_core_config()`. This may cause issues in the case where one is handling multiple different repositories in a single process with different values for that config key, as we may or may not see the correct value in that case. Furthermore, global state blocks our path towards libification. Refactor the code so that we instead store the value in `struct repo_settings`, where the value is computed as-needed and cached. Note that this change requires us to adapt one test in t1050 that verifies that we die when parsing an invalid "core.bigFileThreshold" value. The exercised Git command doesn't use the value at all, and thus it won't hit the new code path that parses the value. This is addressed by using git-hash-object(1) instead, which does read the value. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03diff: add option to skip resolving diff statusesJustin Tobler
By default, `diffcore_std()` resolves the statuses for queued diff file pairs by calling `diff_resolve_rename_copy()`. If status information is already manually set, invoking `diffcore_std()` may change the status value. Introduce the `skip_resolving_statuses` diff option that prevents `diffcore_std()` from resolving file pair statuses when enabled. Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-03diff: return diff_filepair from diff queue helpersJustin Tobler
The `diff_addremove()` and `diff_change()` functions set up and queue diffs, but do not return the `diff_filepair` added to the queue. In a subsequent commit, modifications to `diff_filepair` need to occur in certain cases after being queued. Since the existing `diff_addremove()` and `diff_change()` are also used for callbacks in `diff_options` as types `add_remove_fn_t` and `change_fn_t`, modifying the existing function signatures requires further changes. The diff options for pruning use `file_add_remove()` and `file_change()` where file pairs do not even get queued. Thus, separate functions are implemented instead. Split out the queuing operations into `diff_queue_addremove()` and `diff_queue_change()` which also return a handle to the queued `diff_filepair`. Both `diff_addremove()` and `diff_change()` are reimplemented as thin wrappers around the new functions. Signed-off-by: Justin Tobler <jltobler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-25Merge branch 'bc/diff-reject-empty-arg-to-pickaxe'Junio C Hamano
The -G/-S options to the "diff" family of commands caused us to hit a BUG() when they get no values; they have been corrected. * bc/diff-reject-empty-arg-to-pickaxe: diff: don't crash with empty argument to -G or -S
2025-02-18diff: don't crash with empty argument to -G or -Sbrian m. carlson
The pickaxe options, -G and -S, need either a regex or a string to look through the history for. An empty value isn't very useful since it would either match everything or nothing, and what's worse, we presently crash with a BUG like so when the user provides one: BUG: diffcore-pickaxe.c:241: should have needle under -G or -S Since it's not very nice of us to crash and this wouldn't do anything useful anyway, let's simply inform the user that they must provide a non-empty argument and exit with an error if they provide an empty one instead. Reported-by: Jared Van Bortel <cebtenzzre@gmail.com> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-31global: adapt callers to use generic hash context helpersPatrick Steinhardt
Adapt callers to use generic hash context helpers instead of using the hash algorithm to update them. This makes the callsites easier to reason about and removes the possibility that the wrong hash algorithm is used to update the hash context's state. And as a nice side effect this also gets rid of a bunch of users of `the_hash_algo`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-31hash: stop typedeffing the hash contextPatrick Steinhardt
We generally avoid using `typedef` in the Git codebase. One exception though is the `git_hash_ctx`, likely because it used to be a union rather than a struct until the preceding commit refactored it. But now that it is a normal `struct` there isn't really a need for a typedef anymore. Drop the typedef and adapt all callers accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-23Merge branch 'ps/build-sign-compare'Junio C Hamano
Start working to make the codebase buildable with -Wsign-compare. * ps/build-sign-compare: t/helper: don't depend on implicit wraparound scalar: address -Wsign-compare warnings builtin/patch-id: fix type of `get_one_patchid()` builtin/blame: fix type of `length` variable when emitting object ID gpg-interface: address -Wsign-comparison warnings daemon: fix type of `max_connections` daemon: fix loops that have mismatching integer types global: trivial conversions to fix `-Wsign-compare` warnings pkt-line: fix -Wsign-compare warning on 32 bit platform csum-file: fix -Wsign-compare warning on 32-bit platform diff.h: fix index used to loop through unsigned integer config.mak.dev: drop `-Wno-sign-compare` global: mark code units that generate warnings with `-Wsign-compare` compat/win32: fix -Wsign-compare warning in "wWinMain()" compat/regex: explicitly ignore "-Wsign-compare" warnings git-compat-util: introduce macros to disable "-Wsign-compare" warnings
2024-12-18pager: stop using `the_repository`Patrick Steinhardt
Stop using `the_repository` in the "pager" subsystem by passing in a repository when setting up the pager and when configuring it. Adjust callers accordingly by using `the_repository`. While there may be some callers that have a repository available in their context, this trivial conversion allows for easier verification and bubbles up the use of `the_repository` by one level. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-18Merge branch 'ps/build-sign-compare' into ps/the-repositoryJunio C Hamano
* ps/build-sign-compare: t/helper: don't depend on implicit wraparound scalar: address -Wsign-compare warnings builtin/patch-id: fix type of `get_one_patchid()` builtin/blame: fix type of `length` variable when emitting object ID gpg-interface: address -Wsign-comparison warnings daemon: fix type of `max_connections` daemon: fix loops that have mismatching integer types global: trivial conversions to fix `-Wsign-compare` warnings pkt-line: fix -Wsign-compare warning on 32 bit platform csum-file: fix -Wsign-compare warning on 32-bit platform diff.h: fix index used to loop through unsigned integer config.mak.dev: drop `-Wno-sign-compare` global: mark code units that generate warnings with `-Wsign-compare` compat/win32: fix -Wsign-compare warning in "wWinMain()" compat/regex: explicitly ignore "-Wsign-compare" warnings git-compat-util: introduce macros to disable "-Wsign-compare" warnings
2024-12-06global: mark code units that generate warnings with `-Wsign-compare`Patrick Steinhardt
Mark code units that generate warnings with `-Wsign-compare`. This allows for a structured approach to get rid of all such warnings over time in a way that can be easily measured. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-04packfile: pass down repository to `has_object[_kept]_pack`Karthik Nayak
The functions `has_object[_kept]_pack` currently rely on the global variable `the_repository`. To eliminate global variable usage in `packfile.c`, we should progressively shift the dependency on the_repository to higher layers. Let's remove its usage from these functions and any related ones. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-10Merge branch 'jk/output-prefix-cleanup'Junio C Hamano
Code clean-up. * jk/output-prefix-cleanup: diff: store graph prefix buf in git_graph struct diff: return line_prefix directly when possible diff: return const char from output_prefix callback diff: drop line_prefix_length field line-log: use diff_line_prefix() instead of custom helper
2024-10-03diff: return const char from output_prefix callbackJeff King
The diff_options structure has an output_prefix callback for returning a prefix string, but it does so by returning a pointer to a strbuf. This makes the interface awkward. There's no reason the callback should need to use a strbuf, and it creates questions about whether the ownership of the resulting buffer should be transferred to the caller (it should not be, but a recent attempt to clean up this code led to a double-free in some cases). The one advantage we get is that the strbuf contains a ptr/len pair, so we could in theory have a prefix with embedded NULs. But we can observe that none of the existing callbacks would ever produce such a NUL (they are usually just indentation or graph symbols, and even the "--line-prefix" option takes a NUL-terminated string). And anyway, only one caller (the one in log_tree_diff_flush) actually looks at the strbuf length. In every other case we use a helper function which discards the length and just returns the NUL-terminated string. So let's just have the callback return a "const char *" pointer. It's up to the callbacks themselves if they want to use a strbuf under the hood. And now the caller in log_tree_diff_flush() can just use the helper function along with everybody else. That lets us even simplify out the function pointer check, since the helper returns an empty string (technically this does mean we'll sometimes issue an empty fputs() call, but I don't think this code path is hot enough to care about that). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-03diff: drop line_prefix_length fieldJeff King
The diff_options structure holds a line_prefix string and an associated length. But the length is always just the strlen() of the NUL-terminated string. Let's simplify the code by just storing the string pointer and assuming it is NUL-terminated when we use it. This will cause us to compute the string length in a few extra spots, but I don't think any of these are particularly hot code paths. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-30diff: improve lifecycle management of diff queuesPatrick Steinhardt
The lifecycle management of diff queues is somewhat confusing: - For most of the part this can be attributed to `DIFF_QUEUE_CLEAR()`, which does not release any memory but rather initializes the queue, only. This is in contrast to our common naming schema, where "clearing" means that we release underlying memory and then re-initialize the data structure such that it is ready to use. - A second offender is `diff_free_queue()`, which does not free the queue structure itself. It is rather a release-style function. Refactor the code to make things less confusing. `DIFF_QUEUE_CLEAR()` is replaced by `DIFF_QUEUE_INIT` and `diff_queue_init()`, while `diff_free_queue()` is replaced by `diff_queue_release()`. While on it, adapt callsites where we call `DIFF_QUEUE_CLEAR()` with the intent to release underlying memory to instead call `diff_queue_clear()` to fix memory leaks. This memory leak is exposed by t4211, but plugging it alone does not make the whole test suite pass. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-30Merge branch 'ps/leakfixes-part-7' into ps/leakfixes-part-8Junio C Hamano
* ps/leakfixes-part-7: (23 commits) diffcore-break: fix leaking filespecs when merging broken pairs revision: fix leaking parents when simplifying commits builtin/maintenance: fix leak in `get_schedule_cmd()` builtin/maintenance: fix leaking config string promisor-remote: fix leaking partial clone filter grep: fix leaking grep pattern submodule: fix leaking submodule ODB paths trace2: destroy context stored in thread-local storage builtin/difftool: plug several trivial memory leaks builtin/repack: fix leaking configuration diffcore-order: fix leaking buffer when parsing orderfiles parse-options: free previous value of `OPTION_FILENAME` diff: fix leaking orderfile option builtin/pull: fix leaking "ff" option dir: fix off by one errors for ignored and untracked entries builtin/submodule--helper: fix leaking remote ref on errors t/helper: fix leaking subrepo in nested submodule config helper builtin/submodule--helper: fix leaking error buffer builtin/submodule--helper: clear child process when not running it submodule: fix leaking update strategy ...
2024-09-27diff: fix leaking orderfile optionPatrick Steinhardt
The `orderfile` diff option is being assigned via `OPT_FILENAME()`, which assigns an allocated string to the variable. We never free it though, causing a memory leak. Change the type of the string to `char *` and free it to plug the leak. This also requires us to use `xstrdup()` to assign the global config to it in case it is set. This leak is being hit in t7621, but plugging it alone does not make the test suite pass. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25Merge branch 'rs/diff-exit-code-binary'Junio C Hamano
"git diff --exit-code" ignored modified binary files, which has been corrected. * rs/diff-exit-code-binary: diff: report modified binary files as changes in builtin_diff()
2024-09-23diff: report modified binary files as changes in builtin_diff()René Scharfe
The diff machinery has two ways to detect changes to set the exit code: Just comparing hashes and comparing blob contents. The latter is needed if certain changes have to be ignored, e.g. with --ignore-space-change or --ignore-matching-lines. It's enabled by the diff_options flag diff_from_contents. The code for handling binary files added by 1aaf69e669 (diff: shortcut for diff'ing two binary SHA-1 objects, 2014-08-16) always uses a quick hash-only comparison, even if the slow way is taken. We need it to report a hash difference as a change for the purpose of setting the exit code, though, but it never did. Fix that. d7b97b7185 (diff: let external diffs report that changes are uninteresting, 2024-06-09) set diff_from_contents if external diff programs are allowed. This is the default e.g. for git diff, and so that change exposed the inconsistency much more widely. Reported-by: Kohei Shibata <shiba200712@gmail.com> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-16Merge branch 'jc/range-diff-lazy-setup'Junio C Hamano
Code clean-up. * jc/range-diff-lazy-setup: remerge-diff: clean up temporary objdir at a central place remerge-diff: lazily prepare temporary objdir on demand
2024-09-13Merge branch 'rs/diff-exit-code-fix'Junio C Hamano
In a few corner cases "git diff --exit-code" failed to report "changes" (e.g., renamed without any content change), which has been corrected. * rs/diff-exit-code-fix: diff: report dirty submodules as changes in builtin_diff() diff: report copies and renames as changes in run_diff_cmd()
2024-09-08diff: report dirty submodules as changes in builtin_diff()René Scharfe
The diff machinery has two ways to detect changes to set the exit code: Just comparing hashes and comparing blob contents. The latter is needed if certain changes have to be ignored, e.g. with --ignore-space-change or --ignore-matching-lines. It's enabled by the diff_options flag diff_from_contents. The slower mode as never considered submodules (and subrepos) as changes with --submodule=diff or --submodule=log, which is inconsistent with --submodule=short (the default). Fix it. d7b97b7185 (diff: let external diffs report that changes are uninteresting, 2024-06-09) set diff_from_contents if external diff programs are allowed. This is the default e.g. for git diff, and so that change exposed the inconsistency much more widely. Reported-by: David Hull <david.hull@friendbuy.com> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-08diff: report copies and renames as changes in run_diff_cmd()René Scharfe
The diff machinery has two ways to detect changes to set the exit code: Just comparing hashes and comparing blob contents. The latter is needed if certain changes have to be ignored, e.g. with --ignore-space-change or --ignore-matching-lines. It's enabled by the diff_options flag diff_from_contents. The slower mode has never considered copies and renames to be changes, which is inconsistent with the quicker one. Fix it. Even if we ignore the file contents (because it's empty or contains only ignored lines), there's still the meta data change of adding or changing a filename, so we need to report it in the exit code. d7b97b7185 (diff: let external diffs report that changes are uninteresting, 2024-06-09) set diff_from_contents if external diff programs are allowed. This is the default e.g. for git diff, and so that change exposed the inconsistency much more widely. Reported-by: Jorge Luis Martinez Gomez <jol@jol.dev> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14diff: free state populated via optionsPatrick Steinhardt
The `objfind` and `anchors` members of `struct diff_options` are populated via option parsing, but are never freed in `diff_free()`. Fix this to plug those memory leaks. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14diff: fix leak when parsing invalid ignore regex optionPatrick Steinhardt
When parsing invalid ignore regexes passed via the `-I` option we don't free already-allocated memory, leading to a memory leak. Fix this. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-09remerge-diff: clean up temporary objdir at a central placeJunio C Hamano
After running a diff between two things, or a series of diffs while walking the history, the diff computation is concluded by a call to diff_result_code() to extract the exit status of the diff machinery. The function can work on "struct diffopt", but all the callers historically and currently pass "struct diffopt" that is embedded in the "struct rev_info" that is used to hold the remerge_diff bit and the remerge_objdir variable that points at the temporary object directory in use. Redefine diff_result_code() to take the whole "struct rev_info" to give it an access to these members related to remerge-diff, so that it can get rid of the temporary object directory for any and all callers that used the feature. We can lose the equivalent code to do so from the code paths for individual commands, diff-tree, diff, and log. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-08Merge branch 'ps/leakfixes-more'Junio C Hamano
More memory leaks have been plugged. * ps/leakfixes-more: (29 commits) builtin/blame: fix leaking ignore revs files builtin/blame: fix leaking prefixed paths blame: fix leaking data for blame scoreboards line-range: plug leaking find functions merge: fix leaking merge bases builtin/merge: fix leaking `struct cmdnames` in `get_strategy()` sequencer: fix memory leaks in `make_script_with_merges()` builtin/clone: plug leaking HEAD ref in `wanted_peer_refs()` apply: fix leaking string in `match_fragment()` sequencer: fix leaking string buffer in `commit_staged_changes()` commit: fix leaking parents when calling `commit_tree_extended()` config: fix leaking "core.notesref" variable rerere: fix various trivial leaks builtin/stash: fix leak in `show_stash()` revision: free diff options builtin/log: fix leaking commit list in git-cherry(1) merge-recursive: fix memory leak when finalizing merge builtin/merge-recursive: fix leaking object ID bases builtin/difftool: plug memory leaks in `run_dir_diff()` object-name: free leaking object contexts ...
2024-07-02Merge branch 'rs/diff-color-moved-w-no-ext-diff-fix'Junio C Hamano
"git diff --no-ext-diff" when diff.external is configured ignored the "--color-moved" option. * rs/diff-color-moved-w-no-ext-diff-fix: diff: allow --color-moved with --no-ext-diff
2024-07-02Merge branch 'ps/use-the-repository'Junio C Hamano
A CPP macro USE_THE_REPOSITORY_VARIABLE is introduced to help transition the codebase to rely less on the availability of the singleton the_repository instance. * ps/use-the-repository: hex: guard declarations with `USE_THE_REPOSITORY_VARIABLE` t/helper: remove dependency on `the_repository` in "proc-receive" t/helper: fix segfault in "oid-array" command without repository t/helper: use correct object hash in partial-clone helper compat/fsmonitor: fix socket path in networked SHA256 repos replace-object: use hash algorithm from passed-in repository protocol-caps: use hash algorithm from passed-in repository oidset: pass hash algorithm when parsing file http-fetch: don't crash when parsing packfile without a repo hash-ll: merge with "hash.h" refs: avoid include cycle with "repository.h" global: introduce `USE_THE_REPOSITORY_VARIABLE` macro hash: require hash algorithm in `empty_tree_oid_hex()` hash: require hash algorithm in `is_empty_{blob,tree}_oid()` hash: make `is_null_oid()` independent of `the_repository` hash: convert `oidcmp()` and `oideq()` to compare whole hash global: ensure that object IDs are always padded hash: require hash algorithm in `oidread()` and `oidclr()` hash: require hash algorithm in `hasheq()`, `hashcmp()` and `hashclr()` hash: drop (mostly) unused `is_empty_{blob,tree}_sha1()` functions
2024-06-24diff: allow --color-moved with --no-ext-diffRené Scharfe
We ignore the option --color-moved if an external diff program is configured, presumably because its overhead is unnecessary in that case. Respect the option if we don't actually use the external diff, though. Reported-by: lolligerhans@gmx.de Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-20Merge branch 'rs/diff-exit-code-with-external-diff'Junio C Hamano
"git diff --exit-code --ext-diff" learned to take the exit status of the external diff driver into account when deciding the exit status of the overall "git diff" invocation when configured to do so. * rs/diff-exit-code-with-external-diff: diff: let external diffs report that changes are uninteresting userdiff: add and use struct external_diff t4020: test exit code with external diffs
2024-06-14global: introduce `USE_THE_REPOSITORY_VARIABLE` macroPatrick Steinhardt
Use of the `the_repository` variable is deprecated nowadays, and we slowly but steadily convert the codebase to not use it anymore. Instead, callers should be passing down the repository to work on via parameters. It is hard though to prove that a given code unit does not use this variable anymore. The most trivial case, merely demonstrating that there is no direct use of `the_repository`, is already a bit of a pain during code reviews as the reviewer needs to manually verify claims made by the patch author. The bigger problem though is that we have many interfaces that implicitly rely on `the_repository`. Introduce a new `USE_THE_REPOSITORY_VARIABLE` macro that allows code units to opt into usage of `the_repository`. The intent of this macro is to demonstrate that a certain code unit does not use this variable anymore, and to keep it from new dependencies on it in future changes, be it explicit or implicit For now, the macro only guards `the_repository` itself as well as `the_hash_algo`. There are many more known interfaces where we have an implicit dependency on `the_repository`, but those are not guarded at the current point in time. Over time though, we should start to add guards as required (or even better, just remove them). Define the macro as required in our code units. As expected, most of our code still relies on the global variable. Nearly all of our builtins rely on the variable as there is no way yet to pass `the_repository` to their entry point. For now, declare the macro in "biultin.h" to keep the required changes at least a little bit more contained. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14hash: require hash algorithm in `oidread()` and `oidclr()`Patrick Steinhardt
Both `oidread()` and `oidclr()` use `the_repository` to derive the hash function that shall be used. Require callers to pass in the hash algorithm to get rid of this implicit dependency. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-11revision: free diff optionsPatrick Steinhardt
There is a todo comment in `release_revisions()` that mentions that we need to free the diff options, which was added via 54c8a7c379 (revisions API: add a TODO for diff_free(&revs->diffopt), 2022-04-14). Releasing the diff options wasn't quite feasible at that time because some call sites rely on its contents to remain even after the revisions have been released. In fact, there really only are a couple of callsites that misbehave here: - `cmd_shortlog()` releases the revisions, but continues to access its file pointer. - `do_diff_cache()` creates a shallow copy of `struct diff_options`, but does not set the `no_free` member. Consequently, we end up releasing resources of the caller-provided diff options. - `diff_free()` and friends do not play nice when being called multiple times as they don't unset data structures that they have just released. Fix all of those cases and enable the call to `diff_free()`, which plugs a bunch of memory leaks. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-10diff: let external diffs report that changes are uninterestingRené Scharfe
The options --exit-code and --quiet instruct git diff to indicate whether it found any significant changes by exiting with code 1 if it did and 0 if there were none. Currently this doesn't work if external diff programs are involved, as we have no way to learn what they found. Add that ability in the form of the new configuration options diff.trustExitCode and diff.<driver>.trustExitCode and the environment variable GIT_EXTERNAL_DIFF_TRUST_EXIT_CODE. They pair with the config options diff.external and diff.<driver>.command and the environment variable GIT_EXTERNAL_DIFF, respectively. The new options are off by default, keeping the old behavior. Enabling them indicates that the external diff returns exit code 1 if it finds significant changes and 0 if it doesn't, like diff(1). The name of the new options is taken from the git difftool and mergetool options of similar purpose. (There they enable passing on the exit code of a diff tool and to infer whether a merge done by a merge tool is successful.) The new feature sets the diff flag diff_from_contents in diff_setup_done() if we need the exit code and are allowed to call external diffs. This disables the optimization that avoids calling the program with --quiet. Add it back by skipping the call if the external diff is not able to report empty diffs. We can only do that check after evaluating the file-specific attributes in run_external_diff(). If we do run the external diff with --quiet, send its output to /dev/null. I considered checking the output of the external diff to check whether its empty. It was added as 11be65cfa4 (diff: fix --exit-code with external diff, 2024-05-05) and quickly reverted, as it does not work with external diffs that do not write to stdout. There's no reason why a graphical diff tool would even need to write anything there at all. I also considered using a non-zero exit code for empty diffs, which could be done without adding new configuration options. We'd need to disable the optimization that allows git diff --quiet to skip calling external diffs, though -- that might be quite surprising if graphical diff programs are involved. And assigning the opposite meaning of the exit codes compared to diff(1) and git diff --exit-code to the external diff can cause unnecessary confusion. Suggested-by: Phillip Wood <phillip.wood123@gmail.com> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-10userdiff: add and use struct external_diffRené Scharfe
Wrap the string specifying the external diff command in a new struct to simplify adding attributes, which the next patch will do. Make sure external_diff() still returns NULL if neither the environment variable GIT_EXTERNAL_DIFF nor the configuration option diff.external is set, to continue allowing its use in a boolean context. Use a designated initializer for the default builtin userdiff driver to adjust to the type change of the second struct member. Spelling out only the non-zero members improves readability as a nice side-effect. No functional change intended. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07diff: cast string constant in `fill_textconv()`Patrick Steinhardt
The `fill_textconv()` function is responsible for converting an input file with a textconv driver, which is then passed to the caller. Weirdly though, the function also handles the case where there is no textconv driver at all. In that case, it will return either the contents of the populated filespec, or an empty string if the filespec is invalid. These two cases have differing memory ownership semantics. When there is a textconv driver, then the result is an allocated string. Otherwise, the result is either a string constant or owned by the filespec struct. All callers are in fact aware of this weirdness and only end up freeing the output buffer when they had a textconv driver. Ideally, we'd split up this interface to only perform the conversion via the textconv driver, and BUG in case the caller didn't provide one. This would make memory ownership semantics much more straight forward. For now though, let's simply cast the empty string constant to `char *` to avoid a warning with `-Wwrite-strings`. This is equivalent to the same cast that we already have in `fill_mmfile()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-07global: improve const correctness when assigning string constantsPatrick Steinhardt
We're about to enable `-Wwrite-strings`, which changes the type of string constants to `const char[]`. Fix various sites where we assign such constants to non-const variables. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27config: clarify memory ownership in `git_config_string()`Patrick Steinhardt
The out parameter of `git_config_string()` is a `const char **` even though we transfer ownership of memory to the caller. This is quite misleading and has led to many memory leaks all over the place. Adapt the parameter to instead be `char **`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27diff: refactor code to clarify memory ownership of prefixesPatrick Steinhardt
The source and destination prefixes are tracked in a `const char *` array, but may at times contain allocated strings. The result is that those strings may be leaking because we never free them. Refactor the code to always store allocated strings in those variables, freeing them as required. This requires us to handle the default values a bit different compared to before. But given that there is only a single callsite where we use the variables to `struct diff_options` it's easy to handle the defaults there. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>