summaryrefslogtreecommitdiff
path: root/diff.c
AgeCommit message (Collapse)Author
2024-05-27config: clarify memory ownership in `git_config_pathname()`Patrick Steinhardt
The out parameter of `git_config_pathname()` is a `const char **` even though we transfer ownership of memory to the caller. This is quite misleading and has led to many memory leaks all over the place. Adapt the parameter to instead be `char **`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-16Merge branch 'rs/external-diff-with-exit-code'Junio C Hamano
* rs/external-diff-with-exit-code: Revert "diff: fix --exit-code with external diff"
2024-05-16Revert "diff: fix --exit-code with external diff"Junio C Hamano
This reverts commit 11be65cfa43416219e85384a3a80d672b65b76ba, per original author's request to come up with a better strategy.
2024-05-15Merge branch 'rs/external-diff-with-exit-code'Junio C Hamano
The "--exit-code" option of "git diff" command learned to work with the "--ext-diff" option. * rs/external-diff-with-exit-code: diff: fix --exit-code with external diff diff: report unmerged paths as changes in run_diff_cmd()
2024-05-06diff: fix --exit-code with external diffRené Scharfe
You can ask the diff machinery to let the exit code indicate whether there are changes, e.g. with --exit-code. It as two ways to calculate that bit: The quick one assumes blobs with different hashes have different content, and the more elaborate way actually compares the contents, possibly applying transformations like ignoring whitespace. Always use the slower path by setting the flag diff_from_contents, because any of the files could have an external diff driver set via an attribute, which might consider binary differences irrelevant, like e.g. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06diff: report unmerged paths as changes in run_diff_cmd()René Scharfe
You can ask the diff machinery to let the exit code indicate whether there are changes, e.g. with --quiet. It as two ways to calculate that bit: The quick one assumes blobs with different hashes have different content, and the more elaborate way actually compares the contents, possibly applying transformations like ignoring whitespace. The quick way considers an unmerged file to be a change and reports exit code 1, which makes sense. The slower path uses the struct diff_options member found_changes to indicate whether the blobs differ even with the transformations applied. It's not set for unmerged files, though, resulting in exit code 0. Set found_changes in run_diff_cmd() for unmerged files, for a consistent exit code of 1 if there's an unmerged file, regardless of whether whitespace is ignored. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-15diff: add diff.srcPrefix and diff.dstPrefix configuration variablesPeter Hutterer
Allow the default prefixes "a/" and "b/" to be tweaked by the diff.srcPrefix and diff.dstPrefix configuration variables. Signed-off-by: Peter Hutterer <peter.hutterer@who-t.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-15Merge branch 'jx/dirstat-parseopt-help'Junio C Hamano
The mark-up of diff options has been updated to help translators. * jx/dirstat-parseopt-help: diff: mark param1 and param2 as placeholders
2024-02-14diff: mark param1 and param2 as placeholdersJiang Xin
Some l10n translators translated the parameters "files", "param1" and "param2" in the following message: "synonym for --dirstat=files,param1,param2..." Translating "param1" and "param2" is OK, but changing the parameter "files" is wrong. The parameters that are not meant to be used verbatim should be marked as placeholders, but the verbatim parameter not marked as a placeholder should be left as is. This change is a complement for commit 51e846e673 (doc: enforce placeholders in documentation, 2023-12-25). With the help of Jean-Noël,some parameter combinations in one placeholder (e.g. "<param1,param2>...") are splited into seperate placeholders. Helped-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Jiang Xin <worldhello.net@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-13Merge branch 'jk/diff-external-with-no-index' into maint-2.43Junio C Hamano
"git diff --no-index file1 file2" segfaulted while invoking the external diff driver, which has been corrected. * jk/diff-external-with-no-index: diff: handle NULL meta-info when spawning external diff
2024-02-08Merge branch 'en/header-cleanup' into maint-2.43Junio C Hamano
Remove unused header "#include". * en/header-cleanup: treewide: remove unnecessary includes in source files treewide: add direct includes currently only pulled in transitively trace2/tr2_tls.h: remove unnecessary include submodule-config.h: remove unnecessary include pkt-line.h: remove unnecessary include line-log.h: remove unnecessary include http.h: remove unnecessary include fsmonitor--daemon.h: remove unnecessary includes blame.h: remove unnecessary includes archive.h: remove unnecessary include treewide: remove unnecessary includes in source files treewide: remove unnecessary includes from header files
2024-02-06Merge branch 'jk/diff-external-with-no-index'Junio C Hamano
"git diff --no-index file1 file2" segfaulted while invoking the external diff driver, which has been corrected. * jk/diff-external-with-no-index: diff: handle NULL meta-info when spawning external diff
2024-01-29diff: handle NULL meta-info when spawning external diffJeff King
Running this: $ touch foo bar $ chmod +x foo $ git -c diff.external=echo diff --ext-diff --no-index foo bar results in a segfault. The issue is that run_diff_cmd() passes a NULL "xfrm_msg" variable to run_external_diff(), which feeds it to strvec_push(), causing the segfault. The bug dates back to 82fbf269b9 (run_external_diff: use an argv_array for the command line, 2014-04-19), though it mostly only ever worked accidentally. Before then, we just stuck the NULL pointer into a "const char **" array, so our NULL ended up acting as an extra end-of-argv sentinel (which was OK, because it was the last thing in the array). Curiously, though, this is only a problem with --no-index. We set up xfrm_msg by calling fill_metainfo(). This result may be empty, or may have text like "index 1234..5678\n", "rename from foo\nrename from bar\n", etc. In run_external_diff(), we only look at xfrm_msg if the "other" variable is not NULL. That variable is set when the paths of the two sides of the diff pair aren't the same (in which case the destination path becomes "other"). So normally it would kick in only for a rename, in which case xfrm_msg should not be NULL (it would have the rename information in it). But with a "--no-index" of two blobs, we of course have two different pathnames, and thus end up with a non-NULL "other" filename (which is always just a repeat of the file2-name), but possibly a NULL xfrm_msg. So how to fix it? I have a feeling that --no-index always passing "other" to the external diff command is probably a bug. There was no rename, and the name is always redundant with existing information we pass (and this may even cause us to pass a useless "xfrm_msg" that contains an "index 1234..5678" line). So one option would be to change that behavior. We don't seem to have ever documented the "other" or "xfrm_msg" parameters for external diffs. But I'm not sure what fallout we might have from changing that behavior now. So this patch takes the less-risky option, and simply teaches run_external_diff() to avoid passing xfrm_msg when it's NULL. That makes it agnostic to whether "other" and "xfrm_msg" always come as a pair. It fixes the segfault now, and if we want to change the --no-index "other" behavior on top, it will handle that, too. Reported-by: Wilfred Hughes <me@wilfred.me.uk> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-08Merge branch 'en/header-cleanup'Junio C Hamano
Remove unused header "#include". * en/header-cleanup: treewide: remove unnecessary includes in source files treewide: add direct includes currently only pulled in transitively trace2/tr2_tls.h: remove unnecessary include submodule-config.h: remove unnecessary include pkt-line.h: remove unnecessary include line-log.h: remove unnecessary include http.h: remove unnecessary include fsmonitor--daemon.h: remove unnecessary includes blame.h: remove unnecessary includes archive.h: remove unnecessary include treewide: remove unnecessary includes in source files treewide: remove unnecessary includes from header files
2023-12-26treewide: remove unnecessary includes in source filesElijah Newren
Each of these were checked with gcc -E -I. ${SOURCE_FILE} | grep ${HEADER_FILE} to ensure that removing the direct inclusion of the header actually resulted in that header no longer being included at all (i.e. that no other header pulled it in transitively). ...except for a few cases where we verified that although the header was brought in transitively, nothing from it was directly used in that source file. These cases were: * builtin/credential-cache.c * builtin/pull.c * builtin/send-pack.c Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09diff: give more detailed messages for bogus diff.* configJeff King
The config callbacks for a few diff.* variables simply return -1 when we encounter an error. The message you get mentions the offending location, like: fatal: bad config variable 'diff.algorithm' in file '.git/config' at line 7 but is vague about "bad" (as it must be, since the message comes from the generic config code). Most callbacks add their own messages here, so let's do the same. E.g.: error: unknown value for config 'diff.algorithm': foo fatal: bad config variable 'diff.algorithm' in file '.git/config' at line 7 I've written the string in a way that should be reusable for translators, and matches another similar message in transport.c (there doesn't yet seem to be a popular generic message to reuse here, so hopefully this will get the ball rolling). Note that in the case of diff.algorithm, our parse_algorithm_value() helper does detect a NULL value string. But it's still worth detecting it ourselves here, since we can give a more specific error message (and which is the usual one for unexpected implicit-bool values). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09config: handle NULL value when parsing non-boolsJeff King
When the config parser sees an "implicit" bool like: [core] someVariable it passes NULL to the config callback. Any callback code which expects a string must check for NULL. This usually happens via helpers like git_config_string(), etc, but some custom code forgets to do so and will segfault. These are all fairly vanilla cases where the solution is just the usual pattern of: if (!value) return config_error_nonbool(var); though note that in a few cases we have to split initializers like: int some_var = initializer(); into: int some_var; if (!value) return config_error_nonbool(var); some_var = initializer(); There are still some broken instances after this patch, which I'll address on their own in individual patches after this one. Reported-by: Carlos Andrés Ramírez Cataño <antaigroupltda@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-02Merge branch 'jk/diff-result-code-cleanup' into maint-2.42Junio C Hamano
"git diff --no-such-option" and other corner cases around the exit status of the "diff" command has been corrected. * jk/diff-result-code-cleanup: diff: drop useless "status" parameter from diff_result_code() diff: drop useless return values in git-diff helpers diff: drop useless return from run_diff_{files,index} functions diff: die when failing to read index in git-diff builtin diff: show usage for unknown builtin_diff_files() options diff-files: avoid negative exit value diff: spell DIFF_INDEX_CACHED out when calling run_diff_index()
2023-11-02Merge branch 'jc/diff-exit-code-with-w-fixes' into maint-2.42Junio C Hamano
"git diff -w --exit-code" with various options did not work correctly, which is being addressed. * jc/diff-exit-code-with-w-fixes: diff: the -w option breaks --exit-code for --raw and other output modes t4040: remove test that succeeded for a wrong reason diff: teach "--stat -w --exit-code" to notice differences diff: mode-only change should be noticed by "--patch -w --exit-code" diff: move the fallback "--exit-code" code down
2023-09-29diff --stat: set the width defaults in a helper functionDragan Simic
Extract the commonly used initialization of the --stat-width=<width>, --stat-name-width=<width> and --stat-graph-with=<width> parameters to their internal default values into a helper function, to avoid repeating the same initialization code in a few places. Add a couple of tests to additionally cover existing configuration options diff.statNameWidth=<width> and diff.statGraphWidth=<width> when used by git-merge to generate --stat outputs. This closes the gap that existed previously in the --stat tests, and reduces the chances for having any regressions introduced by this commit. While there, perform a small bunch of minor wording tweaks in the improved unit test, to improve its test-level consistency a bit. Signed-off-by: Dragan Simic <dsimic@manjaro.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-18diff --stat: add config option to limit filename widthDragan Simic
Add new configuration option diff.statNameWidth=<width> that is equivalent to the command-line option --stat-name-width=<width>, but it is ignored by format-patch. This follows the logic established by the already existing configuration option diff.statGraphWidth=<width>. Limiting the widths of names and graphs in the --stat output makes sense for interactive work on wide terminals with many columns, hence the support for these configuration options. They don't affect format-patch because it already adheres to the traditional 80-column standard. Update the documentation and add more tests to cover new configuration option diff.statNameWidth=<width>. While there, perform a few minor code and whitespace cleanups here and there, as spotted. Signed-off-by: Dragan Simic <dsimic@manjaro.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-01Merge branch 'jk/diff-result-code-cleanup'Junio C Hamano
"git diff --no-such-option" and other corner cases around the exit status of the "diff" command has been corrected. * jk/diff-result-code-cleanup: diff: drop useless "status" parameter from diff_result_code() diff: drop useless return values in git-diff helpers diff: drop useless return from run_diff_{files,index} functions diff: die when failing to read index in git-diff builtin diff: show usage for unknown builtin_diff_files() options diff-files: avoid negative exit value diff: spell DIFF_INDEX_CACHED out when calling run_diff_index()
2023-08-30Merge branch 'jc/diff-exit-code-with-w-fixes'Junio C Hamano
"git diff -w --exit-code" with various options did not work correctly, which is being addressed. * jc/diff-exit-code-with-w-fixes: diff: the -w option breaks --exit-code for --raw and other output modes t4040: remove test that succeeded for a wrong reason diff: teach "--stat -w --exit-code" to notice differences diff: mode-only change should be noticed by "--patch -w --exit-code" diff: move the fallback "--exit-code" code down
2023-08-21diff: the -w option breaks --exit-code for --raw and other output modesJunio C Hamano
The output from "--raw", "--name-status", and "--name-only" modes in "git diff" does depend on and does not reflect how certain different contents are considered equal, unlike "--patch" and "--stat" output modes do, when used with options like "-w" (another way of thinking about it is that it is not like we recompute the hash of the blob after removing all whitespaces to show "git diff --raw -w" output). But the fact that "--raw" and friends ignore "-w" is not a good excuse for "diff --raw -w --exit-code" to also ignore the request to report the differences with its exit status. When run without "-w", "git diff --exit-code --raw" does report with its exit status the differences as requested, and we should do the same when run with "-w", too. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-21diff: drop useless "status" parameter from diff_result_code()Jeff King
Many programs use diff_result_code() to get a user-visible program exit code from a diff result (e.g., checking opts.found_changes if --exit-code was requested). This function also takes a "status" parameter, which seems at first glance that it could be used to propagate an error encountered when computing the diff. But it doesn't work that way: - negative values are passed through as-is, but are not appropriate as program exit codes - when --exit-code or --check is in effect, we _ignore_ the passed-in status completely. So a failed diff which did not have a chance to set opts.found_changes would erroneously report "success, no changes" instead of propagating the error. After recent cleanups, neither of these bugs is possible to trigger, as every caller just passes in "0". So rather than fixing them, we can simply drop the useless parameter instead. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-18diff: teach "--stat -w --exit-code" to notice differencesJunio C Hamano
When options like "-w" is used while "--exit-code" option is in effect, instead of the usual "do we have any filepair whose preimage and postimage have different <mode,object>?" check, we need to compare the contents of the blobs, taking into account that certain changes are considered no-op. With the previous step, we taught "--patch" codepath to set the .found_changes bit correctly, even for a change that only affects the mode and not object. The "--stat" codepath, however, did not set the .found_changes bit at all. This lead to $ git diff --stat -w --exit-code for a change that does have an output to exit with status 0. Set the bit by inspecting the list of paths the diffstat output is given for (a mode-only change will still appear as a "0-line added 0-line deleted" change) to fix it. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-18diff: mode-only change should be noticed by "--patch -w --exit-code"Junio C Hamano
The codepath to notice the content-level changes, taking certain no-op changes like "ignore whitespace" into account, forgot that a mode-only change is still a change. This resulted in $ git diff --patch --exit-code -w to exit with status 0 even when there is such a mode-only change, breaking both "--patch" and "--quiet" output formats. Teach the builtin_diff() codepath that creation and deletion as well as mode changes are all interesting changes. Note that the test specifically checks removal of an empty file, because if there is anything in the preimage (i.e. the removed file is not empty), the removal would still trigger textual patch output and the codepath for that does update .found_changes bit to report that it found an interesting change. We need to make sure that the .found_changes bit is set even without triggering textual patch output. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-18diff: move the fallback "--exit-code" code downJunio C Hamano
When "--exit-code" is asked and the code cannot just answer by comparing the object names on both sides but needs to inspect and compare the contents, there are two ways that the result is found out. Some output modes, like "--stat" and "--patch", inherently have to inspect the contents in order to show the differences in the way they do. The codepaths for these modes set the .found_changes bit as they compute what to show. However, other output modes do not need to inspect the contents to show the differences in the way they do. The most notable example is "--quiet", which does not need to compute any output to show. When they are asked to report "--exit-code", they run the codepaths for the "--patch" output with their output redirected to "/dev/null", only to set the .found_changes bit. Currently, this fallback invocation of "--patch" output is done after the "--stat" output format and its friends and before the "--patch" and internal callback logic. Move it to the end of the sequence to clarify the fallback status of this code block. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-17Merge branch 'cw/compat-util-header-cleanup'Junio C Hamano
Further shuffling of declarations across header files to streamline file dependencies. * cw/compat-util-header-cleanup: git-compat-util: move alloc macros to git-compat-util.h treewide: remove unnecessary includes for wrapper.h kwset: move translation table from ctype sane-ctype.h: create header for sane-ctype macros git-compat-util: move wrapper.c funcs to its header git-compat-util: move strbuf.c funcs to its header
2023-07-06Merge branch 'gc/config-context'Junio C Hamano
Reduce reliance on a global state in the config reading API. * gc/config-context: config: pass source to config_parser_event_fn_t config: add kvi.path, use it to evaluate includes config.c: remove config_reader from configsets config: pass kvi to die_bad_number() trace2: plumb config kvi config.c: pass ctx with CLI config config: pass ctx with config files config.c: pass ctx in configsets config: add ctx arg to config_fn_t urlmatch.h: use config_fn_t type config: inline git_color_default_config
2023-07-06Merge branch 'pb/complete-diff-options'Junio C Hamano
Completion updates. * pb/complete-diff-options: (24 commits) diff.c: mention completion above add_diff_options completion: complete --remerge-diff completion: complete --diff-merges, its options and --no-diff-merges completion: move --pickaxe-{all,regex} to __git_diff_common_options completion: complete --ws-error-highlight completion: complete --unified completion: complete --output-indicator-{context,new,old} completion: complete --output completion: complete --no-stat completion: complete --no-relative completion: complete --line-prefix completion: complete --ita-invisible-in-index and --ita-visible-in-index completion: complete --irreversible-delete completion: complete --ignore-matching-lines completion: complete --function-context completion: complete --find-renames completion: complete --find-object completion: complete --find-copies completion: complete --default-prefix completion: complete --compact-summary ...
2023-07-05git-compat-util: move alloc macros to git-compat-util.hCalvin Wan
alloc_nr, ALLOC_GROW, and ALLOC_GROW_BY are commonly used macros for dynamic array allocation. Moving these macros to git-compat-util.h with the other alloc macros focuses alloc.[ch] to allocation for Git objects and additionally allows us to remove inclusions to alloc.h from files that solely used the above macros. Signed-off-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05treewide: remove unnecessary includes for wrapper.hCalvin Wan
Signed-off-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28config: pass kvi to die_bad_number()Glen Choo
Plumb "struct key_value_info" through all code paths that end in die_bad_number(), which lets us remove the helper functions that read analogous values from "struct config_reader". As a result, nothing reads config_reader.config_kvi any more, so remove that too. In config.c, this requires changing the signature of git_configset_get_value() to 'return' "kvi" in an out parameter so that git_configset_get_<type>() can pass it to git_config_<type>(). Only numeric types will use "kvi", so for non-numeric types (e.g. git_configset_get_string()), pass NULL to indicate that the out parameter isn't needed. Outside of config.c, config callbacks now need to pass "ctx->kvi" to any of the git_config_<type>() functions that parse a config string into a number type. Included is a .cocci patch to make that refactor. The only exceptional case is builtin/config.c, where git_config_<type>() is called outside of a config callback (namely, on user-provided input), so config source information has never been available. In this case, die_bad_number() defaults to a generic, but perfectly descriptive message. Let's provide a safe, non-NULL for "kvi" anyway, but make sure not to change the message. Signed-off-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28config: add ctx arg to config_fn_tGlen Choo
Add a new "const struct config_context *ctx" arg to config_fn_t to hold additional information about the config iteration operation. config_context has a "struct key_value_info kvi" member that holds metadata about the config source being read (e.g. what kind of config source it is, the filename, etc). In this series, we're only interested in .kvi, so we could have just used "struct key_value_info" as an arg, but config_context makes it possible to add/adjust members in the future without changing the config_fn_t signature. We could also consider other ways of organizing the args (e.g. moving the config name and value into config_context or key_value_info), but in my experiments, the incremental benefit doesn't justify the added complexity (e.g. a config_fn_t will sometimes invoke another config_fn_t but with a different config value). In subsequent commits, the .kvi member will replace the global "struct config_reader" in config.c, making config iteration a global-free operation. It requires much more work for the machinery to provide meaningful values of .kvi, so for now, merely change the signature and call sites, pass NULL as a placeholder value, and don't rely on the arg in any meaningful way. Most of the changes are performed by contrib/coccinelle/config_fn_ctx.pending.cocci, which, for every config_fn_t: - Modifies the signature to accept "const struct config_context *ctx" - Passes "ctx" to any inner config_fn_t, if needed - Adds UNUSED attributes to "ctx", if needed Most config_fn_t instances are easily identified by seeing if they are called by the various config functions. Most of the remaining ones are manually named in the .cocci patch. Manual cleanups are still needed, but the majority of it is trivial; it's either adjusting config_fn_t that the .cocci patch didn't catch, or adding forward declarations of "struct config_context ctx" to make the signatures make sense. The non-trivial changes are in cases where we are invoking a config_fn_t outside of config machinery, and we now need to decide what value of "ctx" to pass. These cases are: - trace2/tr2_cfg.c:tr2_cfg_set_fl() This is indirectly called by git_config_set() so that the trace2 machinery can notice the new config values and update its settings using the tr2 config parsing function, i.e. tr2_cfg_cb(). - builtin/checkout.c:checkout_main() This calls git_xmerge_config() as a shorthand for parsing a CLI arg. This might be worth refactoring away in the future, since git_xmerge_config() can call git_default_config(), which can do much more than just parsing. Handle them by creating a KVI_INIT macro that initializes "struct key_value_info" to a reasonable default, and use that to construct the "ctx" arg. Signed-off-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-26diff.c: mention completion above add_diff_optionsPhilippe Blain
Add a comment on top of add_diff_options, where common diff options are listed, mentioning __git_diff_common_options in the completion script, in the hope that contributors update it when they add new diff flags. Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21object-store-ll.h: split this header out of object-store.hElijah Newren
The vast majority of files including object-store.h did not need dir.h nor khash.h. Split the header into two files, and let most just depend upon object-store-ll.h, while letting the two callers that need it depend on the full object-store.h. After this patch: $ git grep -h include..object-store | sort | uniq -c 2 #include "object-store.h" 129 #include "object-store-ll.h" Diff best viewed with `--color-moved`. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21merge-ll: rename from ll-mergeElijah Newren
A long term (but rather minor) pet-peeve of mine was the name ll-merge.[ch]. I thought it made it harder to realize what stuff was related to merging when I was working on the merge machinery and trying to improve it. Further, back in d1cbe1e6d8a ("hash-ll.h: split out of hash.h to remove dependency on repository.h", 2023-04-22), we have split the portions of hash.h that do not depend upon repository.h into a "hash-ll.h" (due to the recommendation to use "ll" for "low-level" in its name[1], but which I used as a suffix precisely because of my distaste for "ll-merge"). When we discussed adding additional "*-ll.h" files, a request was made that we use "ll" consistently as either a prefix or a suffix. Since it is already in use as both a prefix and a suffix, the only way to do so is to rename some files. Besides my distaste for the ll-merge.[ch] name, let me also note that the files ll-fsmonitor.h, ll-hash.h, ll-merge.h, ll-object-store.h, ll-read-cache.h would have essentially nothing to do with each other and make no sense to group. But giving them the common "ll-" prefix would group them. Using "-ll" as a suffix thus seems just much more logical to me. Rename ll-merge.[ch] to merge-ll.[ch] to achieve this consistency, and to ensure we get a more logical grouping of files. [1] https://lore.kernel.org/git/kl6lsfcu1g8w.fsf@chooglen-macbookpro.roam.corp.google.com/ Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21cache.h: remove this no-longer-used headerElijah Newren
Since this header showed up in some places besides just #include statements, update/clean-up/remove those other places as well. Note that compat/fsmonitor/fsm-path-utils-darwin.c previously got away with violating the rule that all files must start with an include of git-compat-util.h (or a short-list of alternate headers that happen to include it first). This change exposed the violation and caused it to stop building correctly; fix it by having it include git-compat-util.h first, as per policy. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21read-cache*.h: move declarations for read-cache.c functions from cache.hElijah Newren
For the functions defined in read-cache.c, move their declarations from cache.h to a new header, read-cache-ll.h. Also move some related inline functions from cache.h to read-cache.h. The purpose of the read-cache-ll.h/read-cache.h split is that about 70% of the sites don't need the inline functions and the extra headers they include. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-20Merge branch 'jk/log-follow-with-non-literal-pathspec'Junio C Hamano
"git [-c log.follow=true] log [--follow] ':(glob)f**'" used to barf. * jk/log-follow-with-non-literal-pathspec: diff: detect pathspec magic not supported by --follow diff: factor out --follow pathspec check pathspec: factor out magic-to-name function
2023-06-13Merge branch 'jc/diff-s-with-other-options'Junio C Hamano
The "-s" (silent, squelch) option of the "diff" family of commands did not interact with other options that specify the output format well. This has been cleaned up so that it will clear all the formatting options given before. * jc/diff-s-with-other-options: diff: fix interaction between the "-s" option and other options
2023-06-03diff: detect pathspec magic not supported by --followJeff King
The --follow code doesn't handle most forms of pathspec magic. We check that no unexpected ones have made it to try_to_follow_renames() with a runtime GUARD_PATHSPEC() check, which gives behavior like this: $ git log --follow ':(icase)makefile' >/dev/null BUG: tree-diff.c:596: unsupported magic 10 Aborted The same is true of ":(glob)", ":(attr)", and so on. It's good that we notice the problem rather than continuing and producing a wrong answer. But there are two non-ideal things: 1. The idea of GUARD_PATHSPEC() is to catch programming errors where low-level code gets unexpected pathspecs. We'd usually try to catch unsupported pathspecs by passing a magic_mask to parse_pathspec(), which would give the user a much better message like: pathspec magic not supported by this command: 'icase' That doesn't happen here because git-log usually _does_ support all types of pathspec magic, and so it passes "0" for the mask (this call actually happens in setup_revisions()). It needs to distinguish the normal case from the "--follow" one but currently doesn't. 2. In addition to --follow, we have the log.follow config option. When that is set, we try to turn on --follow mode only when there is a single pathspec (since --follow doesn't handle anything else). But really, that ought to be expanded to "use --follow when the pathspec supports it". Otherwise, we'd complain any time you use an exotic pathspec: $ git config log.follow true $ git log ':(icase)makefile' >/dev/null BUG: tree-diff.c:596: unsupported magic 10 Aborted We should instead just avoid enabling follow mode if it's not supported by this particular invocation. This patch expands our diff_check_follow_pathspec() function to cover pathspec magic, solving both problems. A few final notes: - we could also solve (1) by passing the appropriate mask to parse_pathspec(). But that's not great for two reasons. One is that the error message is less precise. It says "magic not supported by this command", but really it is not the command, but rather the --follow option which is the problem. The second is that it always calls die(). But for our log.follow code, we want to speculatively ask "is this pathspec OK?" and just get a boolean result. - This is obviously the right thing to do for ':(icase)' and most other magic options. But ':(glob)' is a bit odd here. The --follow code doesn't support wildcards, but we allow them anyway. From try_to_follow_renames(): #if 0 /* * We should reject wildcards as well. Unfortunately we * haven't got a reliable way to detect that 'foo\*bar' in * fact has no wildcards. nowildcard_len is merely a hint for * optimization. Let it slip for now until wildmatch is taught * about dry-run mode and returns wildcard info. */ if (opt->pathspec.has_wildcard) BUG("wildcards are not supported"); #endif So something like "git log --follow 'Make*'" is already doing the wrong thing, since ":(glob)" behavior is already the default (it is used only to countermand an earlier --noglob-pathspecs). So we _could_ loosen the guard to allow :(glob), since it just behaves the same as pathspecs do by default. But it seems like a backwards step to do so. It already doesn't work (it hits the BUG() case currently), and given that the user took an explicit step to say "this pathspec should glob", it is reasonable for us to say "no, --follow does not support globbing" (or in the case of log.follow, avoid turning on follow mode). Which is what happens after this patch. - The set of allowed pathspec magic is obviously the same as in GUARD_PATHSPEC(). We could perhaps factor these out to avoid repetition. The point of having separate masks and GUARD calls is that we don't necessarily know which parsed pathspecs will be used where. But in this case, the two are heavily correlated. Still, there may be some value in keeping them separate; it would make anyone think twice about adding new magic to the list in diff_check_follow_pathspec(). They'd need to touch try_to_follow_renames() as well, which is the code that would actually need to be updated to handle more exotic pathspecs. - The documentation for log.follow says that it enables --follow "...when a single <path> is given". We could possibly expand that to say "with no unsupported pathspec magic", but that raises the question of documenting which magic is supported. I think the existing wording of "single <path>" sufficiently encompasses the idea (the forbidden magic is stuff that might match multiple entries), and the spirit remains the same. Reported-by: Jim Pryor <dubiousjim@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-03diff: factor out --follow pathspec checkJeff King
In --follow mode, we require exactly one pathspec. We check this condition in two places: - in diff_setup_done(), we complain if --follow is used with an inapropriate pathspec - in git-log's revision "tweak" function, we enable log.follow only if the pathspec allows it The duplication isn't a big deal right now, since the logic is so simple. But in preparation for it becoming more complex, let's pull it into a shared function. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-15Merge branch 'jc/dirstat-plug-leaks'Junio C Hamano
"git diff --dirstat" leaked memory, which has been plugged. * jc/dirstat-plug-leaks: diff: plug leaks in dirstat diff: refactor common tail part of dirstat computation
2023-05-09Merge branch 'en/header-split-cache-h-part-2'Junio C Hamano
More header clean-up. * en/header-split-cache-h-part-2: (22 commits) reftable: ensure git-compat-util.h is the first (indirect) include diff.h: reduce unnecessary includes object-store.h: reduce unnecessary includes commit.h: reduce unnecessary includes fsmonitor: reduce includes of cache.h cache.h: remove unnecessary headers treewide: remove cache.h inclusion due to previous changes cache,tree: move basic name compare functions from read-cache to tree cache,tree: move cmp_cache_name_compare from tree.[ch] to read-cache.c hash-ll.h: split out of hash.h to remove dependency on repository.h tree-diff.c: move S_DIFFTREE_IFXMIN_NEQ define from cache.h dir.h: move DTYPE defines from cache.h versioncmp.h: move declarations for versioncmp.c functions from cache.h ws.h: move declarations for ws.c functions from cache.h match-trees.h: move declarations for match-trees.c functions from cache.h pkt-line.h: move declarations for pkt-line.c functions from cache.h base85.h: move declarations for base85.c functions from cache.h copy.h: move declarations for copy.c functions from cache.h server-info.h: move declarations for server-info.c functions from cache.h packfile.h: move pack_window and pack_entry from cache.h ...
2023-05-05diff: fix interaction between the "-s" option and other optionsJunio C Hamano
Sergey Organov noticed and reported "--patch --no-patch --raw" behaves differently from just "--raw". It turns out that there are a few interesting bugs in the implementation and documentation. * First, the documentation for "--no-patch" was unclear that it could be read to mean "--no-patch" countermands an earlier "--patch" but not other things. The intention of "--no-patch" ever since it was introduced at d09cd15d (diff: allow --no-patch as synonym for -s, 2013-07-16) was to serve as a synonym for "-s", so "--raw --patch --no-patch" should have produced no output, but it can be (mis)read to allow showing only "--raw" output. * Then the interaction between "-s" and other format options were poorly implemented. Modern versions of Git uses one bit each to represent formatting options like "--patch", "--stat" in a single output_format word, but for historical reasons, "-s" also is represented as another bit in the same word. This allows two interesting bugs to happen, and we have both X-<. (1) After setting a format bit, then setting NO_OUTPUT with "-s", the code to process another "--<format>" option drops the NO_OUTPUT bit to allow output to be shown again. However, the code to handle "-s" only set NO_OUTPUT without unsetting format bits set earlier, so the earlier format bit got revealed upon seeing the second "--<format>" option. This is the problem Sergey observed. (2) After setting NO_OUTPUT with "-s", code to process "--<format>" option can forget to unset NO_OUTPUT, leaving the command still silent. It is tempting to change the meaning of "--no-patch" to mean "disable only the patch format output" and reimplement "-s" as "not showing anything", but it would be an end-user visible change in behavior. Let's fix the interactions of these bits to first make "-s" work as intended. The fix is conceptually very simple. * Whenever we set DIFF_FORMAT_FOO because we saw the "--foo" option (e.g. DIFF_FORMAT_RAW is set when the "--raw" option is given), we make sure we drop DIFF_FORMAT_NO_OUTPUT. We forgot to do so in some of the options and caused (2) above. * When processing "-s" option, we should not just set DIFF_FORMAT_NO_OUTPUT bit, but clear other DIFF_FORMAT_* bits. We didn't do so and retained format bits set by options previously seen, causing (1) above. It is even more tempting to lose NO_OUTPUT bit and instead take output_format word being 0 as its replacement, but that would break the mechanism "git show" uses to default to "--patch" output, where the distinction between telling the command to be silent with "-s" and having no output format specified on the command line matters, and an explicit output format given on the command line should not be "combined" with the default "--patch" format. So, while we cannot lose the NO_OUTPUT bit, as a follow-up work, we may want to replace it with OPTION_GIVEN bit, and * make "--patch", "--raw", etc. set DIFF_FORMAT_$format bit and DIFF_FORMAT_OPTION_GIVEN bit on for each format. "--no-raw", etc. will set off DIFF_FORMAT_$format bit but still record the fact that we saw an option from the command line by setting DIFF_FORMAT_OPTION_GIVEN bit. * make "-s" (and its synonym "--no-patch") clear all other bits and set only the DIFF_FORMAT_OPTION_GIVEN bit on. which I suspect would make the code much cleaner without breaking any end-user expectations. Once that is in place, transitioning "--no-patch" to mean the counterpart of "--patch", just like "--no-raw" only defeats an earlier "--raw", would be quite simple at the code level. The social cost of migrating the end-user expectations might be too great for it to be worth, but at least the "GIVEN" bit clean-up alone may be worth it. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-05diff: plug leaks in dirstatJunio C Hamano
The array of dirstat_file contained in the dirstat_dir structure is not freed after the processing ends. Unfortunately, the member that points at the array, .files, is incremented as the gather_dirstat() function recursively walks it, and this needs to be plugged by remembering the beginning of the array before gather_dirstat() mucks with it and freeing it after we are done. We can mark t4047 as leak-free. t4000, which is marked as leak-free, now can exercise dirstat in it, which will happen next. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-05diff: refactor common tail part of dirstat computationJunio C Hamano
This will become useful when we plug leaks in these two functions. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-02Merge branch 'tb/ban-strtok'Junio C Hamano
Mark strtok() and strtok_r() to be banned. * tb/ban-strtok: banned.h: mark `strtok()` and `strtok_r()` as banned t/helper/test-json-writer.c: avoid using `strtok()` t/helper/test-oidmap.c: avoid using `strtok()` t/helper/test-hashmap.c: avoid using `strtok()` string-list: introduce `string_list_setlen()` string-list: multi-delimiter `string_list_split_in_place()`