summaryrefslogtreecommitdiff
path: root/environment.c
AgeCommit message (Collapse)Author
13 daysconfig: move Git config parsing into "environment.c"Patrick Steinhardt
In "config.c" we host both the business logic to read and write config files as well as the logic to parse specific Git-related variables. On the one hand this is mixing concerns, but even more importantly it means that we cannot easily remove the dependency on `the_repository` in our config parsing logic. Move the logic into "environment.c". This file is a grab bag of all kinds of global state already, so it is quite a good fit. Furthermore, it also hosts most of the global variables that we're parsing the config values into, making this an even better fit. Note that there is one hidden change: in `parse_fsync_components()` we use an `int` to iterate through `ARRAY_SIZE(fsync_component_names)`. But as -Wsign-compare warnings are enabled in this file this causes a compiler warning. The issue is fixed by using a `size_t` instead. This change allows us to drop the `USE_THE_REPOSITORY_VARIABLE` declaration. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-07-07repository: move 'repository_format_precious_objects' to repo scopeAyush Chandekar
The 'extensions.preciousObjects' setting when set true, prevents operations that might drop objects from the object storage. This setting is populated in the global variable 'repository_format_precious_objects'. Move this global variable to repo scope by adding it to 'struct repository and also refactor all the occurences accordingly. This change is part of an ongoing effort to eliminate global variables, improve modularity and help libify the codebase. Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com> Signed-off-by: Ayush Chandekar <ayu.chandekar@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-06-10environment: remove the global variable 'core_preload_index'Ayush Chandekar
The global variable 'core_preload_index' is used in a single function named 'preload_index()' in "preload-index.c". Move its declaration inside that function, removing unnecessary global state. This change is part of an ongoing effort to eliminate global variables, improve modularity and help libify the codebase. Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com> Signed-off-by: Ayush Chandekar <ayu.chandekar@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-05-05Merge branch 'js/windows-arm64'Junio C Hamano
Update to arm64 Windows port. * js/windows-arm64: max_tree_depth: lower it for clangarm64 on Windows mingw(arm64): do move the `/etc/git*` location msvc: do handle builds on Windows/ARM64 mingw: do not use nedmalloc on Windows/ARM64 config.mak.uname: add support for clangarm64 bswap.h: add support for built-in bswap functions
2025-04-29Merge branch 'as/typofix-in-env-h-header'Junio C Hamano
Typofix. * as/typofix-in-env-h-header: environment: fix typo: 'setup_git_directory_gently'
2025-04-23max_tree_depth: lower it for clangarm64 on WindowsJohannes Schindelin
Just as in b64d78ad02ca (max_tree_depth: lower it for MSVC to avoid stack overflows, 2023-11-01), I encountered the same problem with the clang builds on Windows/ARM64. The symptom is an exit code 127 when t6700 tries to verify that `git archive big` fails. This exit code is reserved on Unix/Linux to mean "command not found". Unfortunately in this case, it is the fall-back chosen by Cygwin's `pinfo::status_exit()` method when encountering the NSTATUS `STATUS_STACK_OVERFLOW`, see https://github.com/cygwin/cygwin/blob/cygwin-3.6.1/winsup/cygwin/pinfo.cc#L171 I verified manually that the stack overflow always happens somewhere around tree depth 1403, therefore 1280 should be a safe bound in these instances. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-04-18environment: fix typo: 'setup_git_directory_gently'Abhijeet Sonar
Above the declaration of git_work_tree_cfg, we have: /* This is set by setup_git_dir_gently() and/or git_default_config() */ char *git_work_tree_cfg; It can be verified that there is no function called 'setup_git_dir_gently' by running grep on the codebase: $ grep -R setup_git_dir_gently . ./environment.c:/* This is set by setup_git_dir_gently() and/or git_default_config() */ The comment, introduced in e90fdc39b6 (Clean up work-tree handling), is the only occurrence of the name 'setup_git_dir_gently'. It probably meant 'setup_git_directory_gently' as that is a name of a real function in setup.c. Correct it. Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-10environment: move access to "core.bigFileThreshold" into repo settingsPatrick Steinhardt
The "core.bigFileThreshold" setting is stored in a global variable and populated via `git_default_core_config()`. This may cause issues in the case where one is handling multiple different repositories in a single process with different values for that config key, as we may or may not see the correct value in that case. Furthermore, global state blocks our path towards libification. Refactor the code so that we instead store the value in `struct repo_settings`, where the value is computed as-needed and cached. Note that this change requires us to adapt one test in t1050 that verifies that we die when parsing an invalid "core.bigFileThreshold" value. The exercised Git command doesn't use the value at all, and thus it won't hit the new code path that parses the value. This is addressed by using git-hash-object(1) instead, which does read the value. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-03-05Merge branch 'ps/path-sans-the-repository'Junio C Hamano
The path.[ch] API takes an explicit repository parameter passed throughout the callchain, instead of relying on the_repository singleton instance. * ps/path-sans-the-repository: path: adjust last remaining users of `the_repository` environment: move access to "core.sharedRepository" into repo settings environment: move access to "core.hooksPath" into repo settings repo-settings: introduce function to clear struct path: drop `git_path()` in favor of `repo_git_path()` rerere: let `rerere_path()` write paths into a caller-provided buffer path: drop `git_common_path()` in favor of `repo_common_path()` worktree: return allocated string from `get_worktree_git_dir()` path: drop `git_path_buf()` in favor of `repo_git_path_replace()` path: drop `git_pathdup()` in favor of `repo_git_path()` path: drop unused `strbuf_git_path()` function path: refactor `repo_submodule_path()` family of functions submodule: refactor `submodule_to_gitdir()` to accept a repo path: refactor `repo_worktree_path()` family of functions path: refactor `repo_git_path()` family of functions path: refactor `repo_common_path()` family of functions
2025-02-28environment: move access to "core.sharedRepository" into repo settingsPatrick Steinhardt
Similar as with the preceding commit, we track "core.sharedRepository" via a pair of global variables. Move them into `struct repo_settings` so that we can instead track them per-repository. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-02-28environment: move access to "core.hooksPath" into repo settingsPatrick Steinhardt
The "core.hooksPath" setting is stored in a global variable and populated via the `git_default_core_config`. This may cause issues in the case where one is handling multiple different repositories in a single process with different values for that config key, as we may or may not see the correct value in that case. Furthermore, global state blocks our path towards libification. Refactor the code so that we instead store the value in `struct repo_settings`. The value is computed as-needed and cached. The result should be functionally the same as there aren't ever any code paths where we'd execute hooks outside the context of a repository. Note that this requires us to change the passed-in repository in the `repo_git_path()` family of functions to be non-constant, as we call `adjust_git_path()` there. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-28git-compat-util: move include of "compat/zlib.h" into "git-zlib.h"Patrick Steinhardt
We include "compat/zlib.h" in "git-compat-util.h", which is unnecessarily broad given that we only have a small handful of files that use the zlib library. Move the header into "git-zlib.h" instead and adapt users of zlib to include that header. One exception is the reftable library, as we don't want to use the Git-specific wrapper of zlib there, so we include "compat/zlib.h" instead. Furthermore, we move the include into "reftable/system.h" so that users of the library other than Git can wire up zlib themselves. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-04config: make `packed_git_(limit|window_size)` non-global variablesKarthik Nayak
The variables `packed_git_window_size` and `packed_git_limit` are global config variables used in the `packfile.c` file. Since it is only used in this file, let's change it from being a global config variable to a local variable for the subsystem. With this, we rid `packfile.c` from all global variable usage and this means we can also remove the `USE_THE_REPOSITORY_VARIABLE` guard from the file. Helped-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-04config: make `delta_base_cache_limit` a non-global variableKarthik Nayak
The `delta_base_cache_limit` variable is a global config variable used by multiple subsystems. Let's make this non-global, by adding this variable independently to the subsystems where it is used. First, add the setting to the `repo_settings` struct, this provides access to the config in places where the repository is available. Use this in `packfile.c`. In `index-pack.c` we add it to the `pack_idx_option` struct and its constructor. While the repository struct is available here, it may not be set because `git index-pack` can be used without a repository. In `gc.c` add it to the `gc_config` struct and also the constructor function. The gc functions currently do not have direct access to a repository struct. These changes are made to remove the usage of `delta_base_cache_limit` as a global variable in `packfile.c`. This brings us one step closer to removing the `USE_THE_REPOSITORY_VARIABLE` definition in `packfile.c` which we complete in the next patch. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12environment: stop storing "core.notesRef" globallyPatrick Steinhardt
Stop storing the "core.notesRef" config value globally. Instead, retrieve the value in `default_notes_ref()`. The code is never called in a hot loop anyway, so doing this on every invocation should be perfectly fine. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12environment: stop storing "core.warnAmbiguousRefs" globallyPatrick Steinhardt
Same as the preceding commits, storing the "core.warnAmbiguousRefs" value globally is misdesigned as this setting may be set per repository. Move the logic into the repo-settings subsystem. The usual pattern here is that users are expected to call `prepare_repo_settings()` before they access the settings themselves. This seems somewhat fragile though, as it is easy to miss and leads to somewhat ugly code patterns at the call sites. Instead, introduce a new function that encapsulates this logic for us. This also allows us to change how exactly the lazy initialization works in the future, e.g. by only partially initializing values as requested by the caller. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12environment: stop storing "core.preferSymlinkRefs" globallyPatrick Steinhardt
Same as the preceding commit, storing the "core.preferSymlinkRefs" value globally is misdesigned as this setting may be set per repository. There is only a single user of this value anyway, namely the "files" backend. So let's just remove the global variable and read the value of this setting when initializing the backend. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12environment: stop storing "core.logAllRefUpdates" globallyPatrick Steinhardt
The value of "core.logAllRefUpdates" is being stored in the global variable `log_all_ref_updates`. This design is somewhat aged nowadays, where it is entirely possible to access multiple repositories in the same process which all have different values for this setting. So using a single global variable to track it is plain wrong. Remove the global variable. Instead, we now provide a new function part of the repo-settings subsystem that parses the value for a specific repository. While that may require us to read the value multiple times, we work around this by reading it once when the ref backends are set up and caching the value there. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12environment: move `set_git_dir()` and related into setup layerPatrick Steinhardt
The functions `set_git_dir()` and friends are used to set up repositories. As such, they are quite clearly part of the setup subsystem, but still live in "environment.c". Move them over, which also helps to get rid of dependencies on `the_repository` in the environment subsystem. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12environment: make `get_git_namespace()` self-containedPatrick Steinhardt
The logic to set up and retrieve `git_namespace` is distributed across different functions which communicate with each other via a global environment variable. This is rather pointless though, as the value is always derived from an environment variable, and this environment variable does not change after we have parsed global options. Convert the function to be fully self-contained such that it lazily populates once called. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12environment: move object database functions into object layerPatrick Steinhardt
The `odb_mkstemp()` and `odb_pack_keep()` functions are quite clearly tied to the object store, but regardless of that they are located in "environment.c". Move them over, which also helps to get rid of dependencies on `the_repository` in the environment subsystem. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12environment: make `get_git_work_tree()` accept a repositoryPatrick Steinhardt
The `get_git_work_tree()` function retrieves the path of the work tree of `the_repository`. Make it accept a `struct repository` such that it can work on arbitrary repositories and make it part of the repository subsystem. This reduces our reliance on `the_repository` and clarifies scope. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12environment: make `get_graft_file()` accept a repositoryPatrick Steinhardt
The `get_graft_file()` function retrieves the path to the graft file of `the_repository`. Make it accept a `struct repository` such that it can work on arbitrary repositories and make it part of the repository subsystem. This reduces our reliance on `the_repository` and clarifies scope. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12environment: make `get_index_file()` accept a repositoryPatrick Steinhardt
The `get_index_file()` function retrieves the path to the index file of `the_repository`. Make it accept a `struct repository` such that it can work on arbitrary repositories and make it part of the repository subsystem. This reduces our reliance on `the_repository` and clarifies scope. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12environment: make `get_object_directory()` accept a repositoryPatrick Steinhardt
The `get_object_directory()` function retrieves the path to the object directory for `the_repository`. Make it accept a `struct repository` such that it can work on arbitrary repositories and make it part of the repository subsystem. This reduces our reliance on `the_repository` and clarifies scope. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12environment: make `get_git_common_dir()` accept a repositoryPatrick Steinhardt
The `get_git_common_dir()` function retrieves the path to the common directory for `the_repository`. Make it accept a `struct repository` such that it can work on arbitrary repositories and make it part of the repository subsystem. This reduces our reliance on `the_repository` and clarifies scope. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-12environment: make `get_git_dir()` accept a repositoryPatrick Steinhardt
The `get_git_dir()` function retrieves the path to the Git directory for `the_repository`. Make it accept a `struct repository` such that it can work on arbitrary repositories and make it part of the repository subsystem. This reduces our reliance on `the_repository` and clarifies scope. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14config: fix leaking comment character configPatrick Steinhardt
When the comment line character has been specified multiple times in the configuration, then `git_default_core_config()` will cause a memory leak because it unconditionally copies the string into `comment_line_str` without free'ing the previous value. In fact, it can't easily free the value in the first place because it may contain a string constant. Refactor the code such that we track allocated comment character strings via a separate non-constant variable `comment_line_str_to_free`. Adapt sites that set `comment_line_str` to set both and free the old value that was stored in `comment_line_str_to_free`. This memory leak is being hit in t3404. As there are still other memory leaks in that file we cannot yet mark it as passing with leak checking enabled. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14global: introduce `USE_THE_REPOSITORY_VARIABLE` macroPatrick Steinhardt
Use of the `the_repository` variable is deprecated nowadays, and we slowly but steadily convert the codebase to not use it anymore. Instead, callers should be passing down the repository to work on via parameters. It is hard though to prove that a given code unit does not use this variable anymore. The most trivial case, merely demonstrating that there is no direct use of `the_repository`, is already a bit of a pain during code reviews as the reviewer needs to manually verify claims made by the patch author. The bigger problem though is that we have many interfaces that implicitly rely on `the_repository`. Introduce a new `USE_THE_REPOSITORY_VARIABLE` macro that allows code units to opt into usage of `the_repository`. The intent of this macro is to demonstrate that a certain code unit does not use this variable anymore, and to keep it from new dependencies on it in future changes, be it explicit or implicit For now, the macro only guards `the_repository` itself as well as `the_hash_algo`. There are many more known interfaces where we have an implicit dependency on `the_repository`, but those are not guarded at the current point in time. Over time though, we should start to add guards as required (or even better, just remove them). Define the macro as required in our code units. As expected, most of our code still relies on the global variable. Nearly all of our builtins rely on the variable as there is no way yet to pass `the_repository` to their entry point. For now, declare the macro in "biultin.h" to keep the required changes at least a little bit more contained. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27config: clarify memory ownership in `git_config_string()`Patrick Steinhardt
The out parameter of `git_config_string()` is a `const char **` even though we transfer ownership of memory to the caller. This is quite misleading and has led to many memory leaks all over the place. Adapt the parameter to instead be `char **`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27convert: refactor code to clarify ownership of check_roundtrip_encodingPatrick Steinhardt
The `check_roundtrip_encoding` variable is tracked in a `const char *` even though it may contain allocated strings at times. The result is that those strings may be leaking because we never free them. Refactor the code to always store allocated strings in this variable. The default value is handled in `check_roundtrip()` now, which is the only user of the variable. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27config: clarify memory ownership in `git_config_pathname()`Patrick Steinhardt
The out parameter of `git_config_pathname()` is a `const char **` even though we transfer ownership of memory to the caller. This is quite misleading and has led to many memory leaks all over the place. Adapt the parameter to instead be `char **`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-05Merge branch 'jk/core-comment-string'Junio C Hamano
core.commentChar used to be limited to a single byte, but has been updated to allow an arbitrary multi-byte sequence. * jk/core-comment-string: config: add core.commentString config: allow multi-byte core.commentChar environment: drop comment_line_char compatibility macro wt-status: drop custom comment-char stringification sequencer: handle multi-byte comment characters when writing todo list find multi-byte comment chars in unterminated buffers find multi-byte comment chars in NUL-terminated strings prefer comment_line_str to comment_line_char for printing strbuf: accept a comment string for strbuf_add_commented_lines() strbuf: accept a comment string for strbuf_commented_addf() strbuf: accept a comment string for strbuf_stripspace() environment: store comment_line_char as a string strbuf: avoid shadowing global comment_line_char name commit: refactor base-case of adjust_comment_line_char() strbuf: avoid static variables in strbuf_add_commented_lines() strbuf: simplify comment-handling in add_lines() helper config: forbid newline as core.commentChar
2024-03-12environment: store comment_line_char as a stringJeff King
We'd like to eventually support multi-byte comment prefixes, but the comment_line_char variable is referenced in many spots, making the transition difficult. Let's start by storing the character in a NUL-terminated string. That will let us switch code over incrementally to the string format, and we can easily support the existing code with a macro wrapper (since we'll continue to allow only a single-byte prefix, this will behave identically). Once all references to the "char" variable have been converted, we can drop it and enable longer strings. We'll still have to touch all of the spots that create or set the variable in this patch, but there are only a few (reading the config, and the "auto" character selector). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-07Merge branch 'jc/no-lazy-fetch'Junio C Hamano
"git --no-lazy-fetch cmd" allows to run "cmd" while disabling lazy fetching of objects from the promisor remote, which may be handy for debugging. * jc/no-lazy-fetch: git: extend --no-lazy-fetch to work across subprocesses git: document GIT_NO_REPLACE_OBJECTS environment variable git: --no-lazy-fetch option
2024-02-27git: extend --no-lazy-fetch to work across subprocessesJunio C Hamano
Modeling after how the `--no-replace-objects` option is made usable across subprocess spawning (e.g., cURL based remote helpers are spawned as a separate process while running "git fetch"), allow the `--no-lazy-fetch` option to be passed across process boundaries. Do not model how the value of GIT_NO_REPLACE_OBJECTS environment variable is ignored, though. Just use the usual git_env_bool() to allow "export GIT_NO_LAZY_FETCH=0" and "unset GIT_NO_LAZY_FETCH" to be equivalents. Also do not model how the request is not propagated to subprocesses we spawn (e.g. "git clone --local" that spawns a new process to work in the origin repository, while the original one working in the newly created one) by the "--no-replace-objects" option, as this "do not lazily fetch from the promisor" is more about a per-request debugging aid, not "this repository's promisor should not be relied upon" property specific to a repository. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09config: use git_config_string() for core.checkRoundTripEncodingJeff King
Since this code path was recently converted to check for a NULL value, it now behaves exactly like git_config_string(). We can shorten the code a bit by using that helper. Note that git_config_string() takes a const pointer, but our storage variable is non-const. We're better off making this "const", though, since the default value points to a string literal (and thus it would be an error if anybody tried to write to it). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-02max_tree_depth: lower it for MSVC to avoid stack overflowsJohannes Schindelin
There seems to be some internal stack overflow detection in MSVC's `malloc()` machinery that seems to be independent of the `stack reserve` and `heap reserve` sizes specified in the executable (editable via `EDITBIN /STACK:<n> <exe>` and `EDITBIN /HEAP:<n> <exe>`). In the newly test cases added by `jk/tree-name-and-depth-limit`, this stack overflow detection is unfortunately triggered before Git can print out the error message about too-deep trees and exit gracefully. Instead, it exits with `STATUS_STACK_OVERFLOW`. This corresponds to the numeric value -1073741571, something the MSYS2 runtime we sadly need to use to run Git's test suite cannot handle and which it internally maps to the exit code 127. Git's test suite, in turn, mistakes this to mean that the command was not found, and fails both test cases. Here is an example stack trace from an example run: [0x0] ntdll!RtlpAllocateHeap+0x31 0x4212603f50 0x7ff9d6d4cd49 [0x1] ntdll!RtlpAllocateHeapInternal+0x6c9 0x42126041b0 0x7ff9d6e14512 [0x2] ntdll!RtlDebugAllocateHeap+0x102 0x42126042b0 0x7ff9d6dcd8b0 [0x3] ntdll!RtlpAllocateHeap+0x7ec70 0x4212604350 0x7ff9d6d4cd49 [0x4] ntdll!RtlpAllocateHeapInternal+0x6c9 0x42126045b0 0x7ff9596ed480 [0x5] ucrtbased!heap_alloc_dbg_internal+0x210 0x42126046b0 0x7ff9596ed20d [0x6] ucrtbased!heap_alloc_dbg+0x4d 0x4212604750 0x7ff9596f037f [0x7] ucrtbased!_malloc_dbg+0x2f 0x42126047a0 0x7ff9596f0dee [0x8] ucrtbased!malloc+0x1e 0x42126047d0 0x7ff730fcc1ef [0x9] git!do_xmalloc+0x2f 0x4212604800 0x7ff730fcc2b9 [0xa] git!do_xmallocz+0x59 0x4212604840 0x7ff730fca779 [0xb] git!xmallocz_gently+0x19 0x4212604880 0x7ff7311b0883 [0xc] git!unpack_compressed_entry+0x43 0x42126048b0 0x7ff7311ac9a4 [0xd] git!unpack_entry+0x554 0x42126049a0 0x7ff7311b0628 [0xe] git!cache_or_unpack_entry+0x58 0x4212605250 0x7ff7311ad3a8 [0xf] git!packed_object_info+0x98 0x42126052a0 0x7ff7310a92da [0x10] git!do_oid_object_info_extended+0x3fa 0x42126053b0 0x7ff7310a44e7 [0x11] git!oid_object_info_extended+0x37 0x4212605460 0x7ff7310a38ba [0x12] git!repo_read_object_file+0x9a 0x42126054a0 0x7ff7310a6147 [0x13] git!read_object_with_reference+0x97 0x4212605560 0x7ff7310b4656 [0x14] git!fill_tree_descriptor+0x66 0x4212605620 0x7ff7310dc0a5 [0x15] git!traverse_trees_recursive+0x3f5 0x4212605680 0x7ff7310dd831 [0x16] git!unpack_callback+0x441 0x4212605790 0x7ff7310b4c95 [0x17] git!traverse_trees+0x5d5 0x42126058a0 0x7ff7310dc0f2 [0x18] git!traverse_trees_recursive+0x442 0x4212605980 0x7ff7310dd831 [0x19] git!unpack_callback+0x441 0x4212605a90 0x7ff7310b4c95 [0x1a] git!traverse_trees+0x5d5 0x4212605ba0 0x7ff7310dc0f2 [0x1b] git!traverse_trees_recursive+0x442 0x4212605c80 0x7ff7310dd831 [0x1c] git!unpack_callback+0x441 0x4212605d90 0x7ff7310b4c95 [0x1d] git!traverse_trees+0x5d5 0x4212605ea0 0x7ff7310dc0f2 [0x1e] git!traverse_trees_recursive+0x442 0x4212605f80 0x7ff7310dd831 [0x1f] git!unpack_callback+0x441 0x4212606090 0x7ff7310b4c95 [0x20] git!traverse_trees+0x5d5 0x42126061a0 0x7ff7310dc0f2 [0x21] git!traverse_trees_recursive+0x442 0x4212606280 0x7ff7310dd831 [...] [0xfad] git!cmd_main+0x2a2 0x42126ff740 0x7ff730fb6345 [0xfae] git!main+0xe5 0x42126ff7c0 0x7ff730fbff93 [0xfaf] git!wmain+0x2a3 0x42126ff830 0x7ff731318859 [0xfb0] git!invoke_main+0x39 0x42126ff8a0 0x7ff7313186fe [0xfb1] git!__scrt_common_main_seh+0x12e 0x42126ff8f0 0x7ff7313185be [0xfb2] git!__scrt_common_main+0xe 0x42126ff960 0x7ff7313188ee [0xfb3] git!wmainCRTStartup+0xe 0x42126ff990 0x7ff9d5ed257d [0xfb4] KERNEL32!BaseThreadInitThunk+0x1d 0x42126ff9c0 0x7ff9d6d6aa78 [0xfb5] ntdll!RtlUserThreadStart+0x28 0x42126ff9f0 0x0 I verified manually that `traverse_trees_cur_depth` was 562 when that happened, which is far below the 2048 that were already accepted into Git as a hard limit. Despite many attempts to figure out which of the internals trigger this `STATUS_STACK_OVERFLOW` and how to maybe increase certain sizes to avoid running into this issue and let Git behave the same way as under Linux, I failed to find any build-time/runtime knob we could turn to that effect. Note: even switching to using a different allocator (I used mimalloc because that's what Git for Windows uses for its GCC builds) does not help, as the zlib code used to unpack compressed pack entries _still_ uses the regular `malloc()`. And runs into the same issue. Note also: switching to using a different allocator _also_ for zlib code seems _also_ not to help. I tried that, and it still exited with `STATUS_STACK_OVERFLOW` that seems to have been triggered by a `mi_assert_internal()`, i.e. an internal assertion of mimalloc... So the best bet to work around this for now seems to just lower the maximum allowed tree depth _even further_ for MSVC builds. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-31lower core.maxTreeDepth default to 2048Jeff King
On my Linux system, all of our recursive tree walking algorithms can run up to the 4096 default limit without segfaulting. But not all platforms will have stack sizes as generous (nor might even Linux if we kick off a recursive walk within a thread). In particular, several of the tests added in the previous few commits fail in our Windows CI environment. Through some guess-and-check pushing, I found that 3072 is still too much, but 2048 is OK. These are obviously vague heuristics, and there is nothing to promise that another system might not have trouble at even lower values. But it seems unlikely anybody will be too angry about a 2048-depth limit (this is close to the default max-pathname limit on Linux even for a pathological path like "a/a/a/..."). So let's just lower it. Some alternatives are: - configure separate defaults for Windows versus other platforms. - just skip the tests on Windows. This leaves Windows users with the annoying case that they can be crashed by running out of stack space, but there shouldn't be any security implications (they can't go deep enough to hit integer overflow problems). Since the original default was arbitrary, it seems less confusing to just lower it, keeping behavior consistent across platforms. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-31add core.maxTreeDepth configJeff King
Most of our tree traversal algorithms use recursion to visit sub-trees. For pathologically large trees, this can cause us to run out of stack space and abort in an uncontrolled way. Let's put our own limit here so that we can fail gracefully rather than segfaulting. In similar cases where we recursed along the commit graph, we rewrote the algorithms to avoid recursion and keep any stack data on the heap. But the commit graph is meant to grow without bound, whereas it's not an imposition to put a limit on the maximum size of tree we'll handle. And this has a bonus side effect: coupled with a limit on individual tree entry names, this limits the total size of a path we may encounter. This gives us an extra protection against code handling long path names which may suffer from integer overflows in the size (which could then be exploited by malicious trees). The default of 4096 is set to be much longer than anybody would care about in the real world. Even with single-letter interior tree names (like "a/b/c"), such a path is at least 8191 bytes. While most operating systems will let you create such a path incrementally, trying to reference the whole thing in a system call (as Git would do when actually trying to access it) will result in ENAMETOOLONG. Coupled with the recent fsck.largePathname warning, the maximum total pathname Git will handle is (by default) 16MB. This config option doesn't do anything yet; future patches will convert various algorithms to respect the limit. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-28Merge branch 'rs/pack-objects-parseopt-fix'Junio C Hamano
Command line parser fix. * rs/pack-objects-parseopt-fix: pack-objects: fix --no-quiet pack-objects: fix --no-keep-true-parents
2023-07-21pack-objects: fix --no-keep-true-parentsRené Scharfe
Since 99fb6e04cb (pack-objects: convert to use parse_options(), 2012-02-01) git pack-objects has accepted --no-keep-true-parents, but this option does the same as --keep-true-parents. That's because it's defined using OPT_SET_INT with a value of 0, which sets 0 when negated as well. Turn --no-keep-true-parents into the opposite of --keep-true-parents by using OPT_BOOL and storing the option's status directly in a variable named "grafts_keep_true_parents" instead of in negative form in "grafts_replace_parents". Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05treewide: remove unnecessary includes for wrapper.hCalvin Wan
Signed-off-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-29Merge branch 'en/header-split-cache-h-part-3'Junio C Hamano
Header files cleanup. * en/header-split-cache-h-part-3: (28 commits) fsmonitor-ll.h: split this header out of fsmonitor.h hash-ll, hashmap: move oidhash() to hash-ll object-store-ll.h: split this header out of object-store.h khash: name the structs that khash declares merge-ll: rename from ll-merge git-compat-util.h: remove unneccessary include of wildmatch.h builtin.h: remove unneccessary includes list-objects-filter-options.h: remove unneccessary include diff.h: remove unnecessary include of oidset.h repository: remove unnecessary include of path.h log-tree: replace include of revision.h with simple forward declaration cache.h: remove this no-longer-used header read-cache*.h: move declarations for read-cache.c functions from cache.h repository.h: move declaration of the_index from cache.h merge.h: move declarations for merge.c from cache.h diff.h: move declaration for global in diff.c from cache.h preload-index.h: move declarations for preload-index.c from elsewhere sparse-index.h: move declarations for sparse-index.c from cache.h name-hash.h: move declarations for name-hash.c from cache.h run-command.h: move declarations for run-command.c from cache.h ...
2023-06-22Merge branch 'ds/disable-replace-refs'Junio C Hamano
Introduce a mechanism to disable replace refs globally and per repository. * ds/disable-replace-refs: repository: create read_replace_refs setting replace-objects: create wrapper around setting repository: create disable_replace_refs()
2023-06-21object-store-ll.h: split this header out of object-store.hElijah Newren
The vast majority of files including object-store.h did not need dir.h nor khash.h. Split the header into two files, and let most just depend upon object-store-ll.h, while letting the two callers that need it depend on the full object-store.h. After this patch: $ git grep -h include..object-store | sort | uniq -c 2 #include "object-store.h" 129 #include "object-store-ll.h" Diff best viewed with `--color-moved`. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21repository: remove unnecessary include of path.hElijah Newren
This also made it clear that several .c files that depended upon path.h were missing a #include for it; add the missing includes while at it. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21cache.h: remove this no-longer-used headerElijah Newren
Since this header showed up in some places besides just #include statements, update/clean-up/remove those other places as well. Note that compat/fsmonitor/fsm-path-utils-darwin.c previously got away with violating the rule that all files must start with an include of git-compat-util.h (or a short-list of alternate headers that happen to include it first). This change exposed the violation and caused it to stop building correctly; fix it by having it include git-compat-util.h first, as per policy. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12repository: create read_replace_refs settingDerrick Stolee
The 'read_replace_refs' global specifies whether or not we should respect the references of the form 'refs/replace/<oid>' to replace which object we look up when asking for '<oid>'. This global has caused issues when it is not initialized properly, such as in b6551feadfd (merge-tree: load default git config, 2023-05-10). To make this more robust, move its config-based initialization out of git_default_config and into prepare_repo_settings(). This provides a repository-scoped version of the 'read_replace_refs' global. The global still has its purpose: it is disabled process-wide by the GIT_NO_REPLACE_OBJECTS environment variable or by a call to disable_replace_refs() in some specific Git commands. Since we already encapsulated the use of the constant inside replace_refs_enabled(), we can perform the initialization inside that method, if necessary. This solves the problem of forgetting to check the config, as we will check it before returning this value. Due to this encapsulation, the global can move to be static within replace-object.c. There is an interesting behavior change possible here: we now have a repository-scoped understanding of this config value. Thus, if there was a command that recurses into submodules and might follow replace refs, then it would now respect the core.useReplaceRefs config value in each repository. 'git grep --recurse-submodules' is such a command that recurses into submodules in-process. We can demonstrate the granularity of this config value via a test in t7814. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-12repository: create disable_replace_refs()Derrick Stolee
Several builtins depend on being able to disable the replace references so we actually operate on each object individually. These currently do so by directly mutating the 'read_replace_refs' global. A future change will move this global into a different place, so it will be necessary to change all of these lines. However, we can simplify that transition by abstracting the purpose of these global assignments with a method call. We will need to keep this read_replace_refs global forever, as we want to make sure that we never use replace refs throughout the life of the process if this method is called. Future changes may present a repository-scoped version of the variable to represent that repository's core.useReplaceRefs config value, but a zero-valued read_replace_refs will always override such a setting. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>