diff options
| author | Derrick Stolee <stolee@gmail.com> | 2025-09-12 10:30:06 +0000 |
|---|---|---|
| committer | Junio C Hamano <gitster@pobox.com> | 2025-09-12 08:59:52 -0700 |
| commit | 2520efd3bc8c81cb4bd8f832b241c3b2b8c0630f (patch) | |
| tree | 6616c4f6b52d70575ed08eff7a8a2d872ae9e4a0 /builtin/sparse-checkout.c | |
| parent | 064468e89919e9cf16d2afa59de33a242a4d71ab (diff) | |
sparse-checkout: add basics of 'clean' command
When users change their sparse-checkout definitions to add new
directories and remove old ones, there may be a few reasons why
directories no longer in scope remain (ignored or excluded files still
exist, Windows handles are still open, etc.). When these files still
exist, the sparse index feature notices that a tracked, but sparse,
directory still exists on disk and thus the index expands. This causes a
performance hit _and_ the advice printed isn't very helpful. Using 'git
clean' isn't enough (generally '-dfx' may be needed) but also this may
not be sufficient.
Add a new subcommand to 'git sparse-checkout' that removes these
tracked-but-sparse directories.
The implementation details provide a clear definition of what is happening,
but it is difficult to describe this without including the internal
implementation details. The core operation converts the index to a sparse
index (in memory if not already on disk) and then deletes any directories in
the worktree that correspond with a sparse directory entry in that sparse
index.
In the most common case, this means that a file will be removed if it is
contained within a directory that is both tracked and outside of the
sparse-checkout definition. However, there can be exceptions depending on
the current state of the index:
* If the worktree has a modification to a tracked, sparse file, then that
file's parent directories will be expanded instead of represented as
sparse directories. Siblings of those parent directories may be
considered sparse.
* If the user staged a sparse file with "git add --sparse", then that file
loses the SKIP_WORKTREE bit until the sparse-checkout is reapplied. Until
then, that file's parent directories are not represented as sparse
directory entries and thus will not be removed. Siblings of those parent
directories may be considered sparse. (There may be other reasons why
the SKIP_WORKTREE bit was removed for a file and this impact on the
sparse directories will apply to those as well.)
* If the user has a merge conflict outside of the sparse-checkout
definition, then those conflict entries prevent the parent directories
from being represented as sparse directory entries and thus are not
removed.
* The cases above present reasons why certain _file conditions_ will impact
which _directories_ are considered sparse. The list of tracked
directories that are outside of the sparse-checkout definition but not
represented as a sparse directory further reduces the list of files that
will be removed.
For these complicated reasons, the documentation details a potential list of
files that will be "considered for removal" instead of defining the list
concretely. The special cases can be handled by resolving conflicts,
committing staged changes, and running 'git sparse-checkout reapply' to
update the SKIP_WORKTREE bits as expected by the sparse-checkout definition.
It is important to make clear that this operation will remove ignored and
excluded files which would normally be ignored even by 'git clean -f' unless
the '-x' or '-X' option is provided. This is the most extreme method for
doing this, but it works when the sparse-checkout is in cone mode and is
expected to rescope based on directories, not files.
The current implementation always deletes these sparse directories
without warning. This is unacceptable for a released version, but those
features will be added in changes coming immediately after this one.
Note that this will not remove an untracked directory (or any of its
contents) if its parent is a tracked directory within the sparse-checkout
definition. This is required to prevent removing data created by tools that
perform caching operations for editors or build tools.
Thus, 'git sparse-checkout clean' is both more aggressive and more careful
than 'git clean -fx':
* It is more aggressive because it will remove _tracked_ files within the
sparse directories.
* It is less aggressive because it will leave _untracked_ files that are
not contained in sparse directories.
These special cases will be handled more explicitly in a future change that
expands tests for the 'git sparse-checkout clean' command. We handle some of
the modified, staged, and committed states including some impact on 'git
status' after cleaning.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffstat (limited to 'builtin/sparse-checkout.c')
| -rw-r--r-- | builtin/sparse-checkout.c | 64 |
1 files changed, 63 insertions, 1 deletions
diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c index 06de61bd9d..f7caa28f3f 100644 --- a/builtin/sparse-checkout.c +++ b/builtin/sparse-checkout.c @@ -2,6 +2,7 @@ #define DISABLE_SIGN_COMPARE_WARNINGS #include "builtin.h" +#include "abspath.h" #include "config.h" #include "dir.h" #include "environment.h" @@ -23,7 +24,7 @@ static const char *empty_base = ""; static char const * const builtin_sparse_checkout_usage[] = { - N_("git sparse-checkout (init | list | set | add | reapply | disable | check-rules) [<options>]"), + N_("git sparse-checkout (init | list | set | add | reapply | disable | check-rules | clean) [<options>]"), NULL }; @@ -924,6 +925,66 @@ static int sparse_checkout_reapply(int argc, const char **argv, return update_working_directory(repo, NULL); } +static char const * const builtin_sparse_checkout_clean_usage[] = { + "git sparse-checkout clean [-n|--dry-run]", + NULL +}; + +static const char *msg_remove = N_("Removing %s\n"); + +static int sparse_checkout_clean(int argc, const char **argv, + const char *prefix, + struct repository *repo) +{ + struct strbuf full_path = STRBUF_INIT; + const char *msg = msg_remove; + size_t worktree_len; + + struct option builtin_sparse_checkout_clean_options[] = { + OPT_END(), + }; + + setup_work_tree(); + if (!core_apply_sparse_checkout) + die(_("must be in a sparse-checkout to clean directories")); + if (!core_sparse_checkout_cone) + die(_("must be in a cone-mode sparse-checkout to clean directories")); + + argc = parse_options(argc, argv, prefix, + builtin_sparse_checkout_clean_options, + builtin_sparse_checkout_clean_usage, 0); + + if (repo_read_index(repo) < 0) + die(_("failed to read index")); + + if (convert_to_sparse(repo->index, SPARSE_INDEX_MEMORY_ONLY) || + repo->index->sparse_index == INDEX_EXPANDED) + die(_("failed to convert index to a sparse index; resolve merge conflicts and try again")); + + strbuf_addstr(&full_path, repo->worktree); + strbuf_addch(&full_path, '/'); + worktree_len = full_path.len; + + for (size_t i = 0; i < repo->index->cache_nr; i++) { + struct cache_entry *ce = repo->index->cache[i]; + if (!S_ISSPARSEDIR(ce->ce_mode)) + continue; + strbuf_setlen(&full_path, worktree_len); + strbuf_add(&full_path, ce->name, ce->ce_namelen); + + if (!is_directory(full_path.buf)) + continue; + + printf(msg, ce->name); + + if (remove_dir_recursively(&full_path, 0)) + warning_errno(_("failed to remove '%s'"), ce->name); + } + + strbuf_release(&full_path); + return 0; +} + static char const * const builtin_sparse_checkout_disable_usage[] = { "git sparse-checkout disable", NULL @@ -1080,6 +1141,7 @@ int cmd_sparse_checkout(int argc, OPT_SUBCOMMAND("set", &fn, sparse_checkout_set), OPT_SUBCOMMAND("add", &fn, sparse_checkout_add), OPT_SUBCOMMAND("reapply", &fn, sparse_checkout_reapply), + OPT_SUBCOMMAND("clean", &fn, sparse_checkout_clean), OPT_SUBCOMMAND("disable", &fn, sparse_checkout_disable), OPT_SUBCOMMAND("check-rules", &fn, sparse_checkout_check_rules), OPT_END(), |
