diff options
68 files changed, 3085 insertions, 1514 deletions
diff --git a/Documentation/BreakingChanges.adoc b/Documentation/BreakingChanges.adoc index 90b53abcea..f814450d2f 100644 --- a/Documentation/BreakingChanges.adoc +++ b/Documentation/BreakingChanges.adoc @@ -295,6 +295,26 @@ The command will be removed. + cf. <xmqqa59i45wc.fsf@gitster.g> +* Support for `core.preferSymlinkRefs=true` has been deprecated and will be + removed in Git 3.0. Writing symbolic refs as symbolic links will be phased + out in favor of using plain files using the textual representation of + symbolic refs. ++ +Symbolic references were initially always stored as a symbolic link. This was +changed in 9b143c6e15 (Teach update-ref about a symbolic ref stored in a +textfile., 2005-09-25), where a new textual symref format was introduced to +store those symbolic refs in a plain file. In 9f0bb90d16 +(core.prefersymlinkrefs: use symlinks for .git/HEAD, 2006-05-02), the Git +project switched the default to use the textual symrefs in favor of symbolic +links. ++ +The migration away from symbolic links has happened almost 20 years ago by now, +and there is no known reason why one should prefer them nowadays. Furthermore, +symbolic links are not supported on some platforms. ++ +Note that only the writing side for such symbolic links is deprecated. Reading +such symbolic links is still supported for now. + == Superseded features that will not be deprecated Some features have gained newer replacements that aim to improve the design in diff --git a/Documentation/MyFirstContribution.adoc b/Documentation/MyFirstContribution.adoc index 02ba8ba5f6..f186dfbc89 100644 --- a/Documentation/MyFirstContribution.adoc +++ b/Documentation/MyFirstContribution.adoc @@ -1153,6 +1153,11 @@ NOTE: When you are sending a real patch, it will go to git@vger.kernel.org - but please don't send your patchset from the tutorial to the real mailing list! For now, you can send it to yourself, to make sure you understand how it will look. +NOTE: After sending your patches, you can confirm that they reached the mailing +list by visiting https://lore.kernel.org/git/. Use the search bar to find your +name or the subject of your patch. If it appears, your email was successfully +delivered. + After you run the command above, you will be presented with an interactive prompt for each patch that's about to go out. This gives you one last chance to edit or quit sending something (but again, don't edit code this way). Once you diff --git a/Documentation/RelNotes/2.52.0.adoc b/Documentation/RelNotes/2.52.0.adoc index bfde4bda3b..ba213c0d6c 100644 --- a/Documentation/RelNotes/2.52.0.adoc +++ b/Documentation/RelNotes/2.52.0.adoc @@ -64,6 +64,16 @@ UI, Workflows & Features * "git fast-import" is taught to handle signed tags, just like it recently learned to handle signed commits, in different ways. + * A new configuration variable commitGraph.changedPaths allows to + turn "--changed-paths" on by default for "git commit-graph". + + * "Symlink symref" has been added to the list of things that will + disappear at Git 3.0 boundary. + + * "git maintenance" command learns the "geometric" strategy where it + avoids doing maintenance tasks that rebuilds everything from + scratch. + Performance, Internal Implementation, Development Support etc. -------------------------------------------------------------- @@ -146,6 +156,15 @@ Performance, Internal Implementation, Development Support etc. * CI improvements to handle the recent Rust integration better. + * The code in "git repack" machinery has been cleaned up to prepare + for incremental update of midx files. + + * Two slightly different ways to get at "all the packfiles" in API + has been cleaned up. + + * The code to walk revision graph to compute merge base has been + optimized. + Fixes since v2.51 ----------------- @@ -356,6 +375,28 @@ including security updates, are included in this release. fail doe to overly long pathname in our test environment, which has been worked around by using "ssh-agent -T". + * strbuf_split*() to split a string into multiple strbufs is often a + wrong API to use. A few uses of it have been removed by + simplifying the code. + (merge 2ab72a16d9 ob/gpg-interface-cleanup later to maint). + + * "git shortlog" knows "--committer" and "--author" options, which + the command line completion (in contrib/) did not handle well, + which has been corrected. + (merge c568fa8e1c kf/log-shortlog-completion-fix later to maint). + + * "git bisect" command did not react correctly to "git bisect help" + and "git bisect unknown", which has been corrected. + (merge 2bb3a012f3 rz/bisect-help-unknown later to maint). + + * The 'q'(uit) command in "git add -p" has been improved to quit + without doing any meaningless work before leaving, and giving EOF + (typically control-D) to the prompt is made to behave the same way. + + * The wildmatch code had a corner case bug that mistakenly makes + "foo**/bar" match with "foobar", which has been corrected. + (merge 1940a02dc1 jk/match-pathname-fix later to maint). + * Other code cleanup, docfix, build fix, etc. (merge 529a60a885 ua/t1517-short-help-tests later to maint). (merge 22d421fed9 ac/deglobal-fmt-merge-log-config later to maint). @@ -367,3 +408,4 @@ including security updates, are included in this release. (merge a66fc22bf9 rs/get-oid-with-flags-cleanup later to maint). (merge 15b8abde07 js/mingw-includes-cleanup later to maint). (merge 2cebca0582 tb/cat-file-objectmode-update later to maint). + (merge 8f487db07a kh/doc-patch-id-1 later to maint). diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches index d620bd93bd..e270ccbe85 100644 --- a/Documentation/SubmittingPatches +++ b/Documentation/SubmittingPatches @@ -446,6 +446,34 @@ highlighted above. Only capitalize the very first letter of the trailer, i.e. favor "Signed-off-by" over "Signed-Off-By" and "Acked-by:" over "Acked-By". +[[ai]] +=== Use of Artificial Intelligence (AI) + +The Developer's Certificate of Origin requires contributors to certify +that they know the origin of their contributions to the project and +that they have the right to submit it under the project's license. +It's not yet clear that this can be legally satisfied when submitting +significant amount of content that has been generated by AI tools. + +Another issue with AI generated content is that AIs still often +hallucinate or just produce bad code, commit messages, documentation +or output, even when you point out their mistakes. + +To avoid these issues, we will reject anything that looks AI +generated, that sounds overly formal or bloated, that looks like AI +slop, that looks good on the surface but makes no sense, or that +senders don’t understand or cannot explain. + +We strongly recommend using AI tools carefully and responsibly. + +Contributors would often benefit more from AI by using it to guide and +help them step by step towards producing a solution by themselves +rather than by asking for a full solution that they would then mostly +copy-paste. They can also use AI to help with debugging, or with +checking for obvious mistakes, things that can be improved, things +that don’t match our style, guidelines or our feedback, before sending +it to us. + [[git-tools]] === Generate your patch using Git tools out of your commits. diff --git a/Documentation/config/commitgraph.adoc b/Documentation/config/commitgraph.adoc index 7f8c9d6638..70a56c53d2 100644 --- a/Documentation/config/commitgraph.adoc +++ b/Documentation/config/commitgraph.adoc @@ -8,6 +8,17 @@ commitGraph.maxNewFilters:: Specifies the default value for the `--max-new-filters` option of `git commit-graph write` (c.f., linkgit:git-commit-graph[1]). +commitGraph.changedPaths:: + If true, then `git commit-graph write` will compute and write + changed-path Bloom filters by default, equivalent to passing + `--changed-paths`. If false or unset, changed-paths Bloom filters will + be written during `git commit-graph write` only if the filters already + exist in the current commit-graph file. This matches the default + behavior of `git commit-graph write` without any `--[no-]changed-paths` + option. To rewrite a commit-graph file without any filters, use the + `--no-changed-paths` option. Command-line option `--[no-]changed-paths` + always takes precedence over this configuration. Defaults to unset. + commitGraph.readChangedPaths:: Deprecated. Equivalent to commitGraph.changedPathsVersion=-1 if true, and commitGraph.changedPathsVersion=0 if false. (If commitGraph.changedPathVersion diff --git a/Documentation/config/core.adoc b/Documentation/config/core.adoc index e2de270c86..11efad189e 100644 --- a/Documentation/config/core.adoc +++ b/Documentation/config/core.adoc @@ -290,6 +290,9 @@ core.preferSymlinkRefs:: and other symbolic reference files, use symbolic links. This is sometimes needed to work with old scripts that expect HEAD to be a symbolic link. ++ +This configuration is deprecated and will be removed in Git 3.0. Symbolic refs +will always be written as textual symrefs. core.alternateRefsCommand:: When advertising tips of available history from an alternate, use the shell to diff --git a/Documentation/config/maintenance.adoc b/Documentation/config/maintenance.adoc index 2f71934218..d0c38f03fa 100644 --- a/Documentation/config/maintenance.adoc +++ b/Documentation/config/maintenance.adoc @@ -16,19 +16,36 @@ detach. maintenance.strategy:: This string config option provides a way to specify one of a few - recommended schedules for background maintenance. This only affects - which tasks are run during `git maintenance run --schedule=X` - commands, provided no `--task=<task>` arguments are provided. - Further, if a `maintenance.<task>.schedule` config value is set, - then that value is used instead of the one provided by - `maintenance.strategy`. The possible strategy strings are: + recommended strategies for repository maintenance. This affects + which tasks are run during `git maintenance run`, provided no + `--task=<task>` arguments are provided. This setting impacts manual + maintenance, auto-maintenance as well as scheduled maintenance. The + tasks that run may be different depending on the maintenance type. + -* `none`: This default setting implies no tasks are run at any schedule. +The maintenance strategy can be further tweaked by setting +`maintenance.<task>.enabled` and `maintenance.<task>.schedule`. If set, these +values are used instead of the defaults provided by `maintenance.strategy`. ++ +The possible strategies are: ++ +* `none`: This strategy implies no tasks are run at all. This is the default + strategy for scheduled maintenance. +* `gc`: This strategy runs the `gc` task. This is the default strategy for + manual maintenance. +* `geometric`: This strategy performs geometric repacking of packfiles and + keeps auxiliary data structures up-to-date. The strategy expires data in the + reflog and removes worktrees that cannot be located anymore. When the + geometric repacking strategy would decide to do an all-into-one repack, then + the strategy generates a cruft pack for all unreachable objects. Objects that + are already part of a cruft pack will be expired. ++ +This repacking strategy is a full replacement for the `gc` strategy and is +recommended for large repositories. * `incremental`: This setting optimizes for performing small maintenance activities that do not delete any data. This does not schedule the `gc` task, but runs the `prefetch` and `commit-graph` tasks hourly, the `loose-objects` and `incremental-repack` tasks daily, and the `pack-refs` - task weekly. + task weekly. Manual repository maintenance uses the `gc` task. maintenance.<task>.enabled:: This boolean config option controls whether the maintenance task @@ -75,6 +92,22 @@ maintenance.incremental-repack.auto:: number of pack-files not in the multi-pack-index is at least the value of `maintenance.incremental-repack.auto`. The default value is 10. +maintenance.geometric-repack.auto:: + This integer config option controls how often the `geometric-repack` + task should be run as part of `git maintenance run --auto`. If zero, + then the `geometric-repack` task will not run with the `--auto` + option. A negative value will force the task to run every time. + Otherwise, a positive value implies the command should run either when + there are packfiles that need to be merged together to retain the + geometric progression, or when there are at least this many loose + objects that would be written into a new packfile. The default value is + 100. + +maintenance.geometric-repack.splitFactor:: + This integer config option controls the factor used for the geometric + sequence. See the `--geometric=` option in linkgit:git-repack[1] for + more details. Defaults to `2`. + maintenance.reflog-expire.auto:: This integer config option controls how often the `reflog-expire` task should be run as part of `git maintenance run --auto`. If zero, then diff --git a/Documentation/git-commit-graph.adoc b/Documentation/git-commit-graph.adoc index e9558173c0..6d19026035 100644 --- a/Documentation/git-commit-graph.adoc +++ b/Documentation/git-commit-graph.adoc @@ -71,7 +71,7 @@ take a while on large repositories. It provides significant performance gains for getting history of a directory or a file with `git log -- <path>`. If this option is given, future commit-graph writes will automatically assume that this option was intended. Use `--no-changed-paths` to stop storing this -data. +data. `--changed-paths` is implied by config `commitGraph.changedPaths=true`. + With the `--max-new-filters=<n>` option, generate at most `n` new Bloom filters (if `--changed-paths` is specified). If `n` is `-1`, no limit is diff --git a/Documentation/git-patch-id.adoc b/Documentation/git-patch-id.adoc index 45da0f27ac..92a1af36a2 100644 --- a/Documentation/git-patch-id.adoc +++ b/Documentation/git-patch-id.adoc @@ -7,8 +7,8 @@ git-patch-id - Compute unique ID for a patch SYNOPSIS -------- -[verse] -'git patch-id' [--stable | --unstable | --verbatim] +[synopsis] +git patch-id [--stable | --unstable | --verbatim] DESCRIPTION ----------- @@ -21,7 +21,7 @@ the same time also reasonably unique, i.e., two patches that have the same The main usecase for this command is to look for likely duplicate commits. -When dealing with 'git diff-tree' output, it takes advantage of +When dealing with `git diff-tree` output, it takes advantage of the fact that the patch is prefixed with the object name of the commit, and outputs two 40-byte hexadecimal strings. The first string is the patch ID, and the second string is the commit ID. @@ -30,35 +30,35 @@ This can be used to make a mapping from patch ID to commit ID. OPTIONS ------- ---verbatim:: +`--verbatim`:: Calculate the patch-id of the input as it is given, do not strip any whitespace. + -This is the default if patchid.verbatim is true. +This is the default if `patchid.verbatim` is `true`. ---stable:: +`--stable`:: Use a "stable" sum of hashes as the patch ID. With this option: + -- - Reordering file diffs that make up a patch does not affect the ID. In particular, two patches produced by comparing the same two trees - with two different settings for "-O<orderfile>" result in the same + with two different settings for `-O<orderfile>` result in the same patch ID signature, thereby allowing the computed result to be used as a key to index some meta-information about the change between the two trees; - Result is different from the value produced by git 1.9 and older - or produced when an "unstable" hash (see --unstable below) is + or produced when an "unstable" hash (see `--unstable` below) is configured - even when used on a diff output taken without any use - of "-O<orderfile>", thereby making existing databases storing such + of `-O<orderfile>`, thereby making existing databases storing such "unstable" or historical patch-ids unusable. - All whitespace within the patch is ignored and does not affect the id. -- + -This is the default if patchid.stable is set to true. +This is the default if `patchid.stable` is set to `true`. ---unstable:: +`--unstable`:: Use an "unstable" hash as the patch ID. With this option, the result produced is compatible with the patch-id value produced by git 1.9 and older and whitespace is ignored. Users with pre-existing diff --git a/Documentation/git-repo.adoc b/Documentation/git-repo.adoc index 209afd1b61..ce43cb19c8 100644 --- a/Documentation/git-repo.adoc +++ b/Documentation/git-repo.adoc @@ -9,6 +9,7 @@ SYNOPSIS -------- [synopsis] git repo info [--format=(keyvalue|nul)] [-z] [<key>...] +git repo structure [--format=(table|keyvalue|nul)] DESCRIPTION ----------- @@ -43,6 +44,35 @@ supported: + `-z` is an alias for `--format=nul`. +`structure [--format=(table|keyvalue|nul)]`:: + Retrieve statistics about the current repository structure. The + following kinds of information are reported: ++ +* Reference counts categorized by type +* Reachable object counts categorized by type + ++ +The output format can be chosen through the flag `--format`. Three formats are +supported: ++ +`table`::: + Outputs repository stats in a human-friendly table. This format may + change and is not intended for machine parsing. This is the default + format. + +`keyvalue`::: + Each line of output contains a key-value pair for a repository stat. + The '=' character is used to delimit between the key and the value. + Values containing "unusual" characters are quoted as explained for the + configuration variable `core.quotePath` (see linkgit:git-config[1]). + +`nul`::: + Similar to `keyvalue`, but uses a NUL character to delimit between + key-value pairs instead of a newline. Also uses a newline character as + the delimiter between the key and value instead of '='. Unlike the + `keyvalue` format, values containing "unusual" characters are never + quoted. + INFO KEYS --------- In order to obtain a set of values from `git repo info`, you should provide diff --git a/GIT-VERSION-GEN b/GIT-VERSION-GEN index b16db85e77..c43f33d889 100755 --- a/GIT-VERSION-GEN +++ b/GIT-VERSION-GEN @@ -1,6 +1,6 @@ #!/bin/sh -DEF_VER=v2.51.GIT +DEF_VER=v2.52.0-rc0 LF=' ' @@ -1267,6 +1267,12 @@ LIB_OBJS += reftable/table.o LIB_OBJS += reftable/tree.o LIB_OBJS += reftable/writer.o LIB_OBJS += remote.o +LIB_OBJS += repack.o +LIB_OBJS += repack-cruft.o +LIB_OBJS += repack-filtered.o +LIB_OBJS += repack-geometry.o +LIB_OBJS += repack-midx.o +LIB_OBJS += repack-promisor.o LIB_OBJS += replace-object.o LIB_OBJS += repo-settings.o LIB_OBJS += repository.o diff --git a/add-patch.c b/add-patch.c index ae9a20d8f2..173a53241e 100644 --- a/add-patch.c +++ b/add-patch.c @@ -1569,8 +1569,10 @@ static int patch_update_file(struct add_p_state *s, if (*s->s.reset_color_interactive) fputs(s->s.reset_color_interactive, stdout); fflush(stdout); - if (read_single_character(s) == EOF) + if (read_single_character(s) == EOF) { + quit = 1; break; + } if (!s->answer.len) continue; @@ -1601,7 +1603,7 @@ soft_increment: } else if (hunk->use == UNDECIDED_HUNK) { hunk->use = USE_HUNK; } - } else if (ch == 'd' || ch == 'q') { + } else if (ch == 'd') { if (file_diff->hunk_nr) { for (; hunk_index < file_diff->hunk_nr; hunk_index++) { hunk = file_diff->hunk + hunk_index; @@ -1613,10 +1615,9 @@ soft_increment: } else if (hunk->use == UNDECIDED_HUNK) { hunk->use = SKIP_HUNK; } - if (ch == 'q') { - quit = 1; - break; - } + } else if (ch == 'q') { + quit = 1; + break; } else if (s->answer.buf[0] == 'K') { if (permitted & ALLOW_GOTO_PREVIOUS_HUNK) hunk_index = dec_mod(hunk_index, diff --git a/builtin/bisect.c b/builtin/bisect.c index 8b8d870cd1..993caf545d 100644 --- a/builtin/bisect.c +++ b/builtin/bisect.c @@ -1453,9 +1453,13 @@ int cmd_bisect(int argc, if (!argc) usage_msg_opt(_("need a command"), git_bisect_usage, options); + if (!strcmp(argv[0], "help")) + usage_with_options(git_bisect_usage, options); + set_terms(&terms, "bad", "good"); get_terms(&terms); - if (check_and_set_terms(&terms, argv[0])) + if (check_and_set_terms(&terms, argv[0]) || + !one_of(argv[0], terms.term_good, terms.term_bad, NULL)) usage_msg_optf(_("unknown command: '%s'"), git_bisect_usage, options, argv[0]); res = bisect_state(&terms, argc, argv); diff --git a/builtin/cat-file.c b/builtin/cat-file.c index 5ca2ca3852..983ecec837 100644 --- a/builtin/cat-file.c +++ b/builtin/cat-file.c @@ -852,10 +852,9 @@ static void batch_each_object(struct batch_options *opt, if (bitmap && !for_each_bitmapped_object(bitmap, &opt->objects_filter, batch_one_object_bitmapped, &payload)) { - struct packfile_store *packs = the_repository->objects->packfiles; struct packed_git *pack; - for (pack = packfile_store_get_all_packs(packs); pack; pack = pack->next) { + repo_for_each_pack(the_repository, pack) { if (bitmap_index_contains_pack(bitmap, pack) || open_pack_index(pack)) continue; diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c index fe3ebaadad..d62005edc0 100644 --- a/builtin/commit-graph.c +++ b/builtin/commit-graph.c @@ -210,6 +210,8 @@ static int git_commit_graph_write_config(const char *var, const char *value, { if (!strcmp(var, "commitgraph.maxnewfilters")) write_opts.max_new_filters = git_config_int(var, value, ctx->kvi); + else if (!strcmp(var, "commitgraph.changedpaths")) + opts.enable_changed_paths = git_config_bool(var, value) ? 1 : -1; /* * No need to fall-back to 'git_default_config', since this was already * called in 'cmd_commit_graph()'. diff --git a/builtin/count-objects.c b/builtin/count-objects.c index f2f407c2a7..18f6e33b6f 100644 --- a/builtin/count-objects.c +++ b/builtin/count-objects.c @@ -122,7 +122,6 @@ int cmd_count_objects(int argc, count_loose, count_cruft, NULL, NULL); if (verbose) { - struct packfile_store *packs = the_repository->objects->packfiles; struct packed_git *p; unsigned long num_pack = 0; off_t size_pack = 0; @@ -130,7 +129,7 @@ int cmd_count_objects(int argc, struct strbuf pack_buf = STRBUF_INIT; struct strbuf garbage_buf = STRBUF_INIT; - for (p = packfile_store_get_all_packs(packs); p; p = p->next) { + repo_for_each_pack(the_repository, p) { if (!p->pack_local) continue; if (open_pack_index(p)) diff --git a/builtin/fast-import.c b/builtin/fast-import.c index 8714edfc65..54d3e592c6 100644 --- a/builtin/fast-import.c +++ b/builtin/fast-import.c @@ -979,7 +979,7 @@ static int store_object( if (e->idx.offset) { duplicate_count_by_type[type]++; return 1; - } else if (find_oid_pack(&oid, packfile_store_get_all_packs(packs))) { + } else if (find_oid_pack(&oid, packfile_store_get_packs(packs))) { e->type = type; e->pack_id = MAX_PACK_ID; e->idx.offset = 1; /* just not zero! */ @@ -1180,7 +1180,7 @@ static void stream_blob(uintmax_t len, struct object_id *oidout, uintmax_t mark) duplicate_count_by_type[OBJ_BLOB]++; truncate_pack(&checkpoint); - } else if (find_oid_pack(&oid, packfile_store_get_all_packs(packs))) { + } else if (find_oid_pack(&oid, packfile_store_get_packs(packs))) { e->type = OBJ_BLOB; e->pack_id = MAX_PACK_ID; e->idx.offset = 1; /* just not zero! */ diff --git a/builtin/fsck.c b/builtin/fsck.c index 8ee95e0d67..b1a650c673 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -867,20 +867,20 @@ static int mark_packed_for_connectivity(const struct object_id *oid, static int check_pack_rev_indexes(struct repository *r, int show_progress) { - struct packfile_store *packs = r->objects->packfiles; struct progress *progress = NULL; + struct packed_git *p; uint32_t pack_count = 0; int res = 0; if (show_progress) { - for (struct packed_git *p = packfile_store_get_all_packs(packs); p; p = p->next) + repo_for_each_pack(r, p) pack_count++; progress = start_delayed_progress(the_repository, "Verifying reverse pack-indexes", pack_count); pack_count = 0; } - for (struct packed_git *p = packfile_store_get_all_packs(packs); p; p = p->next) { + repo_for_each_pack(r, p) { int load_error = load_pack_revindex_from_disk(p); if (load_error < 0) { @@ -1000,8 +1000,6 @@ int cmd_fsck(int argc, for_each_packed_object(the_repository, mark_packed_for_connectivity, NULL, 0); } else { - struct packfile_store *packs = the_repository->objects->packfiles; - odb_prepare_alternates(the_repository->objects); for (source = the_repository->objects->sources; source; source = source->next) fsck_source(source); @@ -1012,8 +1010,7 @@ int cmd_fsck(int argc, struct progress *progress = NULL; if (show_progress) { - for (p = packfile_store_get_all_packs(packs); p; - p = p->next) { + repo_for_each_pack(the_repository, p) { if (open_pack_index(p)) continue; total += p->num_objects; @@ -1022,8 +1019,8 @@ int cmd_fsck(int argc, progress = start_progress(the_repository, _("Checking objects"), total); } - for (p = packfile_store_get_all_packs(packs); p; - p = p->next) { + + repo_for_each_pack(the_repository, p) { /* verify gives error messages itself */ if (verify_pack(the_repository, p, fsck_obj_buffer, diff --git a/builtin/gc.c b/builtin/gc.c index e19e13d978..d212cbb9b8 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -34,6 +34,7 @@ #include "pack-objects.h" #include "path.h" #include "reflog.h" +#include "repack.h" #include "rerere.h" #include "blob.h" #include "tree.h" @@ -55,7 +56,6 @@ static const char * const builtin_gc_usage[] = { }; static timestamp_t gc_log_expire_time; -static struct strvec repack = STRVEC_INIT; static struct tempfile *pidfile; static struct lock_file log_lock; static struct string_list pack_garbage = STRING_LIST_INIT_DUP; @@ -255,6 +255,7 @@ enum maintenance_task_label { TASK_PREFETCH, TASK_LOOSE_OBJECTS, TASK_INCREMENTAL_REPACK, + TASK_GEOMETRIC_REPACK, TASK_GC, TASK_COMMIT_GRAPH, TASK_PACK_REFS, @@ -448,7 +449,7 @@ out: return should_gc; } -static int too_many_loose_objects(struct gc_config *cfg) +static int too_many_loose_objects(int limit) { /* * Quickly check if a "gc" is needed, by estimating how @@ -470,7 +471,7 @@ static int too_many_loose_objects(struct gc_config *cfg) if (!dir) return 0; - auto_threshold = DIV_ROUND_UP(cfg->gc_auto_threshold, 256); + auto_threshold = DIV_ROUND_UP(limit, 256); while ((ent = readdir(dir)) != NULL) { if (strspn(ent->d_name, "0123456789abcdef") != hexsz_loose || ent->d_name[hexsz_loose] != '\0') @@ -487,10 +488,9 @@ static int too_many_loose_objects(struct gc_config *cfg) static struct packed_git *find_base_packs(struct string_list *packs, unsigned long limit) { - struct packfile_store *packfiles = the_repository->objects->packfiles; struct packed_git *p, *base = NULL; - for (p = packfile_store_get_all_packs(packfiles); p; p = p->next) { + repo_for_each_pack(the_repository, p) { if (!p->pack_local || p->is_cruft) continue; if (limit) { @@ -509,14 +509,13 @@ static struct packed_git *find_base_packs(struct string_list *packs, static int too_many_packs(struct gc_config *cfg) { - struct packfile_store *packs = the_repository->objects->packfiles; struct packed_git *p; - int cnt; + int cnt = 0; if (cfg->gc_auto_pack_limit <= 0) return 0; - for (cnt = 0, p = packfile_store_get_all_packs(packs); p; p = p->next) { + repo_for_each_pack(the_repository, p) { if (!p->pack_local) continue; if (p->pack_keep) @@ -618,48 +617,50 @@ static uint64_t estimate_repack_memory(struct gc_config *cfg, return os_cache + heap; } -static int keep_one_pack(struct string_list_item *item, void *data UNUSED) +static int keep_one_pack(struct string_list_item *item, void *data) { - strvec_pushf(&repack, "--keep-pack=%s", basename(item->string)); + struct strvec *args = data; + strvec_pushf(args, "--keep-pack=%s", basename(item->string)); return 0; } static void add_repack_all_option(struct gc_config *cfg, - struct string_list *keep_pack) + struct string_list *keep_pack, + struct strvec *args) { if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now") && !(cfg->cruft_packs && cfg->repack_expire_to)) - strvec_push(&repack, "-a"); + strvec_push(args, "-a"); else if (cfg->cruft_packs) { - strvec_push(&repack, "--cruft"); + strvec_push(args, "--cruft"); if (cfg->prune_expire) - strvec_pushf(&repack, "--cruft-expiration=%s", cfg->prune_expire); + strvec_pushf(args, "--cruft-expiration=%s", cfg->prune_expire); if (cfg->max_cruft_size) - strvec_pushf(&repack, "--max-cruft-size=%lu", + strvec_pushf(args, "--max-cruft-size=%lu", cfg->max_cruft_size); if (cfg->repack_expire_to) - strvec_pushf(&repack, "--expire-to=%s", cfg->repack_expire_to); + strvec_pushf(args, "--expire-to=%s", cfg->repack_expire_to); } else { - strvec_push(&repack, "-A"); + strvec_push(args, "-A"); if (cfg->prune_expire) - strvec_pushf(&repack, "--unpack-unreachable=%s", cfg->prune_expire); + strvec_pushf(args, "--unpack-unreachable=%s", cfg->prune_expire); } if (keep_pack) - for_each_string_list(keep_pack, keep_one_pack, NULL); + for_each_string_list(keep_pack, keep_one_pack, args); if (cfg->repack_filter && *cfg->repack_filter) - strvec_pushf(&repack, "--filter=%s", cfg->repack_filter); + strvec_pushf(args, "--filter=%s", cfg->repack_filter); if (cfg->repack_filter_to && *cfg->repack_filter_to) - strvec_pushf(&repack, "--filter-to=%s", cfg->repack_filter_to); + strvec_pushf(args, "--filter-to=%s", cfg->repack_filter_to); } -static void add_repack_incremental_option(void) +static void add_repack_incremental_option(struct strvec *args) { - strvec_push(&repack, "--no-write-bitmap-index"); + strvec_push(args, "--no-write-bitmap-index"); } -static int need_to_gc(struct gc_config *cfg) +static int need_to_gc(struct gc_config *cfg, struct strvec *repack_args) { /* * Setting gc.auto to 0 or negative can disable the @@ -700,10 +701,10 @@ static int need_to_gc(struct gc_config *cfg) string_list_clear(&keep_pack, 0); } - add_repack_all_option(cfg, &keep_pack); + add_repack_all_option(cfg, &keep_pack, repack_args); string_list_clear(&keep_pack, 0); - } else if (too_many_loose_objects(cfg)) - add_repack_incremental_option(); + } else if (too_many_loose_objects(cfg->gc_auto_threshold)) + add_repack_incremental_option(repack_args); else return 0; @@ -852,6 +853,7 @@ int cmd_gc(int argc, int keep_largest_pack = -1; int skip_foreground_tasks = 0; timestamp_t dummy; + struct strvec repack_args = STRVEC_INIT; struct maintenance_run_opts opts = MAINTENANCE_RUN_OPTS_INIT; struct gc_config cfg = GC_CONFIG_INIT; const char *prune_expire_sentinel = "sentinel"; @@ -891,7 +893,7 @@ int cmd_gc(int argc, show_usage_with_options_if_asked(argc, argv, builtin_gc_usage, builtin_gc_options); - strvec_pushl(&repack, "repack", "-d", "-l", NULL); + strvec_pushl(&repack_args, "repack", "-d", "-l", NULL); gc_config(&cfg); @@ -914,14 +916,14 @@ int cmd_gc(int argc, die(_("failed to parse prune expiry value %s"), cfg.prune_expire); if (aggressive) { - strvec_push(&repack, "-f"); + strvec_push(&repack_args, "-f"); if (cfg.aggressive_depth > 0) - strvec_pushf(&repack, "--depth=%d", cfg.aggressive_depth); + strvec_pushf(&repack_args, "--depth=%d", cfg.aggressive_depth); if (cfg.aggressive_window > 0) - strvec_pushf(&repack, "--window=%d", cfg.aggressive_window); + strvec_pushf(&repack_args, "--window=%d", cfg.aggressive_window); } if (opts.quiet) - strvec_push(&repack, "-q"); + strvec_push(&repack_args, "-q"); if (opts.auto_flag) { if (cfg.detach_auto && opts.detach < 0) @@ -930,7 +932,7 @@ int cmd_gc(int argc, /* * Auto-gc should be least intrusive as possible. */ - if (!need_to_gc(&cfg)) { + if (!need_to_gc(&cfg, &repack_args)) { ret = 0; goto out; } @@ -952,7 +954,7 @@ int cmd_gc(int argc, find_base_packs(&keep_pack, cfg.big_pack_threshold); } - add_repack_all_option(&cfg, &keep_pack); + add_repack_all_option(&cfg, &keep_pack, &repack_args); string_list_clear(&keep_pack, 0); } @@ -1014,9 +1016,9 @@ int cmd_gc(int argc, repack_cmd.git_cmd = 1; repack_cmd.close_object_store = 1; - strvec_pushv(&repack_cmd.args, repack.v); + strvec_pushv(&repack_cmd.args, repack_args.v); if (run_command(&repack_cmd)) - die(FAILED_RUN, repack.v[0]); + die(FAILED_RUN, repack_args.v[0]); if (cfg.prune_expire) { struct child_process prune_cmd = CHILD_PROCESS_INIT; @@ -1055,7 +1057,7 @@ int cmd_gc(int argc, !opts.quiet && !daemonized ? COMMIT_GRAPH_WRITE_PROGRESS : 0, NULL); - if (opts.auto_flag && too_many_loose_objects(&cfg)) + if (opts.auto_flag && too_many_loose_objects(cfg.gc_auto_threshold)) warning(_("There are too many unreachable loose objects; " "run 'git prune' to remove them.")); @@ -1067,6 +1069,7 @@ int cmd_gc(int argc, out: maintenance_run_opts_release(&opts); + strvec_clear(&repack_args); gc_config_release(&cfg); return 0; } @@ -1269,6 +1272,19 @@ static int maintenance_task_gc_background(struct maintenance_run_opts *opts, return run_command(&child); } +static int gc_condition(struct gc_config *cfg) +{ + /* + * Note that it's fine to drop the repack arguments here, as we execute + * git-gc(1) as a separate child process anyway. So it knows to compute + * these arguments again. + */ + struct strvec repack_args = STRVEC_INIT; + int ret = need_to_gc(cfg, &repack_args); + strvec_clear(&repack_args); + return ret; +} + static int prune_packed(struct maintenance_run_opts *opts) { struct child_process child = CHILD_PROCESS_INIT; @@ -1425,9 +1441,9 @@ static int incremental_repack_auto_condition(struct gc_config *cfg UNUSED) if (incremental_repack_auto_limit < 0) return 1; - for (p = packfile_store_get_packs(the_repository->objects->packfiles); - count < incremental_repack_auto_limit && p; - p = p->next) { + repo_for_each_pack(the_repository, p) { + if (count >= incremental_repack_auto_limit) + break; if (!p->multi_pack_index) count++; } @@ -1494,7 +1510,7 @@ static off_t get_auto_pack_size(void) struct repository *r = the_repository; odb_reprepare(r->objects); - for (p = packfile_store_get_all_packs(r->objects->packfiles); p; p = p->next) { + repo_for_each_pack(r, p) { if (p->pack_size > max_size) { second_largest_size = max_size; max_size = p->pack_size; @@ -1550,6 +1566,108 @@ static int maintenance_task_incremental_repack(struct maintenance_run_opts *opts return 0; } +static int maintenance_task_geometric_repack(struct maintenance_run_opts *opts, + struct gc_config *cfg) +{ + struct pack_geometry geometry = { + .split_factor = 2, + }; + struct pack_objects_args po_args = { + .local = 1, + }; + struct existing_packs existing_packs = EXISTING_PACKS_INIT; + struct string_list kept_packs = STRING_LIST_INIT_DUP; + struct child_process child = CHILD_PROCESS_INIT; + int ret; + + repo_config_get_int(the_repository, "maintenance.geometric-repack.splitFactor", + &geometry.split_factor); + + existing_packs.repo = the_repository; + existing_packs_collect(&existing_packs, &kept_packs); + pack_geometry_init(&geometry, &existing_packs, &po_args); + pack_geometry_split(&geometry); + + child.git_cmd = 1; + + strvec_pushl(&child.args, "repack", "-d", "-l", NULL); + if (geometry.split < geometry.pack_nr) + strvec_pushf(&child.args, "--geometric=%d", + geometry.split_factor); + else + add_repack_all_option(cfg, NULL, &child.args); + if (opts->quiet) + strvec_push(&child.args, "--quiet"); + if (the_repository->settings.core_multi_pack_index) + strvec_push(&child.args, "--write-midx"); + + if (run_command(&child)) { + ret = error(_("failed to perform geometric repack")); + goto out; + } + + ret = 0; + +out: + existing_packs_release(&existing_packs); + pack_geometry_release(&geometry); + return ret; +} + +static int geometric_repack_auto_condition(struct gc_config *cfg UNUSED) +{ + struct pack_geometry geometry = { + .split_factor = 2, + }; + struct pack_objects_args po_args = { + .local = 1, + }; + struct existing_packs existing_packs = EXISTING_PACKS_INIT; + struct string_list kept_packs = STRING_LIST_INIT_DUP; + int auto_value = 100; + int ret; + + repo_config_get_int(the_repository, "maintenance.geometric-repack.auto", + &auto_value); + if (!auto_value) + return 0; + if (auto_value < 0) + return 1; + + repo_config_get_int(the_repository, "maintenance.geometric-repack.splitFactor", + &geometry.split_factor); + + existing_packs.repo = the_repository; + existing_packs_collect(&existing_packs, &kept_packs); + pack_geometry_init(&geometry, &existing_packs, &po_args); + pack_geometry_split(&geometry); + + /* + * When we'd merge at least two packs with one another we always + * perform the repack. + */ + if (geometry.split) { + ret = 1; + goto out; + } + + /* + * Otherwise, we estimate the number of loose objects to determine + * whether we want to create a new packfile or not. + */ + if (too_many_loose_objects(auto_value)) { + ret = 1; + goto out; + } + + ret = 0; + +out: + existing_packs_release(&existing_packs); + pack_geometry_release(&geometry); + return ret; +} + typedef int (*maintenance_task_fn)(struct maintenance_run_opts *opts, struct gc_config *cfg); typedef int (*maintenance_auto_fn)(struct gc_config *cfg); @@ -1592,11 +1710,16 @@ static const struct maintenance_task tasks[] = { .background = maintenance_task_incremental_repack, .auto_condition = incremental_repack_auto_condition, }, + [TASK_GEOMETRIC_REPACK] = { + .name = "geometric-repack", + .background = maintenance_task_geometric_repack, + .auto_condition = geometric_repack_auto_condition, + }, [TASK_GC] = { .name = "gc", .foreground = maintenance_task_gc_foreground, .background = maintenance_task_gc_background, - .auto_condition = need_to_gc, + .auto_condition = gc_condition, }, [TASK_COMMIT_GRAPH] = { .name = "commit-graph", @@ -1702,39 +1825,116 @@ static int maintenance_run_tasks(struct maintenance_run_opts *opts, return result; } +enum maintenance_type { + /* As invoked via `git maintenance run --schedule=`. */ + MAINTENANCE_TYPE_SCHEDULED = (1 << 0), + /* As invoked via `git maintenance run` and with `--auto`. */ + MAINTENANCE_TYPE_MANUAL = (1 << 1), +}; + struct maintenance_strategy { struct { - int enabled; + unsigned type; enum schedule_priority schedule; } tasks[TASK__COUNT]; }; static const struct maintenance_strategy none_strategy = { 0 }; -static const struct maintenance_strategy default_strategy = { + +static const struct maintenance_strategy gc_strategy = { .tasks = { - [TASK_GC].enabled = 1, + [TASK_GC] = { + .type = MAINTENANCE_TYPE_MANUAL | MAINTENANCE_TYPE_SCHEDULED, + .schedule = SCHEDULE_DAILY, + }, }, }; + static const struct maintenance_strategy incremental_strategy = { .tasks = { - [TASK_COMMIT_GRAPH].enabled = 1, - [TASK_COMMIT_GRAPH].schedule = SCHEDULE_HOURLY, - [TASK_PREFETCH].enabled = 1, - [TASK_PREFETCH].schedule = SCHEDULE_HOURLY, - [TASK_INCREMENTAL_REPACK].enabled = 1, - [TASK_INCREMENTAL_REPACK].schedule = SCHEDULE_DAILY, - [TASK_LOOSE_OBJECTS].enabled = 1, - [TASK_LOOSE_OBJECTS].schedule = SCHEDULE_DAILY, - [TASK_PACK_REFS].enabled = 1, - [TASK_PACK_REFS].schedule = SCHEDULE_WEEKLY, + [TASK_COMMIT_GRAPH] = { + .type = MAINTENANCE_TYPE_SCHEDULED, + .schedule = SCHEDULE_HOURLY, + }, + [TASK_PREFETCH] = { + .type = MAINTENANCE_TYPE_SCHEDULED, + .schedule = SCHEDULE_HOURLY, + }, + [TASK_INCREMENTAL_REPACK] = { + .type = MAINTENANCE_TYPE_SCHEDULED, + .schedule = SCHEDULE_DAILY, + }, + [TASK_LOOSE_OBJECTS] = { + .type = MAINTENANCE_TYPE_SCHEDULED, + .schedule = SCHEDULE_DAILY, + }, + [TASK_PACK_REFS] = { + .type = MAINTENANCE_TYPE_SCHEDULED, + .schedule = SCHEDULE_WEEKLY, + }, + /* + * Historically, the "incremental" strategy was only available + * in the context of scheduled maintenance when set up via + * "maintenance.strategy". We have later expanded that config + * to also cover manual maintenance. + * + * To retain backwards compatibility with the previous status + * quo we thus run git-gc(1) in case manual maintenance was + * requested. This is the same as the default strategy, which + * would have been in use beforehand. + */ + [TASK_GC] = { + .type = MAINTENANCE_TYPE_MANUAL, + }, + }, +}; + +static const struct maintenance_strategy geometric_strategy = { + .tasks = { + [TASK_COMMIT_GRAPH] = { + .type = MAINTENANCE_TYPE_SCHEDULED | MAINTENANCE_TYPE_MANUAL, + .schedule = SCHEDULE_HOURLY, + }, + [TASK_GEOMETRIC_REPACK] = { + .type = MAINTENANCE_TYPE_SCHEDULED | MAINTENANCE_TYPE_MANUAL, + .schedule = SCHEDULE_DAILY, + }, + [TASK_PACK_REFS] = { + .type = MAINTENANCE_TYPE_SCHEDULED | MAINTENANCE_TYPE_MANUAL, + .schedule = SCHEDULE_DAILY, + }, + [TASK_RERERE_GC] = { + .type = MAINTENANCE_TYPE_SCHEDULED | MAINTENANCE_TYPE_MANUAL, + .schedule = SCHEDULE_WEEKLY, + }, + [TASK_REFLOG_EXPIRE] = { + .type = MAINTENANCE_TYPE_SCHEDULED | MAINTENANCE_TYPE_MANUAL, + .schedule = SCHEDULE_WEEKLY, + }, + [TASK_WORKTREE_PRUNE] = { + .type = MAINTENANCE_TYPE_SCHEDULED | MAINTENANCE_TYPE_MANUAL, + .schedule = SCHEDULE_WEEKLY, + }, }, }; +static struct maintenance_strategy parse_maintenance_strategy(const char *name) +{ + if (!strcasecmp(name, "incremental")) + return incremental_strategy; + if (!strcasecmp(name, "gc")) + return gc_strategy; + if (!strcasecmp(name, "geometric")) + return geometric_strategy; + die(_("unknown maintenance strategy: '%s'"), name); +} + static void initialize_task_config(struct maintenance_run_opts *opts, const struct string_list *selected_tasks) { struct strbuf config_name = STRBUF_INIT; struct maintenance_strategy strategy; + enum maintenance_type type; const char *config_str; /* @@ -1762,19 +1962,20 @@ static void initialize_task_config(struct maintenance_run_opts *opts, * - Unscheduled maintenance uses our default strategy. * * Both of these are affected by the gitconfig though, which may - * override specific aspects of our strategy. + * override specific aspects of our strategy. Furthermore, both + * strategies can be overridden by setting "maintenance.strategy". */ if (opts->schedule) { strategy = none_strategy; - - if (!repo_config_get_string_tmp(the_repository, "maintenance.strategy", &config_str)) { - if (!strcasecmp(config_str, "incremental")) - strategy = incremental_strategy; - } + type = MAINTENANCE_TYPE_SCHEDULED; } else { - strategy = default_strategy; + strategy = gc_strategy; + type = MAINTENANCE_TYPE_MANUAL; } + if (!repo_config_get_string_tmp(the_repository, "maintenance.strategy", &config_str)) + strategy = parse_maintenance_strategy(config_str); + for (size_t i = 0; i < TASK__COUNT; i++) { int config_value; @@ -1782,8 +1983,8 @@ static void initialize_task_config(struct maintenance_run_opts *opts, strbuf_addf(&config_name, "maintenance.%s.enabled", tasks[i].name); if (!repo_config_get_bool(the_repository, config_name.buf, &config_value)) - strategy.tasks[i].enabled = config_value; - if (!strategy.tasks[i].enabled) + strategy.tasks[i].type = config_value ? type : 0; + if (!(strategy.tasks[i].type & type)) continue; if (opts->schedule) { diff --git a/builtin/grep.c b/builtin/grep.c index 13841fbf00..53cccf2d25 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -1214,7 +1214,7 @@ int cmd_grep(int argc, if (recurse_submodules) repo_read_gitmodules(the_repository, 1); if (startup_info->have_repository) - (void)packfile_store_get_packs(the_repository->objects->packfiles); + packfile_store_prepare(the_repository->objects->packfiles); start_threads(&opt); } else { diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 5bdc44fb2d..b5454e5df1 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -3831,12 +3831,10 @@ static int pack_mtime_cmp(const void *_a, const void *_b) static void read_packs_list_from_stdin(struct rev_info *revs) { - struct packfile_store *packs = the_repository->objects->packfiles; struct strbuf buf = STRBUF_INIT; struct string_list include_packs = STRING_LIST_INIT_DUP; struct string_list exclude_packs = STRING_LIST_INIT_DUP; struct string_list_item *item = NULL; - struct packed_git *p; while (strbuf_getline(&buf, stdin) != EOF) { @@ -3856,7 +3854,7 @@ static void read_packs_list_from_stdin(struct rev_info *revs) string_list_sort(&exclude_packs); string_list_remove_duplicates(&exclude_packs, 0); - for (p = packfile_store_get_all_packs(packs); p; p = p->next) { + repo_for_each_pack(the_repository, p) { const char *pack_name = pack_basename(p); if ((item = string_list_lookup(&include_packs, pack_name))) @@ -4077,7 +4075,6 @@ static void enumerate_cruft_objects(void) static void enumerate_and_traverse_cruft_objects(struct string_list *fresh_packs) { - struct packfile_store *packs = the_repository->objects->packfiles; struct packed_git *p; struct rev_info revs; int ret; @@ -4107,7 +4104,7 @@ static void enumerate_and_traverse_cruft_objects(struct string_list *fresh_packs * Re-mark only the fresh packs as kept so that objects in * unknown packs do not halt the reachability traversal early. */ - for (p = packfile_store_get_all_packs(packs); p; p = p->next) + repo_for_each_pack(the_repository, p) p->pack_keep_in_core = 0; mark_pack_kept_in_core(fresh_packs, 1); @@ -4124,7 +4121,6 @@ static void enumerate_and_traverse_cruft_objects(struct string_list *fresh_packs static void read_cruft_objects(void) { - struct packfile_store *packs = the_repository->objects->packfiles; struct strbuf buf = STRBUF_INIT; struct string_list discard_packs = STRING_LIST_INIT_DUP; struct string_list fresh_packs = STRING_LIST_INIT_DUP; @@ -4145,7 +4141,7 @@ static void read_cruft_objects(void) string_list_sort(&discard_packs); string_list_sort(&fresh_packs); - for (p = packfile_store_get_all_packs(packs); p; p = p->next) { + repo_for_each_pack(the_repository, p) { const char *pack_name = pack_basename(p); struct string_list_item *item; @@ -4398,7 +4394,7 @@ static int has_sha1_pack_kept_or_nonlocal(const struct object_id *oid) struct packed_git *p; p = (last_found != (void *)1) ? last_found : - packfile_store_get_all_packs(packs); + packfile_store_get_packs(packs); while (p) { if ((!p->pack_local || p->pack_keep || @@ -4408,7 +4404,7 @@ static int has_sha1_pack_kept_or_nonlocal(const struct object_id *oid) return 1; } if (p == last_found) - p = packfile_store_get_all_packs(packs); + p = packfile_store_get_packs(packs); else p = p->next; if (p == last_found) @@ -4440,13 +4436,12 @@ static int loosened_object_can_be_discarded(const struct object_id *oid, static void loosen_unused_packed_objects(void) { - struct packfile_store *packs = the_repository->objects->packfiles; struct packed_git *p; uint32_t i; uint32_t loosened_objects_nr = 0; struct object_id oid; - for (p = packfile_store_get_all_packs(packs); p; p = p->next) { + repo_for_each_pack(the_repository, p) { if (!p->pack_local || p->pack_keep || p->pack_keep_in_core) continue; @@ -4747,13 +4742,12 @@ static void get_object_list(struct rev_info *revs, struct strvec *argv) static void add_extra_kept_packs(const struct string_list *names) { - struct packfile_store *packs = the_repository->objects->packfiles; struct packed_git *p; if (!names->nr) return; - for (p = packfile_store_get_all_packs(packs); p; p = p->next) { + repo_for_each_pack(the_repository, p) { const char *name = basename(p->pack_name); int i; @@ -5191,10 +5185,9 @@ int cmd_pack_objects(int argc, add_extra_kept_packs(&keep_pack_list); if (ignore_packed_keep_on_disk) { - struct packfile_store *packs = the_repository->objects->packfiles; struct packed_git *p; - for (p = packfile_store_get_all_packs(packs); p; p = p->next) + repo_for_each_pack(the_repository, p) if (p->pack_local && p->pack_keep) break; if (!p) /* no keep-able packs found */ @@ -5206,10 +5199,9 @@ int cmd_pack_objects(int argc, * want to unset "local" based on looking at packs, as * it also covers non-local objects */ - struct packfile_store *packs = the_repository->objects->packfiles; struct packed_git *p; - for (p = packfile_store_get_all_packs(packs); p; p = p->next) { + repo_for_each_pack(the_repository, p) { if (!p->pack_local) { have_non_local_packs = 1; break; diff --git a/builtin/pack-redundant.c b/builtin/pack-redundant.c index 80743d8806..e4ecf774ca 100644 --- a/builtin/pack-redundant.c +++ b/builtin/pack-redundant.c @@ -566,29 +566,23 @@ static struct pack_list * add_pack(struct packed_git *p) static struct pack_list * add_pack_file(const char *filename) { - struct packfile_store *packs = the_repository->objects->packfiles; - struct packed_git *p = packfile_store_get_all_packs(packs); + struct packed_git *p; if (strlen(filename) < 40) die("Bad pack filename: %s", filename); - while (p) { + repo_for_each_pack(the_repository, p) if (strstr(p->pack_name, filename)) return add_pack(p); - p = p->next; - } die("Filename %s not found in packed_git", filename); } static void load_all(void) { - struct packfile_store *packs = the_repository->objects->packfiles; - struct packed_git *p = packfile_store_get_all_packs(packs); + struct packed_git *p; - while (p) { + repo_for_each_pack(the_repository, p) add_pack(p); - p = p->next; - } } int cmd_pack_redundant(int argc, const char **argv, const char *prefix UNUSED, struct repository *repo UNUSED) { diff --git a/builtin/repack.c b/builtin/repack.c index e8730808c5..cfdb4c0920 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -3,27 +3,18 @@ #include "builtin.h" #include "config.h" -#include "dir.h" #include "environment.h" -#include "gettext.h" -#include "hex.h" #include "parse-options.h" #include "path.h" #include "run-command.h" #include "server-info.h" -#include "strbuf.h" #include "string-list.h" -#include "strvec.h" #include "midx.h" #include "packfile.h" #include "prune-packed.h" -#include "odb.h" #include "promisor-remote.h" +#include "repack.h" #include "shallow.h" -#include "pack.h" -#include "pack-bitmap.h" -#include "refs.h" -#include "list-objects-filter-options.h" #define ALL_INTO_ONE 1 #define LOOSEN_UNREACHABLE 2 @@ -33,8 +24,6 @@ #define RETAIN_PACK 2 static int pack_everything; -static int delta_base_offset = 1; -static int pack_kept_objects = -1; static int write_bitmaps = -1; static int use_delta_islands; static int run_update_server_info = 1; @@ -53,31 +42,23 @@ static const char incremental_bitmap_conflict_error[] = N_( "--no-write-bitmap-index or disable the pack.writeBitmaps configuration." ); -struct pack_objects_args { - char *window; - char *window_memory; - char *depth; - char *threads; - unsigned long max_pack_size; - int no_reuse_delta; - int no_reuse_object; - int quiet; - int local; - int name_hash_version; - int path_walk; - struct list_objects_filter_options filter_options; +struct repack_config_ctx { + struct pack_objects_args *po_args; + struct pack_objects_args *cruft_po_args; }; static int repack_config(const char *var, const char *value, const struct config_context *ctx, void *cb) { - struct pack_objects_args *cruft_po_args = cb; + struct repack_config_ctx *repack_ctx = cb; + struct pack_objects_args *po_args = repack_ctx->po_args; + struct pack_objects_args *cruft_po_args = repack_ctx->cruft_po_args; if (!strcmp(var, "repack.usedeltabaseoffset")) { - delta_base_offset = git_config_bool(var, value); + po_args->delta_base_offset = git_config_bool(var, value); return 0; } if (!strcmp(var, "repack.packkeptobjects")) { - pack_kept_objects = git_config_bool(var, value); + po_args->pack_kept_objects = git_config_bool(var, value); return 0; } if (!strcmp(var, "repack.writebitmaps") || @@ -116,1138 +97,10 @@ static int repack_config(const char *var, const char *value, return git_default_config(var, value, ctx, cb); } -static void pack_objects_args_release(struct pack_objects_args *args) -{ - free(args->window); - free(args->window_memory); - free(args->depth); - free(args->threads); - list_objects_filter_release(&args->filter_options); -} - -struct existing_packs { - struct string_list kept_packs; - struct string_list non_kept_packs; - struct string_list cruft_packs; -}; - -#define EXISTING_PACKS_INIT { \ - .kept_packs = STRING_LIST_INIT_DUP, \ - .non_kept_packs = STRING_LIST_INIT_DUP, \ - .cruft_packs = STRING_LIST_INIT_DUP, \ -} - -static int has_existing_non_kept_packs(const struct existing_packs *existing) -{ - return existing->non_kept_packs.nr || existing->cruft_packs.nr; -} - -static void pack_mark_for_deletion(struct string_list_item *item) -{ - item->util = (void*)((uintptr_t)item->util | DELETE_PACK); -} - -static void pack_unmark_for_deletion(struct string_list_item *item) -{ - item->util = (void*)((uintptr_t)item->util & ~DELETE_PACK); -} - -static int pack_is_marked_for_deletion(struct string_list_item *item) -{ - return (uintptr_t)item->util & DELETE_PACK; -} - -static void pack_mark_retained(struct string_list_item *item) -{ - item->util = (void*)((uintptr_t)item->util | RETAIN_PACK); -} - -static int pack_is_retained(struct string_list_item *item) -{ - return (uintptr_t)item->util & RETAIN_PACK; -} - -static void mark_packs_for_deletion_1(struct string_list *names, - struct string_list *list) -{ - struct string_list_item *item; - const int hexsz = the_hash_algo->hexsz; - - for_each_string_list_item(item, list) { - char *sha1; - size_t len = strlen(item->string); - if (len < hexsz) - continue; - sha1 = item->string + len - hexsz; - - if (pack_is_retained(item)) { - pack_unmark_for_deletion(item); - } else if (!string_list_has_string(names, sha1)) { - /* - * Mark this pack for deletion, which ensures - * that this pack won't be included in a MIDX - * (if `--write-midx` was given) and that we - * will actually delete this pack (if `-d` was - * given). - */ - pack_mark_for_deletion(item); - } - } -} - -static void retain_cruft_pack(struct existing_packs *existing, - struct packed_git *cruft) -{ - struct strbuf buf = STRBUF_INIT; - struct string_list_item *item; - - strbuf_addstr(&buf, pack_basename(cruft)); - strbuf_strip_suffix(&buf, ".pack"); - - item = string_list_lookup(&existing->cruft_packs, buf.buf); - if (!item) - BUG("could not find cruft pack '%s'", pack_basename(cruft)); - - pack_mark_retained(item); - strbuf_release(&buf); -} - -static void mark_packs_for_deletion(struct existing_packs *existing, - struct string_list *names) - -{ - mark_packs_for_deletion_1(names, &existing->non_kept_packs); - mark_packs_for_deletion_1(names, &existing->cruft_packs); -} - -static void remove_redundant_pack(const char *dir_name, const char *base_name) -{ - struct strbuf buf = STRBUF_INIT; - struct odb_source *source = the_repository->objects->sources; - struct multi_pack_index *m = get_multi_pack_index(source); - strbuf_addf(&buf, "%s.pack", base_name); - if (m && source->local && midx_contains_pack(m, buf.buf)) - clear_midx_file(the_repository); - strbuf_insertf(&buf, 0, "%s/", dir_name); - unlink_pack_path(buf.buf, 1); - strbuf_release(&buf); -} - -static void remove_redundant_packs_1(struct string_list *packs) -{ - struct string_list_item *item; - for_each_string_list_item(item, packs) { - if (!pack_is_marked_for_deletion(item)) - continue; - remove_redundant_pack(packdir, item->string); - } -} - -static void remove_redundant_existing_packs(struct existing_packs *existing) -{ - remove_redundant_packs_1(&existing->non_kept_packs); - remove_redundant_packs_1(&existing->cruft_packs); -} - -static void existing_packs_release(struct existing_packs *existing) -{ - string_list_clear(&existing->kept_packs, 0); - string_list_clear(&existing->non_kept_packs, 0); - string_list_clear(&existing->cruft_packs, 0); -} - -/* - * Adds all packs hex strings (pack-$HASH) to either packs->non_kept - * or packs->kept based on whether each pack has a corresponding - * .keep file or not. Packs without a .keep file are not to be kept - * if we are going to pack everything into one file. - */ -static void collect_pack_filenames(struct existing_packs *existing, - const struct string_list *extra_keep) -{ - struct packfile_store *packs = the_repository->objects->packfiles; - struct packed_git *p; - struct strbuf buf = STRBUF_INIT; - - for (p = packfile_store_get_all_packs(packs); p; p = p->next) { - int i; - const char *base; - - if (!p->pack_local) - continue; - - base = pack_basename(p); - - for (i = 0; i < extra_keep->nr; i++) - if (!fspathcmp(base, extra_keep->items[i].string)) - break; - - strbuf_reset(&buf); - strbuf_addstr(&buf, base); - strbuf_strip_suffix(&buf, ".pack"); - - if ((extra_keep->nr > 0 && i < extra_keep->nr) || p->pack_keep) - string_list_append(&existing->kept_packs, buf.buf); - else if (p->is_cruft) - string_list_append(&existing->cruft_packs, buf.buf); - else - string_list_append(&existing->non_kept_packs, buf.buf); - } - - string_list_sort(&existing->kept_packs); - string_list_sort(&existing->non_kept_packs); - string_list_sort(&existing->cruft_packs); - strbuf_release(&buf); -} - -static void prepare_pack_objects(struct child_process *cmd, - const struct pack_objects_args *args, - const char *out) -{ - strvec_push(&cmd->args, "pack-objects"); - if (args->window) - strvec_pushf(&cmd->args, "--window=%s", args->window); - if (args->window_memory) - strvec_pushf(&cmd->args, "--window-memory=%s", args->window_memory); - if (args->depth) - strvec_pushf(&cmd->args, "--depth=%s", args->depth); - if (args->threads) - strvec_pushf(&cmd->args, "--threads=%s", args->threads); - if (args->max_pack_size) - strvec_pushf(&cmd->args, "--max-pack-size=%lu", args->max_pack_size); - if (args->no_reuse_delta) - strvec_pushf(&cmd->args, "--no-reuse-delta"); - if (args->no_reuse_object) - strvec_pushf(&cmd->args, "--no-reuse-object"); - if (args->name_hash_version) - strvec_pushf(&cmd->args, "--name-hash-version=%d", args->name_hash_version); - if (args->path_walk) - strvec_pushf(&cmd->args, "--path-walk"); - if (args->local) - strvec_push(&cmd->args, "--local"); - if (args->quiet) - strvec_push(&cmd->args, "--quiet"); - if (delta_base_offset) - strvec_push(&cmd->args, "--delta-base-offset"); - strvec_push(&cmd->args, out); - cmd->git_cmd = 1; - cmd->out = -1; -} - -/* - * Write oid to the given struct child_process's stdin, starting it first if - * necessary. - */ -static int write_oid(const struct object_id *oid, - struct packed_git *pack UNUSED, - uint32_t pos UNUSED, void *data) -{ - struct child_process *cmd = data; - - if (cmd->in == -1) { - if (start_command(cmd)) - die(_("could not start pack-objects to repack promisor objects")); - } - - if (write_in_full(cmd->in, oid_to_hex(oid), the_hash_algo->hexsz) < 0 || - write_in_full(cmd->in, "\n", 1) < 0) - die(_("failed to feed promisor objects to pack-objects")); - return 0; -} - -static struct { - const char *name; - unsigned optional:1; -} exts[] = { - {".pack"}, - {".rev", 1}, - {".mtimes", 1}, - {".bitmap", 1}, - {".promisor", 1}, - {".idx"}, -}; - -struct generated_pack_data { - struct tempfile *tempfiles[ARRAY_SIZE(exts)]; -}; - -static struct generated_pack_data *populate_pack_exts(const char *name) -{ - struct stat statbuf; - struct strbuf path = STRBUF_INIT; - struct generated_pack_data *data = xcalloc(1, sizeof(*data)); - int i; - - for (i = 0; i < ARRAY_SIZE(exts); i++) { - strbuf_reset(&path); - strbuf_addf(&path, "%s-%s%s", packtmp, name, exts[i].name); - - if (stat(path.buf, &statbuf)) - continue; - - data->tempfiles[i] = register_tempfile(path.buf); - } - - strbuf_release(&path); - return data; -} - -static int has_pack_ext(const struct generated_pack_data *data, - const char *ext) -{ - int i; - for (i = 0; i < ARRAY_SIZE(exts); i++) { - if (strcmp(exts[i].name, ext)) - continue; - return !!data->tempfiles[i]; - } - BUG("unknown pack extension: '%s'", ext); -} - -static void repack_promisor_objects(const struct pack_objects_args *args, - struct string_list *names) -{ - struct child_process cmd = CHILD_PROCESS_INIT; - FILE *out; - struct strbuf line = STRBUF_INIT; - - prepare_pack_objects(&cmd, args, packtmp); - cmd.in = -1; - - /* - * NEEDSWORK: Giving pack-objects only the OIDs without any ordering - * hints may result in suboptimal deltas in the resulting pack. See if - * the OIDs can be sent with fake paths such that pack-objects can use a - * {type -> existing pack order} ordering when computing deltas instead - * of a {type -> size} ordering, which may produce better deltas. - */ - for_each_packed_object(the_repository, write_oid, &cmd, - FOR_EACH_OBJECT_PROMISOR_ONLY); - - if (cmd.in == -1) { - /* No packed objects; cmd was never started */ - child_process_clear(&cmd); - return; - } - - close(cmd.in); - - out = xfdopen(cmd.out, "r"); - while (strbuf_getline_lf(&line, out) != EOF) { - struct string_list_item *item; - char *promisor_name; - - if (line.len != the_hash_algo->hexsz) - die(_("repack: Expecting full hex object ID lines only from pack-objects.")); - item = string_list_append(names, line.buf); - - /* - * pack-objects creates the .pack and .idx files, but not the - * .promisor file. Create the .promisor file, which is empty. - * - * NEEDSWORK: fetch-pack sometimes generates non-empty - * .promisor files containing the ref names and associated - * hashes at the point of generation of the corresponding - * packfile, but this would not preserve their contents. Maybe - * concatenate the contents of all .promisor files instead of - * just creating a new empty file. - */ - promisor_name = mkpathdup("%s-%s.promisor", packtmp, - line.buf); - write_promisor_file(promisor_name, NULL, 0); - - item->util = populate_pack_exts(item->string); - - free(promisor_name); - } - - fclose(out); - if (finish_command(&cmd)) - die(_("could not finish pack-objects to repack promisor objects")); - strbuf_release(&line); -} - -struct pack_geometry { - struct packed_git **pack; - uint32_t pack_nr, pack_alloc; - uint32_t split; - - int split_factor; -}; - -static uint32_t geometry_pack_weight(struct packed_git *p) -{ - if (open_pack_index(p)) - die(_("cannot open index for %s"), p->pack_name); - return p->num_objects; -} - -static int geometry_cmp(const void *va, const void *vb) -{ - uint32_t aw = geometry_pack_weight(*(struct packed_git **)va), - bw = geometry_pack_weight(*(struct packed_git **)vb); - - if (aw < bw) - return -1; - if (aw > bw) - return 1; - return 0; -} - -static void init_pack_geometry(struct pack_geometry *geometry, - struct existing_packs *existing, - const struct pack_objects_args *args) -{ - struct packfile_store *packs = the_repository->objects->packfiles; - struct packed_git *p; - struct strbuf buf = STRBUF_INIT; - - for (p = packfile_store_get_all_packs(packs); p; p = p->next) { - if (args->local && !p->pack_local) - /* - * When asked to only repack local packfiles we skip - * over any packfiles that are borrowed from alternate - * object directories. - */ - continue; - - if (!pack_kept_objects) { - /* - * Any pack that has its pack_keep bit set will - * appear in existing->kept_packs below, but - * this saves us from doing a more expensive - * check. - */ - if (p->pack_keep) - continue; - - /* - * The pack may be kept via the --keep-pack - * option; check 'existing->kept_packs' to - * determine whether to ignore it. - */ - strbuf_reset(&buf); - strbuf_addstr(&buf, pack_basename(p)); - strbuf_strip_suffix(&buf, ".pack"); - - if (string_list_has_string(&existing->kept_packs, buf.buf)) - continue; - } - if (p->is_cruft) - continue; - - ALLOC_GROW(geometry->pack, - geometry->pack_nr + 1, - geometry->pack_alloc); - - geometry->pack[geometry->pack_nr] = p; - geometry->pack_nr++; - } - - QSORT(geometry->pack, geometry->pack_nr, geometry_cmp); - strbuf_release(&buf); -} - -static void split_pack_geometry(struct pack_geometry *geometry) -{ - uint32_t i; - uint32_t split; - off_t total_size = 0; - - if (!geometry->pack_nr) { - geometry->split = geometry->pack_nr; - return; - } - - /* - * First, count the number of packs (in descending order of size) which - * already form a geometric progression. - */ - for (i = geometry->pack_nr - 1; i > 0; i--) { - struct packed_git *ours = geometry->pack[i]; - struct packed_git *prev = geometry->pack[i - 1]; - - if (unsigned_mult_overflows(geometry->split_factor, - geometry_pack_weight(prev))) - die(_("pack %s too large to consider in geometric " - "progression"), - prev->pack_name); - - if (geometry_pack_weight(ours) < - geometry->split_factor * geometry_pack_weight(prev)) - break; - } - - split = i; - - if (split) { - /* - * Move the split one to the right, since the top element in the - * last-compared pair can't be in the progression. Only do this - * when we split in the middle of the array (otherwise if we got - * to the end, then the split is in the right place). - */ - split++; - } - - /* - * Then, anything to the left of 'split' must be in a new pack. But, - * creating that new pack may cause packs in the heavy half to no longer - * form a geometric progression. - * - * Compute an expected size of the new pack, and then determine how many - * packs in the heavy half need to be joined into it (if any) to restore - * the geometric progression. - */ - for (i = 0; i < split; i++) { - struct packed_git *p = geometry->pack[i]; - - if (unsigned_add_overflows(total_size, geometry_pack_weight(p))) - die(_("pack %s too large to roll up"), p->pack_name); - total_size += geometry_pack_weight(p); - } - for (i = split; i < geometry->pack_nr; i++) { - struct packed_git *ours = geometry->pack[i]; - - if (unsigned_mult_overflows(geometry->split_factor, - total_size)) - die(_("pack %s too large to roll up"), ours->pack_name); - - if (geometry_pack_weight(ours) < - geometry->split_factor * total_size) { - if (unsigned_add_overflows(total_size, - geometry_pack_weight(ours))) - die(_("pack %s too large to roll up"), - ours->pack_name); - - split++; - total_size += geometry_pack_weight(ours); - } else - break; - } - - geometry->split = split; -} - -static struct packed_git *get_preferred_pack(struct pack_geometry *geometry) -{ - uint32_t i; - - if (!geometry) { - /* - * No geometry means either an all-into-one repack (in which - * case there is only one pack left and it is the largest) or an - * incremental one. - * - * If repacking incrementally, then we could check the size of - * all packs to determine which should be preferred, but leave - * this for later. - */ - return NULL; - } - if (geometry->split == geometry->pack_nr) - return NULL; - - /* - * The preferred pack is the largest pack above the split line. In - * other words, it is the largest pack that does not get rolled up in - * the geometric repack. - */ - for (i = geometry->pack_nr; i > geometry->split; i--) - /* - * A pack that is not local would never be included in a - * multi-pack index. We thus skip over any non-local packs. - */ - if (geometry->pack[i - 1]->pack_local) - return geometry->pack[i - 1]; - - return NULL; -} - -static void geometry_remove_redundant_packs(struct pack_geometry *geometry, - struct string_list *names, - struct existing_packs *existing) -{ - struct strbuf buf = STRBUF_INIT; - uint32_t i; - - for (i = 0; i < geometry->split; i++) { - struct packed_git *p = geometry->pack[i]; - if (string_list_has_string(names, hash_to_hex(p->hash))) - continue; - - strbuf_reset(&buf); - strbuf_addstr(&buf, pack_basename(p)); - strbuf_strip_suffix(&buf, ".pack"); - - if ((p->pack_keep) || - (string_list_has_string(&existing->kept_packs, buf.buf))) - continue; - - remove_redundant_pack(packdir, buf.buf); - } - - strbuf_release(&buf); -} - -static void free_pack_geometry(struct pack_geometry *geometry) -{ - if (!geometry) - return; - - free(geometry->pack); -} - -static int midx_has_unknown_packs(char **midx_pack_names, - size_t midx_pack_names_nr, - struct string_list *include, - struct pack_geometry *geometry, - struct existing_packs *existing) -{ - size_t i; - - string_list_sort(include); - - for (i = 0; i < midx_pack_names_nr; i++) { - const char *pack_name = midx_pack_names[i]; - - /* - * Determine whether or not each MIDX'd pack from the existing - * MIDX (if any) is represented in the new MIDX. For each pack - * in the MIDX, it must either be: - * - * - In the "include" list of packs to be included in the new - * MIDX. Note this function is called before the include - * list is populated with any cruft pack(s). - * - * - Below the geometric split line (if using pack geometry), - * indicating that the pack won't be included in the new - * MIDX, but its contents were rolled up as part of the - * geometric repack. - * - * - In the existing non-kept packs list (if not using pack - * geometry), and marked as non-deleted. - */ - if (string_list_has_string(include, pack_name)) { - continue; - } else if (geometry) { - struct strbuf buf = STRBUF_INIT; - uint32_t j; - - for (j = 0; j < geometry->split; j++) { - strbuf_reset(&buf); - strbuf_addstr(&buf, pack_basename(geometry->pack[j])); - strbuf_strip_suffix(&buf, ".pack"); - strbuf_addstr(&buf, ".idx"); - - if (!strcmp(pack_name, buf.buf)) { - strbuf_release(&buf); - break; - } - } - - strbuf_release(&buf); - - if (j < geometry->split) - continue; - } else { - struct string_list_item *item; - - item = string_list_lookup(&existing->non_kept_packs, - pack_name); - if (item && !pack_is_marked_for_deletion(item)) - continue; - } - - /* - * If we got to this point, the MIDX includes some pack that we - * don't know about. - */ - return 1; - } - - return 0; -} - -struct midx_snapshot_ref_data { - struct tempfile *f; - struct oidset seen; - int preferred; -}; - -static int midx_snapshot_ref_one(const char *refname UNUSED, - const char *referent UNUSED, - const struct object_id *oid, - int flag UNUSED, void *_data) -{ - struct midx_snapshot_ref_data *data = _data; - struct object_id peeled; - - if (!peel_iterated_oid(the_repository, oid, &peeled)) - oid = &peeled; - - if (oidset_insert(&data->seen, oid)) - return 0; /* already seen */ - - if (odb_read_object_info(the_repository->objects, oid, NULL) != OBJ_COMMIT) - return 0; - - fprintf(data->f->fp, "%s%s\n", data->preferred ? "+" : "", - oid_to_hex(oid)); - - return 0; -} - -static void midx_snapshot_refs(struct tempfile *f) -{ - struct midx_snapshot_ref_data data; - const struct string_list *preferred = bitmap_preferred_tips(the_repository); - - data.f = f; - data.preferred = 0; - oidset_init(&data.seen, 0); - - if (!fdopen_tempfile(f, "w")) - die(_("could not open tempfile %s for writing"), - get_tempfile_path(f)); - - if (preferred) { - struct string_list_item *item; - - data.preferred = 1; - for_each_string_list_item(item, preferred) - refs_for_each_ref_in(get_main_ref_store(the_repository), - item->string, - midx_snapshot_ref_one, &data); - data.preferred = 0; - } - - refs_for_each_ref(get_main_ref_store(the_repository), - midx_snapshot_ref_one, &data); - - if (close_tempfile_gently(f)) { - int save_errno = errno; - delete_tempfile(&f); - errno = save_errno; - die_errno(_("could not close refs snapshot tempfile")); - } - - oidset_clear(&data.seen); -} - -static void midx_included_packs(struct string_list *include, - struct existing_packs *existing, - char **midx_pack_names, - size_t midx_pack_names_nr, - struct string_list *names, - struct pack_geometry *geometry) -{ - struct string_list_item *item; - struct strbuf buf = STRBUF_INIT; - - for_each_string_list_item(item, &existing->kept_packs) { - strbuf_reset(&buf); - strbuf_addf(&buf, "%s.idx", item->string); - string_list_insert(include, buf.buf); - } - - for_each_string_list_item(item, names) { - strbuf_reset(&buf); - strbuf_addf(&buf, "pack-%s.idx", item->string); - string_list_insert(include, buf.buf); - } - - if (geometry->split_factor) { - uint32_t i; - - for (i = geometry->split; i < geometry->pack_nr; i++) { - struct packed_git *p = geometry->pack[i]; - - /* - * The multi-pack index never refers to packfiles part - * of an alternate object database, so we skip these. - * While git-multi-pack-index(1) would silently ignore - * them anyway, this allows us to skip executing the - * command completely when we have only non-local - * packfiles. - */ - if (!p->pack_local) - continue; - - strbuf_reset(&buf); - strbuf_addstr(&buf, pack_basename(p)); - strbuf_strip_suffix(&buf, ".pack"); - strbuf_addstr(&buf, ".idx"); - - string_list_insert(include, buf.buf); - } - } else { - for_each_string_list_item(item, &existing->non_kept_packs) { - if (pack_is_marked_for_deletion(item)) - continue; - - strbuf_reset(&buf); - strbuf_addf(&buf, "%s.idx", item->string); - string_list_insert(include, buf.buf); - } - } - - if (midx_must_contain_cruft || - midx_has_unknown_packs(midx_pack_names, midx_pack_names_nr, - include, geometry, existing)) { - /* - * If there are one or more unknown pack(s) present (see - * midx_has_unknown_packs() for what makes a pack - * "unknown") in the MIDX before the repack, keep them - * as they may be required to form a reachability - * closure if the MIDX is bitmapped. - * - * For example, a cruft pack can be required to form a - * reachability closure if the MIDX is bitmapped and one - * or more of the bitmap's selected commits reaches a - * once-cruft object that was later made reachable. - */ - for_each_string_list_item(item, &existing->cruft_packs) { - /* - * When doing a --geometric repack, there is no - * need to check for deleted packs, since we're - * by definition not doing an ALL_INTO_ONE - * repack (hence no packs will be deleted). - * Otherwise we must check for and exclude any - * packs which are enqueued for deletion. - * - * So we could omit the conditional below in the - * --geometric case, but doing so is unnecessary - * since no packs are marked as pending - * deletion (since we only call - * `mark_packs_for_deletion()` when doing an - * all-into-one repack). - */ - if (pack_is_marked_for_deletion(item)) - continue; - - strbuf_reset(&buf); - strbuf_addf(&buf, "%s.idx", item->string); - string_list_insert(include, buf.buf); - } - } else { - /* - * Modern versions of Git (with the appropriate - * configuration setting) will write new copies of - * once-cruft objects when doing a --geometric repack. - * - * If the MIDX has no cruft pack, new packs written - * during a --geometric repack will not rely on the - * cruft pack to form a reachability closure, so we can - * avoid including them in the MIDX in that case. - */ - ; - } - - strbuf_release(&buf); -} - -static int write_midx_included_packs(struct string_list *include, - struct pack_geometry *geometry, - struct string_list *names, - const char *refs_snapshot, - int show_progress, int write_bitmaps) -{ - struct child_process cmd = CHILD_PROCESS_INIT; - struct string_list_item *item; - struct packed_git *preferred = get_preferred_pack(geometry); - FILE *in; - int ret; - - if (!include->nr) - return 0; - - cmd.in = -1; - cmd.git_cmd = 1; - - strvec_push(&cmd.args, "multi-pack-index"); - strvec_pushl(&cmd.args, "write", "--stdin-packs", NULL); - - if (show_progress) - strvec_push(&cmd.args, "--progress"); - else - strvec_push(&cmd.args, "--no-progress"); - - if (write_bitmaps) - strvec_push(&cmd.args, "--bitmap"); - - if (preferred) - strvec_pushf(&cmd.args, "--preferred-pack=%s", - pack_basename(preferred)); - else if (names->nr) { - /* The largest pack was repacked, meaning that either - * one or two packs exist depending on whether the - * repository has a cruft pack or not. - * - * Select the non-cruft one as preferred to encourage - * pack-reuse among packs containing reachable objects - * over unreachable ones. - * - * (Note we could write multiple packs here if - * `--max-pack-size` was given, but any one of them - * will suffice, so pick the first one.) - */ - for_each_string_list_item(item, names) { - struct generated_pack_data *data = item->util; - if (has_pack_ext(data, ".mtimes")) - continue; - - strvec_pushf(&cmd.args, "--preferred-pack=pack-%s.pack", - item->string); - break; - } - } else { - /* - * No packs were kept, and no packs were written. The - * only thing remaining are .keep packs (unless - * --pack-kept-objects was given). - * - * Set the `--preferred-pack` arbitrarily here. - */ - ; - } - - if (refs_snapshot) - strvec_pushf(&cmd.args, "--refs-snapshot=%s", refs_snapshot); - - ret = start_command(&cmd); - if (ret) - return ret; - - in = xfdopen(cmd.in, "w"); - for_each_string_list_item(item, include) - fprintf(in, "%s\n", item->string); - fclose(in); - - return finish_command(&cmd); -} - -static void remove_redundant_bitmaps(struct string_list *include, - const char *packdir) -{ - struct strbuf path = STRBUF_INIT; - struct string_list_item *item; - size_t packdir_len; - - strbuf_addstr(&path, packdir); - strbuf_addch(&path, '/'); - packdir_len = path.len; - - /* - * Remove any pack bitmaps corresponding to packs which are now - * included in the MIDX. - */ - for_each_string_list_item(item, include) { - strbuf_addstr(&path, item->string); - strbuf_strip_suffix(&path, ".idx"); - strbuf_addstr(&path, ".bitmap"); - - if (unlink(path.buf) && errno != ENOENT) - warning_errno(_("could not remove stale bitmap: %s"), - path.buf); - - strbuf_setlen(&path, packdir_len); - } - strbuf_release(&path); -} - -static int finish_pack_objects_cmd(struct child_process *cmd, - struct string_list *names, - int local) -{ - FILE *out; - struct strbuf line = STRBUF_INIT; - - out = xfdopen(cmd->out, "r"); - while (strbuf_getline_lf(&line, out) != EOF) { - struct string_list_item *item; - - if (line.len != the_hash_algo->hexsz) - die(_("repack: Expecting full hex object ID lines only " - "from pack-objects.")); - /* - * Avoid putting packs written outside of the repository in the - * list of names. - */ - if (local) { - item = string_list_append(names, line.buf); - item->util = populate_pack_exts(line.buf); - } - } - fclose(out); - - strbuf_release(&line); - - return finish_command(cmd); -} - -static int write_filtered_pack(const struct pack_objects_args *args, - const char *destination, - const char *pack_prefix, - struct existing_packs *existing, - struct string_list *names) -{ - struct child_process cmd = CHILD_PROCESS_INIT; - struct string_list_item *item; - FILE *in; - int ret; - const char *caret; - const char *scratch; - int local = skip_prefix(destination, packdir, &scratch); - - prepare_pack_objects(&cmd, args, destination); - - strvec_push(&cmd.args, "--stdin-packs"); - - if (!pack_kept_objects) - strvec_push(&cmd.args, "--honor-pack-keep"); - for_each_string_list_item(item, &existing->kept_packs) - strvec_pushf(&cmd.args, "--keep-pack=%s", item->string); - - cmd.in = -1; - - ret = start_command(&cmd); - if (ret) - return ret; - - /* - * Here 'names' contains only the pack(s) that were just - * written, which is exactly the packs we want to keep. Also - * 'existing_kept_packs' already contains the packs in - * 'keep_pack_list'. - */ - in = xfdopen(cmd.in, "w"); - for_each_string_list_item(item, names) - fprintf(in, "^%s-%s.pack\n", pack_prefix, item->string); - for_each_string_list_item(item, &existing->non_kept_packs) - fprintf(in, "%s.pack\n", item->string); - for_each_string_list_item(item, &existing->cruft_packs) - fprintf(in, "%s.pack\n", item->string); - caret = pack_kept_objects ? "" : "^"; - for_each_string_list_item(item, &existing->kept_packs) - fprintf(in, "%s%s.pack\n", caret, item->string); - fclose(in); - - return finish_pack_objects_cmd(&cmd, names, local); -} - -static void combine_small_cruft_packs(FILE *in, size_t combine_cruft_below_size, - struct existing_packs *existing) -{ - struct packfile_store *packs = the_repository->objects->packfiles; - struct packed_git *p; - struct strbuf buf = STRBUF_INIT; - size_t i; - - for (p = packfile_store_get_all_packs(packs); p; p = p->next) { - if (!(p->is_cruft && p->pack_local)) - continue; - - strbuf_reset(&buf); - strbuf_addstr(&buf, pack_basename(p)); - strbuf_strip_suffix(&buf, ".pack"); - - if (!string_list_has_string(&existing->cruft_packs, buf.buf)) - continue; - - if (p->pack_size < combine_cruft_below_size) { - fprintf(in, "-%s\n", pack_basename(p)); - } else { - retain_cruft_pack(existing, p); - fprintf(in, "%s\n", pack_basename(p)); - } - } - - for (i = 0; i < existing->non_kept_packs.nr; i++) - fprintf(in, "-%s.pack\n", - existing->non_kept_packs.items[i].string); - - strbuf_release(&buf); -} - -static int write_cruft_pack(const struct pack_objects_args *args, - const char *destination, - const char *pack_prefix, - const char *cruft_expiration, - unsigned long combine_cruft_below_size, - struct string_list *names, - struct existing_packs *existing) -{ - struct child_process cmd = CHILD_PROCESS_INIT; - struct string_list_item *item; - FILE *in; - int ret; - const char *scratch; - int local = skip_prefix(destination, packdir, &scratch); - - prepare_pack_objects(&cmd, args, destination); - - strvec_push(&cmd.args, "--cruft"); - if (cruft_expiration) - strvec_pushf(&cmd.args, "--cruft-expiration=%s", - cruft_expiration); - - strvec_push(&cmd.args, "--honor-pack-keep"); - strvec_push(&cmd.args, "--non-empty"); - - cmd.in = -1; - - ret = start_command(&cmd); - if (ret) - return ret; - - /* - * names has a confusing double use: it both provides the list - * of just-written new packs, and accepts the name of the cruft - * pack we are writing. - * - * By the time it is read here, it contains only the pack(s) - * that were just written, which is exactly the set of packs we - * want to consider kept. - * - * If `--expire-to` is given, the double-use served by `names` - * ensures that the pack written to `--expire-to` excludes any - * objects contained in the cruft pack. - */ - in = xfdopen(cmd.in, "w"); - for_each_string_list_item(item, names) - fprintf(in, "%s-%s.pack\n", pack_prefix, item->string); - if (combine_cruft_below_size && !cruft_expiration) { - combine_small_cruft_packs(in, combine_cruft_below_size, - existing); - } else { - for_each_string_list_item(item, &existing->non_kept_packs) - fprintf(in, "-%s.pack\n", item->string); - for_each_string_list_item(item, &existing->cruft_packs) - fprintf(in, "-%s.pack\n", item->string); - } - for_each_string_list_item(item, &existing->kept_packs) - fprintf(in, "%s.pack\n", item->string); - fclose(in); - - return finish_pack_objects_cmd(&cmd, names, local); -} - -static const char *find_pack_prefix(const char *packdir, const char *packtmp) -{ - const char *pack_prefix; - if (!skip_prefix(packtmp, packdir, &pack_prefix)) - die(_("pack prefix %s does not begin with objdir %s"), - packtmp, packdir); - if (*pack_prefix == '/') - pack_prefix++; - return pack_prefix; -} - int cmd_repack(int argc, const char **argv, const char *prefix, - struct repository *repo UNUSED) + struct repository *repo) { struct child_process cmd = CHILD_PROCESS_INIT; struct string_list_item *item; @@ -1255,18 +108,17 @@ int cmd_repack(int argc, struct existing_packs existing = EXISTING_PACKS_INIT; struct pack_geometry geometry = { 0 }; struct tempfile *refs_snapshot = NULL; - int i, ext, ret; + int i, ret; int show_progress; - char **midx_pack_names = NULL; - size_t midx_pack_names_nr = 0; /* variables to be filled by option parsing */ + struct repack_config_ctx config_ctx; int delete_redundant = 0; const char *unpack_unreachable = NULL; int keep_unreachable = 0; struct string_list keep_pack_list = STRING_LIST_INIT_NODUP; - struct pack_objects_args po_args = { 0 }; - struct pack_objects_args cruft_po_args = { 0 }; + struct pack_objects_args po_args = PACK_OBJECTS_ARGS_INIT; + struct pack_objects_args cruft_po_args = PACK_OBJECTS_ARGS_INIT; int write_midx = 0; const char *cruft_expiration = NULL; const char *expire_to = NULL; @@ -1327,7 +179,7 @@ int cmd_repack(int argc, OPT_UNSIGNED(0, "max-pack-size", &po_args.max_pack_size, N_("maximum size of each packfile")), OPT_PARSE_LIST_OBJECTS_FILTER(&po_args.filter_options), - OPT_BOOL(0, "pack-kept-objects", &pack_kept_objects, + OPT_BOOL(0, "pack-kept-objects", &po_args.pack_kept_objects, N_("repack objects in packs marked with .keep")), OPT_STRING_LIST(0, "keep-pack", &keep_pack_list, N_("name"), N_("do not repack this pack")), @@ -1344,7 +196,11 @@ int cmd_repack(int argc, list_objects_filter_init(&po_args.filter_options); - repo_config(the_repository, repack_config, &cruft_po_args); + memset(&config_ctx, 0, sizeof(config_ctx)); + config_ctx.po_args = &po_args; + config_ctx.cruft_po_args = &cruft_po_args; + + repo_config(repo, repack_config, &config_ctx); argc = parse_options(argc, argv, prefix, builtin_repack_options, git_repack_usage, 0); @@ -1354,7 +210,7 @@ int cmd_repack(int argc, po_args.depth = xstrdup_or_null(opt_depth); po_args.threads = xstrdup_or_null(opt_threads); - if (delete_redundant && the_repository->repository_format_precious_objects) + if (delete_redundant && repo->repository_format_precious_objects) die(_("cannot delete packs in a precious-objects repo")); die_for_incompatible_opt3(unpack_unreachable || (pack_everything & LOOSEN_UNREACHABLE), "-A", @@ -1369,14 +225,14 @@ int cmd_repack(int argc, (!(pack_everything & ALL_INTO_ONE) || !is_bare_repository())) write_bitmaps = 0; } - if (pack_kept_objects < 0) - pack_kept_objects = write_bitmaps > 0 && !write_midx; + if (po_args.pack_kept_objects < 0) + po_args.pack_kept_objects = write_bitmaps > 0 && !write_midx; if (write_bitmaps && !(pack_everything & ALL_INTO_ONE) && !write_midx) die(_(incremental_bitmap_conflict_error)); if (write_bitmaps && po_args.local && - odb_has_alternates(the_repository->objects)) { + odb_has_alternates(repo->objects)) { /* * When asked to do a local repack, but we have * packfiles that are inherited from an alternate, then @@ -1391,26 +247,28 @@ int cmd_repack(int argc, if (write_midx && write_bitmaps) { struct strbuf path = STRBUF_INIT; - strbuf_addf(&path, "%s/%s_XXXXXX", repo_get_object_directory(the_repository), + strbuf_addf(&path, "%s/%s_XXXXXX", + repo_get_object_directory(repo), "bitmap-ref-tips"); refs_snapshot = xmks_tempfile(path.buf); - midx_snapshot_refs(refs_snapshot); + midx_snapshot_refs(repo, refs_snapshot); strbuf_release(&path); } - packdir = mkpathdup("%s/pack", repo_get_object_directory(the_repository)); + packdir = mkpathdup("%s/pack", repo_get_object_directory(repo)); packtmp_name = xstrfmt(".tmp-%d-pack", (int)getpid()); packtmp = mkpathdup("%s/%s", packdir, packtmp_name); - collect_pack_filenames(&existing, &keep_pack_list); + existing.repo = repo; + existing_packs_collect(&existing, &keep_pack_list); if (geometry.split_factor) { if (pack_everything) die(_("options '%s' and '%s' cannot be used together"), "--geometric", "-A/-a"); - init_pack_geometry(&geometry, &existing, &po_args); - split_pack_geometry(&geometry); + pack_geometry_init(&geometry, &existing, &po_args); + pack_geometry_split(&geometry); } prepare_pack_objects(&cmd, &po_args, packtmp); @@ -1418,8 +276,6 @@ int cmd_repack(int argc, show_progress = !po_args.quiet && isatty(2); strvec_push(&cmd.args, "--keep-true-parents"); - if (!pack_kept_objects) - strvec_push(&cmd.args, "--honor-pack-keep"); for (i = 0; i < keep_pack_list.nr; i++) strvec_pushf(&cmd.args, "--keep-pack=%s", keep_pack_list.items[i].string); @@ -1439,7 +295,7 @@ int cmd_repack(int argc, strvec_push(&cmd.args, "--reflog"); strvec_push(&cmd.args, "--indexed-objects"); } - if (repo_has_promisor_remote(the_repository)) + if (repo_has_promisor_remote(repo)) strvec_push(&cmd.args, "--exclude-promisor-objects"); if (!write_midx) { if (write_bitmaps > 0) @@ -1451,9 +307,9 @@ int cmd_repack(int argc, strvec_push(&cmd.args, "--delta-islands"); if (pack_everything & ALL_INTO_ONE) { - repack_promisor_objects(&po_args, &names); + repack_promisor_objects(repo, &po_args, &names, packtmp); - if (has_existing_non_kept_packs(&existing) && + if (existing_packs_has_non_kept(&existing) && delete_redundant && !(pack_everything & PACK_CRUFT)) { for_each_string_list_item(item, &names) { @@ -1515,9 +371,17 @@ int cmd_repack(int argc, fclose(in); } - ret = finish_pack_objects_cmd(&cmd, &names, 1); - if (ret) - goto cleanup; + { + struct write_pack_opts opts = { + .packdir = packdir, + .destination = packdir, + .packtmp = packtmp, + }; + ret = finish_pack_objects_cmd(repo->hash_algo, &opts, &cmd, + &names); + if (ret) + goto cleanup; + } if (!names.nr) { if (!po_args.quiet) @@ -1535,12 +399,17 @@ int cmd_repack(int argc, * midx_has_unknown_packs() will make the decision for * us. */ - if (!get_multi_pack_index(the_repository->objects->sources)) + if (!get_multi_pack_index(repo->objects->sources)) midx_must_contain_cruft = 1; } if (pack_everything & PACK_CRUFT) { - const char *pack_prefix = find_pack_prefix(packdir, packtmp); + struct write_pack_opts opts = { + .po_args = &cruft_po_args, + .destination = packtmp, + .packtmp = packtmp, + .packdir = packdir, + }; if (!cruft_po_args.window) cruft_po_args.window = xstrdup_or_null(po_args.window); @@ -1555,9 +424,10 @@ int cmd_repack(int argc, cruft_po_args.local = po_args.local; cruft_po_args.quiet = po_args.quiet; + cruft_po_args.delta_base_offset = po_args.delta_base_offset; + cruft_po_args.pack_kept_objects = 0; - ret = write_cruft_pack(&cruft_po_args, packtmp, pack_prefix, - cruft_expiration, + ret = write_cruft_pack(&opts, cruft_expiration, combine_cruft_below_size, &names, &existing); if (ret) @@ -1592,11 +462,8 @@ int cmd_repack(int argc, * pack, but rather removing all cruft packs from the * main repository regardless of size. */ - ret = write_cruft_pack(&cruft_po_args, expire_to, - pack_prefix, - NULL, - 0ul, - &names, + opts.destination = expire_to; + ret = write_cruft_pack(&opts, NULL, 0ul, &names, &existing); if (ret) goto cleanup; @@ -1604,99 +471,63 @@ int cmd_repack(int argc, } if (po_args.filter_options.choice) { - if (!filter_to) - filter_to = packtmp; - - ret = write_filtered_pack(&po_args, - filter_to, - find_pack_prefix(packdir, packtmp), - &existing, - &names); + struct write_pack_opts opts = { + .po_args = &po_args, + .destination = filter_to, + .packdir = packdir, + .packtmp = packtmp, + }; + + if (!opts.destination) + opts.destination = packtmp; + + ret = write_filtered_pack(&opts, &existing, &names); if (ret) goto cleanup; } string_list_sort(&names); - if (get_multi_pack_index(the_repository->objects->sources)) { - struct multi_pack_index *m = - get_multi_pack_index(the_repository->objects->sources); - - ALLOC_ARRAY(midx_pack_names, - m->num_packs + m->num_packs_in_base); - - for (; m; m = m->base_midx) - for (uint32_t i = 0; i < m->num_packs; i++) - midx_pack_names[midx_pack_names_nr++] = - xstrdup(m->pack_names[i]); - } - - close_object_store(the_repository->objects); + close_object_store(repo->objects); /* * Ok we have prepared all new packfiles. */ - for_each_string_list_item(item, &names) { - struct generated_pack_data *data = item->util; - - for (ext = 0; ext < ARRAY_SIZE(exts); ext++) { - char *fname; - - fname = mkpathdup("%s/pack-%s%s", - packdir, item->string, exts[ext].name); - - if (data->tempfiles[ext]) { - const char *fname_old = get_tempfile_path(data->tempfiles[ext]); - struct stat statbuffer; - - if (!stat(fname_old, &statbuffer)) { - statbuffer.st_mode &= ~(S_IWUSR | S_IWGRP | S_IWOTH); - chmod(fname_old, statbuffer.st_mode); - } - - if (rename_tempfile(&data->tempfiles[ext], fname)) - die_errno(_("renaming pack to '%s' failed"), fname); - } else if (!exts[ext].optional) - die(_("pack-objects did not write a '%s' file for pack %s-%s"), - exts[ext].name, packtmp, item->string); - else if (unlink(fname) < 0 && errno != ENOENT) - die_errno(_("could not unlink: %s"), fname); - - free(fname); - } - } + for_each_string_list_item(item, &names) + generated_pack_install(item->util, item->string, packdir, + packtmp); /* End of pack replacement. */ if (delete_redundant && pack_everything & ALL_INTO_ONE) - mark_packs_for_deletion(&existing, &names); + existing_packs_mark_for_deletion(&existing, &names); if (write_midx) { - struct string_list include = STRING_LIST_INIT_DUP; - midx_included_packs(&include, &existing, midx_pack_names, - midx_pack_names_nr, &names, &geometry); - - ret = write_midx_included_packs(&include, &geometry, &names, - refs_snapshot ? get_tempfile_path(refs_snapshot) : NULL, - show_progress, write_bitmaps > 0); - - if (!ret && write_bitmaps) - remove_redundant_bitmaps(&include, packdir); - - string_list_clear(&include, 0); + struct repack_write_midx_opts opts = { + .existing = &existing, + .geometry = &geometry, + .names = &names, + .refs_snapshot = refs_snapshot ? get_tempfile_path(refs_snapshot) : NULL, + .packdir = packdir, + .show_progress = show_progress, + .write_bitmaps = write_bitmaps > 0, + .midx_must_contain_cruft = midx_must_contain_cruft + }; + + ret = write_midx_included_packs(&opts); if (ret) goto cleanup; } - odb_reprepare(the_repository->objects); + odb_reprepare(repo->objects); if (delete_redundant) { int opts = 0; - remove_redundant_existing_packs(&existing); + existing_packs_remove_redundant(&existing, packdir); if (geometry.split_factor) - geometry_remove_redundant_packs(&geometry, &names, - &existing); + pack_geometry_remove_redundant(&geometry, &names, + &existing, packdir); if (show_progress) opts |= PRUNE_PACKED_VERBOSE; prune_packed_objects(opts); @@ -1704,18 +535,18 @@ int cmd_repack(int argc, if (!keep_unreachable && (!(pack_everything & LOOSEN_UNREACHABLE) || unpack_unreachable) && - is_repository_shallow(the_repository)) + is_repository_shallow(repo)) prune_shallow(PRUNE_QUICK); } if (run_update_server_info) - update_server_info(the_repository, 0); + update_server_info(repo, 0); if (git_env_bool(GIT_TEST_MULTI_PACK_INDEX, 0)) { unsigned flags = 0; if (git_env_bool(GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL, 0)) flags |= MIDX_WRITE_INCREMENTAL; - write_midx_file(the_repository->objects->sources, + write_midx_file(repo->objects->sources, NULL, NULL, flags); } @@ -1723,10 +554,7 @@ cleanup: string_list_clear(&keep_pack_list, 0); string_list_clear(&names, 1); existing_packs_release(&existing); - free_pack_geometry(&geometry); - for (size_t i = 0; i < midx_pack_names_nr; i++) - free(midx_pack_names[i]); - free(midx_pack_names); + pack_geometry_release(&geometry); pack_objects_args_release(&po_args); pack_objects_args_release(&cruft_po_args); diff --git a/builtin/repo.c b/builtin/repo.c index bbb0966f2d..9d4749f79b 100644 --- a/builtin/repo.c +++ b/builtin/repo.c @@ -3,19 +3,27 @@ #include "builtin.h" #include "environment.h" #include "parse-options.h" +#include "path-walk.h" +#include "progress.h" #include "quote.h" +#include "ref-filter.h" #include "refs.h" +#include "revision.h" #include "strbuf.h" +#include "string-list.h" #include "shallow.h" +#include "utf8.h" static const char *const repo_usage[] = { "git repo info [--format=(keyvalue|nul)] [-z] [<key>...]", + "git repo structure [--format=(table|keyvalue|nul)]", NULL }; typedef int get_value_fn(struct repository *repo, struct strbuf *buf); enum output_format { + FORMAT_TABLE, FORMAT_KEYVALUE, FORMAT_NUL_TERMINATED, }; @@ -130,14 +138,16 @@ static int parse_format_cb(const struct option *opt, *format = FORMAT_NUL_TERMINATED; else if (!strcmp(arg, "keyvalue")) *format = FORMAT_KEYVALUE; + else if (!strcmp(arg, "table")) + *format = FORMAT_TABLE; else die(_("invalid format '%s'"), arg); return 0; } -static int repo_info(int argc, const char **argv, const char *prefix, - struct repository *repo) +static int cmd_repo_info(int argc, const char **argv, const char *prefix, + struct repository *repo) { enum output_format format = FORMAT_KEYVALUE; struct option options[] = { @@ -152,16 +162,380 @@ static int repo_info(int argc, const char **argv, const char *prefix, }; argc = parse_options(argc, argv, prefix, options, repo_usage, 0); + if (format != FORMAT_KEYVALUE && format != FORMAT_NUL_TERMINATED) + die(_("unsupported output format")); return print_fields(argc, argv, repo, format); } +struct ref_stats { + size_t branches; + size_t remotes; + size_t tags; + size_t others; +}; + +struct object_stats { + size_t tags; + size_t commits; + size_t trees; + size_t blobs; +}; + +struct repo_structure { + struct ref_stats refs; + struct object_stats objects; +}; + +struct stats_table { + struct string_list rows; + + int name_col_width; + int value_col_width; +}; + +/* + * Holds column data that gets stored for each row. + */ +struct stats_table_entry { + char *value; +}; + +static void stats_table_vaddf(struct stats_table *table, + struct stats_table_entry *entry, + const char *format, va_list ap) +{ + struct strbuf buf = STRBUF_INIT; + struct string_list_item *item; + char *formatted_name; + int name_width; + + strbuf_vaddf(&buf, format, ap); + formatted_name = strbuf_detach(&buf, NULL); + name_width = utf8_strwidth(formatted_name); + + item = string_list_append_nodup(&table->rows, formatted_name); + item->util = entry; + + if (name_width > table->name_col_width) + table->name_col_width = name_width; + if (entry) { + int value_width = utf8_strwidth(entry->value); + if (value_width > table->value_col_width) + table->value_col_width = value_width; + } +} + +static void stats_table_addf(struct stats_table *table, const char *format, ...) +{ + va_list ap; + + va_start(ap, format); + stats_table_vaddf(table, NULL, format, ap); + va_end(ap); +} + +static void stats_table_count_addf(struct stats_table *table, size_t value, + const char *format, ...) +{ + struct stats_table_entry *entry; + va_list ap; + + CALLOC_ARRAY(entry, 1); + entry->value = xstrfmt("%" PRIuMAX, (uintmax_t)value); + + va_start(ap, format); + stats_table_vaddf(table, entry, format, ap); + va_end(ap); +} + +static inline size_t get_total_reference_count(struct ref_stats *stats) +{ + return stats->branches + stats->remotes + stats->tags + stats->others; +} + +static inline size_t get_total_object_count(struct object_stats *stats) +{ + return stats->tags + stats->commits + stats->trees + stats->blobs; +} + +static void stats_table_setup_structure(struct stats_table *table, + struct repo_structure *stats) +{ + struct object_stats *objects = &stats->objects; + struct ref_stats *refs = &stats->refs; + size_t object_total; + size_t ref_total; + + ref_total = get_total_reference_count(refs); + stats_table_addf(table, "* %s", _("References")); + stats_table_count_addf(table, ref_total, " * %s", _("Count")); + stats_table_count_addf(table, refs->branches, " * %s", _("Branches")); + stats_table_count_addf(table, refs->tags, " * %s", _("Tags")); + stats_table_count_addf(table, refs->remotes, " * %s", _("Remotes")); + stats_table_count_addf(table, refs->others, " * %s", _("Others")); + + object_total = get_total_object_count(objects); + stats_table_addf(table, ""); + stats_table_addf(table, "* %s", _("Reachable objects")); + stats_table_count_addf(table, object_total, " * %s", _("Count")); + stats_table_count_addf(table, objects->commits, " * %s", _("Commits")); + stats_table_count_addf(table, objects->trees, " * %s", _("Trees")); + stats_table_count_addf(table, objects->blobs, " * %s", _("Blobs")); + stats_table_count_addf(table, objects->tags, " * %s", _("Tags")); +} + +static void stats_table_print_structure(const struct stats_table *table) +{ + const char *name_col_title = _("Repository structure"); + const char *value_col_title = _("Value"); + int name_col_width = utf8_strwidth(name_col_title); + int value_col_width = utf8_strwidth(value_col_title); + struct string_list_item *item; + + if (table->name_col_width > name_col_width) + name_col_width = table->name_col_width; + if (table->value_col_width > value_col_width) + value_col_width = table->value_col_width; + + printf("| %-*s | %-*s |\n", name_col_width, name_col_title, + value_col_width, value_col_title); + printf("| "); + for (int i = 0; i < name_col_width; i++) + putchar('-'); + printf(" | "); + for (int i = 0; i < value_col_width; i++) + putchar('-'); + printf(" |\n"); + + for_each_string_list_item(item, &table->rows) { + struct stats_table_entry *entry = item->util; + const char *value = ""; + + if (entry) { + struct stats_table_entry *entry = item->util; + value = entry->value; + } + + printf("| %-*s | %*s |\n", name_col_width, item->string, + value_col_width, value); + } +} + +static void stats_table_clear(struct stats_table *table) +{ + struct stats_table_entry *entry; + struct string_list_item *item; + + for_each_string_list_item(item, &table->rows) { + entry = item->util; + if (entry) + free(entry->value); + } + + string_list_clear(&table->rows, 1); +} + +static void structure_keyvalue_print(struct repo_structure *stats, + char key_delim, char value_delim) +{ + printf("references.branches.count%c%" PRIuMAX "%c", key_delim, + (uintmax_t)stats->refs.branches, value_delim); + printf("references.tags.count%c%" PRIuMAX "%c", key_delim, + (uintmax_t)stats->refs.tags, value_delim); + printf("references.remotes.count%c%" PRIuMAX "%c", key_delim, + (uintmax_t)stats->refs.remotes, value_delim); + printf("references.others.count%c%" PRIuMAX "%c", key_delim, + (uintmax_t)stats->refs.others, value_delim); + + printf("objects.commits.count%c%" PRIuMAX "%c", key_delim, + (uintmax_t)stats->objects.commits, value_delim); + printf("objects.trees.count%c%" PRIuMAX "%c", key_delim, + (uintmax_t)stats->objects.trees, value_delim); + printf("objects.blobs.count%c%" PRIuMAX "%c", key_delim, + (uintmax_t)stats->objects.blobs, value_delim); + printf("objects.tags.count%c%" PRIuMAX "%c", key_delim, + (uintmax_t)stats->objects.tags, value_delim); + + fflush(stdout); +} + +struct count_references_data { + struct ref_stats *stats; + struct rev_info *revs; + struct progress *progress; +}; + +static int count_references(const char *refname, + const char *referent UNUSED, + const struct object_id *oid, + int flags UNUSED, void *cb_data) +{ + struct count_references_data *data = cb_data; + struct ref_stats *stats = data->stats; + size_t ref_count; + + switch (ref_kind_from_refname(refname)) { + case FILTER_REFS_BRANCHES: + stats->branches++; + break; + case FILTER_REFS_REMOTES: + stats->remotes++; + break; + case FILTER_REFS_TAGS: + stats->tags++; + break; + case FILTER_REFS_OTHERS: + stats->others++; + break; + default: + BUG("unexpected reference type"); + } + + /* + * While iterating through references for counting, also add OIDs in + * preparation for the path walk. + */ + add_pending_oid(data->revs, NULL, oid, 0); + + ref_count = get_total_reference_count(stats); + display_progress(data->progress, ref_count); + + return 0; +} + +static void structure_count_references(struct ref_stats *stats, + struct rev_info *revs, + struct repository *repo, + int show_progress) +{ + struct count_references_data data = { + .stats = stats, + .revs = revs, + }; + + if (show_progress) + data.progress = start_delayed_progress(repo, + _("Counting references"), 0); + + refs_for_each_ref(get_main_ref_store(repo), count_references, &data); + stop_progress(&data.progress); +} + +struct count_objects_data { + struct object_stats *stats; + struct progress *progress; +}; + +static int count_objects(const char *path UNUSED, struct oid_array *oids, + enum object_type type, void *cb_data) +{ + struct count_objects_data *data = cb_data; + struct object_stats *stats = data->stats; + size_t object_count; + + switch (type) { + case OBJ_TAG: + stats->tags += oids->nr; + break; + case OBJ_COMMIT: + stats->commits += oids->nr; + break; + case OBJ_TREE: + stats->trees += oids->nr; + break; + case OBJ_BLOB: + stats->blobs += oids->nr; + break; + default: + BUG("invalid object type"); + } + + object_count = get_total_object_count(stats); + display_progress(data->progress, object_count); + + return 0; +} + +static void structure_count_objects(struct object_stats *stats, + struct rev_info *revs, + struct repository *repo, int show_progress) +{ + struct path_walk_info info = PATH_WALK_INFO_INIT; + struct count_objects_data data = { + .stats = stats, + }; + + info.revs = revs; + info.path_fn = count_objects; + info.path_fn_data = &data; + + if (show_progress) + data.progress = start_delayed_progress(repo, _("Counting objects"), 0); + + walk_objects_by_path(&info); + path_walk_info_clear(&info); + stop_progress(&data.progress); +} + +static int cmd_repo_structure(int argc, const char **argv, const char *prefix, + struct repository *repo) +{ + struct stats_table table = { + .rows = STRING_LIST_INIT_DUP, + }; + enum output_format format = FORMAT_TABLE; + struct repo_structure stats = { 0 }; + struct rev_info revs; + int show_progress = -1; + struct option options[] = { + OPT_CALLBACK_F(0, "format", &format, N_("format"), + N_("output format"), + PARSE_OPT_NONEG, parse_format_cb), + OPT_BOOL(0, "progress", &show_progress, N_("show progress")), + OPT_END() + }; + + argc = parse_options(argc, argv, prefix, options, repo_usage, 0); + if (argc) + usage(_("too many arguments")); + + repo_init_revisions(repo, &revs, prefix); + + if (show_progress < 0) + show_progress = isatty(2); + + structure_count_references(&stats.refs, &revs, repo, show_progress); + structure_count_objects(&stats.objects, &revs, repo, show_progress); + + switch (format) { + case FORMAT_TABLE: + stats_table_setup_structure(&table, &stats); + stats_table_print_structure(&table); + break; + case FORMAT_KEYVALUE: + structure_keyvalue_print(&stats, '=', '\n'); + break; + case FORMAT_NUL_TERMINATED: + structure_keyvalue_print(&stats, '\n', '\0'); + break; + default: + BUG("invalid output format"); + } + + stats_table_clear(&table); + release_revisions(&revs); + + return 0; +} + int cmd_repo(int argc, const char **argv, const char *prefix, struct repository *repo) { parse_opt_subcommand_fn *fn = NULL; struct option options[] = { - OPT_SUBCOMMAND("info", &fn, repo_info), + OPT_SUBCOMMAND("info", &fn, cmd_repo_info), + OPT_SUBCOMMAND("structure", &fn, cmd_repo_structure), OPT_END() }; diff --git a/commit-reach.c b/commit-reach.c index a339e41aa4..cc18c86d3b 100644 --- a/commit-reach.c +++ b/commit-reach.c @@ -60,6 +60,7 @@ static int paint_down_to_common(struct repository *r, struct prio_queue queue = { compare_commits_by_gen_then_commit_date }; int i; timestamp_t last_gen = GENERATION_NUMBER_INFINITY; + struct commit_list **tail = result; if (!min_generation && !corrected_commit_dates_enabled(r)) queue.compare = compare_commits_by_commit_date; @@ -95,7 +96,7 @@ static int paint_down_to_common(struct repository *r, if (flags == (PARENT1 | PARENT2)) { if (!(commit->object.flags & RESULT)) { commit->object.flags |= RESULT; - commit_list_insert_by_date(commit, result); + tail = commit_list_append(commit, tail); } /* Mark parents of a found merge stale */ flags |= STALE; @@ -128,6 +129,7 @@ static int paint_down_to_common(struct repository *r, } clear_prio_queue(&queue); + commit_list_sort_by_date(result); return 0; } @@ -136,7 +138,7 @@ static int merge_bases_many(struct repository *r, struct commit **twos, struct commit_list **result) { - struct commit_list *list = NULL; + struct commit_list *list = NULL, **tail = result; int i; for (i = 0; i < n; i++) { @@ -171,8 +173,9 @@ static int merge_bases_many(struct repository *r, while (list) { struct commit *commit = pop_commit(&list); if (!(commit->object.flags & STALE)) - commit_list_insert_by_date(commit, result); + tail = commit_list_append(commit, tail); } + commit_list_sort_by_date(result); return 0; } @@ -425,7 +428,7 @@ static int get_merge_bases_many_0(struct repository *r, int cleanup, struct commit_list **result) { - struct commit_list *list; + struct commit_list *list, **tail = result; struct commit **rslt; size_t cnt, i; int ret; @@ -461,7 +464,8 @@ static int get_merge_bases_many_0(struct repository *r, return -1; } for (i = 0; i < cnt; i++) - commit_list_insert_by_date(rslt[i], result); + tail = commit_list_append(rslt[i], tail); + commit_list_sort_by_date(result); free(rslt); return 0; } diff --git a/connected.c b/connected.c index b288a18b17..79403108dd 100644 --- a/connected.c +++ b/connected.c @@ -74,10 +74,9 @@ int check_connected(oid_iterate_fn fn, void *cb_data, */ odb_reprepare(the_repository->objects); do { - struct packfile_store *packs = the_repository->objects->packfiles; struct packed_git *p; - for (p = packfile_store_get_all_packs(packs); p; p = p->next) { + repo_for_each_pack(the_repository, p) { if (!p->pack_promisor) continue; if (find_pack_entry_one(oid, p)) diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash index e3d88b0672..73abea31b4 100644 --- a/contrib/completion/git-completion.bash +++ b/contrib/completion/git-completion.bash @@ -2218,7 +2218,7 @@ __git_log_gitk_options=" " # Options that go well for log and shortlog (not gitk) __git_log_shortlog_options=" - --author= --committer= --grep= + --author= --grep= --exclude= --all-match --invert-grep " # Options accepted by log and show @@ -2296,6 +2296,7 @@ __git_complete_log_opts () $__git_log_shortlog_options $__git_log_gitk_options $__git_log_show_options + --committer= --root --topo-order --date-order --reverse --follow --full-diff --abbrev-commit --no-abbrev-commit --abbrev= @@ -3229,7 +3230,7 @@ _git_shortlog () __gitcomp " $__git_log_common_options $__git_log_shortlog_options - --numbered --summary --email + --committer --numbered --summary --email " return ;; diff --git a/contrib/credential/libsecret/Makefile b/contrib/credential/libsecret/Makefile index 7cacc57681..9309cfb78c 100644 --- a/contrib/credential/libsecret/Makefile +++ b/contrib/credential/libsecret/Makefile @@ -10,6 +10,7 @@ gitexecdir ?= $(prefix)/libexec/git-core CC ?= gcc CFLAGS ?= -g -O2 -Wall PKG_CONFIG ?= pkg-config +INSTALL ?= install RM ?= rm -f INCS:=$(shell $(PKG_CONFIG) --cflags libsecret-1 glib-2.0) @@ -21,7 +22,11 @@ LIBS:=$(shell $(PKG_CONFIG) --libs libsecret-1 glib-2.0) git-credential-libsecret: git-credential-libsecret.o $(CC) $(CFLAGS) -o $@ $^ $(LDFLAGS) $(LIBS) +install: git-credential-libsecret + $(INSTALL) -d -m 755 $(DESTDIR)$(gitexecdir) + $(INSTALL) -m 755 $< $(DESTDIR)$(gitexecdir) + clean: $(RM) git-credential-libsecret git-credential-libsecret.o -.PHONY: all clean +.PHONY: all install clean diff --git a/contrib/credential/osxkeychain/Makefile b/contrib/credential/osxkeychain/Makefile index c7d9121022..9680717abe 100644 --- a/contrib/credential/osxkeychain/Makefile +++ b/contrib/credential/osxkeychain/Makefile @@ -9,6 +9,7 @@ gitexecdir ?= $(prefix)/libexec/git-core CC ?= gcc CFLAGS ?= -g -O2 -Wall +INSTALL ?= install RM ?= rm -f %.o: %.c @@ -18,7 +19,11 @@ git-credential-osxkeychain: git-credential-osxkeychain.o $(CC) $(CFLAGS) -o $@ $^ $(LDFLAGS) \ -framework Security -framework CoreFoundation +install: git-credential-osxkeychain + $(INSTALL) -d -m 755 $(DESTDIR)$(gitexecdir) + $(INSTALL) -m 755 $< $(DESTDIR)$(gitexecdir) + clean: $(RM) git-credential-osxkeychain git-credential-osxkeychain.o -.PHONY: all clean +.PHONY: all install clean @@ -1351,6 +1351,9 @@ static void emit_diff_symbol_from_struct(struct diff_options *o, int len = eds->len; unsigned flags = eds->flags; + if (!o->file) + return; + switch (s) { case DIFF_SYMBOL_NO_LF_EOF: context = diff_get_color_opt(o, DIFF_CONTEXT); @@ -3762,9 +3765,9 @@ static void builtin_diff(const char *name_a, if (o->word_diff) init_diff_words_data(&ecbdata, o, one, two); - if (o->dry_run) { + if (!o->file) { /* - * Unlike the !dry_run case, we need to ignore the + * Unlike the normal output case, we need to ignore the * return value from xdi_diff_outf() here, because * xdi_diff_outf() takes non-zero return from its * callback function as a sign of error and returns @@ -4420,7 +4423,6 @@ static void run_external_diff(const struct external_diff *pgm, { struct child_process cmd = CHILD_PROCESS_INIT; struct diff_queue_struct *q = &diff_queued_diff; - int quiet = !(o->output_format & DIFF_FORMAT_PATCH); int rc; /* @@ -4429,7 +4431,7 @@ static void run_external_diff(const struct external_diff *pgm, * external diff program lacks the ability to tell us whether * it's empty then we consider it non-empty without even asking. */ - if (!pgm->trust_exit_code && quiet) { + if (!pgm->trust_exit_code && !o->file) { o->found_changes = 1; return; } @@ -4454,7 +4456,10 @@ static void run_external_diff(const struct external_diff *pgm, diff_free_filespec_data(one); diff_free_filespec_data(two); cmd.use_shell = 1; - cmd.no_stdout = quiet; + if (!o->file) + cmd.no_stdout = 1; + else if (o->file != stdout) + cmd.out = xdup(fileno(o->file)); rc = run_command(&cmd); if (!pgm->trust_exit_code && rc == 0) o->found_changes = 1; @@ -4615,7 +4620,8 @@ static void run_diff_cmd(const struct external_diff *pgm, p->status == DIFF_STATUS_RENAMED) o->found_changes = 1; } else { - fprintf(o->file, "* Unmerged path %s\n", name); + if (o->file) + fprintf(o->file, "* Unmerged path %s\n", name); o->found_changes = 1; } } @@ -6192,15 +6198,15 @@ static void diff_flush_patch(struct diff_filepair *p, struct diff_options *o) /* return 1 if any change is found; otherwise, return 0 */ static int diff_flush_patch_quietly(struct diff_filepair *p, struct diff_options *o) { - int saved_dry_run = o->dry_run; + FILE *saved_file = o->file; int saved_found_changes = o->found_changes; int ret; - o->dry_run = 1; + o->file = NULL; o->found_changes = 0; diff_flush_patch(p, o); ret = o->found_changes; - o->dry_run = saved_dry_run; + o->file = saved_file; o->found_changes |= saved_found_changes; return ret; } @@ -6828,38 +6834,18 @@ void diff_flush(struct diff_options *options) DIFF_FORMAT_NAME | DIFF_FORMAT_NAME_STATUS | DIFF_FORMAT_CHECKDIFF)) { - /* - * make sure diff_Flush_patch_quietly() to be silent. - */ - FILE *dev_null = NULL; - int saved_color_moved = options->color_moved; - - if (options->flags.diff_from_contents) { - dev_null = xfopen("/dev/null", "w"); - options->color_moved = 0; - } for (i = 0; i < q->nr; i++) { struct diff_filepair *p = q->queue[i]; if (!check_pair_status(p)) continue; - if (options->flags.diff_from_contents) { - FILE *saved_file = options->file; - int found_changes; + if (options->flags.diff_from_contents && + !diff_flush_patch_quietly(p, options)) + continue; - options->file = dev_null; - found_changes = diff_flush_patch_quietly(p, options); - options->file = saved_file; - if (!found_changes) - continue; - } flush_one_pair(p, options); } - if (options->flags.diff_from_contents) { - fclose(dev_null); - options->color_moved = saved_color_moved; - } separator++; } @@ -6910,15 +6896,6 @@ void diff_flush(struct diff_options *options) if (output_format & DIFF_FORMAT_NO_OUTPUT && options->flags.exit_with_status && options->flags.diff_from_contents) { - /* - * run diff_flush_patch for the exit status. setting - * options->file to /dev/null should be safe, because we - * aren't supposed to produce any output anyway. - */ - diff_free_file(options); - options->file = xfopen("/dev/null", "w"); - options->close_file = 1; - options->color_moved = 0; for (i = 0; i < q->nr; i++) { struct diff_filepair *p = q->queue[i]; if (check_pair_status(p)) @@ -408,8 +408,6 @@ struct diff_options { #define COLOR_MOVED_WS_ERROR (1<<0) unsigned color_moved_ws_handling; - bool dry_run; - struct repository *repo; struct strmap *additional_path_headers; @@ -1388,18 +1388,25 @@ int match_pathname(const char *pathname, int pathlen, if (fspathncmp(pattern, name, prefix)) return 0; - pattern += prefix; - patternlen -= prefix; - name += prefix; - namelen -= prefix; /* * If the whole pattern did not have a wildcard, * then our prefix match is all we need; we * do not need to call fnmatch at all. */ - if (!patternlen && !namelen) + if (patternlen == prefix && namelen == prefix) return 1; + + /* + * Retain one character of the prefix to + * pass to fnmatch, which lets it distinguish + * the start of a directory component correctly. + */ + prefix--; + pattern += prefix; + patternlen -= prefix; + name += prefix; + namelen -= prefix; } return fnmatch_icase_mem(pattern, patternlen, diff --git a/gpg-interface.c b/gpg-interface.c index 2f4f0e32cb..d1e88da8c1 100644 --- a/gpg-interface.c +++ b/gpg-interface.c @@ -821,8 +821,7 @@ static char *get_ssh_key_fingerprint(const char *signing_key) struct child_process ssh_keygen = CHILD_PROCESS_INIT; int ret = -1; struct strbuf fingerprint_stdout = STRBUF_INIT; - struct strbuf **fingerprint; - char *fingerprint_ret; + char *fingerprint_ret, *begin, *delim; const char *literal_key = NULL; /* @@ -845,13 +844,17 @@ static char *get_ssh_key_fingerprint(const char *signing_key) die_errno(_("failed to get the ssh fingerprint for key '%s'"), signing_key); - fingerprint = strbuf_split_max(&fingerprint_stdout, ' ', 3); - if (!fingerprint[1]) - die_errno(_("failed to get the ssh fingerprint for key '%s'"), + begin = fingerprint_stdout.buf; + delim = strchr(begin, ' '); + if (!delim) + die(_("failed to get the ssh fingerprint for key %s"), signing_key); - - fingerprint_ret = strbuf_detach(fingerprint[1], NULL); - strbuf_list_free(fingerprint); + begin = delim + 1; + delim = strchr(begin, ' '); + if (!delim) + die(_("failed to get the ssh fingerprint for key %s"), + signing_key); + fingerprint_ret = xmemdupz(begin, delim - begin); strbuf_release(&fingerprint_stdout); return fingerprint_ret; } @@ -862,12 +865,12 @@ static char *get_default_ssh_signing_key(void) struct child_process ssh_default_key = CHILD_PROCESS_INIT; int ret = -1; struct strbuf key_stdout = STRBUF_INIT, key_stderr = STRBUF_INIT; - struct strbuf **keys; char *key_command = NULL; const char **argv; int n; char *default_key = NULL; const char *literal_key = NULL; + char *begin, *new_line, *first_line; if (!ssh_default_key_command) die(_("either user.signingkey or gpg.ssh.defaultKeyCommand needs to be configured")); @@ -884,19 +887,24 @@ static char *get_default_ssh_signing_key(void) &key_stderr, 0); if (!ret) { - keys = strbuf_split_max(&key_stdout, '\n', 2); - if (keys[0] && is_literal_ssh_key(keys[0]->buf, &literal_key)) { + begin = key_stdout.buf; + new_line = strchr(begin, '\n'); + if (new_line) + first_line = xmemdupz(begin, new_line - begin); + else + first_line = xstrdup(begin); + if (is_literal_ssh_key(first_line, &literal_key)) { /* * We only use `is_literal_ssh_key` here to check validity * The prefix will be stripped when the key is used. */ - default_key = strbuf_detach(keys[0], NULL); + default_key = first_line; } else { + free(first_line); warning(_("gpg.ssh.defaultKeyCommand succeeded but returned no keys: %s %s"), key_stderr.buf, key_stdout.buf); } - strbuf_list_free(keys); } else { warning(_("gpg.ssh.defaultKeyCommand failed: %s %s"), key_stderr.buf, key_stdout.buf); diff --git a/http-backend.c b/http-backend.c index 9084058f1e..52f0483dd3 100644 --- a/http-backend.c +++ b/http-backend.c @@ -603,19 +603,18 @@ static void get_head(struct strbuf *hdr, char *arg UNUSED) static void get_info_packs(struct strbuf *hdr, char *arg UNUSED) { size_t objdirlen = strlen(repo_get_object_directory(the_repository)); - struct packfile_store *packs = the_repository->objects->packfiles; struct strbuf buf = STRBUF_INIT; struct packed_git *p; size_t cnt = 0; select_getanyfile(hdr); - for (p = packfile_store_get_all_packs(packs); p; p = p->next) { + repo_for_each_pack(the_repository, p) { if (p->pack_local) cnt++; } strbuf_grow(&buf, cnt * 53 + 2); - for (p = packfile_store_get_all_packs(packs); p; p = p->next) { + repo_for_each_pack(the_repository, p) { if (p->pack_local) strbuf_addf(&buf, "P %s\n", p->pack_name + objdirlen + 6); } @@ -2416,7 +2416,6 @@ static char *fetch_pack_index(unsigned char *hash, const char *base_url) static int fetch_and_setup_pack_index(struct packed_git **packs_head, unsigned char *sha1, const char *base_url) { - struct packfile_store *packs = the_repository->objects->packfiles; struct packed_git *new_pack, *p; char *tmp_idx = NULL; int ret; @@ -2425,7 +2424,7 @@ static int fetch_and_setup_pack_index(struct packed_git **packs_head, * If we already have the pack locally, no need to fetch its index or * even add it to list; we already have all of its objects. */ - for (p = packfile_store_get_all_packs(packs); p; p = p->next) { + repo_for_each_pack(the_repository, p) { if (hasheq(p->hash, sha1, the_repository->hash_algo)) return 0; } diff --git a/meson.build b/meson.build index 308798e861..2b763f7c53 100644 --- a/meson.build +++ b/meson.build @@ -463,6 +463,12 @@ libgit_sources = [ 'reftable/tree.c', 'reftable/writer.c', 'remote.c', + 'repack.c', + 'repack-cruft.c', + 'repack-filtered.c', + 'repack-geometry.c', + 'repack-midx.c', + 'repack-promisor.c', 'replace-object.c', 'repo-settings.c', 'repository.c', diff --git a/object-name.c b/object-name.c index f6902e140d..766c757042 100644 --- a/object-name.c +++ b/object-name.c @@ -213,9 +213,11 @@ static void find_short_packed_object(struct disambiguate_state *ds) unique_in_midx(m, ds); } - for (p = packfile_store_get_packs(ds->repo->objects->packfiles); p && !ds->ambiguous; - p = p->next) + repo_for_each_pack(ds->repo, p) { + if (ds->ambiguous) + break; unique_in_pack(p, ds); + } } static int finish_object_disambiguation(struct disambiguate_state *ds, @@ -805,7 +807,7 @@ static void find_abbrev_len_packed(struct min_abbrev_data *mad) find_abbrev_len_for_midx(m, mad); } - for (p = packfile_store_get_packs(mad->repo->objects->packfiles); p; p = p->next) + repo_for_each_pack(mad->repo, p) find_abbrev_len_for_pack(p, mad); } diff --git a/pack-bitmap.c b/pack-bitmap.c index ac71035d77..291e1a9cf4 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -664,7 +664,7 @@ static int open_pack_bitmap(struct repository *r, struct packed_git *p; int ret = -1; - for (p = packfile_store_get_all_packs(r->objects->packfiles); p; p = p->next) { + repo_for_each_pack(r, p) { if (open_pack_bitmap_1(bitmap_git, p) == 0) { ret = 0; /* @@ -3347,6 +3347,7 @@ static int verify_bitmap_file(const struct git_hash_algo *algop, int verify_bitmap_files(struct repository *r) { struct odb_source *source; + struct packed_git *p; int res = 0; odb_prepare_alternates(r->objects); @@ -3362,8 +3363,7 @@ int verify_bitmap_files(struct repository *r) free(midx_bitmap_name); } - for (struct packed_git *p = packfile_store_get_all_packs(r->objects->packfiles); - p; p = p->next) { + repo_for_each_pack(r, p) { char *pack_bitmap_name = pack_bitmap_filename(p); res |= verify_bitmap_file(r->hash_algo, pack_bitmap_name); free(pack_bitmap_name); diff --git a/pack-objects.c b/pack-objects.c index 9d6ee72569..48510dd343 100644 --- a/pack-objects.c +++ b/pack-objects.c @@ -87,7 +87,6 @@ struct object_entry *packlist_find(struct packing_data *pdata, static void prepare_in_pack_by_idx(struct packing_data *pdata) { - struct packfile_store *packs = pdata->repo->objects->packfiles; struct packed_git **mapping, *p; int cnt = 0, nr = 1U << OE_IN_PACK_BITS; @@ -97,13 +96,13 @@ static void prepare_in_pack_by_idx(struct packing_data *pdata) * (i.e. in_pack_idx also zero) should return NULL. */ mapping[cnt++] = NULL; - for (p = packfile_store_get_all_packs(packs); p; p = p->next, cnt++) { + repo_for_each_pack(pdata->repo, p) { if (cnt == nr) { free(mapping); return; } p->index = cnt; - mapping[cnt] = p; + mapping[cnt++] = p; } pdata->in_pack_by_idx = mapping; } diff --git a/packfile.c b/packfile.c index 5a7caec292..1ae2b2fe1e 100644 --- a/packfile.c +++ b/packfile.c @@ -1030,12 +1030,6 @@ void packfile_store_reprepare(struct packfile_store *store) struct packed_git *packfile_store_get_packs(struct packfile_store *store) { packfile_store_prepare(store); - return store->packs; -} - -struct packed_git *packfile_store_get_all_packs(struct packfile_store *store) -{ - packfile_store_prepare(store); for (struct odb_source *source = store->odb->sources; source; source = source->next) { struct multi_pack_index *m = source->midx; @@ -2105,7 +2099,7 @@ struct packed_git **kept_pack_cache(struct repository *r, unsigned flags) * covers, one kept and one not kept, but the midx returns only * the non-kept version. */ - for (p = packfile_store_get_all_packs(r->objects->packfiles); p; p = p->next) { + repo_for_each_pack(r, p) { if ((p->pack_keep && (flags & ON_DISK_KEEP_PACKS)) || (p->pack_keep_in_core && (flags & IN_CORE_KEEP_PACKS))) { ALLOC_GROW(packs, nr + 1, alloc); @@ -2202,7 +2196,7 @@ int for_each_packed_object(struct repository *repo, each_packed_object_fn cb, int r = 0; int pack_errors = 0; - for (p = packfile_store_get_all_packs(repo->objects->packfiles); p; p = p->next) { + repo_for_each_pack(repo, p) { if ((flags & FOR_EACH_OBJECT_LOCAL_ONLY) && !p->pack_local) continue; if ((flags & FOR_EACH_OBJECT_PROMISOR_ONLY) && diff --git a/packfile.h b/packfile.h index e7a5792b6c..c9d0b93446 100644 --- a/packfile.h +++ b/packfile.h @@ -137,16 +137,18 @@ void packfile_store_add_pack(struct packfile_store *store, struct packed_git *pack); /* - * Get packs managed by the given store. Does not load the MIDX or any packs - * referenced by it. + * Load and iterate through all packs of the given repository. This helper + * function will yield packfiles from all object sources connected to the + * repository. */ -struct packed_git *packfile_store_get_packs(struct packfile_store *store); +#define repo_for_each_pack(repo, p) \ + for (p = packfile_store_get_packs(repo->objects->packfiles); p; p = p->next) /* * Get all packs managed by the given store, including packfiles that are * referenced by multi-pack indices. */ -struct packed_git *packfile_store_get_all_packs(struct packfile_store *store); +struct packed_git *packfile_store_get_packs(struct packfile_store *store); /* * Get all packs in most-recently-used order. diff --git a/ref-filter.c b/ref-filter.c index 520d2539c9..30cc488d8a 100644 --- a/ref-filter.c +++ b/ref-filter.c @@ -2664,7 +2664,7 @@ static int match_name_as_path(const char **pattern, const char *refname, /* Return 1 if the refname matches one of the patterns, otherwise 0. */ static int filter_pattern_match(struct ref_filter *filter, const char *refname) { - if (!*filter->name_patterns) + if (!filter->name_patterns || !*filter->name_patterns) return 1; /* No pattern always matches */ if (filter->match_as_path) return match_name_as_path(filter->name_patterns, refname, @@ -2751,7 +2751,7 @@ static int for_each_fullref_in_pattern(struct ref_filter *filter, return for_each_fullref_with_seek(filter, cb, cb_data, 0); } - if (!filter->name_patterns[0]) { + if (!filter->name_patterns || !filter->name_patterns[0]) { /* no patterns; we have to look at everything */ return for_each_fullref_with_seek(filter, cb, cb_data, 0); } @@ -2833,7 +2833,7 @@ struct ref_array_item *ref_array_push(struct ref_array *array, return ref; } -static int ref_kind_from_refname(const char *refname) +int ref_kind_from_refname(const char *refname) { unsigned int i; diff --git a/ref-filter.h b/ref-filter.h index 81f2c229a9..235c60f79c 100644 --- a/ref-filter.h +++ b/ref-filter.h @@ -135,6 +135,8 @@ struct ref_format { OPT_STRVEC(0, "exclude", &(var)->exclude, \ N_("pattern"), N_("exclude refs which match pattern")) +/* Get the reference kind from the provided reference name. */ +int ref_kind_from_refname(const char *refname); /* * API for filtering a set of refs. Based on the type of refs the user * has requested, we iterate through those refs and apply filters diff --git a/refs/debug.c b/refs/debug.c index 697adbd0dc..c59c1728a3 100644 --- a/refs/debug.c +++ b/refs/debug.c @@ -47,6 +47,14 @@ static int debug_create_on_disk(struct ref_store *refs, int flags, struct strbuf return res; } +static int debug_remove_on_disk(struct ref_store *refs, struct strbuf *err) +{ + struct debug_ref_store *drefs = (struct debug_ref_store *)refs; + int res = drefs->refs->be->remove_on_disk(drefs->refs, err); + trace_printf_key(&trace_refs, "remove_on_disk: %d\n", res); + return res; +} + static int debug_transaction_prepare(struct ref_store *refs, struct ref_transaction *transaction, struct strbuf *err) @@ -432,6 +440,7 @@ struct ref_storage_be refs_be_debug = { .init = NULL, .release = debug_release, .create_on_disk = debug_create_on_disk, + .remove_on_disk = debug_remove_on_disk, /* * None of these should be NULL. If the "files" backend (in diff --git a/refs/files-backend.c b/refs/files-backend.c index 8d7007f4aa..054cf42f4e 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -2113,20 +2113,35 @@ static int commit_ref_update(struct files_ref_store *refs, return 0; } -#ifdef NO_SYMLINK_HEAD +#if defined(NO_SYMLINK_HEAD) || defined(WITH_BREAKING_CHANGES) #define create_ref_symlink(a, b) (-1) #else static int create_ref_symlink(struct ref_lock *lock, const char *target) { + static int warn_once = 1; + char *ref_path; int ret = -1; - char *ref_path = get_locked_file_path(&lock->lk); + ref_path = get_locked_file_path(&lock->lk); unlink(ref_path); ret = symlink(target, ref_path); free(ref_path); if (ret) fprintf(stderr, "no symlink - falling back to symbolic ref\n"); + + if (warn_once) + warning(_("'core.preferSymlinkRefs=true' is nominated for removal.\n" + "hint: The use of symbolic links for symbolic refs is deprecated\n" + "hint: and will be removed in Git 3.0. The configuration that\n" + "hint: tells Git to use them is thus going away. You can unset\n" + "hint: it with:\n" + "hint:\n" + "hint:\tgit config unset core.preferSymlinkRefs\n" + "hint:\n" + "hint: Git will then use the textual symref format instead.")); + warn_once = 0; + return ret; } #endif diff --git a/repack-cruft.c b/repack-cruft.c new file mode 100644 index 0000000000..0653e88792 --- /dev/null +++ b/repack-cruft.c @@ -0,0 +1,98 @@ +#include "git-compat-util.h" +#include "repack.h" +#include "packfile.h" +#include "repository.h" +#include "run-command.h" + +static void combine_small_cruft_packs(FILE *in, off_t combine_cruft_below_size, + struct existing_packs *existing) +{ + struct packed_git *p; + struct strbuf buf = STRBUF_INIT; + size_t i; + + repo_for_each_pack(existing->repo, p) { + if (!(p->is_cruft && p->pack_local)) + continue; + + strbuf_reset(&buf); + strbuf_addstr(&buf, pack_basename(p)); + strbuf_strip_suffix(&buf, ".pack"); + + if (!string_list_has_string(&existing->cruft_packs, buf.buf)) + continue; + + if (p->pack_size < combine_cruft_below_size) { + fprintf(in, "-%s\n", pack_basename(p)); + } else { + existing_packs_retain_cruft(existing, p); + fprintf(in, "%s\n", pack_basename(p)); + } + } + + for (i = 0; i < existing->non_kept_packs.nr; i++) + fprintf(in, "-%s.pack\n", + existing->non_kept_packs.items[i].string); + + strbuf_release(&buf); +} + +int write_cruft_pack(const struct write_pack_opts *opts, + const char *cruft_expiration, + unsigned long combine_cruft_below_size, + struct string_list *names, + struct existing_packs *existing) +{ + struct child_process cmd = CHILD_PROCESS_INIT; + struct string_list_item *item; + FILE *in; + int ret; + const char *pack_prefix = write_pack_opts_pack_prefix(opts); + + prepare_pack_objects(&cmd, opts->po_args, opts->destination); + + strvec_push(&cmd.args, "--cruft"); + if (cruft_expiration) + strvec_pushf(&cmd.args, "--cruft-expiration=%s", + cruft_expiration); + + strvec_push(&cmd.args, "--non-empty"); + + cmd.in = -1; + + ret = start_command(&cmd); + if (ret) + return ret; + + /* + * names has a confusing double use: it both provides the list + * of just-written new packs, and accepts the name of the cruft + * pack we are writing. + * + * By the time it is read here, it contains only the pack(s) + * that were just written, which is exactly the set of packs we + * want to consider kept. + * + * If `--expire-to` is given, the double-use served by `names` + * ensures that the pack written to `--expire-to` excludes any + * objects contained in the cruft pack. + */ + in = xfdopen(cmd.in, "w"); + for_each_string_list_item(item, names) + fprintf(in, "%s-%s.pack\n", pack_prefix, item->string); + if (combine_cruft_below_size && !cruft_expiration) { + combine_small_cruft_packs(in, combine_cruft_below_size, + existing); + } else { + for_each_string_list_item(item, &existing->non_kept_packs) + fprintf(in, "-%s.pack\n", item->string); + for_each_string_list_item(item, &existing->cruft_packs) + fprintf(in, "-%s.pack\n", item->string); + } + for_each_string_list_item(item, &existing->kept_packs) + fprintf(in, "%s.pack\n", item->string); + fclose(in); + + return finish_pack_objects_cmd(existing->repo->hash_algo, opts, &cmd, + names); +} diff --git a/repack-filtered.c b/repack-filtered.c new file mode 100644 index 0000000000..edcf7667c5 --- /dev/null +++ b/repack-filtered.c @@ -0,0 +1,51 @@ +#include "git-compat-util.h" +#include "repack.h" +#include "repository.h" +#include "run-command.h" +#include "string-list.h" + +int write_filtered_pack(const struct write_pack_opts *opts, + struct existing_packs *existing, + struct string_list *names) +{ + struct child_process cmd = CHILD_PROCESS_INIT; + struct string_list_item *item; + FILE *in; + int ret; + const char *caret; + const char *pack_prefix = write_pack_opts_pack_prefix(opts); + + prepare_pack_objects(&cmd, opts->po_args, opts->destination); + + strvec_push(&cmd.args, "--stdin-packs"); + + for_each_string_list_item(item, &existing->kept_packs) + strvec_pushf(&cmd.args, "--keep-pack=%s", item->string); + + cmd.in = -1; + + ret = start_command(&cmd); + if (ret) + return ret; + + /* + * Here 'names' contains only the pack(s) that were just + * written, which is exactly the packs we want to keep. Also + * 'existing_kept_packs' already contains the packs in + * 'keep_pack_list'. + */ + in = xfdopen(cmd.in, "w"); + for_each_string_list_item(item, names) + fprintf(in, "^%s-%s.pack\n", pack_prefix, item->string); + for_each_string_list_item(item, &existing->non_kept_packs) + fprintf(in, "%s.pack\n", item->string); + for_each_string_list_item(item, &existing->cruft_packs) + fprintf(in, "%s.pack\n", item->string); + caret = opts->po_args->pack_kept_objects ? "" : "^"; + for_each_string_list_item(item, &existing->kept_packs) + fprintf(in, "%s%s.pack\n", caret, item->string); + fclose(in); + + return finish_pack_objects_cmd(existing->repo->hash_algo, opts, &cmd, + names); +} diff --git a/repack-geometry.c b/repack-geometry.c new file mode 100644 index 0000000000..b3e32cd07e --- /dev/null +++ b/repack-geometry.c @@ -0,0 +1,232 @@ +#define DISABLE_SIGN_COMPARE_WARNINGS + +#include "git-compat-util.h" +#include "repack.h" +#include "repository.h" +#include "hex.h" +#include "packfile.h" + +static uint32_t pack_geometry_weight(struct packed_git *p) +{ + if (open_pack_index(p)) + die(_("cannot open index for %s"), p->pack_name); + return p->num_objects; +} + +static int pack_geometry_cmp(const void *va, const void *vb) +{ + uint32_t aw = pack_geometry_weight(*(struct packed_git **)va), + bw = pack_geometry_weight(*(struct packed_git **)vb); + + if (aw < bw) + return -1; + if (aw > bw) + return 1; + return 0; +} + +void pack_geometry_init(struct pack_geometry *geometry, + struct existing_packs *existing, + const struct pack_objects_args *args) +{ + struct packed_git *p; + struct strbuf buf = STRBUF_INIT; + + repo_for_each_pack(existing->repo, p) { + if (args->local && !p->pack_local) + /* + * When asked to only repack local packfiles we skip + * over any packfiles that are borrowed from alternate + * object directories. + */ + continue; + + if (!args->pack_kept_objects) { + /* + * Any pack that has its pack_keep bit set will + * appear in existing->kept_packs below, but + * this saves us from doing a more expensive + * check. + */ + if (p->pack_keep) + continue; + + /* + * The pack may be kept via the --keep-pack + * option; check 'existing->kept_packs' to + * determine whether to ignore it. + */ + strbuf_reset(&buf); + strbuf_addstr(&buf, pack_basename(p)); + strbuf_strip_suffix(&buf, ".pack"); + + if (string_list_has_string(&existing->kept_packs, buf.buf)) + continue; + } + if (p->is_cruft) + continue; + + ALLOC_GROW(geometry->pack, + geometry->pack_nr + 1, + geometry->pack_alloc); + + geometry->pack[geometry->pack_nr] = p; + geometry->pack_nr++; + } + + QSORT(geometry->pack, geometry->pack_nr, pack_geometry_cmp); + strbuf_release(&buf); +} + +void pack_geometry_split(struct pack_geometry *geometry) +{ + uint32_t i; + uint32_t split; + off_t total_size = 0; + + if (!geometry->pack_nr) { + geometry->split = geometry->pack_nr; + return; + } + + /* + * First, count the number of packs (in descending order of size) which + * already form a geometric progression. + */ + for (i = geometry->pack_nr - 1; i > 0; i--) { + struct packed_git *ours = geometry->pack[i]; + struct packed_git *prev = geometry->pack[i - 1]; + + if (unsigned_mult_overflows(geometry->split_factor, + pack_geometry_weight(prev))) + die(_("pack %s too large to consider in geometric " + "progression"), + prev->pack_name); + + if (pack_geometry_weight(ours) < + geometry->split_factor * pack_geometry_weight(prev)) + break; + } + + split = i; + + if (split) { + /* + * Move the split one to the right, since the top element in the + * last-compared pair can't be in the progression. Only do this + * when we split in the middle of the array (otherwise if we got + * to the end, then the split is in the right place). + */ + split++; + } + + /* + * Then, anything to the left of 'split' must be in a new pack. But, + * creating that new pack may cause packs in the heavy half to no longer + * form a geometric progression. + * + * Compute an expected size of the new pack, and then determine how many + * packs in the heavy half need to be joined into it (if any) to restore + * the geometric progression. + */ + for (i = 0; i < split; i++) { + struct packed_git *p = geometry->pack[i]; + + if (unsigned_add_overflows(total_size, pack_geometry_weight(p))) + die(_("pack %s too large to roll up"), p->pack_name); + total_size += pack_geometry_weight(p); + } + for (i = split; i < geometry->pack_nr; i++) { + struct packed_git *ours = geometry->pack[i]; + + if (unsigned_mult_overflows(geometry->split_factor, + total_size)) + die(_("pack %s too large to roll up"), ours->pack_name); + + if (pack_geometry_weight(ours) < + geometry->split_factor * total_size) { + if (unsigned_add_overflows(total_size, + pack_geometry_weight(ours))) + die(_("pack %s too large to roll up"), + ours->pack_name); + + split++; + total_size += pack_geometry_weight(ours); + } else + break; + } + + geometry->split = split; +} + +struct packed_git *pack_geometry_preferred_pack(struct pack_geometry *geometry) +{ + uint32_t i; + + if (!geometry) { + /* + * No geometry means either an all-into-one repack (in which + * case there is only one pack left and it is the largest) or an + * incremental one. + * + * If repacking incrementally, then we could check the size of + * all packs to determine which should be preferred, but leave + * this for later. + */ + return NULL; + } + if (geometry->split == geometry->pack_nr) + return NULL; + + /* + * The preferred pack is the largest pack above the split line. In + * other words, it is the largest pack that does not get rolled up in + * the geometric repack. + */ + for (i = geometry->pack_nr; i > geometry->split; i--) + /* + * A pack that is not local would never be included in a + * multi-pack index. We thus skip over any non-local packs. + */ + if (geometry->pack[i - 1]->pack_local) + return geometry->pack[i - 1]; + + return NULL; +} + +void pack_geometry_remove_redundant(struct pack_geometry *geometry, + struct string_list *names, + struct existing_packs *existing, + const char *packdir) +{ + const struct git_hash_algo *algop = existing->repo->hash_algo; + struct strbuf buf = STRBUF_INIT; + uint32_t i; + + for (i = 0; i < geometry->split; i++) { + struct packed_git *p = geometry->pack[i]; + if (string_list_has_string(names, hash_to_hex_algop(p->hash, + algop))) + continue; + + strbuf_reset(&buf); + strbuf_addstr(&buf, pack_basename(p)); + strbuf_strip_suffix(&buf, ".pack"); + + if ((p->pack_keep) || + (string_list_has_string(&existing->kept_packs, buf.buf))) + continue; + + repack_remove_redundant_pack(existing->repo, packdir, buf.buf); + } + + strbuf_release(&buf); +} + +void pack_geometry_release(struct pack_geometry *geometry) +{ + if (!geometry) + return; + + free(geometry->pack); +} diff --git a/repack-midx.c b/repack-midx.c new file mode 100644 index 0000000000..6f6202c5bc --- /dev/null +++ b/repack-midx.c @@ -0,0 +1,372 @@ +#include "git-compat-util.h" +#include "repack.h" +#include "hash.h" +#include "hex.h" +#include "odb.h" +#include "oidset.h" +#include "pack-bitmap.h" +#include "refs.h" +#include "run-command.h" +#include "tempfile.h" + +struct midx_snapshot_ref_data { + struct repository *repo; + struct tempfile *f; + struct oidset seen; + int preferred; +}; + +static int midx_snapshot_ref_one(const char *refname UNUSED, + const char *referent UNUSED, + const struct object_id *oid, + int flag UNUSED, void *_data) +{ + struct midx_snapshot_ref_data *data = _data; + struct object_id peeled; + + if (!peel_iterated_oid(data->repo, oid, &peeled)) + oid = &peeled; + + if (oidset_insert(&data->seen, oid)) + return 0; /* already seen */ + + if (odb_read_object_info(data->repo->objects, oid, NULL) != OBJ_COMMIT) + return 0; + + fprintf(data->f->fp, "%s%s\n", data->preferred ? "+" : "", + oid_to_hex(oid)); + + return 0; +} + +void midx_snapshot_refs(struct repository *repo, struct tempfile *f) +{ + struct midx_snapshot_ref_data data; + const struct string_list *preferred = bitmap_preferred_tips(repo); + + data.repo = repo; + data.f = f; + data.preferred = 0; + oidset_init(&data.seen, 0); + + if (!fdopen_tempfile(f, "w")) + die(_("could not open tempfile %s for writing"), + get_tempfile_path(f)); + + if (preferred) { + struct string_list_item *item; + + data.preferred = 1; + for_each_string_list_item(item, preferred) + refs_for_each_ref_in(get_main_ref_store(repo), + item->string, + midx_snapshot_ref_one, &data); + data.preferred = 0; + } + + refs_for_each_ref(get_main_ref_store(repo), + midx_snapshot_ref_one, &data); + + if (close_tempfile_gently(f)) { + int save_errno = errno; + delete_tempfile(&f); + errno = save_errno; + die_errno(_("could not close refs snapshot tempfile")); + } + + oidset_clear(&data.seen); +} + +static int midx_has_unknown_packs(struct string_list *include, + struct pack_geometry *geometry, + struct existing_packs *existing) +{ + struct string_list_item *item; + + string_list_sort(include); + + for_each_string_list_item(item, &existing->midx_packs) { + const char *pack_name = item->string; + + /* + * Determine whether or not each MIDX'd pack from the existing + * MIDX (if any) is represented in the new MIDX. For each pack + * in the MIDX, it must either be: + * + * - In the "include" list of packs to be included in the new + * MIDX. Note this function is called before the include + * list is populated with any cruft pack(s). + * + * - Below the geometric split line (if using pack geometry), + * indicating that the pack won't be included in the new + * MIDX, but its contents were rolled up as part of the + * geometric repack. + * + * - In the existing non-kept packs list (if not using pack + * geometry), and marked as non-deleted. + */ + if (string_list_has_string(include, pack_name)) { + continue; + } else if (geometry) { + struct strbuf buf = STRBUF_INIT; + uint32_t j; + + for (j = 0; j < geometry->split; j++) { + strbuf_reset(&buf); + strbuf_addstr(&buf, pack_basename(geometry->pack[j])); + strbuf_strip_suffix(&buf, ".pack"); + strbuf_addstr(&buf, ".idx"); + + if (!strcmp(pack_name, buf.buf)) { + strbuf_release(&buf); + break; + } + } + + strbuf_release(&buf); + + if (j < geometry->split) + continue; + } else { + struct string_list_item *item; + + item = string_list_lookup(&existing->non_kept_packs, + pack_name); + if (item && !existing_pack_is_marked_for_deletion(item)) + continue; + } + + /* + * If we got to this point, the MIDX includes some pack that we + * don't know about. + */ + return 1; + } + + return 0; +} + +static void midx_included_packs(struct string_list *include, + struct repack_write_midx_opts *opts) +{ + struct existing_packs *existing = opts->existing; + struct pack_geometry *geometry = opts->geometry; + struct string_list *names = opts->names; + struct string_list_item *item; + struct strbuf buf = STRBUF_INIT; + + for_each_string_list_item(item, &existing->kept_packs) { + strbuf_reset(&buf); + strbuf_addf(&buf, "%s.idx", item->string); + string_list_insert(include, buf.buf); + } + + for_each_string_list_item(item, names) { + strbuf_reset(&buf); + strbuf_addf(&buf, "pack-%s.idx", item->string); + string_list_insert(include, buf.buf); + } + + if (geometry->split_factor) { + uint32_t i; + + for (i = geometry->split; i < geometry->pack_nr; i++) { + struct packed_git *p = geometry->pack[i]; + + /* + * The multi-pack index never refers to packfiles part + * of an alternate object database, so we skip these. + * While git-multi-pack-index(1) would silently ignore + * them anyway, this allows us to skip executing the + * command completely when we have only non-local + * packfiles. + */ + if (!p->pack_local) + continue; + + strbuf_reset(&buf); + strbuf_addstr(&buf, pack_basename(p)); + strbuf_strip_suffix(&buf, ".pack"); + strbuf_addstr(&buf, ".idx"); + + string_list_insert(include, buf.buf); + } + } else { + for_each_string_list_item(item, &existing->non_kept_packs) { + if (existing_pack_is_marked_for_deletion(item)) + continue; + + strbuf_reset(&buf); + strbuf_addf(&buf, "%s.idx", item->string); + string_list_insert(include, buf.buf); + } + } + + if (opts->midx_must_contain_cruft || + midx_has_unknown_packs(include, geometry, existing)) { + /* + * If there are one or more unknown pack(s) present (see + * midx_has_unknown_packs() for what makes a pack + * "unknown") in the MIDX before the repack, keep them + * as they may be required to form a reachability + * closure if the MIDX is bitmapped. + * + * For example, a cruft pack can be required to form a + * reachability closure if the MIDX is bitmapped and one + * or more of the bitmap's selected commits reaches a + * once-cruft object that was later made reachable. + */ + for_each_string_list_item(item, &existing->cruft_packs) { + /* + * When doing a --geometric repack, there is no + * need to check for deleted packs, since we're + * by definition not doing an ALL_INTO_ONE + * repack (hence no packs will be deleted). + * Otherwise we must check for and exclude any + * packs which are enqueued for deletion. + * + * So we could omit the conditional below in the + * --geometric case, but doing so is unnecessary + * since no packs are marked as pending + * deletion (since we only call + * `existing_packs_mark_for_deletion()` when + * doing an all-into-one repack). + */ + if (existing_pack_is_marked_for_deletion(item)) + continue; + + strbuf_reset(&buf); + strbuf_addf(&buf, "%s.idx", item->string); + string_list_insert(include, buf.buf); + } + } else { + /* + * Modern versions of Git (with the appropriate + * configuration setting) will write new copies of + * once-cruft objects when doing a --geometric repack. + * + * If the MIDX has no cruft pack, new packs written + * during a --geometric repack will not rely on the + * cruft pack to form a reachability closure, so we can + * avoid including them in the MIDX in that case. + */ + ; + } + + strbuf_release(&buf); +} + +static void remove_redundant_bitmaps(struct string_list *include, + const char *packdir) +{ + struct strbuf path = STRBUF_INIT; + struct string_list_item *item; + size_t packdir_len; + + strbuf_addstr(&path, packdir); + strbuf_addch(&path, '/'); + packdir_len = path.len; + + /* + * Remove any pack bitmaps corresponding to packs which are now + * included in the MIDX. + */ + for_each_string_list_item(item, include) { + strbuf_addstr(&path, item->string); + strbuf_strip_suffix(&path, ".idx"); + strbuf_addstr(&path, ".bitmap"); + + if (unlink(path.buf) && errno != ENOENT) + warning_errno(_("could not remove stale bitmap: %s"), + path.buf); + + strbuf_setlen(&path, packdir_len); + } + strbuf_release(&path); +} + +int write_midx_included_packs(struct repack_write_midx_opts *opts) +{ + struct child_process cmd = CHILD_PROCESS_INIT; + struct string_list include = STRING_LIST_INIT_DUP; + struct string_list_item *item; + struct packed_git *preferred = pack_geometry_preferred_pack(opts->geometry); + FILE *in; + int ret = 0; + + midx_included_packs(&include, opts); + if (!include.nr) + goto done; + + cmd.in = -1; + cmd.git_cmd = 1; + + strvec_push(&cmd.args, "multi-pack-index"); + strvec_pushl(&cmd.args, "write", "--stdin-packs", NULL); + + if (opts->show_progress) + strvec_push(&cmd.args, "--progress"); + else + strvec_push(&cmd.args, "--no-progress"); + + if (opts->write_bitmaps) + strvec_push(&cmd.args, "--bitmap"); + + if (preferred) + strvec_pushf(&cmd.args, "--preferred-pack=%s", + pack_basename(preferred)); + else if (opts->names->nr) { + /* The largest pack was repacked, meaning that either + * one or two packs exist depending on whether the + * repository has a cruft pack or not. + * + * Select the non-cruft one as preferred to encourage + * pack-reuse among packs containing reachable objects + * over unreachable ones. + * + * (Note we could write multiple packs here if + * `--max-pack-size` was given, but any one of them + * will suffice, so pick the first one.) + */ + for_each_string_list_item(item, opts->names) { + struct generated_pack *pack = item->util; + if (generated_pack_has_ext(pack, ".mtimes")) + continue; + + strvec_pushf(&cmd.args, "--preferred-pack=pack-%s.pack", + item->string); + break; + } + } else { + /* + * No packs were kept, and no packs were written. The + * only thing remaining are .keep packs (unless + * --pack-kept-objects was given). + * + * Set the `--preferred-pack` arbitrarily here. + */ + ; + } + + if (opts->refs_snapshot) + strvec_pushf(&cmd.args, "--refs-snapshot=%s", + opts->refs_snapshot); + + ret = start_command(&cmd); + if (ret) + goto done; + + in = xfdopen(cmd.in, "w"); + for_each_string_list_item(item, &include) + fprintf(in, "%s\n", item->string); + fclose(in); + + ret = finish_command(&cmd); +done: + if (!ret && opts->write_bitmaps) + remove_redundant_bitmaps(&include, opts->packdir); + + string_list_clear(&include, 0); + + return ret; +} diff --git a/repack-promisor.c b/repack-promisor.c new file mode 100644 index 0000000000..ee6e0669f6 --- /dev/null +++ b/repack-promisor.c @@ -0,0 +1,102 @@ +#include "git-compat-util.h" +#include "repack.h" +#include "hex.h" +#include "pack.h" +#include "packfile.h" +#include "path.h" +#include "repository.h" +#include "run-command.h" + +struct write_oid_context { + struct child_process *cmd; + const struct git_hash_algo *algop; +}; + +/* + * Write oid to the given struct child_process's stdin, starting it first if + * necessary. + */ +static int write_oid(const struct object_id *oid, + struct packed_git *pack UNUSED, + uint32_t pos UNUSED, void *data) +{ + struct write_oid_context *ctx = data; + struct child_process *cmd = ctx->cmd; + + if (cmd->in == -1) { + if (start_command(cmd)) + die(_("could not start pack-objects to repack promisor objects")); + } + + if (write_in_full(cmd->in, oid_to_hex(oid), ctx->algop->hexsz) < 0 || + write_in_full(cmd->in, "\n", 1) < 0) + die(_("failed to feed promisor objects to pack-objects")); + return 0; +} + +void repack_promisor_objects(struct repository *repo, + const struct pack_objects_args *args, + struct string_list *names, const char *packtmp) +{ + struct write_oid_context ctx; + struct child_process cmd = CHILD_PROCESS_INIT; + FILE *out; + struct strbuf line = STRBUF_INIT; + + prepare_pack_objects(&cmd, args, packtmp); + cmd.in = -1; + + /* + * NEEDSWORK: Giving pack-objects only the OIDs without any ordering + * hints may result in suboptimal deltas in the resulting pack. See if + * the OIDs can be sent with fake paths such that pack-objects can use a + * {type -> existing pack order} ordering when computing deltas instead + * of a {type -> size} ordering, which may produce better deltas. + */ + ctx.cmd = &cmd; + ctx.algop = repo->hash_algo; + for_each_packed_object(repo, write_oid, &ctx, + FOR_EACH_OBJECT_PROMISOR_ONLY); + + if (cmd.in == -1) { + /* No packed objects; cmd was never started */ + child_process_clear(&cmd); + return; + } + + close(cmd.in); + + out = xfdopen(cmd.out, "r"); + while (strbuf_getline_lf(&line, out) != EOF) { + struct string_list_item *item; + char *promisor_name; + + if (line.len != repo->hash_algo->hexsz) + die(_("repack: Expecting full hex object ID lines only from pack-objects.")); + item = string_list_append(names, line.buf); + + /* + * pack-objects creates the .pack and .idx files, but not the + * .promisor file. Create the .promisor file, which is empty. + * + * NEEDSWORK: fetch-pack sometimes generates non-empty + * .promisor files containing the ref names and associated + * hashes at the point of generation of the corresponding + * packfile, but this would not preserve their contents. Maybe + * concatenate the contents of all .promisor files instead of + * just creating a new empty file. + */ + promisor_name = mkpathdup("%s-%s.promisor", packtmp, + line.buf); + write_promisor_file(promisor_name, NULL, 0); + + item->util = generated_pack_populate(item->string, packtmp); + + free(promisor_name); + } + + fclose(out); + if (finish_command(&cmd)) + die(_("could not finish pack-objects to repack promisor objects")); + strbuf_release(&line); +} diff --git a/repack.c b/repack.c new file mode 100644 index 0000000000..596841027a --- /dev/null +++ b/repack.c @@ -0,0 +1,359 @@ +#include "git-compat-util.h" +#include "dir.h" +#include "midx.h" +#include "odb.h" +#include "packfile.h" +#include "path.h" +#include "repack.h" +#include "repository.h" +#include "run-command.h" +#include "tempfile.h" + +void prepare_pack_objects(struct child_process *cmd, + const struct pack_objects_args *args, + const char *out) +{ + strvec_push(&cmd->args, "pack-objects"); + if (args->window) + strvec_pushf(&cmd->args, "--window=%s", args->window); + if (args->window_memory) + strvec_pushf(&cmd->args, "--window-memory=%s", args->window_memory); + if (args->depth) + strvec_pushf(&cmd->args, "--depth=%s", args->depth); + if (args->threads) + strvec_pushf(&cmd->args, "--threads=%s", args->threads); + if (args->max_pack_size) + strvec_pushf(&cmd->args, "--max-pack-size=%lu", args->max_pack_size); + if (args->no_reuse_delta) + strvec_pushf(&cmd->args, "--no-reuse-delta"); + if (args->no_reuse_object) + strvec_pushf(&cmd->args, "--no-reuse-object"); + if (args->name_hash_version) + strvec_pushf(&cmd->args, "--name-hash-version=%d", args->name_hash_version); + if (args->path_walk) + strvec_pushf(&cmd->args, "--path-walk"); + if (args->local) + strvec_push(&cmd->args, "--local"); + if (args->quiet) + strvec_push(&cmd->args, "--quiet"); + if (args->delta_base_offset) + strvec_push(&cmd->args, "--delta-base-offset"); + if (!args->pack_kept_objects) + strvec_push(&cmd->args, "--honor-pack-keep"); + strvec_push(&cmd->args, out); + cmd->git_cmd = 1; + cmd->out = -1; +} + +void pack_objects_args_release(struct pack_objects_args *args) +{ + free(args->window); + free(args->window_memory); + free(args->depth); + free(args->threads); + list_objects_filter_release(&args->filter_options); +} + +void repack_remove_redundant_pack(struct repository *repo, const char *dir_name, + const char *base_name) +{ + struct strbuf buf = STRBUF_INIT; + struct odb_source *source = repo->objects->sources; + struct multi_pack_index *m = get_multi_pack_index(source); + strbuf_addf(&buf, "%s.pack", base_name); + if (m && source->local && midx_contains_pack(m, buf.buf)) + clear_midx_file(repo); + strbuf_insertf(&buf, 0, "%s/", dir_name); + unlink_pack_path(buf.buf, 1); + strbuf_release(&buf); +} + +const char *write_pack_opts_pack_prefix(const struct write_pack_opts *opts) +{ + const char *pack_prefix; + if (!skip_prefix(opts->packtmp, opts->packdir, &pack_prefix)) + die(_("pack prefix %s does not begin with objdir %s"), + opts->packtmp, opts->packdir); + if (*pack_prefix == '/') + pack_prefix++; + return pack_prefix; +} + +bool write_pack_opts_is_local(const struct write_pack_opts *opts) +{ + return starts_with(opts->destination, opts->packdir); +} + +int finish_pack_objects_cmd(const struct git_hash_algo *algop, + const struct write_pack_opts *opts, + struct child_process *cmd, + struct string_list *names) +{ + FILE *out; + bool local = write_pack_opts_is_local(opts); + struct strbuf line = STRBUF_INIT; + + out = xfdopen(cmd->out, "r"); + while (strbuf_getline_lf(&line, out) != EOF) { + struct string_list_item *item; + + if (line.len != algop->hexsz) + die(_("repack: Expecting full hex object ID lines only " + "from pack-objects.")); + /* + * Avoid putting packs written outside of the repository in the + * list of names. + */ + if (local) { + item = string_list_append(names, line.buf); + item->util = generated_pack_populate(line.buf, + opts->packtmp); + } + } + fclose(out); + + strbuf_release(&line); + + return finish_command(cmd); +} + +#define DELETE_PACK 1 +#define RETAIN_PACK 2 + +void existing_packs_collect(struct existing_packs *existing, + const struct string_list *extra_keep) +{ + struct packed_git *p; + struct strbuf buf = STRBUF_INIT; + + repo_for_each_pack(existing->repo, p) { + size_t i; + const char *base; + + if (p->multi_pack_index) + string_list_append(&existing->midx_packs, + pack_basename(p)); + if (!p->pack_local) + continue; + + base = pack_basename(p); + + for (i = 0; i < extra_keep->nr; i++) + if (!fspathcmp(base, extra_keep->items[i].string)) + break; + + strbuf_reset(&buf); + strbuf_addstr(&buf, base); + strbuf_strip_suffix(&buf, ".pack"); + + if ((extra_keep->nr > 0 && i < extra_keep->nr) || p->pack_keep) + string_list_append(&existing->kept_packs, buf.buf); + else if (p->is_cruft) + string_list_append(&existing->cruft_packs, buf.buf); + else + string_list_append(&existing->non_kept_packs, buf.buf); + } + + string_list_sort(&existing->kept_packs); + string_list_sort(&existing->non_kept_packs); + string_list_sort(&existing->cruft_packs); + string_list_sort(&existing->midx_packs); + strbuf_release(&buf); +} + +int existing_packs_has_non_kept(const struct existing_packs *existing) +{ + return existing->non_kept_packs.nr || existing->cruft_packs.nr; +} + +static void existing_pack_mark_for_deletion(struct string_list_item *item) +{ + item->util = (void*)((uintptr_t)item->util | DELETE_PACK); +} + +static void existing_pack_unmark_for_deletion(struct string_list_item *item) +{ + item->util = (void*)((uintptr_t)item->util & ~DELETE_PACK); +} + +int existing_pack_is_marked_for_deletion(struct string_list_item *item) +{ + return (uintptr_t)item->util & DELETE_PACK; +} + +static void existing_packs_mark_retained(struct string_list_item *item) +{ + item->util = (void*)((uintptr_t)item->util | RETAIN_PACK); +} + +static int existing_pack_is_retained(struct string_list_item *item) +{ + return (uintptr_t)item->util & RETAIN_PACK; +} + +static void existing_packs_mark_for_deletion_1(const struct git_hash_algo *algop, + struct string_list *names, + struct string_list *list) +{ + struct string_list_item *item; + const size_t hexsz = algop->hexsz; + + for_each_string_list_item(item, list) { + char *sha1; + size_t len = strlen(item->string); + if (len < hexsz) + continue; + sha1 = item->string + len - hexsz; + + if (existing_pack_is_retained(item)) { + existing_pack_unmark_for_deletion(item); + } else if (!string_list_has_string(names, sha1)) { + /* + * Mark this pack for deletion, which ensures + * that this pack won't be included in a MIDX + * (if `--write-midx` was given) and that we + * will actually delete this pack (if `-d` was + * given). + */ + existing_pack_mark_for_deletion(item); + } + } +} + +void existing_packs_retain_cruft(struct existing_packs *existing, + struct packed_git *cruft) +{ + struct strbuf buf = STRBUF_INIT; + struct string_list_item *item; + + strbuf_addstr(&buf, pack_basename(cruft)); + strbuf_strip_suffix(&buf, ".pack"); + + item = string_list_lookup(&existing->cruft_packs, buf.buf); + if (!item) + BUG("could not find cruft pack '%s'", pack_basename(cruft)); + + existing_packs_mark_retained(item); + strbuf_release(&buf); +} + +void existing_packs_mark_for_deletion(struct existing_packs *existing, + struct string_list *names) + +{ + const struct git_hash_algo *algop = existing->repo->hash_algo; + existing_packs_mark_for_deletion_1(algop, names, + &existing->non_kept_packs); + existing_packs_mark_for_deletion_1(algop, names, + &existing->cruft_packs); +} + +static void remove_redundant_packs_1(struct repository *repo, + struct string_list *packs, + const char *packdir) +{ + struct string_list_item *item; + for_each_string_list_item(item, packs) { + if (!existing_pack_is_marked_for_deletion(item)) + continue; + repack_remove_redundant_pack(repo, packdir, item->string); + } +} + +void existing_packs_remove_redundant(struct existing_packs *existing, + const char *packdir) +{ + remove_redundant_packs_1(existing->repo, &existing->non_kept_packs, + packdir); + remove_redundant_packs_1(existing->repo, &existing->cruft_packs, + packdir); +} + +void existing_packs_release(struct existing_packs *existing) +{ + string_list_clear(&existing->kept_packs, 0); + string_list_clear(&existing->non_kept_packs, 0); + string_list_clear(&existing->cruft_packs, 0); + string_list_clear(&existing->midx_packs, 0); +} + +static struct { + const char *name; + unsigned optional:1; +} exts[] = { + {".pack"}, + {".rev", 1}, + {".mtimes", 1}, + {".bitmap", 1}, + {".promisor", 1}, + {".idx"}, +}; + +struct generated_pack { + struct tempfile *tempfiles[ARRAY_SIZE(exts)]; +}; + +struct generated_pack *generated_pack_populate(const char *name, + const char *packtmp) +{ + struct stat statbuf; + struct strbuf path = STRBUF_INIT; + struct generated_pack *pack = xcalloc(1, sizeof(*pack)); + size_t i; + + for (i = 0; i < ARRAY_SIZE(exts); i++) { + strbuf_reset(&path); + strbuf_addf(&path, "%s-%s%s", packtmp, name, exts[i].name); + + if (stat(path.buf, &statbuf)) + continue; + + pack->tempfiles[i] = register_tempfile(path.buf); + } + + strbuf_release(&path); + return pack; +} + +int generated_pack_has_ext(const struct generated_pack *pack, const char *ext) +{ + size_t i; + for (i = 0; i < ARRAY_SIZE(exts); i++) { + if (strcmp(exts[i].name, ext)) + continue; + return !!pack->tempfiles[i]; + } + BUG("unknown pack extension: '%s'", ext); +} + +void generated_pack_install(struct generated_pack *pack, const char *name, + const char *packdir, const char *packtmp) +{ + size_t ext; + for (ext = 0; ext < ARRAY_SIZE(exts); ext++) { + char *fname; + + fname = mkpathdup("%s/pack-%s%s", packdir, name, + exts[ext].name); + + if (pack->tempfiles[ext]) { + const char *fname_old = get_tempfile_path(pack->tempfiles[ext]); + struct stat statbuffer; + + if (!stat(fname_old, &statbuffer)) { + statbuffer.st_mode &= ~(S_IWUSR | S_IWGRP | S_IWOTH); + chmod(fname_old, statbuffer.st_mode); + } + + if (rename_tempfile(&pack->tempfiles[ext], fname)) + die_errno(_("renaming pack to '%s' failed"), + fname); + } else if (!exts[ext].optional) + die(_("pack-objects did not write a '%s' file for pack %s-%s"), + exts[ext].name, packtmp, name); + else if (unlink(fname) < 0 && errno != ENOENT) + die_errno(_("could not unlink: %s"), fname); + + free(fname); + } +} diff --git a/repack.h b/repack.h new file mode 100644 index 0000000000..3a688a12ee --- /dev/null +++ b/repack.h @@ -0,0 +1,146 @@ +#ifndef REPACK_H +#define REPACK_H + +#include "list-objects-filter-options.h" +#include "string-list.h" + +struct pack_objects_args { + char *window; + char *window_memory; + char *depth; + char *threads; + unsigned long max_pack_size; + int no_reuse_delta; + int no_reuse_object; + int quiet; + int local; + int name_hash_version; + int path_walk; + int delta_base_offset; + int pack_kept_objects; + struct list_objects_filter_options filter_options; +}; + +#define PACK_OBJECTS_ARGS_INIT { \ + .delta_base_offset = 1, \ + .pack_kept_objects = -1, \ +} + +struct child_process; + +void prepare_pack_objects(struct child_process *cmd, + const struct pack_objects_args *args, + const char *out); +void pack_objects_args_release(struct pack_objects_args *args); + +void repack_remove_redundant_pack(struct repository *repo, const char *dir_name, + const char *base_name); + +struct write_pack_opts { + struct pack_objects_args *po_args; + const char *destination; + const char *packdir; + const char *packtmp; +}; + +const char *write_pack_opts_pack_prefix(const struct write_pack_opts *opts); +bool write_pack_opts_is_local(const struct write_pack_opts *opts); + +int finish_pack_objects_cmd(const struct git_hash_algo *algop, + const struct write_pack_opts *opts, + struct child_process *cmd, + struct string_list *names); + +struct repository; +struct packed_git; + +struct existing_packs { + struct repository *repo; + struct string_list kept_packs; + struct string_list non_kept_packs; + struct string_list cruft_packs; + struct string_list midx_packs; +}; + +#define EXISTING_PACKS_INIT { \ + .kept_packs = STRING_LIST_INIT_DUP, \ + .non_kept_packs = STRING_LIST_INIT_DUP, \ + .cruft_packs = STRING_LIST_INIT_DUP, \ +} + +/* + * Adds all packs hex strings (pack-$HASH) to either packs->non_kept + * or packs->kept based on whether each pack has a corresponding + * .keep file or not. Packs without a .keep file are not to be kept + * if we are going to pack everything into one file. + */ +void existing_packs_collect(struct existing_packs *existing, + const struct string_list *extra_keep); +int existing_packs_has_non_kept(const struct existing_packs *existing); +int existing_pack_is_marked_for_deletion(struct string_list_item *item); +void existing_packs_retain_cruft(struct existing_packs *existing, + struct packed_git *cruft); +void existing_packs_mark_for_deletion(struct existing_packs *existing, + struct string_list *names); +void existing_packs_remove_redundant(struct existing_packs *existing, + const char *packdir); +void existing_packs_release(struct existing_packs *existing); + +struct generated_pack; + +struct generated_pack *generated_pack_populate(const char *name, + const char *packtmp); +int generated_pack_has_ext(const struct generated_pack *pack, const char *ext); +void generated_pack_install(struct generated_pack *pack, const char *name, + const char *packdir, const char *packtmp); + +void repack_promisor_objects(struct repository *repo, + const struct pack_objects_args *args, + struct string_list *names, const char *packtmp); + +struct pack_geometry { + struct packed_git **pack; + uint32_t pack_nr, pack_alloc; + uint32_t split; + + int split_factor; +}; + +void pack_geometry_init(struct pack_geometry *geometry, + struct existing_packs *existing, + const struct pack_objects_args *args); +void pack_geometry_split(struct pack_geometry *geometry); +struct packed_git *pack_geometry_preferred_pack(struct pack_geometry *geometry); +void pack_geometry_remove_redundant(struct pack_geometry *geometry, + struct string_list *names, + struct existing_packs *existing, + const char *packdir); +void pack_geometry_release(struct pack_geometry *geometry); + +struct tempfile; + +struct repack_write_midx_opts { + struct existing_packs *existing; + struct pack_geometry *geometry; + struct string_list *names; + const char *refs_snapshot; + const char *packdir; + int show_progress; + int write_bitmaps; + int midx_must_contain_cruft; +}; + +void midx_snapshot_refs(struct repository *repo, struct tempfile *f); +int write_midx_included_packs(struct repack_write_midx_opts *opts); + +int write_filtered_pack(const struct write_pack_opts *opts, + struct existing_packs *existing, + struct string_list *names); + +int write_cruft_pack(const struct write_pack_opts *opts, + const char *cruft_expiration, + unsigned long combine_cruft_below_size, + struct string_list *names, + struct existing_packs *existing); + +#endif /* REPACK_H */ @@ -166,6 +166,7 @@ static int set_recommended_config(int reconfigure) #endif /* Optional */ { "status.aheadBehind", "false" }, + { "commitGraph.changedPaths", "true" }, { "commitGraph.generationVersion", "1" }, { "core.autoCRLF", "false" }, { "core.safeCRLF", "false" }, diff --git a/server-info.c b/server-info.c index 1d33de821e..b9a710544a 100644 --- a/server-info.c +++ b/server-info.c @@ -287,13 +287,12 @@ static int compare_info(const void *a_, const void *b_) static void init_pack_info(struct repository *r, const char *infofile, int force) { - struct packfile_store *packs = r->objects->packfiles; struct packed_git *p; int stale; int i; size_t alloc = 0; - for (p = packfile_store_get_all_packs(packs); p; p = p->next) { + repo_for_each_pack(r, p) { /* we ignore things on alternate path since they are * not available to the pullers in general. */ diff --git a/t/helper/test-find-pack.c b/t/helper/test-find-pack.c index e001dc3066..fc4b8a77b3 100644 --- a/t/helper/test-find-pack.c +++ b/t/helper/test-find-pack.c @@ -39,11 +39,12 @@ int cmd__find_pack(int argc, const char **argv) if (repo_get_oid(the_repository, argv[0], &oid)) die("cannot parse %s as an object name", argv[0]); - for (p = packfile_store_get_all_packs(the_repository->objects->packfiles); p; p = p->next) + repo_for_each_pack(the_repository, p) { if (find_pack_entry_one(&oid, p)) { printf("%s\n", p->pack_name); actual_count++; } + } if (count > -1 && count != actual_count) die("bad packfile count %d instead of %d", actual_count, count); diff --git a/t/helper/test-pack-mtimes.c b/t/helper/test-pack-mtimes.c index 7c428c1601..7a8ee1de24 100644 --- a/t/helper/test-pack-mtimes.c +++ b/t/helper/test-pack-mtimes.c @@ -37,7 +37,7 @@ int cmd__pack_mtimes(int argc, const char **argv) if (argc != 2) usage(pack_mtimes_usage); - for (p = packfile_store_get_all_packs(the_repository->objects->packfiles); p; p = p->next) { + repo_for_each_pack(the_repository, p) { strbuf_addstr(&buf, basename(p->pack_name)); strbuf_strip_suffix(&buf, ".pack"); strbuf_addstr(&buf, ".mtimes"); diff --git a/t/lib-gpg.sh b/t/lib-gpg.sh index b99ae39a06..97268ae07c 100644 --- a/t/lib-gpg.sh +++ b/t/lib-gpg.sh @@ -71,6 +71,7 @@ test_lazy_prereq GPG2 ' exit 1 ;; *) + prepare_gnupghome && (gpgconf --kill all || : ) && # NEEDSWORK: prepare_gnupghome() should definitely be diff --git a/t/meson.build b/t/meson.build index c9ddd89889..a5531df415 100644 --- a/t/meson.build +++ b/t/meson.build @@ -238,6 +238,7 @@ integration_tests = [ 't1701-racy-split-index.sh', 't1800-hook.sh', 't1900-repo.sh', + 't1901-repo-structure.sh', 't2000-conflict-when-checking-files-out.sh', 't2002-checkout-cache-u.sh', 't2003-checkout-cache-mkdir.sh', diff --git a/t/perf/p6010-merge-base.sh b/t/perf/p6010-merge-base.sh new file mode 100755 index 0000000000..54f52fa23e --- /dev/null +++ b/t/perf/p6010-merge-base.sh @@ -0,0 +1,101 @@ +#!/bin/sh + +test_description='Test git merge-base' + +. ./perf-lib.sh + +test_perf_fresh_repo + +# +# Creates lots of merges to make history traversal costly. In +# particular it creates 2^($max_level-1)-1 2-way merges on top of +# 2^($max_level-1) root commits. E.g., the commit history looks like +# this for a $max_level of 3: +# +# _1_ +# / \ +# 2 3 +# / \ / \ +# 4 5 6 7 +# +# The numbers are the fast-import marks, which also are the commit +# messages. 1 is the HEAD commit and a merge, 2 and 3 are also merges, +# 4-7 are the root commits. +# +build_history () { + local max_level="$1" && + local level="${2:-1}" && + local mark="${3:-1}" && + if test $level -eq $max_level + then + echo "reset refs/heads/master" && + echo "from $ZERO_OID" && + echo "commit refs/heads/master" && + echo "mark :$mark" && + echo "committer C <c@example.com> 1234567890 +0000" && + echo "data <<EOF" && + echo "$mark" && + echo "EOF" + else + local level1=$((level+1)) && + local mark1=$((2*mark)) && + local mark2=$((2*mark+1)) && + build_history $max_level $level1 $mark1 && + build_history $max_level $level1 $mark2 && + echo "commit refs/heads/master" && + echo "mark :$mark" && + echo "committer C <c@example.com> 1234567890 +0000" && + echo "data <<EOF" && + echo "$mark" && + echo "EOF" && + echo "from :$mark1" && + echo "merge :$mark2" + fi +} + +# +# Creates a new merge history in the same shape as build_history does, +# while reusing the same root commits. This way the two top commits +# have 2^($max_level-1) merge bases between them. +# +build_history2 () { + local max_level="$1" && + local level="${2:-1}" && + local mark="${3:-1}" && + if test $level -lt $max_level + then + local level1=$((level+1)) && + local mark1=$((2*mark)) && + local mark2=$((2*mark+1)) && + build_history2 $max_level $level1 $mark1 && + build_history2 $max_level $level1 $mark2 && + echo "commit refs/heads/master" && + echo "mark :$mark" && + echo "committer C <c@example.com> 1234567890 +0000" && + echo "data <<EOF" && + echo "$mark II" && + echo "EOF" && + echo "from :$mark1" && + echo "merge :$mark2" + fi +} + +test_expect_success 'setup' ' + max_level=15 && + build_history $max_level | git fast-import --export-marks=marks && + git tag one && + build_history2 $max_level | git fast-import --import-marks=marks --force && + git tag two && + git gc && + git log --format=%H --no-merges >expect +' + +test_perf 'git merge-base' ' + git merge-base --all one two >actual +' + +test_expect_success 'verify result' ' + test_cmp expect actual +' + +test_done diff --git a/t/t0008-ignores.sh b/t/t0008-ignores.sh index 273d71411f..db8bde280e 100755 --- a/t/t0008-ignores.sh +++ b/t/t0008-ignores.sh @@ -847,6 +847,17 @@ test_expect_success 'directories and ** matches' ' test_cmp expect actual ' +test_expect_success '** not confused by matching leading prefix' ' + cat >.gitignore <<-\EOF && + foo**/bar + EOF + git check-ignore foobar foo/bar >actual && + cat >expect <<-\EOF && + foo/bar + EOF + test_cmp expect actual +' + ############################################################################ # # test whitespace handling diff --git a/t/t0600-reffiles-backend.sh b/t/t0600-reffiles-backend.sh index 1e62c791d9..b11126ed47 100755 --- a/t/t0600-reffiles-backend.sh +++ b/t/t0600-reffiles-backend.sh @@ -477,9 +477,29 @@ test_expect_success SYMLINKS 'symref transaction supports symlinks' ' prepare commit EOF - git update-ref --no-deref --stdin <stdin && - test_path_is_symlink .git/TEST_SYMREF_HEAD && - test "$(test_readlink .git/TEST_SYMREF_HEAD)" = refs/heads/new + git update-ref --no-deref --stdin <stdin 2>err && + if test_have_prereq WITH_BREAKING_CHANGES + then + test_path_is_file .git/TEST_SYMREF_HEAD && + echo "ref: refs/heads/new" >expect && + test_cmp expect .git/TEST_SYMREF_HEAD && + test_must_be_empty err + else + test_path_is_symlink .git/TEST_SYMREF_HEAD && + test "$(test_readlink .git/TEST_SYMREF_HEAD)" = refs/heads/new && + cat >expect <<-EOF && + warning: ${SQ}core.preferSymlinkRefs=true${SQ} is nominated for removal. + hint: The use of symbolic links for symbolic refs is deprecated + hint: and will be removed in Git 3.0. The configuration that + hint: tells Git to use them is thus going away. You can unset + hint: it with: + hint: + hint: git config unset core.preferSymlinkRefs + hint: + hint: Git will then use the textual symref format instead. + EOF + test_cmp expect err + fi ' test_expect_success 'symref transaction supports false symlink config' ' diff --git a/t/t1901-repo-structure.sh b/t/t1901-repo-structure.sh new file mode 100755 index 0000000000..36a71a144e --- /dev/null +++ b/t/t1901-repo-structure.sh @@ -0,0 +1,129 @@ +#!/bin/sh + +test_description='test git repo structure' + +. ./test-lib.sh + +test_expect_success 'empty repository' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + cat >expect <<-\EOF && + | Repository structure | Value | + | -------------------- | ----- | + | * References | | + | * Count | 0 | + | * Branches | 0 | + | * Tags | 0 | + | * Remotes | 0 | + | * Others | 0 | + | | | + | * Reachable objects | | + | * Count | 0 | + | * Commits | 0 | + | * Trees | 0 | + | * Blobs | 0 | + | * Tags | 0 | + EOF + + git repo structure >out 2>err && + + test_cmp expect out && + test_line_count = 0 err + ) +' + +test_expect_success 'repository with references and objects' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + test_commit_bulk 42 && + git tag -a foo -m bar && + + oid="$(git rev-parse HEAD)" && + git update-ref refs/remotes/origin/foo "$oid" && + + # Also creates a commit, tree, and blob. + git notes add -m foo && + + cat >expect <<-\EOF && + | Repository structure | Value | + | -------------------- | ----- | + | * References | | + | * Count | 4 | + | * Branches | 1 | + | * Tags | 1 | + | * Remotes | 1 | + | * Others | 1 | + | | | + | * Reachable objects | | + | * Count | 130 | + | * Commits | 43 | + | * Trees | 43 | + | * Blobs | 43 | + | * Tags | 1 | + EOF + + git repo structure >out 2>err && + + test_cmp expect out && + test_line_count = 0 err + ) +' + +test_expect_success 'keyvalue and nul format' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + test_commit_bulk 42 && + git tag -a foo -m bar && + + cat >expect <<-\EOF && + references.branches.count=1 + references.tags.count=1 + references.remotes.count=0 + references.others.count=0 + objects.commits.count=42 + objects.trees.count=42 + objects.blobs.count=42 + objects.tags.count=1 + EOF + + git repo structure --format=keyvalue >out 2>err && + + test_cmp expect out && + test_line_count = 0 err && + + # Replace key and value delimiters for nul format. + tr "\n=" "\0\n" <expect >expect_nul && + git repo structure --format=nul >out 2>err && + + test_cmp expect_nul out && + test_line_count = 0 err + ) +' + +test_expect_success 'progress meter option' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + test_commit foo && + + GIT_PROGRESS_DELAY=0 git repo structure --progress >out 2>err && + + test_file_not_empty out && + test_grep "Counting references: 2, done." err && + test_grep "Counting objects: 3, done." err && + + GIT_PROGRESS_DELAY=0 git repo structure --no-progress >out 2>err && + + test_file_not_empty out && + test_line_count = 0 err + ) +' + +test_done diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh index 851ca6dd91..4285314f35 100755 --- a/t/t3701-add-interactive.sh +++ b/t/t3701-add-interactive.sh @@ -1431,4 +1431,15 @@ test_expect_success 'invalid option s is rejected' ' test_cmp expect actual ' +test_expect_success 'EOF quits' ' + echo a >file && + echo a >file2 && + git add file file2 && + echo X >file && + echo X >file2 && + git add -p </dev/null >out && + test_grep file out && + test_grep ! file2 out +' + test_done diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh index 55a06eadb3..d35695f5b0 100755 --- a/t/t4013-diff-various.sh +++ b/t/t4013-diff-various.sh @@ -661,6 +661,43 @@ test_expect_success 'diff -I<regex>: ignore matching file' ' test_grep ! "file1" actual ' +test_expect_success 'diff -I<regex>: ignore all content changes' ' + test_when_finished "git rm -f file1 file2 file3" && + : >file1 && + git add file1 && + : >file2 && + git add file2 && + : >file3 && + git add file3 && + + rm -f file1 file2 && + mkdir file2 && + echo "A" >file3 && + A_hash=$(git hash-object -w file3) && + echo "B" >file3 && + B_hash=$(git hash-object -w file3) && + cat <<-EOF | git update-index --index-info && + 100644 $A_hash 1 file3 + 100644 $B_hash 2 file3 + EOF + + test_diff_no_content_changes () { + git diff $1 --ignore-blank-lines -I".*" >actual && + test_line_count = 3 actual && + test_grep "file1" actual && + test_grep "file2" actual && + test_grep "file3" actual && + test_grep ! "diff --git" actual + } && + test_diff_no_content_changes "--raw" && + test_diff_no_content_changes "--name-only" && + test_diff_no_content_changes "--name-status" && + + : >actual && + test_must_fail git diff --quiet -I".*" >actual && + test_must_be_empty actual +' + # check_prefix <patch> <src> <dst> # check only lines with paths to avoid dependency on exact oid/contents check_prefix () { diff --git a/t/t4020-diff-external.sh b/t/t4020-diff-external.sh index c8a23d5148..7ec5854f74 100755 --- a/t/t4020-diff-external.sh +++ b/t/t4020-diff-external.sh @@ -44,6 +44,16 @@ test_expect_success 'GIT_EXTERNAL_DIFF environment and --no-ext-diff' ' ' +test_expect_success 'GIT_EXTERNAL_DIFF and --output' ' + cat >expect <<-EOF && + file $(git rev-parse --verify HEAD:file) 100644 file $(test_oid zero) 100644 + EOF + GIT_EXTERNAL_DIFF=echo git diff --output=out >stdout && + cut -d" " -f1,3- <out >actual && + test_must_be_empty stdout && + test_cmp expect actual +' + test_expect_success SYMLINKS 'typechange diff' ' rm -f file && ln -s elif file && diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh index 0b3404f58f..98c6910963 100755 --- a/t/t5318-commit-graph.sh +++ b/t/t5318-commit-graph.sh @@ -946,4 +946,48 @@ test_expect_success 'stale commit cannot be parsed when traversing graph' ' ) ' +test_expect_success 'config commitGraph.changedPaths acts like --changed-paths' ' + git init config-changed-paths && + ( + cd config-changed-paths && + + # commitGraph.changedPaths is not set and it should not write Bloom filters + test_commit first && + GIT_PROGRESS_DELAY=0 git commit-graph write --reachable --progress 2>error && + test_grep ! "Bloom filters" error && + + # Set commitGraph.changedPaths to true and it should write Bloom filters + test_commit second && + git config commitGraph.changedPaths true && + GIT_PROGRESS_DELAY=0 git commit-graph write --reachable --progress 2>error && + test_grep "Bloom filters" error && + + # Add one more config commitGraph.changedPaths as false to disable the previous true config value + # It should still write Bloom filters due to existing filters + test_commit third && + git config --add commitGraph.changedPaths false && + GIT_PROGRESS_DELAY=0 git commit-graph write --reachable --progress 2>error && + test_grep "Bloom filters" error && + + # commitGraph.changedPaths is still false and command line options should take precedence + test_commit fourth && + GIT_PROGRESS_DELAY=0 git commit-graph write --no-changed-paths --reachable --progress 2>error && + test_grep ! "Bloom filters" error && + GIT_PROGRESS_DELAY=0 git commit-graph write --reachable --progress 2>error && + test_grep ! "Bloom filters" error && + + # commitGraph.changedPaths is all cleared and then set to false again, command line options should take precedence + test_commit fifth && + git config --unset-all commitGraph.changedPaths && + git config commitGraph.changedPaths false && + GIT_PROGRESS_DELAY=0 git commit-graph write --changed-paths --reachable --progress 2>error && + test_grep "Bloom filters" error && + + # commitGraph.changedPaths is still false and it should write Bloom filters due to existing filters + test_commit sixth && + GIT_PROGRESS_DELAY=0 git commit-graph write --reachable --progress 2>error && + test_grep "Bloom filters" error + ) +' + test_done diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh index ddd273d8dc..614184a097 100755 --- a/t/t7900-maintenance.sh +++ b/t/t7900-maintenance.sh @@ -465,6 +465,176 @@ test_expect_success 'maintenance.incremental-repack.auto (when config is unset)' ) ' +run_and_verify_geometric_pack () { + EXPECTED_PACKS="$1" && + + # Verify that we perform a geometric repack. + rm -f "trace2.txt" && + GIT_TRACE2_EVENT="$(pwd)/trace2.txt" \ + git maintenance run --task=geometric-repack 2>/dev/null && + test_subcommand git repack -d -l --geometric=2 \ + --quiet --write-midx <trace2.txt && + + # Verify that the number of packfiles matches our expectation. + ls -l .git/objects/pack/*.pack >packfiles && + test_line_count = "$EXPECTED_PACKS" packfiles && + + # And verify that there are no loose objects anymore. + git count-objects -v >count && + test_grep '^count: 0$' count +} + +test_expect_success 'geometric repacking task' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + git config set maintenance.auto false && + test_commit initial && + + # The initial repack causes an all-into-one repack. + GIT_TRACE2_EVENT="$(pwd)/initial-repack.txt" \ + git maintenance run --task=geometric-repack 2>/dev/null && + test_subcommand git repack -d -l --cruft --cruft-expiration=2.weeks.ago \ + --quiet --write-midx <initial-repack.txt && + + # Repacking should now cause a no-op geometric repack because + # no packfiles need to be combined. + ls -l .git/objects/pack/*.pack >before && + run_and_verify_geometric_pack 1 && + ls -l .git/objects/pack/*.pack >after && + test_cmp before after && + + # This incremental change creates a new packfile that only + # soaks up loose objects. The packfiles are not getting merged + # at this point. + test_commit loose && + run_and_verify_geometric_pack 2 && + + # Both packfiles have 3 objects, so the next run would cause us + # to merge all packfiles together. This should be turned into + # an all-into-one-repack. + GIT_TRACE2_EVENT="$(pwd)/all-into-one-repack.txt" \ + git maintenance run --task=geometric-repack 2>/dev/null && + test_subcommand git repack -d -l --cruft --cruft-expiration=2.weeks.ago \ + --quiet --write-midx <all-into-one-repack.txt && + + # The geometric repack soaks up unreachable objects. + echo blob-1 | git hash-object -w --stdin -t blob && + run_and_verify_geometric_pack 2 && + + # A second unreachable object should be written into another packfile. + echo blob-2 | git hash-object -w --stdin -t blob && + run_and_verify_geometric_pack 3 && + + # And these two small packs should now be merged via the + # geometric repack. The large packfile should remain intact. + run_and_verify_geometric_pack 2 && + + # If we now add two more objects and repack twice we should + # then see another all-into-one repack. This time around + # though, as we have unreachable objects, we should also see a + # cruft pack. + echo blob-3 | git hash-object -w --stdin -t blob && + echo blob-4 | git hash-object -w --stdin -t blob && + run_and_verify_geometric_pack 3 && + GIT_TRACE2_EVENT="$(pwd)/cruft-repack.txt" \ + git maintenance run --task=geometric-repack 2>/dev/null && + test_subcommand git repack -d -l --cruft --cruft-expiration=2.weeks.ago \ + --quiet --write-midx <cruft-repack.txt && + ls .git/objects/pack/*.pack >packs && + test_line_count = 2 packs && + ls .git/objects/pack/*.mtimes >cruft && + test_line_count = 1 cruft + ) +' + +test_geometric_repack_needed () { + NEEDED="$1" + GEOMETRIC_CONFIG="$2" && + rm -f trace2.txt && + GIT_TRACE2_EVENT="$(pwd)/trace2.txt" \ + git ${GEOMETRIC_CONFIG:+-c maintenance.geometric-repack.$GEOMETRIC_CONFIG} \ + maintenance run --auto --task=geometric-repack 2>/dev/null && + case "$NEEDED" in + true) + test_grep "\[\"git\",\"repack\"," trace2.txt;; + false) + ! test_grep "\[\"git\",\"repack\"," trace2.txt;; + *) + BUG "invalid parameter: $NEEDED";; + esac +} + +test_expect_success 'geometric repacking with --auto' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + + # An empty repository does not need repacking, except when + # explicitly told to do it. + test_geometric_repack_needed false && + test_geometric_repack_needed false auto=0 && + test_geometric_repack_needed false auto=1 && + test_geometric_repack_needed true auto=-1 && + + test_oid_init && + + # Loose objects cause a repack when crossing the limit. Note + # that the number of objects gets extrapolated by having a look + # at the "objects/17/" shard. + test_commit "$(test_oid blob17_1)" && + test_geometric_repack_needed false && + test_commit "$(test_oid blob17_2)" && + test_geometric_repack_needed false auto=257 && + test_geometric_repack_needed true auto=256 && + + # Force another repack. + test_commit first && + test_commit second && + test_geometric_repack_needed true auto=-1 && + + # We now have two packfiles that would be merged together. As + # such, the repack should always happen unless the user has + # disabled the auto task. + test_geometric_repack_needed false auto=0 && + test_geometric_repack_needed true auto=9000 + ) +' + +test_expect_success 'geometric repacking honors configured split factor' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + git config set maintenance.auto false && + + # Create three different packs with 9, 2 and 1 object, respectively. + # This is done so that only a subset of packs would be merged + # together so that we can verify that `git repack` receives the + # correct geometric factor. + for i in $(test_seq 9) + do + echo first-$i | git hash-object -w --stdin -t blob || return 1 + done && + git repack --geometric=2 -d && + + for i in $(test_seq 2) + do + echo second-$i | git hash-object -w --stdin -t blob || return 1 + done && + git repack --geometric=2 -d && + + echo third | git hash-object -w --stdin -t blob && + git repack --geometric=2 -d && + + test_geometric_repack_needed false splitFactor=2 && + test_geometric_repack_needed true splitFactor=3 && + test_subcommand git repack -d -l --geometric=3 --quiet --write-midx <trace2.txt + ) +' + test_expect_success 'pack-refs task' ' for n in $(test_seq 1 5) do @@ -716,6 +886,76 @@ test_expect_success 'maintenance.strategy inheritance' ' <modified-daily.txt ' +test_strategy () { + STRATEGY="$1" + shift + + cat >expect && + rm -f trace2.txt && + GIT_TRACE2_EVENT="$(pwd)/trace2.txt" \ + git -c maintenance.strategy=$STRATEGY maintenance run --quiet "$@" && + sed -n 's/{"event":"child_start","sid":"[^/"]*",.*,"argv":\["\(.*\)\"]}/\1/p' <trace2.txt | + sed 's/","/ /g' >actual + test_cmp expect actual +} + +test_expect_success 'maintenance.strategy is respected' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + test_commit initial && + + test_must_fail git -c maintenance.strategy=unknown maintenance run 2>err && + test_grep "unknown maintenance strategy: .unknown." err && + + test_strategy incremental <<-\EOF && + git pack-refs --all --prune + git reflog expire --all + git gc --quiet --no-detach --skip-foreground-tasks + EOF + + test_strategy incremental --schedule=weekly <<-\EOF && + git pack-refs --all --prune + git prune-packed --quiet + git multi-pack-index write --no-progress + git multi-pack-index expire --no-progress + git multi-pack-index repack --no-progress --batch-size=1 + git commit-graph write --split --reachable --no-progress + EOF + + test_strategy gc <<-\EOF && + git pack-refs --all --prune + git reflog expire --all + git gc --quiet --no-detach --skip-foreground-tasks + EOF + + test_strategy gc --schedule=weekly <<-\EOF && + git pack-refs --all --prune + git reflog expire --all + git gc --quiet --no-detach --skip-foreground-tasks + EOF + + test_strategy geometric <<-\EOF && + git pack-refs --all --prune + git reflog expire --all + git repack -d -l --geometric=2 --quiet --write-midx + git commit-graph write --split --reachable --no-progress + git worktree prune --expire 3.months.ago + git rerere gc + EOF + + test_strategy geometric --schedule=weekly <<-\EOF + git pack-refs --all --prune + git reflog expire --all + git repack -d -l --geometric=2 --quiet --write-midx + git commit-graph write --split --reachable --no-progress + git worktree prune --expire 3.months.ago + git rerere gc + EOF + ) +' + test_expect_success 'register and unregister' ' test_when_finished git config --global --unset-all maintenance.repo && @@ -1093,6 +1333,11 @@ test_expect_success 'fails when running outside of a repository' ' nongit test_must_fail git maintenance unregister ' +test_expect_success 'fails when configured to use an invalid strategy' ' + test_must_fail git -c maintenance.strategy=invalid maintenance run --schedule=hourly 2>err && + test_grep "unknown maintenance strategy: .invalid." err +' + test_expect_success 'register and unregister bare repo' ' test_when_finished "git config --global --unset-all maintenance.repo || :" && test_might_fail git config --global --unset-all maintenance.repo && |
