summaryrefslogtreecommitdiff
path: root/Documentation/technical/api-path-walk.adoc
blob: 34c905eb9c313080fb3dd5f7c2be67adc9cfb90d (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
Path-Walk API
=============

The path-walk API is used to walk reachable objects, but to visit objects
in batches based on a common path they appear in, or by type.

For example, all reachable commits are visited in a group. All tags are
visited in a group. Then, all root trees are visited. At some point, all
blobs reachable via a path `my/dir/to/A` are visited. When there are
multiple paths possible to reach the same object, then only one of those
paths is used to visit the object.

Basics
------

To use the path-walk API, include `path-walk.h` and call
`walk_objects_by_path()` with a customized `path_walk_info` struct. The
struct is used to set all of the options for how the walk should proceed.
Let's dig into the different options and their use.

`path_fn` and `path_fn_data`::
	The most important option is the `path_fn` option, which is a
	function pointer to the callback that can execute logic on the
	object IDs for objects grouped by type and path. This function
	also receives a `data` value that corresponds to the
	`path_fn_data` member, for providing custom data structures to
	this callback function.

`revs`::
	To configure the exact details of the reachable set of objects,
	use the `revs` member and initialize it using the revision
	machinery in `revision.h`. Initialize `revs` using calls such as
	`setup_revisions()` or `parse_revision_opt()`. Do not call
	`prepare_revision_walk()`, as that will be called within
	`walk_objects_by_path()`.
+
It is also important that you do not specify the `--objects` flag for the
`revs` struct. The revision walk should only be used to walk commits, and
the objects will be walked in a separate way based on those starting
commits.

`commits`, `blobs`, `trees`, `tags`::
	By default, these members are enabled and signal that the path-walk
	API should call the `path_fn` on objects of these types. Specialized
	applications could disable some options to make it simpler to walk
	the objects or to have fewer calls to `path_fn`.
+
While it is possible to walk only commits in this way, consumers would be
better off using the revision walk API instead.

`prune_all_uninteresting`::
	By default, all reachable paths are emitted by the path-walk API.
	This option allows consumers to declare that they are not
	interested in paths where all included objects are marked with the
	`UNINTERESTING` flag. This requires using the `boundary` option in
	the revision walk so that the walk emits commits marked with the
	`UNINTERESTING` flag.

`edge_aggressive`::
	For performance reasons, usually only the boundary commits are
	explored to find UNINTERESTING objects. However, in the case of
	shallow clones it can be helpful to mark all trees and blobs
	reachable from UNINTERESTING tip commits as UNINTERESTING. This
	matches the behavior of `--objects-edge-aggressive` in the
	revision API.

`pl`::
	This pattern list pointer allows focusing the path-walk search to
	a set of patterns, only emitting paths that match the given
	patterns. See linkgit:gitignore[5] or
	linkgit:git-sparse-checkout[1] for details about pattern lists.
	When the pattern list uses cone-mode patterns, then the path-walk
	API can prune the set of paths it walks to improve performance.

Examples
--------

See example usages in:
	`t/helper/test-path-walk.c`,
	`builtin/pack-objects.c`,
	`builtin/backfill.c`