diff options
Diffstat (limited to 'Documentation/gitformat-pack.txt')
| -rw-r--r-- | Documentation/gitformat-pack.txt | 54 |
1 files changed, 10 insertions, 44 deletions
diff --git a/Documentation/gitformat-pack.txt b/Documentation/gitformat-pack.txt index c520a65e55..9fcb29a9c8 100644 --- a/Documentation/gitformat-pack.txt +++ b/Documentation/gitformat-pack.txt @@ -17,8 +17,8 @@ $GIT_DIR/objects/pack/multi-pack-index DESCRIPTION ----------- -The Git pack format is now Git stores most of its primary repository -data. Over the lietime af a repository loose objects (if any) and +The Git pack format is how Git stores most of its primary repository +data. Over the lifetime of a repository, loose objects (if any) and smaller packs are consolidated into larger pack(s). See linkgit:git-gc[1] and linkgit:git-pack-objects[1]. @@ -48,7 +48,7 @@ Similarly, in SHA-256 repositories, these values are computed using SHA-256. Observation: we cannot have more than 4G versions ;-) and more than 4G objects in a pack. - - The header is followed by number of object entries, each of + - The header is followed by a number of object entries, each of which looks like this: (undeltified representation) @@ -62,7 +62,7 @@ Similarly, in SHA-256 repositories, these values are computed using SHA-256. is an OBJ_OFS_DELTA object compressed delta data - Observation: length of each object is encoded in a variable + Observation: the length of each object is encoded in a variable length format and is not constrained to 32-bit or anything. - The trailer records a pack checksum of all of the above. @@ -117,7 +117,7 @@ the delta data is a sequence of instructions to reconstruct the object from the base object. If the base object is deltified, it must be converted to canonical form first. Each instruction appends more and more data to the target object until it's complete. There are two -supported instructions so far: one for copy a byte range from the +supported instructions so far: one for copying a byte range from the source object and one for inserting new data embedded in the instruction itself. @@ -137,7 +137,7 @@ copy. Offset and size are in little-endian order. All offset and size bytes are optional. This is to reduce the instruction size when encoding small offsets or sizes. The first seven -bits in the first octet determines which of the next seven octets is +bits in the first octet determine which of the next seven octets is present. If bit zero is set, offset1 is present. If bit one is set offset2 is present and so on. @@ -161,9 +161,9 @@ converted to 0x10000. | 0xxxxxxx | data | +----------+============+ -This is the instruction to construct target object without the base +This is the instruction to construct the target object without the base object. The following data is appended to the target object. The first -seven bits of the first octet determines the size of data in +seven bits of the first octet determine the size of data in bytes. The size must be non-zero. ==== Reserved instruction @@ -294,7 +294,7 @@ Pack file entry: <+ - The same trailer as a v1 pack file: - A copy of the pack checksum at the end of + A copy of the pack checksum at the end of the corresponding packfile. Index checksum of all of the above. @@ -589,51 +589,17 @@ later on. It is linkgit:git-gc[1] that is typically responsible for removing expired unreachable objects. -=== Caution for mixed-version environments - -Repositories that have cruft packs in them will continue to work with any older -version of Git. Note, however, that previous versions of Git which do not -understand the `.mtimes` file will use the cruft pack's mtime as the mtime for -all of the objects in it. In other words, do not expect older (pre-cruft pack) -versions of Git to interpret or even read the contents of the `.mtimes` file. - -Note that having mixed versions of Git GC-ing the same repository can lead to -unreachable objects never being completely pruned. This can happen under the -following circumstances: - - - An older version of Git running GC explodes the contents of an existing - cruft pack loose, using the cruft pack's mtime. - - A newer version running GC collects those loose objects into a cruft pack, - where the .mtime file reflects the loose object's actual mtimes, but the - cruft pack mtime is "now". - -Repeating this process will lead to unreachable objects not getting pruned as a -result of repeatedly resetting the objects' mtimes to the present time. - -If you are GC-ing repositories in a mixed version environment, consider omitting -the `--cruft` option when using linkgit:git-repack[1] and linkgit:git-gc[1], and -setting the `gc.cruftPacks` configuration to "false" until all writers -understand cruft packs. - === Alternatives Notable alternatives to this design include: - - The location of the per-object mtime data, and - - Storing unreachable objects in multiple cruft packs. + - The location of the per-object mtime data. On the location of mtime data, a new auxiliary file tied to the pack was chosen to avoid complicating the `.idx` format. If the `.idx` format were ever to gain support for optional chunks of data, it may make sense to consolidate the `.mtimes` format into the `.idx` itself. -Storing unreachable objects among multiple cruft packs (e.g., creating a new -cruft pack during each repacking operation including only unreachable objects -which aren't already stored in an earlier cruft pack) is significantly more -complicated to construct, and so aren't pursued here. The obvious drawback to -the current implementation is that the entire cruft pack must be re-written from -scratch. - GIT --- Part of the linkgit:git[1] suite |
