fsck: stop using object_info->type_name strbuf

When fsck-ing a loose object, we use object_info's type_name strbuf to record the parsed object type as a string. For most objects this is redundant with the object_type enum, but it does let us report the string when we encounter an object with an unknown type (for which there is no matching enum value). There are a few downsides, though: 1. The code to report these cases is not actually robust. Since we did not pass a strbuf to unpack_loose_header(), we only retrieved types from headers up to 32 bytes. In longer cases, we'd simply say "object corrupt or missing". 2. This is the last caller that uses object_info's type_name strbuf support. It would be nice to refactor it so that we can simplify that code. 3. Likewise, we'll check the hash of the object using its unknown type (again, as long as that type is short enough). That depends on the hash_object_file_literally() code, which we'd eventually like to get rid of. So we can simplify things by bailing immediately in read_loose_object() when we encounter an unknown type. This has a few user-visible effects: a. Instead of producing a single line of error output like this: error: 26ed13ce3564fbbb44e35bde42c7da717ea004a6: object is of unknown type 'bogus': .git/objects/26/ed13ce3564fbbb44e35bde42c7da717ea004a6 we'll now issue two lines (the first from read_loose_object() when we see the unparsable header, and the second from the fsck code, since we couldn't read the object): error: unable to parse type from header 'bogus 4' of .git/objects/26/ed13ce3564fbbb44e35bde42c7da717ea004a6 error: 26ed13ce3564fbbb44e35bde42c7da717ea004a6: object corrupt or missing: .git/objects/26/ed13ce3564fbbb44e35bde42c7da717ea004a6 This is a little more verbose, but this sort of error should be rare (such objects are almost impossible to work with, and cannot be transferred between repositories as they are not representable in packfiles). And as a bonus, reporting the broken header in full could help with debugging other cases (e.g., a header like "blob xyzzy\0" would fail in parsing the size, but previously we'd not have showed the offending bytes). b. An object with an unknown type will be reported as corrupt, without actually doing a hash check. Again, I think this is unlikely to matter in practice since such objects are totally unusable. We'll update one fsck test to match the new error strings. And we can remove another test that covered the case of an object with an unknown type _and_ a hash corruption. Since we'll skip the hash check now in this case, the test is no longer interesting. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
author: Jeff King <peff@peff.net> 2025-05-16 00:49:53 -0400
committer: Junio C Hamano <gitster@pobox.com> 2025-05-16 09:43:10 -0700
commit: 4ae0e9423c95c63c17f66fb2de255c46dc14c4e5 (patch)
tree: 60deec1a839650d3e5b8906fd5d4a1b33402cb9f /object-file.c
parent: b32b434bfe241cde380c5f3aca48a1fdcd86961b (diff)
1 files changed, 9 insertions, 3 deletions
diff --git a/object-file.c b/object-file.c
index 1127e154f6..7a35bde96e 100644
--- a/object-file.c
+++ b/object-file.c
@@ -1662,6 +1662,12 @@ int read_loose_object(const char *path,
 		goto out_inflate;
 	}
 
+	if (*oi->typep < 0) {
+		error(_("unable to parse type from header '%s' of %s"),
+		      hdr, path);
+		goto out_inflate;
+	}
+
 	if (*oi->typep == OBJ_BLOB &&
 	    *size > repo_settings_get_big_file_threshold(the_repository)) {
 		if (check_stream_oid(&stream, hdr, *size, path, expected_oid) < 0)
@@ -1672,9 +1678,9 @@ int read_loose_object(const char *path,
 			error(_("unable to unpack contents of %s"), path);
 			goto out_inflate;
 		}
-		hash_object_file_literally(the_repository->hash_algo,
-					   *contents, *size,
-					   oi->type_name->buf, real_oid);
+		hash_object_file(the_repository->hash_algo,
+				 *contents, *size,
+				 *oi->typep, real_oid);
 		if (!oideq(expected_oid, real_oid))
 			goto out_inflate;
 	}
author	Jeff King <peff@peff.net>	2025-05-16 00:49:53 -0400
committer	Junio C Hamano <gitster@pobox.com>	2025-05-16 09:43:10 -0700
commit	4ae0e9423c95c63c17f66fb2de255c46dc14c4e5 (patch)
tree	60deec1a839650d3e5b8906fd5d4a1b33402cb9f /object-file.c
parent	b32b434bfe241cde380c5f3aca48a1fdcd86961b (diff)