From: Jeff King <peff@peff.net> To: git@vger.kernel.org Subject: [PATCH 0/23] parsing and fsck cleanups Date: Fri, 18 Oct 2019 00:41:03 -0400 [thread overview] Message-ID: <20191018044103.GA17625@sigill.intra.peff.net> (raw) The thread starting at: https://public-inbox.org/git/xmqqo8zxnz0m.fsf@gitster-ct.c.googlers.com/ discusses some issues with our handling of corrupt objects, as well as some weirdness in fsck. This series is my attempt to clean it up. The number of patches is a little daunting, but the early ones are the most interesting. The latter half is part of a big refactor/cleanup that's mostly mechanical (and isn't strictly necessary; see below for discussion). [01/23]: parse_commit_buffer(): treat lookup_commit() failure as parse error [02/23]: parse_commit_buffer(): treat lookup_tree() failure as parse error [03/23]: parse_tag_buffer(): treat NULL tag pointer as parse error [04/23]: remember commit/tag parse failures These ones are tightening up our parser to report failures more consistently. The first one definitely fixes a demonstrable bug, and I suspect the rest of them are fixing hard-to-trigger but lurking segfaults. [05/23]: fsck: stop checking commit->tree value [06/23]: fsck: stop checking commit->parent counts [07/23]: fsck: stop checking tag->tagged [08/23]: fsck: require an actual buffer for non-blobs These ones clean up weirdness where fsck is dependent on the results of parse_commit(), etc, rather than just looking at the buffer we gave it. I don't think they're _hurting_ anything, but it certainly makes following the fsck logic more confusing. [09/23]: fsck: unify object-name code Cleanup that fixes a few minor bugs. [10/23]: fsck_describe_object(): build on our get_object_name() primitive [11/23]: fsck: use oids rather than objects for object_name API [12/23]: fsck: don't require full object structs for display functions [13/23]: fsck: only provide oid/type in fsck_error callback [14/23]: fsck: only require an oid for skiplist functions [15/23]: fsck: don't require an object struct for report() [16/23]: fsck: accept an oid instead of a "struct blob" for fsck_blob() [17/23]: fsck: drop blob struct from fsck_finish() [18/23]: fsck: don't require an object struct for fsck_ident() [19/23]: fsck: don't require an object struct in verify_headers() [20/23]: fsck: rename vague "oid" local variables [21/23]: fsck: accept an oid instead of a "struct tag" for fsck_tag() [22/23]: fsck: accept an oid instead of a "struct commit" for fsck_commit() [23/23]: fsck: accept an oid instead of a "struct tree" for fsck_tree() This a string of refactors that ends up with all of the type-specific fsck functions not getting an object struct at all. My goal there was two-fold: - it makes it harder to introduce weirdness like we saw in patches 5-8. - it _could_ make things less awkward for callers like index-pack which don't necessarily have object structs. And at the end, we basically have an fsck_object() that doesn't need an object struct. But index-pack still calls fsck_walk(), which does (and which relies on the parsed values to traverse). It's not entirely clear to me whether index-pack needs to be doing fsck_walk() in the first place, or if it should be relying on the usual connectivity check. So I'm undecided whether this is worth taking on its own, or if trying to avoid object structs in the fsck code is just a fool's errand. I do think the result isn't too bad to look at, though and there are some minor improvements along the way (e.g., patch 17 is able to drop some awkwardness). Most of the patches are pretty mechanical. There are so many because I split it by call stack layer. If A calls B calls C, then I converted "C" away from "struct object" first, which enables converting "B", and so on. builtin/fsck.c | 126 ++++---- commit-graph.c | 3 - commit.c | 33 ++- fsck.c | 386 +++++++++++-------------- fsck.h | 39 ++- t/t1450-fsck.sh | 2 +- t/t5318-commit-graph.sh | 2 +- t/t6102-rev-list-unexpected-objects.sh | 2 +- tag.c | 21 +- 9 files changed, 312 insertions(+), 302 deletions(-)
next reply other threads:[~2019-10-18 5:07 UTC|newest] Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-10-18 4:41 Jeff King [this message] 2019-10-18 4:42 ` [PATCH 01/23] parse_commit_buffer(): treat lookup_commit() failure as parse error Jeff King 2019-10-24 3:37 ` Junio C Hamano 2019-10-24 18:01 ` Jeff King 2019-10-18 4:43 ` [PATCH 02/23] parse_commit_buffer(): treat lookup_tree() " Jeff King 2019-10-24 23:12 ` Jonathan Tan 2019-10-24 23:22 ` Jeff King 2019-10-18 4:45 ` [PATCH 03/23] parse_tag_buffer(): treat NULL tag pointer " Jeff King 2019-10-18 4:47 ` [PATCH 04/23] remember commit/tag parse failures Jeff King 2019-10-24 3:51 ` Junio C Hamano 2019-10-24 23:25 ` Jonathan Tan 2019-10-24 23:41 ` Jeff King 2019-10-18 4:48 ` [PATCH 05/23] fsck: stop checking commit->tree value Jeff King 2019-10-24 3:57 ` Junio C Hamano 2019-10-18 4:49 ` [PATCH 06/23] fsck: stop checking commit->parent counts Jeff King 2019-10-18 4:51 ` [PATCH 07/23] fsck: stop checking tag->tagged Jeff King 2019-10-18 4:54 ` [PATCH 08/23] fsck: require an actual buffer for non-blobs Jeff King 2019-10-18 4:56 ` [PATCH 09/23] fsck: unify object-name code Jeff King 2019-10-24 6:05 ` Junio C Hamano 2019-10-24 18:07 ` Jeff King 2019-10-25 3:23 ` Junio C Hamano 2019-10-25 21:20 ` Jeff King 2019-10-18 4:56 ` [PATCH 10/23] fsck_describe_object(): build on our get_object_name() primitive Jeff King 2019-10-24 6:06 ` Junio C Hamano 2019-10-18 4:57 ` [PATCH 11/23] fsck: use oids rather than objects for object_name API Jeff King 2019-10-18 4:58 ` [PATCH 12/23] fsck: don't require object structs for display functions Jeff King 2019-10-18 4:58 ` [PATCH 13/23] fsck: only provide oid/type in fsck_error callback Jeff King 2019-10-18 4:58 ` [PATCH 14/23] fsck: only require an oid for skiplist functions Jeff King 2019-10-18 4:59 ` [PATCH 15/23] fsck: don't require an object struct for report() Jeff King 2019-10-18 4:59 ` [PATCH 16/23] fsck: accept an oid instead of a "struct blob" for fsck_blob() Jeff King 2019-10-18 4:59 ` [PATCH 17/23] fsck: drop blob struct from fsck_finish() Jeff King 2019-10-18 5:00 ` [PATCH 18/23] fsck: don't require an object struct for fsck_ident() Jeff King 2019-10-18 5:00 ` [PATCH 19/23] fsck: don't require an object struct in verify_headers() Jeff King 2019-10-18 5:00 ` [PATCH 20/23] fsck: rename vague "oid" local variables Jeff King 2019-10-18 5:01 ` [PATCH 21/23] fsck: accept an oid instead of a "struct tag" for fsck_tag() Jeff King 2019-10-18 5:01 ` [PATCH 22/23] fsck: accept an oid instead of a "struct commit" for fsck_commit() Jeff King 2019-10-18 5:02 ` [PATCH 23/23] fsck: accept an oid instead of a "struct tree" for fsck_tree() Jeff King 2019-10-24 23:49 ` [PATCH 0/23] parsing and fsck cleanups Jonathan Tan 2019-10-25 3:11 ` Junio C Hamano
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20191018044103.GA17625@sigill.intra.peff.net \ --to=peff@peff.net \ --cc=git@vger.kernel.org \ --subject='Re: [PATCH 0/23] parsing and fsck cleanups' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).