Git Mailing List Archive on lore.kernel.org
 help / color / Atom feed
From: Jeff King <peff@peff.net>
To: git@vger.kernel.org
Subject: [PATCH 0/23] parsing and fsck cleanups
Date: Fri, 18 Oct 2019 00:41:03 -0400
Message-ID: <20191018044103.GA17625@sigill.intra.peff.net> (raw)

The thread starting at:

  https://public-inbox.org/git/xmqqo8zxnz0m.fsf@gitster-ct.c.googlers.com/

discusses some issues with our handling of corrupt objects, as well as
some weirdness in fsck. This series is my attempt to clean it up. The
number of patches is a little daunting, but the early ones are the most
interesting. The latter half is part of a big refactor/cleanup that's
mostly mechanical (and isn't strictly necessary; see below for
discussion).

  [01/23]: parse_commit_buffer(): treat lookup_commit() failure as parse error
  [02/23]: parse_commit_buffer(): treat lookup_tree() failure as parse error
  [03/23]: parse_tag_buffer(): treat NULL tag pointer as parse error
  [04/23]: remember commit/tag parse failures

    These ones are tightening up our parser to report failures more
    consistently. The first one definitely fixes a demonstrable bug, and
    I suspect the rest of them are fixing hard-to-trigger but lurking
    segfaults.

  [05/23]: fsck: stop checking commit->tree value
  [06/23]: fsck: stop checking commit->parent counts
  [07/23]: fsck: stop checking tag->tagged
  [08/23]: fsck: require an actual buffer for non-blobs

    These ones clean up weirdness where fsck is dependent on the results
    of parse_commit(), etc, rather than just looking at the buffer we
    gave it. I don't think they're _hurting_ anything, but it certainly
    makes following the fsck logic more confusing.

  [09/23]: fsck: unify object-name code

    Cleanup that fixes a few minor bugs.

  [10/23]: fsck_describe_object(): build on our get_object_name() primitive
  [11/23]: fsck: use oids rather than objects for object_name API
  [12/23]: fsck: don't require full object structs for display functions
  [13/23]: fsck: only provide oid/type in fsck_error callback
  [14/23]: fsck: only require an oid for skiplist functions
  [15/23]: fsck: don't require an object struct for report()
  [16/23]: fsck: accept an oid instead of a "struct blob" for fsck_blob()
  [17/23]: fsck: drop blob struct from fsck_finish()
  [18/23]: fsck: don't require an object struct for fsck_ident()
  [19/23]: fsck: don't require an object struct in verify_headers()
  [20/23]: fsck: rename vague "oid" local variables
  [21/23]: fsck: accept an oid instead of a "struct tag" for fsck_tag()
  [22/23]: fsck: accept an oid instead of a "struct commit" for fsck_commit()
  [23/23]: fsck: accept an oid instead of a "struct tree" for fsck_tree()

    This a string of refactors that ends up with all of the
    type-specific fsck functions not getting an object struct at all.
    My goal there was two-fold:

       - it makes it harder to introduce weirdness like we saw in
	 patches 5-8.

       - it _could_ make things less awkward for callers like index-pack
	 which don't necessarily have object structs. And at the end, we
	 basically have an fsck_object() that doesn't need an object
	 struct. But index-pack still calls fsck_walk(), which does (and
	 which relies on the parsed values to traverse). It's not
	 entirely clear to me whether index-pack needs to be doing
	 fsck_walk() in the first place, or if it should be relying on
	 the usual connectivity check.

	 So I'm undecided whether this is worth taking on its own, or if
	 trying to avoid object structs in the fsck code is just a
	 fool's errand. I do think the result isn't too bad to look at,
	 though and there are some minor improvements along the way
	 (e.g., patch 17 is able to drop some awkwardness).

    Most of the patches are pretty mechanical. There are so many because
    I split it by call stack layer. If A calls B calls C, then I
    converted "C" away from "struct object" first, which enables
    converting "B", and so on.

 builtin/fsck.c                         | 126 ++++----
 commit-graph.c                         |   3 -
 commit.c                               |  33 ++-
 fsck.c                                 | 386 +++++++++++--------------
 fsck.h                                 |  39 ++-
 t/t1450-fsck.sh                        |   2 +-
 t/t5318-commit-graph.sh                |   2 +-
 t/t6102-rev-list-unexpected-objects.sh |   2 +-
 tag.c                                  |  21 +-
 9 files changed, 312 insertions(+), 302 deletions(-)


             reply index

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-18  4:41 Jeff King [this message]
2019-10-18  4:42 ` [PATCH 01/23] parse_commit_buffer(): treat lookup_commit() failure as parse error Jeff King
2019-10-24  3:37   ` Junio C Hamano
2019-10-24 18:01     ` Jeff King
2019-10-18  4:45 ` [PATCH 03/23] parse_tag_buffer(): treat NULL tag pointer " Jeff King
2019-10-18  4:47 ` [PATCH 04/23] remember commit/tag parse failures Jeff King
2019-10-24  3:51   ` Junio C Hamano
2019-10-24 23:25   ` Jonathan Tan
2019-10-24 23:41     ` Jeff King
2019-10-18  4:48 ` [PATCH 05/23] fsck: stop checking commit->tree value Jeff King
2019-10-24  3:57   ` Junio C Hamano
2019-10-18  4:49 ` [PATCH 06/23] fsck: stop checking commit->parent counts Jeff King
2019-10-18  4:51 ` [PATCH 07/23] fsck: stop checking tag->tagged Jeff King
2019-10-18  4:54 ` [PATCH 08/23] fsck: require an actual buffer for non-blobs Jeff King
2019-10-18  4:56 ` [PATCH 09/23] fsck: unify object-name code Jeff King
2019-10-24  6:05   ` Junio C Hamano
2019-10-24 18:07     ` Jeff King
2019-10-25  3:23       ` Junio C Hamano
2019-10-25 21:20         ` Jeff King
2019-10-18  4:56 ` [PATCH 10/23] fsck_describe_object(): build on our get_object_name() primitive Jeff King
2019-10-24  6:06   ` Junio C Hamano
2019-10-18  4:57 ` [PATCH 11/23] fsck: use oids rather than objects for object_name API Jeff King
2019-10-18  4:58 ` [PATCH 12/23] fsck: don't require object structs for display functions Jeff King
2019-10-18  4:58 ` [PATCH 13/23] fsck: only provide oid/type in fsck_error callback Jeff King
2019-10-18  4:58 ` [PATCH 14/23] fsck: only require an oid for skiplist functions Jeff King
2019-10-18  4:59 ` [PATCH 15/23] fsck: don't require an object struct for report() Jeff King
2019-10-18  4:59 ` [PATCH 16/23] fsck: accept an oid instead of a "struct blob" for fsck_blob() Jeff King
2019-10-18  4:59 ` [PATCH 17/23] fsck: drop blob struct from fsck_finish() Jeff King
2019-10-18  5:00 ` [PATCH 18/23] fsck: don't require an object struct for fsck_ident() Jeff King
2019-10-18  5:00 ` [PATCH 19/23] fsck: don't require an object struct in verify_headers() Jeff King
2019-10-18  5:00 ` [PATCH 20/23] fsck: rename vague "oid" local variables Jeff King
2019-10-18  5:01 ` [PATCH 21/23] fsck: accept an oid instead of a "struct tag" for fsck_tag() Jeff King
2019-10-18  5:01 ` [PATCH 22/23] fsck: accept an oid instead of a "struct commit" for fsck_commit() Jeff King
2019-10-18  5:02 ` [PATCH 23/23] fsck: accept an oid instead of a "struct tree" for fsck_tree() Jeff King
2019-10-24 23:49 ` [PATCH 0/23] parsing and fsck cleanups Jonathan Tan
2019-10-25  3:11 ` Junio C Hamano
2019-10-18  4:43 [PATCH 02/23] parse_commit_buffer(): treat lookup_tree() failure as parse error Jeff King
2019-10-24 23:12 ` Jonathan Tan
2019-10-24 23:22   ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191018044103.GA17625@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Mailing List Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/git/0 git/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 git git/ https://lore.kernel.org/git \
		git@vger.kernel.org
	public-inbox-index git

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.git


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git