git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Patrick Steinhardt <ps@pks.im>
Cc: "R. Diez" <rdiez-temp3@rd10.de>, git@vger.kernel.org
Subject: Re: git fsck does not check the packed-refs file
Date: Fri, 19 Jan 2024 20:00:55 -0500	[thread overview]
Message-ID: <20240120010055.GC117170@coredump.intra.peff.net> (raw)
In-Reply-To: <ZakIPEytlxHGCB9Y@tanuki>

On Thu, Jan 18, 2024 at 12:15:08PM +0100, Patrick Steinhardt wrote:

> > I am guessing that "git fsck" does not check file packed-refs at all.
> > I mean, it does not even attempt to parse it, in order to check
> > whether at least the format makes any sense. Only "git push" does it.
> 
> Indeed it doesn't. While the issue is comparatively easy to spot by
> manually inspecting the `packed-refs` file, I agree that it would be
> great if git-fsck(1) knew how to check the refdb for consistency. This
> problem is only going to get worse once the upcoming reftable backend
> lands -- it is a binary format, and just opening it with a text editor
> to check whether it looks sane-ish stops being a viable option here.

We don't check the packed-refs file explicitly, but we do open and parse
it to iterate over the refs it contains. E.g.:

  $ git init
  $ echo foo >.git/packed-refs
  $ git fsck
  Checking object directories: 100% (256/256), done.
  fatal: unexpected line in .git/packed-refs: foo

It's quite possible that the reading code could be more careful. I'd
have to see the exact corruption that "git fsck" didn't complain about
to say more.  If there's a page full of NUL bytes at the end of the
file, I wouldn't be surprised if the reading code gently ignores that,
which obviously is not ideal.

Fundamentally we cannot catch all cases here; a simple truncation, for
example, might yield a valid file that is simply missing some entries.
Unlike objects (which make promises about reachability and so on), there
is no real "consistency" for the state of the refs. But probably warning
if saw a bunch of garbage in the file is a good thing.

I also agree that a specific refdb consistency check would be valuable.
There are some things that the regular reading code will not check, but
which an fsck should (e.g., if the packed-refs file claims to have the
"sorted" trait, we should confirm that).

-Peff

      reply	other threads:[~2024-01-20  1:00 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-18  8:02 git fsck does not check the packed-refs file R. Diez
2024-01-18 11:15 ` Patrick Steinhardt
2024-01-20  1:00   ` Jeff King [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240120010055.GC117170@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=ps@pks.im \
    --cc=rdiez-temp3@rd10.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).