git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Duy Nguyen <pclouds@gmail.com>
Cc: "Junio C Hamano" <gitster@pobox.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Van Oostenryck Luc" <luc.vanoostenryck@gmail.com>,
	"Git Mailing List" <git@vger.kernel.org>,
	"Kevin Willford" <kewillf@microsoft.com>
Subject: Re: [PATCH] reopen_tempfile(): truncate opened file
Date: Wed, 5 Sep 2018 11:35:52 -0400	[thread overview]
Message-ID: <20180905153551.GB24660@sigill.intra.peff.net> (raw)
In-Reply-To: <CACsJy8Ax4S9Sms6TY1dMV8M9-=hakEW8TCqn8yxb73Vbrpy_MQ@mail.gmail.com>

On Wed, Sep 05, 2018 at 05:27:11PM +0200, Duy Nguyen wrote:

> > +test_expect_success PERL 'commit -p with shrinking cache-tree' '
> > +       mkdir -p deep/subdir &&
> > +       echo content >deep/subdir/file &&
> > +       git add deep &&
> > +       git commit -m add &&
> > +       git rm -r deep &&
> 
> OK so I guess at this step, we invalidate some cache-tree blocks, but
> we write the same blocks down (with "invalid" flag), so pretty much
> the same size as before.

I didn't verify exactly what was in the index, but that was my
understanding, too (well, it's a little smaller because we drop the
actual index entries, but keep the invalidated cache-tree). I worry a
little that "rm" might eventually learn to drop those invalidated bits.
But hopefully finding this commit would lead that person to figure out
another way to accomplish the same thing, or to decide that carrying the
test forward isn't worth it.

> > +       after=$(wc -c <.git/index) &&
> > +
> > +       # double check that the index shrank
> > +       test $before -gt $after &&
> > +
> > +       # and that our index was not corrupted
> > +       git fsck
> 
> If the index is not shrunk, we parse remaining rubbish as extensions.
> If by chance the rubbish extension name is in uppercase, then we
> ignore (and not flag it as error). But then the chances of the next 4
> bytes being the "right" extension size is so small that we would end
> up flagging it as bad extension anyway. So it's good. But if you want
> to be even stricter (not necessary in my opinion), make sure that
> stderr is empty.

In this case, the size difference is only a few bytes, so the rubbish
actually ends up in the trailing sha1. The reason I use git-fsck here is
that it actually verifies the whole sha1 (since normal index reads no
longer do). In fact, a normal index read won't show any problem for this
case (since it is _only_ the trailing sha1 which is junk, and we no
longer verify it on every read).

In the original sparse-dev case, the size of the rubbish is much larger
(because we deleted a lot more entries), and we do interpret it as a
bogus extension. But it also triggers here, because the trailing sha1 is
_also_ wrong.

So AFAIK this fsck catches everything and yields a non-zero exit in the
error case. And it should work for even a single byte of rubbish.

-Peff

  reply	other threads:[~2018-09-05 15:35 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-01 21:41 [BUG] index corruption with git commit -p Luc Van Oostenryck
2018-09-01 22:17 ` Ævar Arnfjörð Bjarmason
2018-09-02  5:08   ` Jeff King
2018-09-02  7:12     ` Duy Nguyen
2018-09-02  7:24       ` Jeff King
2018-09-02  7:53         ` Luc Van Oostenryck
2018-09-02  8:02           ` Jeff King
2018-09-04 15:57         ` Junio C Hamano
2018-09-04 16:13           ` Duy Nguyen
2018-09-04 16:38             ` Jeff King
2018-09-04 23:36               ` [PATCH] reopen_tempfile(): truncate opened file Jeff King
2018-09-05 15:27                 ` Duy Nguyen
2018-09-05 15:35                   ` Jeff King [this message]
2018-09-05 15:39                     ` Duy Nguyen
2018-09-05 15:48                       ` Jeff King
2018-09-05 16:54                         ` Junio C Hamano
2018-09-05 16:56                           ` Jeff King
2018-09-05 17:01                             ` Junio C Hamano
2018-09-05 18:48                 ` Luc Van Oostenryck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180905153551.GB24660@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=kewillf@microsoft.com \
    --cc=luc.vanoostenryck@gmail.com \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).