All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: Patrick Steinhardt <ps@pks.im>, git@vger.kernel.org
Subject: Re: [PATCH 0/8] Speed up connectivity checks via quarantine dir
Date: Fri, 21 May 2021 05:30:59 -0400	[thread overview]
Message-ID: <YKd90xMgfOcuJfpI@coredump.intra.peff.net> (raw)
In-Reply-To: <xmqqr1i1t6zl.fsf@gitster.g>

On Fri, May 21, 2021 at 06:45:02AM +0900, Junio C Hamano wrote:

> >   2. This tightening is actually important if we want to avoid letting
> >      people _intentionally_ introduce the unreachable-but-incomplete
> >      scenario. Without it, an easy denial-of-service corruption against
> >      a repository you can push to is:
> >
> >        - push an update to change a ref from X to Y. Include all objects
> > 	 necessary for X..Y, but _also_ include a tree T which points to
> > 	 a missing blob B. This will be accepted by the current rules
> > 	 (but not by your patch).
> >
> >        - push an update to change the ref from Y to C, where C is a
> > 	 commit whose root tree is T. Your patch allows this (because we
> > 	 already have T in the repository). But the resulting repository
> > 	 is corrupt (the ref now points to an incomplete object graph).
> 
> Hmph, the last step of that attack would not work with our current
> check; is this the same new hole the series brings in as you
> explained earlier for a case where a newly pushed tree/commit starts
> to reference a left-over dangling tree already in the repository
> whose content blobs are missing?

Right. The current state is immune to this attack; the rule is "refs
must be complete, but nothing else has to be". The state after Patrick's
series is (I think) likewise immune; the rule is "all objects must be
complete".

I'm not sure if that was intentional, or a lucky save because the series
tries to optimize both parts (both which objects we decide to check, as
well as which we consider we "already have"). But it's quite subtle. :)

> > If we wanted to keep the existing rule (requiring that any objects that
> > sender didn't provide are actually reachable from the current refs),
> > then we'd want to be able to check reachability quickly. And there I'd
> > probably turn to reachability bitmaps.
> 
> True.  As we are not "performance is king---a code that corrupts
> repositories as quickly as possible is an improvement" kind of
> project, we should keep the existing "an object can become part of
> DAG referred by ref tips only when the objects it refers to all
> exist in the object store, because we want to keep the invariant: an
> object that is reachable from a ref is guaranteed to have everything
> reachable from it in the object store" rule, and find a way to make
> it fast to enforce that rule somehow.

That's my feeling, too. A rule of "you are not allowed to introduce
objects which directly reference a missing object" would be likewise
correct, if consistently enforced. But it makes me nervous to switch
from one to the other, especially in just one place.

And I guess in that sense, this series isn't immune as I said earlier.
It looks like a push from a shallow clone would use the old "reachable
from refs" check even after this series.

-Peff

  reply	other threads:[~2021-05-21  9:31 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-19 19:13 [PATCH 0/8] Speed up connectivity checks via quarantine dir Patrick Steinhardt
2021-05-19 19:13 ` [PATCH 1/8] perf: fix when running with TEST_OUTPUT_DIRECTORY Patrick Steinhardt
2021-05-20  2:03   ` Chris Torek
2021-05-19 19:13 ` [PATCH 2/8] p5400: add perf tests for git-receive-pack(1) Patrick Steinhardt
2021-05-20  2:09   ` Chris Torek
2021-05-20 17:04   ` Jeff King
2021-05-21 15:03   ` SZEDER Gábor
2021-05-19 19:13 ` [PATCH 3/8] tmp-objdir: expose function to retrieve path Patrick Steinhardt
2021-05-20  0:16   ` Elijah Newren
2021-05-19 19:13 ` [PATCH 4/8] packfile: have `for_each_file_in_pack_dir()` return error codes Patrick Steinhardt
2021-05-19 19:13 ` [PATCH 5/8] object-file: allow reading loose objects without reading their contents Patrick Steinhardt
2021-05-19 19:13 ` [PATCH 6/8] connected: implement connectivity check via temporary object dirs Patrick Steinhardt
2021-05-19 19:13 ` [PATCH 7/8] receive-pack: skip connectivity checks on delete-only commands Patrick Steinhardt
2021-05-21 18:53   ` Felipe Contreras
2021-05-27 14:38     ` Jeff King
2021-05-19 19:13 ` [PATCH 8/8] receive-pack: check connectivity via quarantined objects Patrick Steinhardt
2021-05-20  2:19 ` [PATCH 0/8] Speed up connectivity checks via quarantine dir Chris Torek
2021-05-20 16:50 ` Jeff King
2021-05-20 21:45   ` Junio C Hamano
2021-05-21  9:30     ` Jeff King [this message]
2021-05-21  9:42   ` Patrick Steinhardt
2021-05-21 11:20   ` Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YKd90xMgfOcuJfpI@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.