All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, "Junio C Hamano" <gitster@pobox.com>,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>,
	"Michael Haggerty" <mhagger@alum.mit.edu>,
	"Stefan Beller" <stefanbeller@gmail.com>,
	"Jonathan Nieder" <jrnieder@gmail.com>,
	"Matt McCutchen" <matt@mattmccutchen.net>
Subject: Re: [PATCH 0/4] gc docs: modernize and fix the documentation
Date: Mon, 06 May 2019 11:44:06 +0200	[thread overview]
Message-ID: <878svjj4t5.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <20190319001829.GL29661@sigill.intra.peff.net>


On Tue, Mar 19 2019, Jeff King wrote:

> On Mon, Mar 18, 2019 at 11:45:39PM +0100, Ævar Arnfjörð Bjarmason wrote:
>
>> > I don't think the quarantine stuff should impact contention at all. It's
>> > only quarantining the objects, which are the least contentious part of
>> > Git (because object content is idempotent, so we don't do any locking
>> > there, and with two racing processes, one will just "win").
>>
>> Without the quarantine, isn't there the race that the NOTES section
>> talks about (unless I've misread it).
>
> Ah, OK, I wasn't quite sure which documentation you were talking about.
> I see the discussion now in the "NOTES" section of git-gc(1).
>
>> I.e. we have some loose object "ABCD" not referrred to by anything for
>> the last 2 weeks, as we're gc-ing a ref update comes in that makes it
>> referenced again. We then delete "ABCD" (not used!) at the same time the
>> ref update happens, and get corruption.
>>
>> Whereas the quarantine might work around since the client will have sent
>> ABCD with no reference pointing to it to the server in the temp pack,
>> which we then rename in-place and then update the ref, so we don't care
>> if "ABCD" goes away.
>
> tl;dr I don't think quarantine impacts this, but if you really want gory
> details, read on.
>
> This is a problem with or without the quarantine. It's fundamentally a
> race because we do not atomically say "is anybody using X? If not, we
> can delete it" and some other process saying "I'd like to use X".
>
> Pushes are actually better off than most operations, because we only
> advertise what's reachable, and the client is expected to send
> everything else. So with just a normal update-ref call, we could race
> like this:
>
>   1. ABCD is ancient.
>
>   2. Process 1 (update-ref) wants to reference ABCD. It sees that we
>      have it.
>
>   3. Process 2 (gc/prune) sees that nobody references it. It deletes
>      ABCD.
>
>   4. Process 1 writes out the reference.
>
> That doesn't happen with a push, because the server never would have
> told the client that it has ABCD in the first place (so process 1 here
> is the client). That is true with or without quarantine.
>
> But pushes aren't foolproof either. You said "loose object ABCD not
> referred t oby anything for the last 2 weeks". But that's not exactly
> how it works. It's "object with an mtime of more than 2 weeks which is
> not currently referenced". So imagine a sequence like:
>
>   1. ABCD is ancient.
>
>   2. refs/heads/foo points to ABCD.
>
>   3. Server receive-pack advertises foo pointing to ABCD.
>
>   4. Simultaneous process on the server deletes refs/heads/foo (or
>      perhaps somebody force-pushes over it).
>
>   5. Client prepares and sends pack without ABCD.
>
>   6. Server receive-pack checks that yes, we still have ABCD (i.e., the
>      usual connectivity check).
>
>   7. Server gc drops ABCD, which is now unreachable (reflogs can help
>      here, if you've enabled them; but we do delete reflogs when the
>      branch is deleted).
>
>   8. Server receive-pack writes corrupt repo mentioning ABCD.
>
> That's a lot more steps, though they might not be as implausible as you
> think (e.g., consider moving "refs/heads/foo" to "refs/heads/bar" in a
> single push; that's actually a delete and an update, which is all you
> need to race with a simultaneous gc).
>
> I have no idea how often this happens in practice. My subjective
> recollection is that most of the race corruptions I've seen were from
> local operations on the server. E.g., we compute a tentative merge for
> somebody's pull request which shares objects with an older tentative
> merge. They click the "merge" button and we reference that commit, which
> is recent, but unbeknownst to us, while we were creating our new
> tentative merge, a "gc" was deleting the old one.
>
> We're sometimes saved by the "transitive freshness" rules in d3038d22f9
> (prune: keep objects reachable from recent objects, 2014-10-15).  But
> they're far from perfect:
>
>  - some operations (like the push rename example) aren't writing new
>    objects, so the ref write _is_ the moment that gc would find out
>    something is reachable
>
>  - the "is it reachable?" and "no, then delete it" steps aren't atomic.
>    Unless you want a whole-repo stop-the-world lock, somebody can
>    reference the object in between. And since it may take many seconds
>    to compute reachability, stop-the-world is not great.
>
> I think there are probably ways to make it better. Perhaps some kind of
> lockless delete-but-be-able-to-rollback scheme (but keep in mind this
> has to be implemented no top of POSIX filesystem semantics). Or even
> just a "compute reachability, mark for deletion, and then hold a
> stop-the-world lock briefly to double-check that our reachability is
> still up to date".
>
> At least those seem plausible to me. I've never worked out the details,
> and our solution was to just stop deleting objects during routine
> maintenance (using "repack -adk"). We do still occasionally prune
> manually (e.g., when somebody writes to support to remove a confidential
> mistake).
>
> Anyway, that was more than you probably wanted to know. The short of it
> is that I don't think quarantines help (they may even make things worse
> by slightly increasing the length of the race window, though in practice
> I doubt it).
>
>> Unless that interacts racily with the receive.unpackLimit, but then I
>> have no idea that section is trying to say...
>
> No, I don't think unpackLimit really affects it much either way.
>
>> Also, surely the part where "NOTES" says something to the effect of "you
>> are subject to races unless gc.auto=0" is wrong. To the extent that
>> there's races it won't matter that you invoke "git gc" or "git gc
>> --auto", it's the concurrency that matters. So if there's still races we
>> should be saying the repo needs to be locked for writes for the duration
>> of the "gc".
>
> Correct. It's the very act of pruning that is problematic. I think the
> point is that if you are manually running "git gc", you'd presumably do
> it at a time when the repository is not otherwise active.

Thanks for that E-Mail. I'm hoping to get around to another set of "gc
doc" updates and will hopefully be able to steal liberally from it.

Maybe there's some case I haven't thought of that makes this stupid, but
I wonder if something like a "gc quarantine" might be a fix fo both of
the the issues you noted above.

I.e. it seems to me that the main issue is that we conflate "mtime 2
weeks old because it's unreferenced for 2 weeks" v.s. "mtime 2 weeks old
because we haven't gotten around to a 'gc'".

So in such a "gc quarantine" mode when we discover an object/pack that's
unreachable/purely made up of unreachable objects we'd move the relevant
loose object/"loose" pack to such a quarantine, which would just be
.git/unreferenced-objects/{??,pack}/ or whatever.

AFAICT both cases you mentioned above would be mitigated by this because
we'd no longer conflate "haven't gc'd this yet and it's 2 weeks old"
v.s. "hasn't been referenced in 2 weeks".

I started looking at this initially because I was wondering if the
--keep-unreachable mode you modified in e26a8c4721 ("repack: extend
--keep-unreachable to loose objects", 2016-06-13) could be made to write
out such "unreferenced" objects into their *own* pack, so we could
delete them all at once as a batch, and wouldn't create the "ref
explosions" mentioned in [1].

But of course without an accompanying quarantine described above doing
that would just make this race condition worse.

1. https://public-inbox.org/git/87fu6bmr0j.fsf@evledraar.gmail.com/

  reply	other threads:[~2019-05-06  9:44 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-18 16:14 [PATCH 0/4] gc docs: modernize and fix the documentation Ævar Arnfjörð Bjarmason
2019-03-18 16:14 ` [PATCH 1/4] gc docs: modernize the advice for manually running "gc" Ævar Arnfjörð Bjarmason
2019-03-18 21:27   ` Jeff King
2019-03-18 22:18     ` Ævar Arnfjörð Bjarmason
2019-03-18 16:15 ` [PATCH 2/4] gc docs: include the "gc.*" section from "config" in "gc" Ævar Arnfjörð Bjarmason
2019-03-18 21:31   ` Jeff King
2019-03-21 22:11     ` Andreas Heiduk
2019-03-19  2:08   ` Duy Nguyen
2019-03-18 16:15 ` [PATCH 3/4] gc docs: de-duplicate "OPTIONS" and "CONFIGURATION" Ævar Arnfjörð Bjarmason
2019-03-18 21:49   ` Jeff King
2019-03-18 22:48     ` Ævar Arnfjörð Bjarmason
2019-03-18 23:42       ` Jeff King
2019-03-18 16:15 ` [PATCH 4/4] gc docs: downplay the usefulness of --aggressive Ævar Arnfjörð Bjarmason
2019-03-18 20:28   ` Jonathan Nieder
2019-03-18 21:22     ` Jeff King
2019-03-18 22:13       ` Ævar Arnfjörð Bjarmason
2019-03-18 23:53         ` Jeff King
2019-03-19  6:54   ` Johannes Sixt
2019-03-19  9:28     ` Ævar Arnfjörð Bjarmason
2019-03-18 21:51 ` [PATCH 0/4] gc docs: modernize and fix the documentation Jeff King
2019-03-18 22:45   ` Ævar Arnfjörð Bjarmason
2019-03-19  0:18     ` Jeff King
2019-05-06  9:44       ` Ævar Arnfjörð Bjarmason [this message]
2019-05-07  7:51         ` Jeff King
2019-05-09 23:20           ` Ævar Arnfjörð Bjarmason
2019-07-31  4:26             ` Jeff King
2019-07-31 10:12               ` Ævar Arnfjörð Bjarmason
2019-03-21 20:50 ` [PATCH v2 00/10] " Ævar Arnfjörð Bjarmason
2019-03-22  9:32   ` [PATCH v3 00/11] gc docs: modernize the advice for manually running "gc" Ævar Arnfjörð Bjarmason
2019-03-22  9:32   ` [PATCH v3 01/11] " Ævar Arnfjörð Bjarmason
2019-03-22  9:32   ` [PATCH v3 02/11] gc docs: stop noting "repack" flags Ævar Arnfjörð Bjarmason
2019-03-22  9:32   ` [PATCH v3 03/11] gc docs: clean grammar for "gc.bigPackThreshold" Ævar Arnfjörð Bjarmason
2019-03-22  9:32   ` [PATCH v3 04/11] gc docs: include the "gc.*" section from "config" in "gc" Ævar Arnfjörð Bjarmason
2019-03-30 18:04     ` Todd Zullinger
2019-04-07 19:52     ` [PATCH v4 00/11] gc docs: modernize and fix the documentation Ævar Arnfjörð Bjarmason
2019-04-07 19:52     ` [PATCH v4 01/11] gc docs: modernize the advice for manually running "gc" Ævar Arnfjörð Bjarmason
2019-04-07 19:52     ` [PATCH v4 02/11] gc docs: stop noting "repack" flags Ævar Arnfjörð Bjarmason
2019-04-07 19:52     ` [PATCH v4 03/11] gc docs: clean grammar for "gc.bigPackThreshold" Ævar Arnfjörð Bjarmason
2019-04-07 19:52     ` [PATCH v4 04/11] gc docs: include the "gc.*" section from "config" in "gc" Ævar Arnfjörð Bjarmason
2019-04-07 19:52     ` [PATCH v4 05/11] gc docs: re-flow the "gc.*" section in "config" Ævar Arnfjörð Bjarmason
2019-04-07 19:52     ` [PATCH v4 06/11] gc docs: fix formatting for "gc.writeCommitGraph" Ævar Arnfjörð Bjarmason
2019-04-07 19:52     ` [PATCH v4 07/11] gc docs: note how --aggressive impacts --window & --depth Ævar Arnfjörð Bjarmason
2019-04-07 19:52     ` [PATCH v4 08/11] gc docs: downplay the usefulness of --aggressive Ævar Arnfjörð Bjarmason
2019-04-07 19:52     ` [PATCH v4 09/11] gc docs: note "gc --aggressive" in "fast-import" Ævar Arnfjörð Bjarmason
2019-04-07 19:52     ` [PATCH v4 10/11] gc docs: clarify that "gc" doesn't throw away referenced objects Ævar Arnfjörð Bjarmason
2019-04-07 19:52     ` [PATCH v4 11/11] gc docs: remove incorrect reference to gc.auto=0 Ævar Arnfjörð Bjarmason
2019-03-22  9:32   ` [PATCH v3 05/11] gc docs: re-flow the "gc.*" section in "config" Ævar Arnfjörð Bjarmason
2019-03-22  9:32   ` [PATCH v3 06/11] gc docs: fix formatting for "gc.writeCommitGraph" Ævar Arnfjörð Bjarmason
2019-03-22  9:32   ` [PATCH v3 07/11] gc docs: note how --aggressive impacts --window & --depth Ævar Arnfjörð Bjarmason
2019-03-22  9:32   ` [PATCH v3 08/11] gc docs: downplay the usefulness of --aggressive Ævar Arnfjörð Bjarmason
2019-03-22  9:32   ` [PATCH v3 09/11] gc docs: note "gc --aggressive" in "fast-import" Ævar Arnfjörð Bjarmason
2019-03-22  9:32   ` [PATCH v3 10/11] gc docs: clarify that "gc" doesn't throw away referenced objects Ævar Arnfjörð Bjarmason
2019-03-22  9:32   ` [PATCH v3 11/11] gc docs: remove incorrect reference to gc.auto=0 Ævar Arnfjörð Bjarmason
2019-03-21 20:50 ` [PATCH v2 01/10] gc docs: modernize the advice for manually running "gc" Ævar Arnfjörð Bjarmason
2019-03-22  6:01   ` Junio C Hamano
2019-03-21 20:50 ` [PATCH v2 02/10] gc docs: stop noting "repack" flags Ævar Arnfjörð Bjarmason
2019-03-21 20:50 ` [PATCH v2 03/10] gc docs: clean grammar for "gc.bigPackThreshold" Ævar Arnfjörð Bjarmason
2019-03-21 20:50 ` [PATCH v2 04/10] gc docs: include the "gc.*" section from "config" in "gc" Ævar Arnfjörð Bjarmason
2019-03-21 20:50 ` [PATCH v2 05/10] gc docs: re-flow the "gc.*" section in "config" Ævar Arnfjörð Bjarmason
2019-03-21 20:50 ` [PATCH v2 06/10] gc docs: note how --aggressive impacts --window & --depth Ævar Arnfjörð Bjarmason
2019-03-21 20:50 ` [PATCH v2 07/10] gc docs: downplay the usefulness of --aggressive Ævar Arnfjörð Bjarmason
2019-03-21 20:50 ` [PATCH v2 08/10] gc docs: note "gc --aggressive" in "fast-import" Ævar Arnfjörð Bjarmason
2019-03-21 20:50 ` [PATCH v2 09/10] gc docs: clarify that "gc" doesn't throw away referenced objects Ævar Arnfjörð Bjarmason
2019-03-21 20:50 ` [PATCH v2 10/10] gc docs: remove incorrect reference to gc.auto=0 Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=878svjj4t5.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jrnieder@gmail.com \
    --cc=matt@mattmccutchen.net \
    --cc=mhagger@alum.mit.edu \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    --cc=stefanbeller@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.