All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Jacob Vosmaer <jacob@gitlab.com>
Cc: Junio C Hamano <gitster@pobox.com>, Taylor Blau <me@ttaylorr.com>,
	Git Mailing List <git@vger.kernel.org>,
	ps@pks.im
Subject: Re: [PATCH v3 0/2] send_ref buffering
Date: Tue, 31 Aug 2021 20:15:30 -0400	[thread overview]
Message-ID: <YS7GItELaEM8lq2z@coredump.intra.peff.net> (raw)
In-Reply-To: <CADMWQoOfPr47a6bJtApt=wSE9BrgXQKJRfJuNAj0zbtio2BRHw@mail.gmail.com> <CADMWQoPerffQcTfKh3bfesjgHaBqXGzW2805knyRW3R_q4V-YA@mail.gmail.com>

On Tue, Aug 31, 2021 at 03:08:25PM +0200, Jacob Vosmaer wrote:

> > Does the 64k buffer actually improve things? Here are the timings I get
> > on a repo with ~1M refs (it's linux.git with one ref per commit).
> Thanks for challenging that. I have a repeatable benchmark where it
> matters, because each write syscall wakes up a chain of proxies
> between the user and git-upload-pack. Larger buffers means fewer
> wake-ups. But then I tried to simplify my example by having sshd as
> the only intermediary, and in that experiment 64K buffers were not
> better than 4K buffers. I think that goes to show that picking a good
> buffer size is hard, and we'd be better off picking one specifically
> for Gitaly (and GitLab) that works with our stack.

Thanks for explaining. Yeah, I think leaving it as a custom thing makes
sense, then.

> >   Summary
> >     'GIT_REF_PARANOIA=1 git.compile upload-pack .' ran
> >       2.17 ± 0.02 times faster than 'git.compile upload-pack .'
> >
> > It's not exactly the intended use of that environment variable, but its
> > side effect is that we do not call has_object_file() on each ref tip.
> That is nice to know, but as a user of Git I don't know when it is or
> is not safe to skip those has_object_file() calls. If it's safe to
> skip them then Git should skip them always. If not, then I will err on
> the side of caution and keep the checks.

Yeah, the use of REF_PARANOIA there was just an easy illustration. IMHO
it would be reasonable for upload-pack to just assume that the refs
files are valid. If they aren't, then either:

  - the receiver is uninterested in those objects or already has them,
    so won't ask for them. They're happy either way.

  - the receiver _will_ ask for them, at which point we'd barf later in
    pack-objects when we try to access them.

There are some thoughts in this old thread which introduce
GIT_REF_PARANOIA:

  https://lore.kernel.org/git/20150317073730.GA25267@peff.net/

I think I was mostly too cowardly to make the change back then. And I
hadn't considered that the performance implications would be all that
big (though I will say this million-ref repo is at the edge of what I'd
consider reasonable).

> > > I do think it would be nice to take the packet_writer
> > > interface further (letting it replace the static buf, and use stdio
> > > handles, and using it throughout upload-pack).
> > I would like that too, for the sake of neatness and general
> > performance, but I don't have the time to take on a larger project
> > like that at the moment.
> I gave solving the problem with packet_writer a couple of hours today.
> The diff gets too big, and I have too little confidence I'm not
> introducing deadlocks. This really is more work than I can chew off
> right now. Sorry!

Thanks for taking a look! I think we can proceed with your series for
now, then.

-Peff

  parent reply	other threads:[~2021-09-01  0:15 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-24 14:02 [PATCH 1/1] upload-pack: buffer ref advertisement writes Jacob Vosmaer
2021-08-24 21:07 ` Taylor Blau
2021-08-24 21:42   ` Junio C Hamano
2021-08-25  0:44     ` Jeff King
2021-08-26 10:02       ` Jacob Vosmaer
2021-08-26 10:06         ` [PATCH 1/2] pkt-line: add packet_fwrite Jacob Vosmaer
2021-08-26 10:06           ` [PATCH 2/2] upload-pack: use stdio in send_ref callbacks Jacob Vosmaer
2021-08-26 16:33             ` Junio C Hamano
2021-08-26 20:21               ` Junio C Hamano
2021-08-26 22:35               ` Taylor Blau
2021-08-26 23:24               ` Jeff King
2021-08-27 16:15                 ` Junio C Hamano
2021-08-31  9:34                   ` [PATCH v3 0/2] send_ref buffering Jacob Vosmaer
2021-08-31  9:34                     ` [PATCH v3 1/2] pkt-line: add stdio packet write functions Jacob Vosmaer
2021-08-31 10:37                       ` Jeff King
2021-08-31 18:13                       ` Junio C Hamano
2021-09-01 12:54                         ` [PATCH v4 0/2] send_ref buffering Jacob Vosmaer
2021-09-01 12:54                           ` [PATCH v4 1/2] pkt-line: add stdio packet write functions Jacob Vosmaer
2021-09-01 12:54                           ` [PATCH v4 2/2] upload-pack: use stdio in send_ref callbacks Jacob Vosmaer
2021-09-02  9:18                           ` [PATCH v4 0/2] send_ref buffering Jeff King
2021-08-31  9:34                     ` [PATCH v3 2/2] upload-pack: use stdio in send_ref callbacks Jacob Vosmaer
2021-08-31 10:25                     ` [PATCH v3 0/2] send_ref buffering Jeff King
2021-08-31 13:08                       ` Jacob Vosmaer
2021-08-31 17:44                         ` Jacob Vosmaer
2021-09-01  0:15                         ` Jeff King [this message]
2021-08-26 23:32             ` [PATCH 2/2] upload-pack: use stdio in send_ref callbacks Jeff King
2021-08-26 16:33           ` [PATCH 1/2] pkt-line: add packet_fwrite Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YS7GItELaEM8lq2z@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jacob@gitlab.com \
    --cc=me@ttaylorr.com \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.