All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Turner <David.Turner@twosigma.com>
To: Jeff King <peff@peff.net>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>
Subject: RE: [PATCH v3] http.postbuffer: allow full range of ssize_t values
Date: Tue, 4 Apr 2017 18:42:23 +0000	[thread overview]
Message-ID: <6488d78232be49a69260436d1c6ed44f@exmbdft7.ad.twosigma.com> (raw)
In-Reply-To: <20170404020130.76thbl5rum2gxgtn@sigill.intra.peff.net>

> -----Original Message-----
> From: Jeff King [mailto:peff@peff.net]
> Sent: Monday, April 3, 2017 10:02 PM
> To: David Turner <David.Turner@twosigma.com>
> Cc: git@vger.kernel.org
> Subject: Re: [PATCH v3] http.postbuffer: allow full range of ssize_t values
> 
> On Mon, Apr 03, 2017 at 05:03:49PM +0000, David Turner wrote:
> 
> > > > Unfortunately, in order to push some large repos, the http
> > > > postbuffer must sometimes exceed two gigabytes.  On a 64-bit system,
> this is OK:
> > > > we just malloc a larger buffer.
> > >
> > > I'm still not sure why a 2GB post-buffer is necessary. It sounds
> > > like something is broken in your setup. Large pushes should be sent
> chunked.
> > >
> > > I know broken setups are a fact of life, but this feels like a
> > > really hacky work- around.
> >
> > I'm not sure what other workaround I should use.  I guess I could do
> > multiple pushes, but only if individual objects are under the size
> > limit, and I'm not sure all of mine are (maybe I'll get lucky tho).  I
> > know that this is a configuration issue with gitlab:
> > https://gitlab.com/gitlab-org/gitlab-ce/issues/30315 but I don't know
> > when that will get fixed.  I could manually copy the repo to the
> > server and do a local push, but I don't know that I have the necessary
> > permissions to do that. Or I could do this, which would hopefully
> > actually solve the problem.
> 
> I didn't think we had gotten details on what was actually broken. Is it really
> that GitLab does not support chunked transfers? I know that's what the issue
> above says, but it sounds like it is just assuming that is the problem based on
> the recent messages to the list.

I agree that we don't know for sure what's actually broken.  I think probably the
GitLab bug tracker might be the better place to figure that out, unless GitLab 
thinks they're hitting a bug in git.

> If that's really the issue, then OK. That's lame, but something the client has to
> work around. It seems like a pretty big gap, though (and one I'd think people
> would have hit before; the default post-buffer is only 1MB. Surely people
> routinely push more than that to GitLab servers?
> So I'm really wondering if there is something else going on here.
> 
> What does it look like when it fails? What does GIT_TRACE_CURL look like (or
> GIT_CURL_VERBOSE if your client is older, but remember to sanitize any auth
> lines)?

Unfortunately, we've already worked around the problem by pushing over SSH, 
so I no longer have a failing case to examine. Last time I tried, I actually did some 
hackery to create a push smaller than 2GB, but it still failed (this time, with 
"502 Bad Gateway").  So, something is clearly weird in GitLab land.

I did see "Transfer-Encoding: chunked" in one of the responses from the server,
 but not in the request (not sure if that's normal). The smaller push had: 
Content-Length: 1048908476

(For me to publish longer log traces requires a level of security review which is 
probably too much of a hassle unless you think it will be really useful).

> > > The ultimate fate of this number, though, is to be handed to:
> > >
> > >   curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDSIZE, rpc->len);
> > >
> > > where the final argument is interpreted as a long. So I suspect that
> > > on 64-bit Windows, setting http.postbuffer to "3G" would cause some
> > > kind of weird error (either a truncated post or some internal curl
> > > error due to the negative size, depending on how curl handles it).
> >
> > Ah, so we would need to use CURLOPT_POSTFIELDSIZE_LARGE.  Will re-roll.
> 
> Ah, neat. I didn't even know about CURLOPT_POSTFIELDSIZE_LARGE, and
> thought we'd have to just limit 32-bit platforms. That's a much better
> solution.
> 
> > > I saw the earlier iteration used a size_t, but you switched it after
> > > the compiler
> > > (rightfully) complained about the signedness. But I'm not sure why
> > > we would want ssize_t here instead of just using git_parse_unsigned().
> >
> > It was originally signed.  I'm not sure why that was, but I figured it
> > would be simpler to save the extra bit just in case.
> 
> I think it was simply because git_config_int() is the generic "number"
> parser, and nobody ever thought about it.
> 
> In fact, passing a negative value to curl would be disastrous, as it would use
> strlen(). I don't think a negative value could ever get that far, though. It looks
> like the config code would silently turn a negative value into
> LARGE_PACKET_MAX.

I would still prefer to preserve the bit just in case we ever decide that a negative 
value should have some special meaning later.  Of course, that special meaning 
wouldn't be "pass directly to curl".  (As I think about git's http protocol, and how 
hard it is to change it, I always want to leave the maximal number of extra bits 
free possible for general future usage).

> IMHO, complaining about the negative number to the user would be an
> improvement.

That seems reasonable.

  reply	other threads:[~2017-04-04 18:42 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-31 17:26 David Turner
2017-03-31 19:51 ` Junio C Hamano
2017-04-01  6:01 ` Jeff King
2017-04-01 18:09   ` Junio C Hamano
2017-04-03 17:03   ` David Turner
2017-04-04  2:01     ` Jeff King
2017-04-04 18:42       ` David Turner [this message]
2017-04-04 20:40         ` Jeff King
2017-04-06 17:24           ` Christian Couder
2017-04-07  4:48             ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6488d78232be49a69260436d1c6ed44f@exmbdft7.ad.twosigma.com \
    --to=david.turner@twosigma.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    --subject='RE: [PATCH v3] http.postbuffer: allow full range of ssize_t values' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.