All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Turner <David.Turner@twosigma.com>
To: Jeff King <peff@peff.net>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>
Subject: RE: [PATCH v3] http.postbuffer: allow full range of ssize_t values
Date: Tue, 4 Apr 2017 18:42:23 +0000	[thread overview]
Message-ID: <6488d78232be49a69260436d1c6ed44f@exmbdft7.ad.twosigma.com> (raw)
In-Reply-To: <20170404020130.76thbl5rum2gxgtn@sigill.intra.peff.net>

> -----Original Message-----
> From: Jeff King [mailto:peff@peff.net]
> Sent: Monday, April 3, 2017 10:02 PM
> To: David Turner <David.Turner@twosigma.com>
> Cc: git@vger.kernel.org
> Subject: Re: [PATCH v3] http.postbuffer: allow full range of ssize_t values
> 
> On Mon, Apr 03, 2017 at 05:03:49PM +0000, David Turner wrote:
> 
> > > > Unfortunately, in order to push some large repos, the http
> > > > postbuffer must sometimes exceed two gigabytes.  On a 64-bit system,
> this is OK:
> > > > we just malloc a larger buffer.
> > >
> > > I'm still not sure why a 2GB post-buffer is necessary. It sounds
> > > like something is broken in your setup. Large pushes should be sent
> chunked.
> > >
> > > I know broken setups are a fact of life, but this feels like a
> > > really hacky work- around.
> >
> > I'm not sure what other workaround I should use.  I guess I could do
> > multiple pushes, but only if individual objects are under the size
> > limit, and I'm not sure all of mine are (maybe I'll get lucky tho).  I
> > know that this is a configuration issue with gitlab:
> > https://gitlab.com/gitlab-org/gitlab-ce/issues/30315 but I don't know
> > when that will get fixed.  I could manually copy the repo to the
> > server and do a local push, but I don't know that I have the necessary
> > permissions to do that. Or I could do this, which would hopefully
> > actually solve the problem.
> 
> I didn't think we had gotten details on what was actually broken. Is it really
> that GitLab does not support chunked transfers? I know that's what the issue
> above says, but it sounds like it is just assuming that is the problem based on
> the recent messages to the list.

I agree that we don't know for sure what's actually broken.  I think probably the
GitLab bug tracker might be the better place to figure that out, unless GitLab 
thinks they're hitting a bug in git.

> If that's really the issue, then OK. That's lame, but something the client has to
> work around. It seems like a pretty big gap, though (and one I'd think people
> would have hit before; the default post-buffer is only 1MB. Surely people
> routinely push more than that to GitLab servers?
> So I'm really wondering if there is something else going on here.
> 
> What does it look like when it fails? What does GIT_TRACE_CURL look like (or
> GIT_CURL_VERBOSE if your client is older, but remember to sanitize any auth
> lines)?

Unfortunately, we've already worked around the problem by pushing over SSH, 
so I no longer have a failing case to examine. Last time I tried, I actually did some 
hackery to create a push smaller than 2GB, but it still failed (this time, with 
"502 Bad Gateway").  So, something is clearly weird in GitLab land.

I did see "Transfer-Encoding: chunked" in one of the responses from the server,
 but not in the request (not sure if that's normal). The smaller push had: 
Content-Length: 1048908476

(For me to publish longer log traces requires a level of security review which is 
probably too much of a hassle unless you think it will be really useful).

> > > The ultimate fate of this number, though, is to be handed to:
> > >
> > >   curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDSIZE, rpc->len);
> > >
> > > where the final argument is interpreted as a long. So I suspect that
> > > on 64-bit Windows, setting http.postbuffer to "3G" would cause some
> > > kind of weird error (either a truncated post or some internal curl
> > > error due to the negative size, depending on how curl handles it).
> >
> > Ah, so we would need to use CURLOPT_POSTFIELDSIZE_LARGE.  Will re-roll.
> 
> Ah, neat. I didn't even know about CURLOPT_POSTFIELDSIZE_LARGE, and
> thought we'd have to just limit 32-bit platforms. That's a much better
> solution.
> 
> > > I saw the earlier iteration used a size_t, but you switched it after
> > > the compiler
> > > (rightfully) complained about the signedness. But I'm not sure why
> > > we would want ssize_t here instead of just using git_parse_unsigned().
> >
> > It was originally signed.  I'm not sure why that was, but I figured it
> > would be simpler to save the extra bit just in case.
> 
> I think it was simply because git_config_int() is the generic "number"
> parser, and nobody ever thought about it.
> 
> In fact, passing a negative value to curl would be disastrous, as it would use
> strlen(). I don't think a negative value could ever get that far, though. It looks
> like the config code would silently turn a negative value into
> LARGE_PACKET_MAX.

I would still prefer to preserve the bit just in case we ever decide that a negative 
value should have some special meaning later.  Of course, that special meaning 
wouldn't be "pass directly to curl".  (As I think about git's http protocol, and how 
hard it is to change it, I always want to leave the maximal number of extra bits 
free possible for general future usage).

> IMHO, complaining about the negative number to the user would be an
> improvement.

That seems reasonable.

  reply	other threads:[~2017-04-04 18:42 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-31 17:26 [PATCH v3] http.postbuffer: allow full range of ssize_t values David Turner
2017-03-31 19:51 ` Junio C Hamano
2017-04-01  6:01 ` Jeff King
2017-04-01 18:09   ` Junio C Hamano
2017-04-03 17:03   ` David Turner
2017-04-04  2:01     ` Jeff King
2017-04-04 18:42       ` David Turner [this message]
2017-04-04 20:40         ` Jeff King
2017-04-06 17:24           ` Christian Couder
2017-04-07  4:48             ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6488d78232be49a69260436d1c6ed44f@exmbdft7.ad.twosigma.com \
    --to=david.turner@twosigma.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.