All of lore.kernel.org
 help / color / mirror / Atom feed
From: Duy Nguyen <pclouds@gmail.com>
To: Jeff King <peff@peff.net>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: [RFH] limiting ref advertisements
Date: Tue, 25 Oct 2016 18:46:21 +0700	[thread overview]
Message-ID: <CACsJy8DwKxz14Dow9dEKeXnBriMzN_OptnGM7nPigPcS_pHX9w@mail.gmail.com> (raw)
In-Reply-To: <20161024132932.i42rqn2vlpocqmkq@sigill.intra.peff.net>

On Mon, Oct 24, 2016 at 8:29 PM, Jeff King <peff@peff.net> wrote:
> I'm looking into the oft-discussed idea of reducing the size of ref
> advertisements by having the client say "these are the refs I'm
> interested in". Let's set aside the protocol complexities for a
> moment and imagine we magically have some way to communicate a set of
> patterns to the server.
>
> What should those patterns look like?
>
> I had hoped that we could keep most of the pattern logic on the
> client-side. Otherwise we risk incompatibilities between how the client
> and server interpret a pattern. I had also hoped we could do some kind
> of prefix-matching, which would let the server look only at the
> interesting bits of the ref tree (so if you don't care about
> refs/changes, and the server has some ref storage that is hierarchical,
> they can literally get away without opening that sub-tree).
>
> The patch at the end of this email is what I came up with in that
> direction. It obviously won't compile without the twenty other patches
> implementing transport->advertise_prefixes

Yes! git-upload-pack-2 is making a come back, one form or another.

> but it gives you a sense of what I'm talking about.
>
> Unfortunately it doesn't work in all cases, because refspec sources may
> be unqualified. If I ask for:
>
>   git fetch $remote master:foo
>
> then we have to actually dwim-resolve "master" from the complete list of
> refs we get from the remote.  It could be "refs/heads/master",
> "refs/tags/master", etc. Worse, it could be "refs/master". In that case,
> at least, I think we are OK because we avoid advertising refs directly
> below "refs/" in the first place. But if you have a slash, like:
>
>   git fetch $remote jk/foo
>
> then that _could_ be "refs/jk/foo". Likewise, we cannot even optimize
> the common case of a fully-qualified ref, like "refs/heads/foo". If it
> exists, we obviously want to use that. But if it doesn't, then it
> could be refs/something-else/refs/heads/foo. That's unlikely, but it
> _does_ work now, and optimizing the advertisement would break it.
>
> So it seems like left-anchoring the refspecs can never be fully correct.
> We can communicate "master" to the server, who can then look at every
> ref it would advertise and ask "could this be called master"? But it
> will be setting in stone the set of "could this be" patterns. Granted,
> those haven't changed much over the history of git, but it seems awfully
> fragile.

The first thought that comes to mind is, if left anchoring does not
work, let's support both left and right anchoring. I guess you
considered and discarded this.

If prefix matching does not work, and assuming "some-prefix" sent by
client to be in fact "**/some-prefix" pattern at server side will set
the "could this be" in stone, how about use wildmatch? It's flexible
enough and we have full control over the pattern matching engine so C
Git <-> C Git should be good regardless of platforms. I understand
that wildmatch is still complicated enough that a re-implementation
can easily divert in behavior. But a pattern with only '*', '/**',
'/**/' and '**/' wildcards (in other words, no [] or ?) could make the
engine a lot simpler and still fit our needs (and give some room for
client-optimization).

> In an ideal world the client and server would negotiate to come to some
> agreement on the patterns being used. But as we are bolting this onto
> the existing protocol, I was really trying to do it without introducing
> an extra capabilities phase or extra round-trips. I.e., something like
> David Turner's "stick the refspec in the HTTP query parameters" trick,
> but working everywhere[1].
-- 
Duy

  reply	other threads:[~2016-10-25 11:46 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-24 13:29 [RFH] limiting ref advertisements Jeff King
2016-10-25 11:46 ` Duy Nguyen [this message]
2016-11-14 21:21   ` Jeff King
2016-11-16 13:42     ` Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CACsJy8DwKxz14Dow9dEKeXnBriMzN_OptnGM7nPigPcS_pHX9w@mail.gmail.com \
    --to=pclouds@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.