All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Christian Couder <christian.couder@gmail.com>
Cc: Jeff Hostetler <git@jeffhostetler.com>, git <git@vger.kernel.org>,
	Jeff Hostetler <jeffhost@microsoft.com>,
	Jonathan Tan <jonathantanmy@google.com>,
	Matthew DeVore <matvore@google.com>
Subject: Re: how does "clone --filter=sparse:path" work?
Date: Fri, 24 May 2019 04:31:42 -0400	[thread overview]
Message-ID: <20190524083142.GC9082@sigill.intra.peff.net> (raw)
In-Reply-To: <CAP8UFD0XbOUj70pt4X=HDvGBoLaG9qBv9SWGnM6N8FG3t-57rg@mail.gmail.com>

On Fri, May 24, 2019 at 10:05:45AM +0200, Christian Couder wrote:

> (Sorry for the late reply to this.)

No problem. I've been meaning to pick it up again, and somehow it's been
6 months. ;)

> > > But mainly I was thinking of a use case on the client of the form:
> > >
> > >     git rev-list
> > >         --objects
> > >         --filter=spec:path=.git/sparse-checkout
> 
> Do you mean "sparse:path" instead of "spec:path"?

Yes, I think so.

> > > and get a list of the blobs that you don't have and would need before
> > > you could checkout <commit> using the current sparse-checkout definition.
> > > You could then have a pre-checkout hook that would bulk
> > > fetch them before starting the actual checkout.  Since that would be
> > > more efficient than demand-loading blobs individually during the
> > > checkout.  There's more work to do in this area, but that was the idea.
> > >
> > > But back to your point, yes, I think we should restrict this over the
> > > wire.
> >
> > Thanks for your thorough response, and sorry for the slow reply. I had
> > meant to reply with a patch adding in the restriction, but I haven't
> > quite gotten to it. :)
> 
> The way I see it could be restricted is by adding a config option on
> the server, maybe called "uploadpack.sparsePathFilter", to tell which
> filenames can be accessed using "--filter=sparse:path=".
> 
> For example with uploadpack.sparsePathFilter set to
> "/home/user/git/sparse/*" and "--filter=sparse:path=foo" then
> "/home/user/git/sparse/foo" on the server would be used if it exists.
> (Of course care should be taken that things like
> "--filter=sparse:path=bar/../../foo" are rejected.)
> 
> If uploadpack.sparsePathFilter is unset or set to "false", then
> "--filter=sparse:path=<stuff>" would always error out.
> 
> Is this what you had in mind?

My plan had been to disallow it entirely, and allow some mechanism by
which the client could specify the actual set of sparse paths itself
(which it might get from a local file, or communicated in some
out-of-band way to the user cloning, etc).

If we just want a mechanism for the server to provide a pre-made sparse
list, then I think pointing people at sparse:oid=<blob> is simpler
there. I.e., your "foo" becomes "refs/sparse/foo" or even "HEAD:.sparse"
or similar, and the server admin just sticks the content into the repo
instead of dealing with exposing filesystem paths to the client.

-Peff

  reply	other threads:[~2019-05-24  8:31 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-08  5:07 how does "clone --filter=sparse:path" work? Jeff King
2018-11-08 18:57 ` Jeff Hostetler
2018-11-22 17:39   ` Jeff King
2019-05-24  8:05     ` Christian Couder
2019-05-24  8:31       ` Jeff King [this message]
2019-05-24  9:27         ` Christian Couder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190524083142.GC9082@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=christian.couder@gmail.com \
    --cc=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=jeffhost@microsoft.com \
    --cc=jonathantanmy@google.com \
    --cc=matvore@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.