git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Cc: Taylor Blau <me@ttaylorr.com>,
	git@vger.kernel.org, christian.couder@gmail.com,
	jonathantanmy@google.com,
	Marc Strapetz <marc.strapetz@syntevo.com>
Subject: Re: Questions about partial clone with '--filter=tree:0'
Date: Wed, 21 Oct 2020 13:31:53 -0400	[thread overview]
Message-ID: <20201021173153.GC1237181@nand.local> (raw)
In-Reply-To: <a4a20c67-4ee3-77b2-8d57-f30843572aa4@syntevo.com>

On Wed, Oct 21, 2020 at 07:10:02PM +0200, Alexandr Miloslavskiy wrote:
> We currently do not intend to use '--filter=tree:0' ourself, but we are
> trying to support all kinds of user repositories with our UI. So we
> basically have these choices:
>
> A) Declare '--filter=tree:0' repos as completely wrong and unsupported
>    in out UI, also giving an option to "un-partial" them.
>
> B) Support '--filter=tree:0' repos, but don't support operations such
>    as blame and file log
>
> C) Use some magic to efficiently download objects that will be needed
>    for a command such as Blame, while keeping the rest of the repository
>    partial. This is where the command described in (3) will help a lot.
>
> We would of course prefer (C) if it's reasonably possible.

(C) is probably the most reasonable. If you have a promisor remote which
is missing objects, running 'git blame' etc. will transparently download
whatever objects it is missing.

> Unfortunately this does not work as expected. Try the following steps:
>
> A) Clone repo with '--filter=tree:0'
>    $ git clone --bare --filter=tree:0 --branch master
> https://github.com/git/git.git
>
> B) Change filter to 'blob:none'
>    $ cd git.git
>    $ git config remote.origin.partialclonefilter 'blob:none'
>
> C) fetch
>    $ git fetch origin
>    Note that there is no 'Receiving objects:' output.

Ah; I would have thought that the server would have sent objects, even
though we have lots of 'have' lines, since we are treating the server as
a promisor remote and might not have the full reachability closure over
the haves.

Jonathan Tan knows better than I do here. Maybe he could chime in.

> > I think what you probably want is a step 1.5 to tell Git "I'm not going
> > to ask for or care about the entirety of my working copy, I really just
> > want objects in path...", and you can do that with sparse checkouts. See
> > https://git-scm.com/docs/git-sparse-checkout for more.
>
> For simplicity of discussion, let's focus on the problem of running
> Blame efficiently in a repo that was cloned with '--filter=tree:0'. In
> order to blame file '/1/2/Foo.txt', we will need the following:
>
> * Trees '/1'
> * Trees '/1/2'
> * Blobs '/1/2/Foo.txt'
>
> All of these will be needed to unknown commit depth. For simplicity,
> the proposed command will download these for all commits. Specifying
> a range of revisions could be nice, but I feel that it's not worth the
> complexity.
>
> Correct me if I'm wrong: I think that sparse checkout will not help to
> achieve the goal?

I see what you're saying. Here sparse-checkout and partial clones
confusingly diverge: what you really want is to say "I want all of the
objects that I need to construct this directory at any point in history"
so that you can run "git blame" on some path within that directory
without the need for a follow-up fetch.

> This is why I suggest a command that will accept paths and send
> requested objects, also forcing server to assume that all of them are
> missing in client's repository.

In any case the '--filter=sparse:<oid>' bit is not recommended for use,
but perhaps this is a convincing use-case. I didn't follow the partial
clone development close enough to know whether this has already been
discussed, but I'm sure that it has.

Thanks,
Taylor

  reply	other threads:[~2020-10-21 17:32 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-20 17:09 Alexandr Miloslavskiy
2020-10-20 22:29 ` Taylor Blau
2020-10-21 17:10   ` Alexandr Miloslavskiy
2020-10-21 17:31     ` Taylor Blau [this message]
2020-10-21 17:46       ` Alexandr Miloslavskiy
2020-10-26 18:24 ` Jonathan Tan
2020-10-26 18:44   ` Alexandr Miloslavskiy
2020-10-26 19:46     ` Jonathan Tan
2020-10-26 20:08       ` Alexandr Miloslavskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201021173153.GC1237181@nand.local \
    --to=me@ttaylorr.com \
    --cc=alexandr.miloslavskiy@syntevo.com \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    --cc=marc.strapetz@syntevo.com \
    --subject='Re: Questions about partial clone with '\''--filter=tree:0'\''' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).