git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chris Jerdonek <chris.jerdonek@gmail.com>
To: Jeff King <peff@peff.net>, git@vger.kernel.org
Subject: Re: git-clone --single-branch clones objects outside of branch
Date: Sun, 26 Jan 2020 22:46:07 -0800	[thread overview]
Message-ID: <CAOTb1wd9D3YytevTt0cGnw1o-9cN1-yxCqbuH4oLH1KB6mzEeA@mail.gmail.com> (raw)
In-Reply-To: <20200127055509.GA12108@coredump.intra.peff.net>

On Sun, Jan 26, 2020 at 9:55 PM Jeff King <peff@peff.net> wrote:
> On Sun, Jan 26, 2020 at 04:39:52AM -0800, Chris Jerdonek wrote:
> > However, when I attempted this with a local repo, I found that objects
> > located only in branches other than the branch I specified are also
> > cloned. Also, this is true even if the remote repo has only loose
> > objects (i.e. no pack files). So it doesn't appear to be doing this
> > only to avoid creating new files.
>
> This is the expected outcome, because in your example you're cloning on
> the local filesystem. By default that enables some optimizations, one of
> which is to hard-link the object files into the destination repository.
> That avoids the cost of copying and re-hashing them (which a normal
> cross-system clone would do). And it even avoids traversing the objects
> to find which are necessary, instead just hard-linking everything.

Thanks for the reply. It's okay for that to be the expected behavior.
My suggestion would just be that the documentation for --single-branch
be updated to clarify that objects unreachable from the specified
branch can still be in the cloned repo when run using the --local
optimizations. For example, it can matter for security if one is
trying to create a clone of a repo that doesn't include data from
branches with sensitive info (e.g. in following Git's advice to create
a separate repo if security of private data is desired:
https://git-scm.com/docs/gitnamespaces#_security ).

I'm guessing other flags also don't apply when --local is being used.
For example, I'm guessing --reference is also ignored when using
--local, but I haven't checked yet to confirm. It would be nice if the
documentation gave a heads up in cases like these. Even if hard links
are being used, it's not clear from the docs whether the objects are
filtered first, prior to hard linking, when flags like --single-branch
and --reference are passed.

> This one behaves as you expected because git-fetch does not perform the
> same optimizations (it wouldn't make as much sense there, as generally
> in a fetch we already have most of the objects from the other side
> anyway, so hard-linking would just give us duplicates).

Incidentally, here's a thread from 2010 requesting that this
optimization be available in the git-fetch case:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=573909
(I don't know how reports on that Debian list relate to this list.)

--Chris

  reply	other threads:[~2020-01-27  6:46 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-26 12:39 git-clone --single-branch clones objects outside of branch Chris Jerdonek
2020-01-27  5:55 ` Jeff King
2020-01-27  6:46   ` Chris Jerdonek [this message]
2020-01-28  9:48     ` Jeff King
2020-01-29  1:59       ` Chris Jerdonek
2020-01-29  2:23         ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOTb1wd9D3YytevTt0cGnw1o-9cN1-yxCqbuH4oLH1KB6mzEeA@mail.gmail.com \
    --to=chris.jerdonek@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).