All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, spearce@spearce.org, mfick@codeaurora.org
Subject: Re: [PATCH 0/2] Hiding some refs in ls-remote
Date: Sat, 19 Jan 2013 07:18:52 +0100	[thread overview]
Message-ID: <50FA3ACC.6070909@alum.mit.edu> (raw)
In-Reply-To: <1358555826-11883-1-git-send-email-gitster@pobox.com>

On 01/19/2013 01:37 AM, Junio C Hamano wrote:
> This is an early preview of reducing the network cost while talking
> with a repository with tons of refs, most of which are of use by
> very narrow audiences (e.g. refs under Gerrit's refs/changes/ are
> useful only for people who are interested in the changes under
> review).  As long as these narrow audiences have a way to learn the
> names of refs or objects pointed at by the refs out-of-band, it is
> not necessary to advertise these refs.
> 
> On the server end, you tell upload-pack that some refs do not have
> to be advertised with the uploadPack.hiderefs multi-valued
> configuration variable:
> 
> 	[uploadPack]
> 		hiderefs = refs/changes
> 
> The changes necessary on the client side to allow fetching objects
> at the tip of a ref in hidden hierarchies are much more involved and
> not part of this early preview, but the end user UI is expected to
> be like these:
> 
> 	$ git fetch $there refs/changes/72/41672/1
> 	$ git fetch $there 9598d59cdc098c5d9094d68024475e2430343182
> 
> That is, you ask for a refname as usual even though it is not part
> of ls-remote response, or you ask for the commit object that is at
> the tip of whatever hidden ref you are interested in.

Although I can understand the pain of slow network performance, somehow
this proposal gives me the feeling of being expeditious rather than elegant.

Could the problem be solved in some other way?  Maybe such references
could be stored in a second repository or in a separate namespace (in
the sense of gitnamespaces(7)) to prevent their creating overhead when
they are unneeded?

And *if* reference hiding makes sense, it seems to me that the client,
not the server, should be the one who decides which server references it
is interested in (though I understand that would require a protocol
change).  Otherwise the git repository *relies* on out-of-band channels
for its functionality.  If I understand correctly, a user would have *no
way* to discover, via git, what hidden references are contained in a
remote repository, or indeed even that the repo contains a hidden
namespace.  For example this would make it impossible to clean up
obsolete "hidden" references on a remote repository without the
supplementary information stored elsewhere.  And if anybody accidentally
creates a reference in a hidden namespace by hand, it will just sit
there undetectably, forever.

I assume (though I've never checked) that a server does not let a client
ask for a SHA1 that is not currently reachable from a server-side
reference, and I assume that that you are not proposing to change this
policy.  But allowing objects to be fetched from a hidden reference
opens up some "interesting" possibilities:

* A pusher could upload arbitrary content to a public git server under a
cryptic hidden reference name.  Most people would be completely unable
to see this content, unless given the SHA1 or the reference name by the
pusher.  Thus this mechanism could be used as a dark channel to exchange
arbitrary data relatively secretly.

* Somebody could push a trojan version of code to a hidden reference in
a project, then pass the SHA1 to a victim.  The victim might trust the
code because it comes from a known project website, even though the code
would be invisible to other project developers and thus impossible for
them to audit.  And even if they learned about the trojan's SHA1 they
would be unable to remove it from their repository because they have no
way to find out the name of the hidden reference!

Obviously these hacks would only be possible for a bad guy with push
privileges to a repository that has turned on hidden references, but I
think they are sobering nevertheless.

These worries would go away if reference hiding were configured on the
client rather than on the server.

A second point: currently, the output of "git show-ref -d" and "git
ls-remote ." are almost identical.  Under your proposal, I believe that
the hiderefs would only be omitted from the latter.  Would it be useful
to add an option to "git show-ref" to make it omit the "hiderefs" refs?
 And maybe another option to make it display *only* the hideref refs?

And in the bikeshedding department, I wonder if "hiderefs" is the best
name for the config setting.  "hiderefs", implies to me that the refs
are actively hidden and not available to the client in any way.  But in
fact they are just not advertised; they can be fetched normally.  Maybe
another name would be more suggestive of its true effect, for example
"quietrefs" or "noadvertiserefs".

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

  parent reply	other threads:[~2013-01-19  6:19 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-19  0:37 [PATCH 0/2] Hiding some refs in ls-remote Junio C Hamano
2013-01-19  0:37 ` [PATCH 1/2] upload-pack: share more code Junio C Hamano
2013-01-19  0:37 ` [PATCH 2/2] upload-pack: allow hiding ref hiearchies Junio C Hamano
2013-01-19  5:50 ` [PATCH 0/2] Hiding some refs in ls-remote Duy Nguyen
2013-01-19 19:16   ` Junio C Hamano
2013-01-20 18:19     ` Junio C Hamano
2013-01-21  1:46     ` Duy Nguyen
2013-01-21 22:56     ` Jeff King
2013-01-19  6:18 ` Michael Haggerty [this message]
2013-01-19 16:50 ` Jeff King
2013-01-20 18:06   ` Junio C Hamano
2013-01-20 22:08     ` Junio C Hamano
2013-01-21 23:01       ` Jeff King
2013-01-21 23:33         ` Junio C Hamano
2013-01-21 23:45           ` Jeff King
2013-01-21 23:03     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50FA3ACC.6070909@alum.mit.edu \
    --to=mhagger@alum.mit.edu \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=mfick@codeaurora.org \
    --cc=spearce@spearce.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.