All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] Hiding some refs in ls-remote
@ 2013-01-19  0:37 Junio C Hamano
  2013-01-19  0:37 ` [PATCH 1/2] upload-pack: share more code Junio C Hamano
                   ` (4 more replies)
  0 siblings, 5 replies; 16+ messages in thread
From: Junio C Hamano @ 2013-01-19  0:37 UTC (permalink / raw)
  To: git; +Cc: spearce, mfick

This is an early preview of reducing the network cost while talking
with a repository with tons of refs, most of which are of use by
very narrow audiences (e.g. refs under Gerrit's refs/changes/ are
useful only for people who are interested in the changes under
review).  As long as these narrow audiences have a way to learn the
names of refs or objects pointed at by the refs out-of-band, it is
not necessary to advertise these refs.

On the server end, you tell upload-pack that some refs do not have
to be advertised with the uploadPack.hiderefs multi-valued
configuration variable:

	[uploadPack]
		hiderefs = refs/changes

The changes necessary on the client side to allow fetching objects
at the tip of a ref in hidden hierarchies are much more involved and
not part of this early preview, but the end user UI is expected to
be like these:

	$ git fetch $there refs/changes/72/41672/1
	$ git fetch $there 9598d59cdc098c5d9094d68024475e2430343182

That is, you ask for a refname as usual even though it is not part
of ls-remote response, or you ask for the commit object that is at
the tip of whatever hidden ref you are interested in.

Junio C Hamano (2):
  upload-pack: share more code
  upload-pack: allow hiding ref hiearchies

 t/t5512-ls-remote.sh |  9 ++++++
 upload-pack.c        | 86 ++++++++++++++++++++++++++++++++++++++++++----------
 2 files changed, 79 insertions(+), 16 deletions(-)

-- 
1.8.1.1.454.g48d39c0

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/2] upload-pack: share more code
  2013-01-19  0:37 [PATCH 0/2] Hiding some refs in ls-remote Junio C Hamano
@ 2013-01-19  0:37 ` Junio C Hamano
  2013-01-19  0:37 ` [PATCH 2/2] upload-pack: allow hiding ref hiearchies Junio C Hamano
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 16+ messages in thread
From: Junio C Hamano @ 2013-01-19  0:37 UTC (permalink / raw)
  To: git; +Cc: spearce, mfick

We mark the objects pointed at our refs with "OUR_REF" flag in two
functions (mark_our_ref() and send_ref()), but we can just use the
former as a helper for the latter.

Update the way mark_our_ref() prepares in-core object to use
lookup_unknown_object() to delay reading the actual object data,
just like we did in 435c833 (upload-pack: use peel_ref for ref
advertisements, 2012-10-04).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 upload-pack.c | 31 ++++++++++++++-----------------
 1 file changed, 14 insertions(+), 17 deletions(-)

diff --git a/upload-pack.c b/upload-pack.c
index 95d8313..3dd220d 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -722,15 +722,28 @@ static void receive_needs(void)
 	free(shallows.objects);
 }
 
+static int mark_our_ref(const char *refname, const unsigned char *sha1, int flag, void *cb_data)
+{
+	struct object *o = lookup_unknown_object(sha1);
+	if (!o)
+		die("git upload-pack: cannot find object %s:", sha1_to_hex(sha1));
+	if (!(o->flags & OUR_REF)) {
+		o->flags |= OUR_REF;
+		nr_our_refs++;
+	}
+	return 0;
+}
+
 static int send_ref(const char *refname, const unsigned char *sha1, int flag, void *cb_data)
 {
 	static const char *capabilities = "multi_ack thin-pack side-band"
 		" side-band-64k ofs-delta shallow no-progress"
 		" include-tag multi_ack_detailed";
-	struct object *o = lookup_unknown_object(sha1);
 	const char *refname_nons = strip_namespace(refname);
 	unsigned char peeled[20];
 
+	mark_our_ref(refname, sha1, flag, cb_data);
+
 	if (capabilities)
 		packet_write(1, "%s %s%c%s%s agent=%s\n",
 			     sha1_to_hex(sha1), refname_nons,
@@ -740,27 +753,11 @@ static int send_ref(const char *refname, const unsigned char *sha1, int flag, vo
 	else
 		packet_write(1, "%s %s\n", sha1_to_hex(sha1), refname_nons);
 	capabilities = NULL;
-	if (!(o->flags & OUR_REF)) {
-		o->flags |= OUR_REF;
-		nr_our_refs++;
-	}
 	if (!peel_ref(refname, peeled))
 		packet_write(1, "%s %s^{}\n", sha1_to_hex(peeled), refname_nons);
 	return 0;
 }
 
-static int mark_our_ref(const char *refname, const unsigned char *sha1, int flag, void *cb_data)
-{
-	struct object *o = parse_object(sha1);
-	if (!o)
-		die("git upload-pack: cannot find object %s:", sha1_to_hex(sha1));
-	if (!(o->flags & OUR_REF)) {
-		o->flags |= OUR_REF;
-		nr_our_refs++;
-	}
-	return 0;
-}
-
 static void upload_pack(void)
 {
 	if (advertise_refs || !stateless_rpc) {
-- 
1.8.1.1.454.g48d39c0

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/2] upload-pack: allow hiding ref hiearchies
  2013-01-19  0:37 [PATCH 0/2] Hiding some refs in ls-remote Junio C Hamano
  2013-01-19  0:37 ` [PATCH 1/2] upload-pack: share more code Junio C Hamano
@ 2013-01-19  0:37 ` Junio C Hamano
  2013-01-19  5:50 ` [PATCH 0/2] Hiding some refs in ls-remote Duy Nguyen
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 16+ messages in thread
From: Junio C Hamano @ 2013-01-19  0:37 UTC (permalink / raw)
  To: git; +Cc: spearce, mfick

Teach upload-pack to omit some refs from the initial advertisement
by introducing the uploadPack.hiderefs multivalued configuration
variable.  Any ref that is under the hierarchies listed on the value
of this variable is excluded from responses to "ls-remote", "fetch"
or "clone" requests.  One typical use case may be

	[uploadPack]
		hiderefs = refs/changes

Note that the underlying protocol allows a request to fetch objects
at the tip of any ref, including the hidden ones, but on the client
side (e.g. fetch-pack), the requests are checked against the
ls-remote response, so it cannot ask for refs/changes/72/41672/1, or
for the object name (which is what eventually goes over the wire on
"want" line) the user may obtain out of band (e.g. Gerrit
dashboard).  A new capability "allow-tip-sha1-in-want" is returned
by upload-pack to signal updated clients that it may be OK to ask
for objects that were not advertised.

If we want to allow fetching refname that is hidden, e.g.

	$ git fetch $there refs/changes/72/41672/1

we need to further update the server side to understand a message
that has the refname instead of object name on "want" line.  The
necessary change to the server side to support that is not in this
step.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t5512-ls-remote.sh |  9 ++++++++
 upload-pack.c        | 61 ++++++++++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 68 insertions(+), 2 deletions(-)

diff --git a/t/t5512-ls-remote.sh b/t/t5512-ls-remote.sh
index d16e5d3..9ee3d2a 100755
--- a/t/t5512-ls-remote.sh
+++ b/t/t5512-ls-remote.sh
@@ -126,4 +126,13 @@ test_expect_success 'Report match with --exit-code' '
 	test_cmp expect actual
 '
 
+test_expect_success 'Hide some refs' '
+	test_config uploadPack.hiderefs refs/tags &&
+	git ls-remote . >actual &&
+	test_unconfig uploadPack.hiderefs &&
+	git ls-remote . |
+	sed -e "/	refs\/tags\//d" >expect &&
+	test_cmp expect actual
+'
+
 test_done
diff --git a/upload-pack.c b/upload-pack.c
index 3dd220d..54c304d 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -12,6 +12,7 @@
 #include "run-command.h"
 #include "sigchain.h"
 #include "version.h"
+#include "string-list.h"
 
 static const char upload_pack_usage[] = "git upload-pack [--strict] [--timeout=<n>] <dir>";
 
@@ -28,10 +29,13 @@ static const char upload_pack_usage[] = "git upload-pack [--strict] [--timeout=<
 
 static unsigned long oldest_have;
 
-static int multi_ack, nr_our_refs;
+static int multi_ack;
+static int nr_our_refs; /* This counts both advertised and unadvertised */
 static int no_done;
 static int use_thin_pack, use_ofs_delta, use_include_tag;
 static int no_progress, daemon_mode;
+static struct string_list *hide_refs;
+static int allow_tip_sha1_in_want;
 static int shallow_nr;
 static struct object_array have_obj;
 static struct object_array want_obj;
@@ -722,6 +726,23 @@ static void receive_needs(void)
 	free(shallows.objects);
 }
 
+static int ref_is_hidden(const char *refname)
+{
+	struct string_list_item *item;
+
+	if (!hide_refs)
+		return 0;
+	for_each_string_list_item(item, hide_refs) {
+		int len;
+		if (prefixcmp(refname, item->string))
+			continue;
+		len = strlen(item->string);
+		if (!refname[len] || refname[len] == '/')
+			return 1;
+	}
+	return 0;
+}
+
 static int mark_our_ref(const char *refname, const unsigned char *sha1, int flag, void *cb_data)
 {
 	struct object *o = lookup_unknown_object(sha1);
@@ -743,11 +764,14 @@ static int send_ref(const char *refname, const unsigned char *sha1, int flag, vo
 	unsigned char peeled[20];
 
 	mark_our_ref(refname, sha1, flag, cb_data);
+	if (allow_tip_sha1_in_want && ref_is_hidden(refname))
+		return 0;
 
 	if (capabilities)
-		packet_write(1, "%s %s%c%s%s agent=%s\n",
+		packet_write(1, "%s %s%c%s%s%s agent=%s\n",
 			     sha1_to_hex(sha1), refname_nons,
 			     0, capabilities,
+			     allow_tip_sha1_in_want ? " allow-tip-sha1-in-want" : "",
 			     stateless_rpc ? " no-done" : "",
 			     git_user_agent_sanitized());
 	else
@@ -758,11 +782,21 @@ static int send_ref(const char *refname, const unsigned char *sha1, int flag, vo
 	return 0;
 }
 
+static int check_hidden_ref(const char *refname, const unsigned char *sha1, int flag, void *cb_data)
+{
+	if (!ref_is_hidden(refname))
+		return 0;
+	allow_tip_sha1_in_want = 1;
+	return 1; /* terminate iteration over refs early */
+}
+
 static void upload_pack(void)
 {
 	if (advertise_refs || !stateless_rpc) {
 		reset_timeout();
 		head_ref_namespaced(send_ref, NULL);
+		if (hide_refs)
+			for_each_namespaced_ref(check_hidden_ref, NULL);
 		for_each_namespaced_ref(send_ref, NULL);
 		packet_flush(1);
 	} else {
@@ -779,6 +813,28 @@ static void upload_pack(void)
 	}
 }
 
+static int upload_pack_config(const char *var, const char *value, void *unused)
+{
+	if (!strcmp("uploadpack.hiderefs", var)) {
+		char *ref;
+		int len;
+
+		if (!value)
+			return config_error_nonbool(var);
+		ref = xstrdup(value);
+		len = strlen(ref);
+		while (len && ref[len - 1] == '/')
+			ref[--len] = '\0';
+		if (!hide_refs) {
+			hide_refs = xcalloc(1, sizeof(*hide_refs));
+			hide_refs->strdup_strings = 1;
+		}
+		string_list_append(hide_refs, ref);
+		return 0;
+	}
+	return 0;
+}
+
 int main(int argc, char **argv)
 {
 	char *dir;
@@ -830,6 +886,7 @@ int main(int argc, char **argv)
 		die("'%s' does not appear to be a git repository", dir);
 	if (is_repository_shallow())
 		die("attempt to fetch/clone from a shallow repository");
+	git_config(upload_pack_config, NULL);
 	if (getenv("GIT_DEBUG_SEND_PACK"))
 		debug_fd = atoi(getenv("GIT_DEBUG_SEND_PACK"));
 	upload_pack();
-- 
1.8.1.1.454.g48d39c0

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/2] Hiding some refs in ls-remote
  2013-01-19  0:37 [PATCH 0/2] Hiding some refs in ls-remote Junio C Hamano
  2013-01-19  0:37 ` [PATCH 1/2] upload-pack: share more code Junio C Hamano
  2013-01-19  0:37 ` [PATCH 2/2] upload-pack: allow hiding ref hiearchies Junio C Hamano
@ 2013-01-19  5:50 ` Duy Nguyen
  2013-01-19 19:16   ` Junio C Hamano
  2013-01-19  6:18 ` Michael Haggerty
  2013-01-19 16:50 ` Jeff King
  4 siblings, 1 reply; 16+ messages in thread
From: Duy Nguyen @ 2013-01-19  5:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, spearce, mfick

On Sat, Jan 19, 2013 at 7:37 AM, Junio C Hamano <gitster@pobox.com> wrote:
> This is an early preview of reducing the network cost while talking
> with a repository with tons of refs, most of which are of use by
> very narrow audiences (e.g. refs under Gerrit's refs/changes/ are
> useful only for people who are interested in the changes under
> review).  As long as these narrow audiences have a way to learn the
> names of refs or objects pointed at by the refs out-of-band, it is
> not necessary to advertise these refs.
>
> On the server end, you tell upload-pack that some refs do not have
> to be advertised with the uploadPack.hiderefs multi-valued
> configuration variable:
>
>         [uploadPack]
>                 hiderefs = refs/changes
>
> The changes necessary on the client side to allow fetching objects
> at the tip of a ref in hidden hierarchies are much more involved and
> not part of this early preview, but the end user UI is expected to
> be like these:
>
>         $ git fetch $there refs/changes/72/41672/1
>         $ git fetch $there 9598d59cdc098c5d9094d68024475e2430343182
>
> That is, you ask for a refname as usual even though it is not part
> of ls-remote response, or you ask for the commit object that is at
> the tip of whatever hidden ref you are interested in.

Should the client side learn how to list hidden refs too? I'm thinking
of an extreme case where upload-pack advertises nothing (or maybe just
refs/heads/master) and it's up to the client to ask for the ref
selection it's interested in. upload-pack may need more updates to do
that, I think.
-- 
Duy

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/2] Hiding some refs in ls-remote
  2013-01-19  0:37 [PATCH 0/2] Hiding some refs in ls-remote Junio C Hamano
                   ` (2 preceding siblings ...)
  2013-01-19  5:50 ` [PATCH 0/2] Hiding some refs in ls-remote Duy Nguyen
@ 2013-01-19  6:18 ` Michael Haggerty
  2013-01-19 16:50 ` Jeff King
  4 siblings, 0 replies; 16+ messages in thread
From: Michael Haggerty @ 2013-01-19  6:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, spearce, mfick

On 01/19/2013 01:37 AM, Junio C Hamano wrote:
> This is an early preview of reducing the network cost while talking
> with a repository with tons of refs, most of which are of use by
> very narrow audiences (e.g. refs under Gerrit's refs/changes/ are
> useful only for people who are interested in the changes under
> review).  As long as these narrow audiences have a way to learn the
> names of refs or objects pointed at by the refs out-of-band, it is
> not necessary to advertise these refs.
> 
> On the server end, you tell upload-pack that some refs do not have
> to be advertised with the uploadPack.hiderefs multi-valued
> configuration variable:
> 
> 	[uploadPack]
> 		hiderefs = refs/changes
> 
> The changes necessary on the client side to allow fetching objects
> at the tip of a ref in hidden hierarchies are much more involved and
> not part of this early preview, but the end user UI is expected to
> be like these:
> 
> 	$ git fetch $there refs/changes/72/41672/1
> 	$ git fetch $there 9598d59cdc098c5d9094d68024475e2430343182
> 
> That is, you ask for a refname as usual even though it is not part
> of ls-remote response, or you ask for the commit object that is at
> the tip of whatever hidden ref you are interested in.

Although I can understand the pain of slow network performance, somehow
this proposal gives me the feeling of being expeditious rather than elegant.

Could the problem be solved in some other way?  Maybe such references
could be stored in a second repository or in a separate namespace (in
the sense of gitnamespaces(7)) to prevent their creating overhead when
they are unneeded?

And *if* reference hiding makes sense, it seems to me that the client,
not the server, should be the one who decides which server references it
is interested in (though I understand that would require a protocol
change).  Otherwise the git repository *relies* on out-of-band channels
for its functionality.  If I understand correctly, a user would have *no
way* to discover, via git, what hidden references are contained in a
remote repository, or indeed even that the repo contains a hidden
namespace.  For example this would make it impossible to clean up
obsolete "hidden" references on a remote repository without the
supplementary information stored elsewhere.  And if anybody accidentally
creates a reference in a hidden namespace by hand, it will just sit
there undetectably, forever.

I assume (though I've never checked) that a server does not let a client
ask for a SHA1 that is not currently reachable from a server-side
reference, and I assume that that you are not proposing to change this
policy.  But allowing objects to be fetched from a hidden reference
opens up some "interesting" possibilities:

* A pusher could upload arbitrary content to a public git server under a
cryptic hidden reference name.  Most people would be completely unable
to see this content, unless given the SHA1 or the reference name by the
pusher.  Thus this mechanism could be used as a dark channel to exchange
arbitrary data relatively secretly.

* Somebody could push a trojan version of code to a hidden reference in
a project, then pass the SHA1 to a victim.  The victim might trust the
code because it comes from a known project website, even though the code
would be invisible to other project developers and thus impossible for
them to audit.  And even if they learned about the trojan's SHA1 they
would be unable to remove it from their repository because they have no
way to find out the name of the hidden reference!

Obviously these hacks would only be possible for a bad guy with push
privileges to a repository that has turned on hidden references, but I
think they are sobering nevertheless.

These worries would go away if reference hiding were configured on the
client rather than on the server.

A second point: currently, the output of "git show-ref -d" and "git
ls-remote ." are almost identical.  Under your proposal, I believe that
the hiderefs would only be omitted from the latter.  Would it be useful
to add an option to "git show-ref" to make it omit the "hiderefs" refs?
 And maybe another option to make it display *only* the hideref refs?

And in the bikeshedding department, I wonder if "hiderefs" is the best
name for the config setting.  "hiderefs", implies to me that the refs
are actively hidden and not available to the client in any way.  But in
fact they are just not advertised; they can be fetched normally.  Maybe
another name would be more suggestive of its true effect, for example
"quietrefs" or "noadvertiserefs".

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/2] Hiding some refs in ls-remote
  2013-01-19  0:37 [PATCH 0/2] Hiding some refs in ls-remote Junio C Hamano
                   ` (3 preceding siblings ...)
  2013-01-19  6:18 ` Michael Haggerty
@ 2013-01-19 16:50 ` Jeff King
  2013-01-20 18:06   ` Junio C Hamano
  4 siblings, 1 reply; 16+ messages in thread
From: Jeff King @ 2013-01-19 16:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, spearce, mfick

On Fri, Jan 18, 2013 at 04:37:04PM -0800, Junio C Hamano wrote:

> This is an early preview of reducing the network cost while talking
> with a repository with tons of refs, most of which are of use by
> very narrow audiences (e.g. refs under Gerrit's refs/changes/ are
> useful only for people who are interested in the changes under
> review).  As long as these narrow audiences have a way to learn the
> names of refs or objects pointed at by the refs out-of-band, it is
> not necessary to advertise these refs.
> 
> On the server end, you tell upload-pack that some refs do not have
> to be advertised with the uploadPack.hiderefs multi-valued
> configuration variable:
> 
> 	[uploadPack]
> 		hiderefs = refs/changes

Would you want to do the same thing on receive-pack? It could benefit
from the same reduction in network cost (although it tends to be invoked
less frequently than upload-pack).

At GitHub, we have a similar patch (we even call it hiderefs), but we do
it only for receive-pack. In our case, it is not about network traffic,
but rather that we provide a set of read-only refs in the refs/pull
hierarchy. These are generated upstream by the creation of pull
requests, and we reject any updates to them via the git protocol using a
pre-receive hook.

However, if a client without these refs uses "git push --mirror", it
will attempt to delete them (which will fail). Meaning that a mirror
push will always report failure, because it will always fail to push the
refs/pull deletions.

I don't know much about Gerrit's inner workings. Are refs/changes also
read-only?

-Peff

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/2] Hiding some refs in ls-remote
  2013-01-19  5:50 ` [PATCH 0/2] Hiding some refs in ls-remote Duy Nguyen
@ 2013-01-19 19:16   ` Junio C Hamano
  2013-01-20 18:19     ` Junio C Hamano
                       ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Junio C Hamano @ 2013-01-19 19:16 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: git, spearce, mfick

Duy Nguyen <pclouds@gmail.com> writes:

> Should the client side learn how to list hidden refs too? I'm thinking
> of an extreme case where upload-pack advertises nothing (or maybe just
> refs/heads/master) and it's up to the client to ask for the ref
> selection it's interested in. upload-pack may need more updates to do
> that, I think.

What you are talking about is a different goal.

Look at this as a mechanism for the repository owner to control the
clutter in what is shown to the intended audience of what s/he
publishes in the repository.  Network bandwidth reduction of
advertisement is a side effect of clutter reduction, and not
necessarily the primary goal.

It lets me as the repository owner to say "do not list them to make
these discoverable"; letting the client side learn defeats the whole
point of this mechanism.

For example, if you mirror-clone from either of my repositories from
GitHub:

    git clone --mirror git://github.com/git/git/
    git clone --mirror git://github.com/gitster/git/

you will see stuff that does not belong to the project; objects that
are only reachable from refs/pull/* are not something I approved to
be placed in my repository; I as the project owner do not want to
give these objects to a cloner as if they are part of my project.

The server side can hide refs/pull/ and refs/pull-request-tags/ so
that clones (and ls-remote) will not see clutter, and nobody gets
hurt.

These refs are there only to support GitHub "pull requests"; the
advertisement of them to ls-remote and appearance of them in mirror
clones are undesirable side effects of how GitHub happened to
implement the feature.  GitHub has its way to notify of these pull
requests, and it makes these pull requests discoverable out of band
without involving Git.

Ability to say "git fetch origin $SHA1" (or "git fetch origin
pull/1") is the only thing lacking, in order to get rid of these
clutter.  The patches do the "get rid of clutter" part by letting
you hide these refs from ls-remote and clone; allowing "fetch by tip
SHA1" needs to come latter.

Another example.  If you run ls-remote against the above two
repositories, you will notice that the latter has tons more
branches.  The former are to publish only the primary integration
branches, while the latter are to show individual topics.

I wish I didn't have to do this if I could.

We have periodical "What's cooking" postings that let interested
parties learn topics out-of-band.  If I can hide refs/heads/??/*
from normal users' clones while actually keeping individual topics
there at the same place, I do not have to push into two places.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/2] Hiding some refs in ls-remote
  2013-01-19 16:50 ` Jeff King
@ 2013-01-20 18:06   ` Junio C Hamano
  2013-01-20 22:08     ` Junio C Hamano
  2013-01-21 23:03     ` Jeff King
  0 siblings, 2 replies; 16+ messages in thread
From: Junio C Hamano @ 2013-01-20 18:06 UTC (permalink / raw)
  To: Jeff King; +Cc: git, spearce, mfick

Jeff King <peff@peff.net> writes:

>> 	[uploadPack]
>> 		hiderefs = refs/changes
>
> Would you want to do the same thing on receive-pack? It could benefit
> from the same reduction in network cost (although it tends to be invoked
> less frequently than upload-pack).

Do *I* want to do?  Not when there already is a patch exists; I'd
rather take existing and tested patch ;-)

As I said, I view this primarily as solving the cluttering issue.
The use of hiderefs to hide these refs that point at objects I do
not consider belong to my repository from the pushers equally makes
sense as such, I think.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/2] Hiding some refs in ls-remote
  2013-01-19 19:16   ` Junio C Hamano
@ 2013-01-20 18:19     ` Junio C Hamano
  2013-01-21  1:46     ` Duy Nguyen
  2013-01-21 22:56     ` Jeff King
  2 siblings, 0 replies; 16+ messages in thread
From: Junio C Hamano @ 2013-01-20 18:19 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: git, spearce, mfick

Junio C Hamano <gitster@pobox.com> writes:

> Duy Nguyen <pclouds@gmail.com> writes:
>
>> Should the client side learn how to list hidden refs too? I'm thinking
>> of an extreme case where upload-pack advertises nothing (or maybe just
>> refs/heads/master) and it's up to the client to ask for the ref
>> selection it's interested in. upload-pack may need more updates to do
>> that, I think.
>
> What you are talking about is a different goal.
>
> Look at this as a mechanism for the repository owner to control the
> clutter in what is shown to the intended audience of what s/he
> publishes in the repository.  Network bandwidth reduction of
> advertisement is a side effect of clutter reduction, and not
> necessarily the primary goal.

As a separate and follow-up topic, I can see a future where we also
have .delayref (in addition to .hideref) configuration, that causes
the upload-pack to:

 * omit refs that are marked as "delayed";

 * advertise "allow-expand-ref=<value>" where the value is the
   pattern used to specify which refs are omitted via this mechanism
   (e.g. "refs/*" in your extreme example, or perhaps "refs/tags/"
   if you only advertise branches in order to limit the bandwidth);

and then a fetch-pack that is interested in fetching "refs/foo/bar",
after seeing that it matches one of the "allow-expand-ref" patterns,
to ask "expand-ref refs/foo/bar", expect the upload-pack to respond
with "refs/foo/bar <value of that ref>" (or "refs/foo/bar does not
exist"), before going into the usual "want" exchange ("ls-remote"
would of course do the same, and pattern-less "ls-remote" may even
ask to expand everything by saying "expand-ref refs/*" (where the
capability said "allow-expand-ref=refs/*" in your extreme example)
to grab everything discoverable).

That is _primarily_ for trading network bandwidth with initial
latency; as far as the end-result is concerned, delayed ones to be
expanded on demand are just as discoverable as the ones advertised
initially.

I think we'd want to have both in the end.  It just happens I just
posted the beginning of clutter-reduction one, as I think developing
both in parallel would be messy and hideref is probably less
involved than the delayref.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/2] Hiding some refs in ls-remote
  2013-01-20 18:06   ` Junio C Hamano
@ 2013-01-20 22:08     ` Junio C Hamano
  2013-01-21 23:01       ` Jeff King
  2013-01-21 23:03     ` Jeff King
  1 sibling, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2013-01-20 22:08 UTC (permalink / raw)
  To: Jeff King; +Cc: git, spearce, mfick

Junio C Hamano <gitster@pobox.com> writes:

> Jeff King <peff@peff.net> writes:
>
>>> 	[uploadPack]
>>> 		hiderefs = refs/changes
>>
>> Would you want to do the same thing on receive-pack? It could benefit
>> from the same reduction in network cost (although it tends to be invoked
>> less frequently than upload-pack).
> ...
> As I said, I view this primarily as solving the cluttering issue.
> The use of hiderefs to hide these refs that point at objects I do
> not consider belong to my repository from the pushers equally makes
> sense as such, I think.

Now that raises one question.  Should this be transfer.hiderefs that
governs both upload-pack and receive-pack?  I tend to think the
answer is yes.

It may even make sense to have "git push" honor it, excluding them
from "git push --mirror" (or "git push --all" if some of the
branches are hidden); I haven't thought things through, though.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/2] Hiding some refs in ls-remote
  2013-01-19 19:16   ` Junio C Hamano
  2013-01-20 18:19     ` Junio C Hamano
@ 2013-01-21  1:46     ` Duy Nguyen
  2013-01-21 22:56     ` Jeff King
  2 siblings, 0 replies; 16+ messages in thread
From: Duy Nguyen @ 2013-01-21  1:46 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, spearce, mfick

On Sun, Jan 20, 2013 at 2:16 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Duy Nguyen <pclouds@gmail.com> writes:
>
>> Should the client side learn how to list hidden refs too? I'm thinking
>> of an extreme case where upload-pack advertises nothing (or maybe just
>> refs/heads/master) and it's up to the client to ask for the ref
>> selection it's interested in. upload-pack may need more updates to do
>> that, I think.
>
> What you are talking about is a different goal.
>
> Look at this as a mechanism for the repository owner to control the
> clutter in what is shown to the intended audience of what s/he
> publishes in the repository.  Network bandwidth reduction of
> advertisement is a side effect of clutter reduction, and not
> necessarily the primary goal.

Probably stupid question: does gitnamespaces.txt attempt to achieve
the same? The document says smart http supports passing namespace,
nothing about git protocol so I guess we need some extension in
upload-pack (or git-daemon) for specifying namespace over git
protocol.
-- 
Duy

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/2] Hiding some refs in ls-remote
  2013-01-19 19:16   ` Junio C Hamano
  2013-01-20 18:19     ` Junio C Hamano
  2013-01-21  1:46     ` Duy Nguyen
@ 2013-01-21 22:56     ` Jeff King
  2 siblings, 0 replies; 16+ messages in thread
From: Jeff King @ 2013-01-21 22:56 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Duy Nguyen, git, spearce, mfick

On Sat, Jan 19, 2013 at 11:16:00AM -0800, Junio C Hamano wrote:

> For example, if you mirror-clone from either of my repositories from
> GitHub:
> 
>     git clone --mirror git://github.com/git/git/
>     git clone --mirror git://github.com/gitster/git/
> 
> you will see stuff that does not belong to the project; objects that
> are only reachable from refs/pull/* are not something I approved to
> be placed in my repository; I as the project owner do not want to
> give these objects to a cloner as if they are part of my project.
> 
> The server side can hide refs/pull/ and refs/pull-request-tags/ so
> that clones (and ls-remote) will not see clutter, and nobody gets
> hurt.

I think it is unfortunately not so simple as that. You do not want them
to be part of your "clone --mirror", but some people do (because they
will inspect them when evaluating the pull request). And that decision
is, I think, in the eye of the cloner, not the eye of the repo owner. I
think in your case it is a little harder to see, in that you do not care
about the PR mechanism for git.git at all, but let us think for a minute
about a project that actually uses PRs.

For an ordinary developer, they would be content to fetch the branch
tips and tags (and that is what we do by default with "git clone). They
do not need to fetch all of refs/pull. If they learn out-of-band that PR
123 exists, they can in theory ask for refs/pull/123/head. That works
even with your proposal, because it is just about dropping the
advertisement, not the availability of refs.

But what about entities which really do want all of refs/pull?  I can
think of two good reasons to want them:

  1. You really do want a mirror, because you are creating a backup or
     alternate distribution channel. IOW, you are using "git clone
     --mirror", but it is missing those refs.

  2. You are a developer who is sometimes disconnected. You want to
     fetch all of the PRs, and then examine them at your leisure.

Without advertising, how do they ask for the wildcard of `refs/pull/*`?
They're stuck massaging some out-of-band data into a set of distinct
fetch refspecs.

I don't know much about what's in Gerrit's refs/changes, but I imagine
one could make a similar argument that their value is in the eye of the
client. And later you give this example:

> Another example.  If you run ls-remote against the above two
> repositories, you will notice that the latter has tons more
> branches.  The former are to publish only the primary integration
> branches, while the latter are to show individual topics.
> 
> I wish I didn't have to do this if I could.
> 
> We have periodical "What's cooking" postings that let interested
> parties learn topics out-of-band.  If I can hide refs/heads/??/*
> from normal users' clones while actually keeping individual topics
> there at the same place, I do not have to push into two places.

Most people do not want to see those heads. But some people (like me)
do, and it is a great convenience to be able to fetch them all with a
wildcard refspec, which cannot work if they are not advertised. I would
have to parse what's cooking and fetch them each individually. That's
not a technologically insurmountable problem, but it means git is being
a lot less helpful than it could be.


So I think in these cases of "extra refs", some clients would want them,
and some would not. And only they can know which camp they are in, not
the server side. Which is why the current system works pretty well: we
advertise everything and let the client decide what it wants. Clone very
sanely fetches only refs/heads/* (and associated tags), and you can put
whatever extra junk you want elsewhere in the hierarchy. A mirrored
clone will fetch it, but to me that is the point of --mirror. And if you
want some arbitrary subset, then you can implement that, too, by the
use of fetch refspecs[1].

The downside, of course, is that the ref advertisement is bigger than
many clients will want. But dealing with that is a separate issue from
"these refs should never be shown", as solutions for one may not work
from the other (e.g., if we delayed ref advertisement until the client
had spoken, the client could tell us what refspecs they are interested
in).

For your topic branch example, why don't you push to refs/topics of the
main git repository? Then normal cloners wouldn't see it, but anybody
interested could add the appropriate fetch refspec.

-Peff

[1] It may be that refspecs are not as expressive as we would like,
    but that is a client side problem we can deal with. For instance,
    you cannot currently say "give me everything except refs/foo/*".

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/2] Hiding some refs in ls-remote
  2013-01-20 22:08     ` Junio C Hamano
@ 2013-01-21 23:01       ` Jeff King
  2013-01-21 23:33         ` Junio C Hamano
  0 siblings, 1 reply; 16+ messages in thread
From: Jeff King @ 2013-01-21 23:01 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, spearce, mfick

On Sun, Jan 20, 2013 at 02:08:32PM -0800, Junio C Hamano wrote:

> Junio C Hamano <gitster@pobox.com> writes:
> 
> > Jeff King <peff@peff.net> writes:
> >
> >>> 	[uploadPack]
> >>> 		hiderefs = refs/changes
> >>
> >> Would you want to do the same thing on receive-pack? It could benefit
> >> from the same reduction in network cost (although it tends to be invoked
> >> less frequently than upload-pack).
> > ...
> > As I said, I view this primarily as solving the cluttering issue.
> > The use of hiderefs to hide these refs that point at objects I do
> > not consider belong to my repository from the pushers equally makes
> > sense as such, I think.
> 
> Now that raises one question.  Should this be transfer.hiderefs that
> governs both upload-pack and receive-pack?  I tend to think the
> answer is yes.

Yes, that is what I was getting at (and it should have individual
hiderefs for each side that override the transfer.*, in case somebody
wants to make the distinction).

> It may even make sense to have "git push" honor it, excluding them
> from "git push --mirror" (or "git push --all" if some of the
> branches are hidden); I haven't thought things through, though.

That is harder, as that is something that happens on the client. How
does the client learn about the transfer.hiderefs setting on the remote?
It has to be out-of-band, at which point calling it by the same name has
little benefit.

-Peff

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/2] Hiding some refs in ls-remote
  2013-01-20 18:06   ` Junio C Hamano
  2013-01-20 22:08     ` Junio C Hamano
@ 2013-01-21 23:03     ` Jeff King
  1 sibling, 0 replies; 16+ messages in thread
From: Jeff King @ 2013-01-21 23:03 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, spearce, mfick

On Sun, Jan 20, 2013 at 10:06:33AM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> >> 	[uploadPack]
> >> 		hiderefs = refs/changes
> >
> > Would you want to do the same thing on receive-pack? It could benefit
> > from the same reduction in network cost (although it tends to be invoked
> > less frequently than upload-pack).
> 
> Do *I* want to do?  Not when there already is a patch exists; I'd
> rather take existing and tested patch ;-)

The patch we have is below, but I do not think you want it, as it is
missing several things:

  - docs and tests

  - handling multiple values

  - anything but raw prefix matching (you even have to put in the "/"
    yourself).

It was not my patch originally, and I never bothered to clean and
upstream it because I didn't think it was something anybody else would
be interested in. I'm happy to do so, but if you are working in the area
anyway, it would make sense to be part of your series.

-Peff

---
diff --git b/builtin/receive-pack.c a/builtin/receive-pack.c
index 0afb8b2..b22670c 100644
--- b/builtin/receive-pack.c
+++ a/builtin/receive-pack.c
@@ -39,6 +39,7 @@ static void *head_name_to_free;
 static int auto_gc = 1;
 static const char *head_name;
 static void *head_name_to_free;
+static const char *hide_refs;
 static int sent_capabilities;
 
 static enum deny_action parse_deny_action(const char *var, const char *value)
@@ -113,11 +114,19 @@ static void show_ref(const char *path, const unsigned char *sha1)
 		return 0;
 	}
 
+	if (strcmp(var, "receive.hiderefs") == 0) {
+		git_config_string(&hide_refs, var, value);
+		return 0;
+	}
+
 	return git_default_config(var, value, cb);
 }
 
 static void show_ref(const char *path, const unsigned char *sha1)
 {
+	if (hide_refs && strncmp(path, hide_refs, strlen(hide_refs)) == 0)
+		return 0;
+
 	if (sent_capabilities)
 		packet_write(1, "%s %s\n", sha1_to_hex(sha1), path);
 	else

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/2] Hiding some refs in ls-remote
  2013-01-21 23:01       ` Jeff King
@ 2013-01-21 23:33         ` Junio C Hamano
  2013-01-21 23:45           ` Jeff King
  0 siblings, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2013-01-21 23:33 UTC (permalink / raw)
  To: Jeff King; +Cc: git, spearce, mfick

Jeff King <peff@peff.net> writes:

>> It may even make sense to have "git push" honor it, excluding them
>> from "git push --mirror" (or "git push --all" if some of the
>> branches are hidden); I haven't thought things through, though.
>
> That is harder, as that is something that happens on the client. How
> does the client learn about the transfer.hiderefs setting on the remote?

No, I was talking about running "git push" from a repository that
has the transfer.hiderefs set, emulating a fetch from a client by
pushing out in the reverse direction.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/2] Hiding some refs in ls-remote
  2013-01-21 23:33         ` Junio C Hamano
@ 2013-01-21 23:45           ` Jeff King
  0 siblings, 0 replies; 16+ messages in thread
From: Jeff King @ 2013-01-21 23:45 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, spearce, mfick

On Mon, Jan 21, 2013 at 03:33:46PM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> >> It may even make sense to have "git push" honor it, excluding them
> >> from "git push --mirror" (or "git push --all" if some of the
> >> branches are hidden); I haven't thought things through, though.
> >
> > That is harder, as that is something that happens on the client. How
> > does the client learn about the transfer.hiderefs setting on the remote?
> 
> No, I was talking about running "git push" from a repository that
> has the transfer.hiderefs set, emulating a fetch from a client by
> pushing out in the reverse direction.

Ah. That seems a bit more questionable to me, as you are assuming that
the face that the repository shows to network clients is the same as it
would show when it is the client itself. That wouldn't be true, for
example, when pushing to a backup repository which would expect to get
everything.

Of course the same problem comes from a backup repository which wants to
fetch from you. Which again comes down to the fact that I think this
ref-hiding is really in the eye of the receiver, not the sender.

-Peff

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2013-01-21 23:46 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-19  0:37 [PATCH 0/2] Hiding some refs in ls-remote Junio C Hamano
2013-01-19  0:37 ` [PATCH 1/2] upload-pack: share more code Junio C Hamano
2013-01-19  0:37 ` [PATCH 2/2] upload-pack: allow hiding ref hiearchies Junio C Hamano
2013-01-19  5:50 ` [PATCH 0/2] Hiding some refs in ls-remote Duy Nguyen
2013-01-19 19:16   ` Junio C Hamano
2013-01-20 18:19     ` Junio C Hamano
2013-01-21  1:46     ` Duy Nguyen
2013-01-21 22:56     ` Jeff King
2013-01-19  6:18 ` Michael Haggerty
2013-01-19 16:50 ` Jeff King
2013-01-20 18:06   ` Junio C Hamano
2013-01-20 22:08     ` Junio C Hamano
2013-01-21 23:01       ` Jeff King
2013-01-21 23:33         ` Junio C Hamano
2013-01-21 23:45           ` Jeff King
2013-01-21 23:03     ` Jeff King

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.