git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG] "fetch $there +refs/*:refs/*" fails if there is a stash
@ 2012-05-01 22:19 Junio C Hamano
  2012-05-02  9:15 ` Michael Haggerty
  2012-05-04 22:30 ` [PATCH] fetch/push: allow refs/*:refs/* Junio C Hamano
  0 siblings, 2 replies; 10+ messages in thread
From: Junio C Hamano @ 2012-05-01 22:19 UTC (permalink / raw)
  To: git; +Cc: Michael Haggerty

This ought to work:

    $ git checkout HEAD^0 ;# make sure we are on detached HEAD
    $ git fetch $somewhere +refs/*:refs/*

if you want to pull down all the branches from somewhere, potentially
nuking the refs we currently have.

However, if $somewhere has a stash, i.e. refs/stash, even though our
"fetch" sees it in the "ls-remote $somewhere" output and tries to make
sure that the object comes over the wire, we never show "want" line for
refs/stash, because we silently drop it with check_ref_format().

I have run out concentration before digging this through, but the attached
single liner patch fixes it.  The thing is, this function is given a list
of refs and drops refs/stash (which is not what I want in the context of
the above +refs/*:refs/*), and modifies the caller's list of refs, and
then the caller (i.e. do_fetch_pack() that called everything_local()) then
uses updated list (i.e. without refs/stash) to run find_common() and
get_pack(), but the layer higher above (namely, do_fetch() that calls
fetch_refs() that in turn calls store_updated_refs(), all in
builtin/fetch.c) is _not_ aware of this filtering and expects that the
code at this layer *must* have asked for and obtained all the objects
reachable from what was listed in ls-remote output, leading to an
inconsistent state.

[Michael CC'ed as he seems to be dead set tightening check_ref_format()]

The specific failure of "refs/stash" is fixed with this patch, but I think
this call to check_ref_format() in the filter_refs() should be removed.
The function is trying to see what we _asked_ against what the remote side
said they have, and if we tried to recover objects from a broken remote by
doing:

	git fetch $somewhere refs/heads/a..b:refs/heads/a_dot_dot_b

and the remote side advertised that it has such a ref (note that a..b is
an illegal path component), we should be able to do so without a misguided
call to check_refname_format() getting in the way.

It is the same story if the name advertised by a broken remote were
"refs/head/../heads/master".  As long as the RHS of the refspec
(i.e. refs/heads/a_dot_dot_b) is kosher, we must allow such a request to
succeed, so that people can interoperate in less than perfect world.

 builtin/fetch-pack.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 7124c4b..34cd8a5 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -541,7 +541,7 @@ static void filter_refs(struct ref **refs, int nr_match, char **match)
 	for (ref = *refs; ref; ref = next) {
 		next = ref->next;
 		if (!memcmp(ref->name, "refs/", 5) &&
-		    check_refname_format(ref->name + 5, 0))
+		    check_refname_format(ref->name, 0))
 			; /* trash */
 		else if (args.fetch_all &&
 			 (!args.depth || prefixcmp(ref->name, "refs/tags/") )) {

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [BUG] "fetch $there +refs/*:refs/*" fails if there is a stash
  2012-05-01 22:19 [BUG] "fetch $there +refs/*:refs/*" fails if there is a stash Junio C Hamano
@ 2012-05-02  9:15 ` Michael Haggerty
  2012-05-02 16:26   ` Junio C Hamano
  2012-05-04 22:30 ` [PATCH] fetch/push: allow refs/*:refs/* Junio C Hamano
  1 sibling, 1 reply; 10+ messages in thread
From: Michael Haggerty @ 2012-05-02  9:15 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On 05/02/2012 12:19 AM, Junio C Hamano wrote:
> This ought to work:
>
>      $ git checkout HEAD^0 ;# make sure we are on detached HEAD
>      $ git fetch $somewhere +refs/*:refs/*
>
> if you want to pull down all the branches from somewhere, potentially
> nuking the refs we currently have.
>
> However, if $somewhere has a stash, i.e. refs/stash, even though our
> "fetch" sees it in the "ls-remote $somewhere" output and tries to make
> sure that the object comes over the wire, we never show "want" line for
> refs/stash, because we silently drop it with check_ref_format().

As you obviously know (but for the information of other readers), the 
reason for the failure is that "stash" (not "refs/stash") is passed to 
check_ref_format().  "stash" is not a valid refname because it has only 
a single level (i.e., does not contain a '/').  check_ref_format() would 
accept the refname if it were passed the REFNAME_ALLOW_ONELEVEL option, 
but passing it the full refname (as your patch does) is better.

> I have run out concentration before digging this through, but the attached
> single liner patch fixes it.  The thing is, this function is given a list
> of refs and drops refs/stash (which is not what I want in the context of
> the above +refs/*:refs/*), and modifies the caller's list of refs, and
> then the caller (i.e. do_fetch_pack() that called everything_local()) then
> uses updated list (i.e. without refs/stash) to run find_common() and
> get_pack(), but the layer higher above (namely, do_fetch() that calls
> fetch_refs() that in turn calls store_updated_refs(), all in
> builtin/fetch.c) is _not_ aware of this filtering and expects that the
> code at this layer *must* have asked for and obtained all the objects
> reachable from what was listed in ls-remote output, leading to an
> inconsistent state.
>
> [Michael CC'ed as he seems to be dead set tightening check_ref_format()]
>
> The specific failure of "refs/stash" is fixed with this patch, but I think
> this call to check_ref_format() in the filter_refs() should be removed.
> The function is trying to see what we _asked_ against what the remote side
> said they have, and if we tried to recover objects from a broken remote by
> doing:
>
> 	git fetch $somewhere refs/heads/a..b:refs/heads/a_dot_dot_b
 >
> and the remote side advertised that it has such a ref (note that a..b is
> an illegal path component), we should be able to do so without a misguided
> call to check_refname_format() getting in the way.

I agree; if the remote reference name never gets used as a local 
reference name, there is no reason to call check_ref_format() on it at 
all.  The local reference name that is derived from the remote reference 
name (even if it is derived via an identity operation) *should* be 
checked using check_ref_format(); presumably that is already the case? 
Optimally the local refnames should be checked *before* the transfer to 
avoid wasting the user's time.

> It is the same story if the name advertised by a broken remote were
> "refs/head/../heads/master".  As long as the RHS of the refspec
> (i.e. refs/heads/a_dot_dot_b) is kosher, we must allow such a request to
> succeed, so that people can interoperate in less than perfect world.

A slightly more awkward question is if the broken remote advertises many 
references including "refs/head/../heads/master" and we fetch refspec 
"+refs/*:refs/*".  In this case it is pretty clear that we should fetch 
all of the valid references; otherwise, working around the problem would 
be quite awkward.  But what to do about "refs/head/../heads/master"?  I 
think we should emit a warning naming the reference that will not be 
fetched "because it is formatted incorrectly", and not include it in the 
"want" lines.  If the user really wants the reference, he can fetch it 
to another name using an explicit "from:to" refspec.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [BUG] "fetch $there +refs/*:refs/*" fails if there is a stash
  2012-05-02  9:15 ` Michael Haggerty
@ 2012-05-02 16:26   ` Junio C Hamano
  0 siblings, 0 replies; 10+ messages in thread
From: Junio C Hamano @ 2012-05-02 16:26 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> On 05/02/2012 12:19 AM, Junio C Hamano wrote:
>
>> The specific failure of "refs/stash" is fixed with this patch, but I think
>> this call to check_ref_format() in the filter_refs() should be removed.
>> The function is trying to see what we _asked_ against what the remote side
>> said they have, and if we tried to recover objects from a broken remote by
>> doing:
>>
>> 	git fetch $somewhere refs/heads/a..b:refs/heads/a_dot_dot_b
>>
>> and the remote side advertised that it has such a ref (note that a..b is
>> an illegal path component), we should be able to do so without a misguided
>> call to check_refname_format() getting in the way.
>
> I agree; if the remote reference name never gets used as a local
> reference name, there is no reason to call check_ref_format() on it at
> all.  The local reference name that is derived from the remote
> reference name (even if it is derived via an identity operation)
> *should* be checked using check_ref_format(); presumably that is
> already the case?

We should make sure the ref the refspec mapping told us to update is a
valid ref locally (didn't I say that), but I do not know offhand if we do
so using check_ref_format() and/or if the check is not overly loose.

Honestly, I didn't expect _you_ to be _asking_ that question; as you are
the person who has been advocating tightening of the function, you would
know the callers of the function better than anybody, no?

>> It is the same story if the name advertised by a broken remote were
>> "refs/head/../heads/master".  As long as the RHS of the refspec
>> (i.e. refs/heads/a_dot_dot_b) is kosher, we must allow such a request to
>> succeed, so that people can interoperate in less than perfect world.
>
> A slightly more awkward question is if the broken remote advertises
> many references including "refs/head/../heads/master" and we fetch
> refspec "+refs/*:refs/*".  In this case it is pretty clear that we
> should fetch all of the valid references; otherwise, working around
> the problem would be quite awkward.

If we ignore what will be stored in FETCH_HEAD, we could either (1) fetch
histories leading to all the valid commits, but not store incorrectly
formatted refs, or (2) fetch histories leading to commits that will be
stored in only the valid _refs_.

But because we cannot ignore FETCH_HEAD, we have to do (1).  That is,
history leading to any ref the remote advertises that matches the LHS of
the refspec has to be fetched, and listed in the resulting FETCH_HEAD.
Some of them may map to invalid refs on the local side, and we do not
store it inside our refs/, so that they do not pollute our local ref
namespace.

> But what to do about
> "refs/head/../heads/master"?  I think we should emit a warning naming
> the reference that will not be fetched "because it is formatted
> incorrectly", and not include it in the "want" lines.  

We have to ask for it, so we include it "want".  Remember, $A without
colon is a short-hand for $A: (fetch but not store locally), and as long
as $A (the LHS of the refspec) matches the refs advertised by the remote
side, we fetch it and list it in FETCH_HEAD.

As the user is _not_ asking to store it locally in our refs/ namespace, we
won't store it.  But if the request were $A:$A for any invalid refname $A,
the story is the same.  We fetch it because LHS of refspec matches what
they advertise, and we list it in FETCH_HEAD.  We do _not_ store it in our
local refs/ space, because RHS of refspec is invalid.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] fetch/push: allow refs/*:refs/*
  2012-05-01 22:19 [BUG] "fetch $there +refs/*:refs/*" fails if there is a stash Junio C Hamano
  2012-05-02  9:15 ` Michael Haggerty
@ 2012-05-04 22:30 ` Junio C Hamano
  2012-05-04 22:35   ` [PATCH] get_fetch_map(): tighten checks on dest refs Junio C Hamano
  2012-05-05  6:03   ` [PATCH] fetch/push: allow refs/*:refs/* Michael Haggerty
  1 sibling, 2 replies; 10+ messages in thread
From: Junio C Hamano @ 2012-05-04 22:30 UTC (permalink / raw)
  To: git; +Cc: Michael Haggerty

There are a handful of places where we call check_refname_format() on a
substring after "refs/" of a refname we are going to use, and filter out a
valid match with "refs/stash" with such a pathspec.  Not sending a stash
may arguably be a feature (as stash is inherently a local workflow
element), but the code in the transport layer is oblivious to this
filtering performed by the lower layer of the code, and complains that the
other side did not send all the objects that needs to complete refs/stash
at the end, even though the code will not write refs/stash out anyway, and
making the whole command fail.

This is an attempt to "fix" it by using check_refname_format() on the
whole "refs/....." string and allowing refs/stash to be also copied.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 * With this patch:

    $ git checkout HEAD^0 ;# make sure we are on detached HEAD
    $ git fetch $somewhere +refs/*:refs/*

   and

    victim$ git config receive.denyCurrentBranch warn
    master$ git push victim +refs/*:refs/*

   should work.

 builtin/fetch-pack.c   |  2 +-
 builtin/receive-pack.c |  2 +-
 remote.c               |  2 +-
 t/t5516-fetch-push.sh  | 30 ++++++++++++++++++++++++++++++
 4 files changed, 33 insertions(+), 3 deletions(-)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 6207ecd..a3e3fa3 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -546,7 +546,7 @@ static void filter_refs(struct ref **refs, int nr_match, char **match)
 	for (ref = *refs; ref; ref = next) {
 		next = ref->next;
 		if (!memcmp(ref->name, "refs/", 5) &&
-		    check_refname_format(ref->name + 5, 0))
+		    check_refname_format(ref->name, 0))
 			; /* trash */
 		else if (args.fetch_all &&
 			 (!args.depth || prefixcmp(ref->name, "refs/tags/") )) {
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 7ec68a1..1935b80 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -401,7 +401,7 @@ static const char *update(struct command *cmd)
 	struct ref_lock *lock;
 
 	/* only refs/... are allowed */
-	if (prefixcmp(name, "refs/") || check_refname_format(name + 5, 0)) {
+	if (prefixcmp(name, "refs/") || check_refname_format(name, 0)) {
 		rp_error("refusing to create funny ref '%s' remotely", name);
 		return "funny refname";
 	}
diff --git a/remote.c b/remote.c
index e2ef991..eacd8ad 100644
--- a/remote.c
+++ b/remote.c
@@ -1595,7 +1595,7 @@ static int one_local_ref(const char *refname, const unsigned char *sha1, int fla
 	int len;
 
 	/* we already know it starts with refs/ to get here */
-	if (check_refname_format(refname + 5, 0))
+	if (check_refname_format(refname, 0))
 		return 0;
 
 	len = strlen(refname) + 1;
diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh
index b69cf57..1a45b19 100755
--- a/t/t5516-fetch-push.sh
+++ b/t/t5516-fetch-push.sh
@@ -929,6 +929,36 @@ test_expect_success 'push into aliased refs (inconsistent)' '
 	)
 '
 
+test_expect_success 'push all hierarchies with stash' '
+	mk_empty &&
+	git stash clear &&
+	git reset --hard &&
+	echo >>path1 &&
+	git stash save "Tweak path1" &&
+	git push testrepo "refs/*:refs/*" &&
+	git ls-remote . >expect &&
+	git ls-remote testrepo >actual &&
+	test_cmp actual expect
+'
+
+test_expect_success 'fetch all hierarchies with stash' '
+	mk_empty &&
+	git stash clear &&
+	git reset --hard &&
+	echo >>path1 &&
+	git stash save "Tweak path1" &&
+	(
+		cd testrepo &&
+		git commit --allow-empty -m initial &&
+		git checkout HEAD^0 &&
+		git fetch .. "+refs/*:refs/*" &&
+		git checkout master
+	) &&
+	git ls-remote . >expect &&
+	git ls-remote testrepo >actual &&
+	test_cmp actual expect
+'
+
 test_expect_success 'push --porcelain' '
 	mk_empty &&
 	echo >.git/foo  "To testrepo" &&
-- 
1.7.10.1.500.g37b1e9a

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH] get_fetch_map(): tighten checks on dest refs
  2012-05-04 22:30 ` [PATCH] fetch/push: allow refs/*:refs/* Junio C Hamano
@ 2012-05-04 22:35   ` Junio C Hamano
  2012-05-05  6:03   ` [PATCH] fetch/push: allow refs/*:refs/* Michael Haggerty
  1 sibling, 0 replies; 10+ messages in thread
From: Junio C Hamano @ 2012-05-04 22:35 UTC (permalink / raw)
  To: git; +Cc: Michael Haggerty

The code to check the refname we store the fetched result locally did not
bother checking the first 5 bytes of it, presumably assuming that it
always begin with "refs/".  For a fetch refspec (or the result of applying
wildcard on one), we always want the RHS to map to something inside
"refs/" hierarchy, so let's spell that rule out in a more explicit way.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 remote.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/remote.c b/remote.c
index eacd8ad..0f2b1db 100644
--- a/remote.c
+++ b/remote.c
@@ -1402,8 +1402,8 @@ int get_fetch_map(const struct ref *remote_refs,
 
 	for (rmp = &ref_map; *rmp; ) {
 		if ((*rmp)->peer_ref) {
-			if (check_refname_format((*rmp)->peer_ref->name + 5,
-				REFNAME_ALLOW_ONELEVEL)) {
+			if (prefixcmp((*rmp)->peer_ref->name, "refs/") ||
+			    check_refname_format((*rmp)->peer_ref->name, 0)) {
 				struct ref *ignore = *rmp;
 				error("* Ignoring funny ref '%s' locally",
 				      (*rmp)->peer_ref->name);
-- 
1.7.10.1.500.g37b1e9a

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] fetch/push: allow refs/*:refs/*
  2012-05-04 22:30 ` [PATCH] fetch/push: allow refs/*:refs/* Junio C Hamano
  2012-05-04 22:35   ` [PATCH] get_fetch_map(): tighten checks on dest refs Junio C Hamano
@ 2012-05-05  6:03   ` Michael Haggerty
  2012-05-05 18:22     ` Michael Haggerty
  2012-05-07 16:24     ` Junio C Hamano
  1 sibling, 2 replies; 10+ messages in thread
From: Michael Haggerty @ 2012-05-05  6:03 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On 05/05/2012 12:30 AM, Junio C Hamano wrote:
> There are a handful of places where we call check_refname_format() on a
> substring after "refs/" of a refname we are going to use, and filter out a
> valid match with "refs/stash" with such a pathspec.  Not sending a stash
> may arguably be a feature (as stash is inherently a local workflow
> element), but the code in the transport layer is oblivious to this
> filtering performed by the lower layer of the code, and complains that the
> other side did not send all the objects that needs to complete refs/stash
> at the end, even though the code will not write refs/stash out anyway, and
> making the whole command fail.
>
> This is an attempt to "fix" it by using check_refname_format() on the
> whole "refs/....." string and allowing refs/stash to be also copied.
>
> Signed-off-by: Junio C Hamano<gitster@pobox.com>
> ---
>
>   * With this patch:
>
>      $ git checkout HEAD^0 ;# make sure we are on detached HEAD
>      $ git fetch $somewhere +refs/*:refs/*
>
>     and
>
>      victim$ git config receive.denyCurrentBranch warn
>      master$ git push victim +refs/*:refs/*
>
>     should work.
>
>   builtin/fetch-pack.c   |  2 +-
>   builtin/receive-pack.c |  2 +-
>   remote.c               |  2 +-
>   t/t5516-fetch-push.sh  | 30 ++++++++++++++++++++++++++++++
>   4 files changed, 33 insertions(+), 3 deletions(-)
>
> diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
> index 6207ecd..a3e3fa3 100644
> --- a/builtin/fetch-pack.c
> +++ b/builtin/fetch-pack.c
> @@ -546,7 +546,7 @@ static void filter_refs(struct ref **refs, int nr_match, char **match)
>   	for (ref = *refs; ref; ref = next) {
>   		next = ref->next;
>   		if (!memcmp(ref->name, "refs/", 5)&&
> -		    check_refname_format(ref->name + 5, 0))
> +		    check_refname_format(ref->name, 0))

The patch looks fine to me.

This combination "!memcmp(ref->name, "refs/", 5) && 
check_refname_format(ref->name, 0)" is the reason that I suggested 
adding a REFNAME_FULL option [1], in which case it could be written 
"check_refname_format(ref->name, REFNAME_FULL)".  However, now I think 
that the options should be constructed a little differently:

flags==0: Require refname to start with "refs/"

flags==REFNAME_ALLOW_SPECIAL: Also accept single-level ALL_CAPS refnames.

flags==REFNAME_ALLOW_PARTIAL: Don't check the namespace or require '/'). 
  This could be used for checking partial names like "master" as 
shorthand for "refs/master".

Does this sound reasonable to you?

Michael

[1] http://article.gmane.org/gmane.comp.version-control.git/196632

>   			; /* trash */
>   		else if (args.fetch_all&&
>   			(!args.depth || prefixcmp(ref->name, "refs/tags/") )) {
> diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
> index 7ec68a1..1935b80 100644
> --- a/builtin/receive-pack.c
> +++ b/builtin/receive-pack.c
> @@ -401,7 +401,7 @@ static const char *update(struct command *cmd)
>   	struct ref_lock *lock;
>
>   	/* only refs/... are allowed */
> -	if (prefixcmp(name, "refs/") || check_refname_format(name + 5, 0)) {
> +	if (prefixcmp(name, "refs/") || check_refname_format(name, 0)) {
>   		rp_error("refusing to create funny ref '%s' remotely", name);
>   		return "funny refname";
>   	}
> diff --git a/remote.c b/remote.c
> index e2ef991..eacd8ad 100644
> --- a/remote.c
> +++ b/remote.c
> @@ -1595,7 +1595,7 @@ static int one_local_ref(const char *refname, const unsigned char *sha1, int fla
>   	int len;
>
>   	/* we already know it starts with refs/ to get here */
> -	if (check_refname_format(refname + 5, 0))
> +	if (check_refname_format(refname, 0))
>   		return 0;
>
>   	len = strlen(refname) + 1;
> diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh
> index b69cf57..1a45b19 100755
> --- a/t/t5516-fetch-push.sh
> +++ b/t/t5516-fetch-push.sh
> @@ -929,6 +929,36 @@ test_expect_success 'push into aliased refs (inconsistent)' '
>   	)
>   '
>
> +test_expect_success 'push all hierarchies with stash' '
> +	mk_empty&&
> +	git stash clear&&
> +	git reset --hard&&
> +	echo>>path1&&
> +	git stash save "Tweak path1"&&
> +	git push testrepo "refs/*:refs/*"&&
> +	git ls-remote .>expect&&
> +	git ls-remote testrepo>actual&&
> +	test_cmp actual expect
> +'
> +
> +test_expect_success 'fetch all hierarchies with stash' '
> +	mk_empty&&
> +	git stash clear&&
> +	git reset --hard&&
> +	echo>>path1&&
> +	git stash save "Tweak path1"&&
> +	(
> +		cd testrepo&&
> +		git commit --allow-empty -m initial&&
> +		git checkout HEAD^0&&
> +		git fetch .. "+refs/*:refs/*"&&
> +		git checkout master
> +	)&&
> +	git ls-remote .>expect&&
> +	git ls-remote testrepo>actual&&
> +	test_cmp actual expect
> +'
> +
>   test_expect_success 'push --porcelain' '
>   	mk_empty&&
>   	echo>.git/foo  "To testrepo"&&


-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] fetch/push: allow refs/*:refs/*
  2012-05-05  6:03   ` [PATCH] fetch/push: allow refs/*:refs/* Michael Haggerty
@ 2012-05-05 18:22     ` Michael Haggerty
  2012-05-07 16:24     ` Junio C Hamano
  1 sibling, 0 replies; 10+ messages in thread
From: Michael Haggerty @ 2012-05-05 18:22 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On 05/05/2012 08:03 AM, Michael Haggerty wrote:
> On 05/05/2012 12:30 AM, Junio C Hamano wrote:
>> There are a handful of places where we call check_refname_format() on a
>> substring after "refs/" of a refname we are going to use, and filter
>> out a
>> valid match with "refs/stash" with such a pathspec. Not sending a stash
>> may arguably be a feature (as stash is inherently a local workflow
>> element), but the code in the transport layer is oblivious to this
>> filtering performed by the lower layer of the code, and complains that
>> the
>> other side did not send all the objects that needs to complete refs/stash
>> at the end, even though the code will not write refs/stash out anyway,
>> and
>> making the whole command fail.
>>
>> This is an attempt to "fix" it by using check_refname_format() on the
>> whole "refs/....." string and allowing refs/stash to be also copied.
>>
>> Signed-off-by: Junio C Hamano<gitster@pobox.com>
>> ---
>>
>> * With this patch:
>>
>> $ git checkout HEAD^0 ;# make sure we are on detached HEAD
>> $ git fetch $somewhere +refs/*:refs/*
>>
>> and
>>
>> victim$ git config receive.denyCurrentBranch warn
>> master$ git push victim +refs/*:refs/*
>>
>> should work.
>>
>> builtin/fetch-pack.c | 2 +-
>> builtin/receive-pack.c | 2 +-
>> remote.c | 2 +-
>> t/t5516-fetch-push.sh | 30 ++++++++++++++++++++++++++++++
>> 4 files changed, 33 insertions(+), 3 deletions(-)
>>
>> diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
>> index 6207ecd..a3e3fa3 100644
>> --- a/builtin/fetch-pack.c
>> +++ b/builtin/fetch-pack.c
>> @@ -546,7 +546,7 @@ static void filter_refs(struct ref **refs, int
>> nr_match, char **match)
>> for (ref = *refs; ref; ref = next) {
>> next = ref->next;
>> if (!memcmp(ref->name, "refs/", 5)&&
>> - check_refname_format(ref->name + 5, 0))
>> + check_refname_format(ref->name, 0))
>
> The patch looks fine to me.
>
> This combination "!memcmp(ref->name, "refs/", 5) &&
> check_refname_format(ref->name, 0)" is the reason that I suggested
> adding a REFNAME_FULL option [1], in which case it could be written
> "check_refname_format(ref->name, REFNAME_FULL)". However, now I think
> that the options should be constructed a little differently:
>
> flags==0: Require refname to start with "refs/"
>
> flags==REFNAME_ALLOW_SPECIAL: Also accept single-level ALL_CAPS refnames.
>
> flags==REFNAME_ALLOW_PARTIAL: Don't check the namespace or require '/').
> This could be used for checking partial names like "master" as shorthand
> for "refs/master".
>
> Does this sound reasonable to you?

Oops; I just realized that this particular example doesn't check "starts 
with "refs/" and valid refname" as expected.  In fact it checks "starts 
with "refs/" and INVALID refname".  The proposed REFNAME_FULL option 
would of course be equivalent to the former.  (Why the latter is needed 
here is something that needs a larger investment of time to figure out.)

The other examples that you patched do the expected combination of checks.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] fetch/push: allow refs/*:refs/*
  2012-05-05  6:03   ` [PATCH] fetch/push: allow refs/*:refs/* Michael Haggerty
  2012-05-05 18:22     ` Michael Haggerty
@ 2012-05-07 16:24     ` Junio C Hamano
  2012-05-15  8:49       ` Michael Haggerty
  1 sibling, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2012-05-07 16:24 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> This combination "!memcmp(ref->name, "refs/", 5) &&
> check_refname_format(ref->name, 0)" is the reason that I suggested
> adding a REFNAME_FULL option [1], in which case it could be written
> "check_refname_format(ref->name, REFNAME_FULL)".

I personally think that check-refname-format should lose its second
parameter and always check for a concrete full refs, so that we can
tighten the current allow-onelevel case a bit further (i.e. ".git/HEAD" is
OK at the root level, but ".git/refs/heads/HEAD" is asking for trouble,
and ".git/config" is downright confusing. The function does not get a good
enough clue by only ONELEVEL).  That would mean REFNAME_FULL will not be
necessary---it should be the only mode of operation, and the callers need
to be adjusted accordingly.

We should also introduce another function check-refspec-half-format to be
used for the checks for left and right halves of refspec ("refs/heads/*"
is OK but "refs/heads*" is not, perhaps).

A single function trying to do both and not doing a good job in either
case is not a pretty picture.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] fetch/push: allow refs/*:refs/*
  2012-05-07 16:24     ` Junio C Hamano
@ 2012-05-15  8:49       ` Michael Haggerty
  2012-05-15 15:19         ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Michael Haggerty @ 2012-05-15  8:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On 05/07/2012 06:24 PM, Junio C Hamano wrote:
> Michael Haggerty<mhagger@alum.mit.edu>  writes:
>
>> This combination "!memcmp(ref->name, "refs/", 5)&&
>> check_refname_format(ref->name, 0)" is the reason that I suggested
>> adding a REFNAME_FULL option [1], in which case it could be written
>> "check_refname_format(ref->name, REFNAME_FULL)".
>
> I personally think that check-refname-format should lose its second
> parameter and always check for a concrete full refs, so that we can
> tighten the current allow-onelevel case a bit further (i.e. ".git/HEAD" is
> OK at the root level, but ".git/refs/heads/HEAD" is asking for trouble,
> and ".git/config" is downright confusing. The function does not get a good
> enough clue by only ONELEVEL).  That would mean REFNAME_FULL will not be
> necessary---it should be the only mode of operation, and the callers need
> to be adjusted accordingly.
>
> We should also introduce another function check-refspec-half-format to be
> used for the checks for left and right halves of refspec ("refs/heads/*"
> is OK but "refs/heads*" is not, perhaps).
>
> A single function trying to do both and not doing a good job in either
> case is not a pretty picture.

Whether the functionality is presented via one function with multiple 
options or multiple functions with no options is of secondary 
importance.  Let's first worry about the desired functionality, since 
even that is not clear.


There's a long weekend coming up in Germany so hopefully I'll have 
another increment of time to work on this.  I'd like to get feedback on 
the requirements.  First let me present a summary of what I know about 
refnames in git.


There are a few categories of refname that come up in the git code:

* Full refnames like "refs/foo" or "refs/bar/baz" (starting with "refs/").

* Top-level special ALL_CAPS reference names like "MERGE_HEAD".

* Refspec-style patterns with wildcards, like "refs/heads/*" or 
"refs/foo/*/bar".

* Short branch names with an implied "refs/heads/" prefix omitted; e.g., 
"master" meaning "refs/heads/master".

* Short tag names with an implied "refs/tags/" prefix omitted; e.g., 
"v1.2.3" meaning "refs/tags/v1.2.3".

* Maybe others...?


Currently, I believe that code usually handles the short branch/tag 
names via one of two mechanisms:

* Explicitly prepend "refs/heads/" or "refs/tags/" to the short name to 
make the corresponding full name, then work exclusively with the full name.

* Use the ref_rev_parse_rules based functions like dwim_ref() to guess 
which prefix can be applied to a short refname to make it into a full 
refname, then work exclusively with the full name.

If code will continue to work this way, then there might be no use for a 
function that can check short refnames; one would always convert the 
short refname into a full refname and check the latter.  On the other 
hand, a short refname check might do special things like *reject* a name 
that starts with "refs/".


In addition to the strict rules governing refnames, there are other 
policies that are enforced at varying levels of strictness or that might 
be helpful in the future; for example,

* Refnames either have to start with "refs/" or have to be ALL_CAPS.

* Refnames cannot be directly under "refs/" but should be in a namespace 
like "refs/heads/".  (This is implied by the old check-ref-format 
documentation:

> . They must contain at least one `/`. This enforces the presence of a
>   category like `heads/`, `tags/` etc. but the actual names are not
>   restricted.

This sentence also implies that check-ref-format was intended to be used 
for refnames shortened by having the "refs/" prefix omitted, which is 
curious.)

* Unlikely refnames like "refs/heads/HEAD" could be handled with 
suspicion, as you suggest.  I would also nominate constructs like 
"refs/heads/refs/heads/mybranch", which I've seen cause confusion at $WORK.


In terms of implementation:

* We want to distinguish between what is acceptable for new refnames 
versus refnames that already exist in the local repository and/or 
refnames received from a remote repo.  So there should be some kind of 
"REFNAME_RELAXED" option.

* It is important for the "old" check_refname_format() and the "new" one 
to coexist during the transition.  This is simply because I don't want 
to have to analyze and rewrite all ~30 callers of check_refname_format() 
in one big bang.  I suggest retaining a copy as 
check_refname_format_legacy() that is deprecated and removed after all 
callers have been rewritten.

* The exact current behavior of "git check-ref-format" will probably not 
map well to the new check_refname_format(), and is anyway inadequate and 
somewhat inconsistent.  We need to decide how strict we need to be about 
backwards-compatibility of this command.

* I've been using "git check-ref-format" for testing 
check_refname_format().  But I think it would be a bad idea to expose 
all of the check_refname_format() options and variants in an installed 
plumbing command.  Therefore I think that there should be a program that 
can be used to call check_refname_format() with all of its option 
combinations to be used only for testing purposes.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] fetch/push: allow refs/*:refs/*
  2012-05-15  8:49       ` Michael Haggerty
@ 2012-05-15 15:19         ` Junio C Hamano
  0 siblings, 0 replies; 10+ messages in thread
From: Junio C Hamano @ 2012-05-15 15:19 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> * Full refnames like "refs/foo" or "refs/bar/baz" (starting with "refs/").

When reading any of these out of existing set of ref candidates (i.e. read
from readdir() recursing down from ".git/refs"), we shouldn't be rejecting
with format checks, but we may want to warn.

When creating, we have full refnames in this case, so we can just check.

> * Top-level special ALL_CAPS reference names like "MERGE_HEAD".

Similar.  These are "full refname" as far as ref format checking code is
concerned, and the format logic allows only ALL_CAPS at the top-level.
When reading any single-level out of existing set of ref candidates
(i.e. read from readdir() scanning ".git/"), malformed candidates should
be ignored without even warning (otherwise you will incorrectly complain
upon seeing ".git/config").  When creating, we have full refnames in this
case, so we can just check.

> * Refspec-style patterns with wildcards, like "refs/heads/*" or
> "refs/foo/*/bar".

As I said in the message you are replying to, the format check for these
should be a separate function (which might happen to share some code with
ref, but that is an implementation detail).

> * Short branch names with an implied "refs/heads/" prefix omitted;
> e.g., "master" meaning "refs/heads/master".
>
> * Short tag names with an implied "refs/tags/" prefix omitted; e.g.,
> "v1.2.3" meaning "refs/tags/v1.2.3".

Checking for the above two classes should conceptually happen with the
implied prefix.  When creating (i.e. git branch/git checkout -b/git tag),
you know what full refname the thing will become if you allow it to be
created.

As an optimization measure at the implementation level, being able to
check "frotz", knowing that the calling code wants to know that the name
is usable as a branch name, without having to do an allocation to hold
"refs/heads/frotz" somewhere to call the format-check function, options
like allow-onelevel is sometimes useful, but at the conceptual level it is
not strictly necessary.

> Currently, I believe that code usually handles the short branch/tag
> names via one of two mechanisms:
>
> * Explicitly prepend "refs/heads/" or "refs/tags/" to the short name
> to make the corresponding full name, then work exclusively with the
> full name.

Conceptually everybody should be doing that.

> * Use the ref_rev_parse_rules based functions like dwim_ref() to guess
> which prefix can be applied to a short refname to make it into a full
> refname, then work exclusively with the full name.

That is merely _matching_ with existing name, no?  Once you see an
existing name, you shouldn't be _rejecting_ anything.  If you see a
questionable name, you may warn.

I am not sure if these "two mechanisms" is a proper characterization of
the problem space.  Shouldn't you be separating between reading and
creating instead?

> * Refnames either have to start with "refs/" or have to be ALL_CAPS.

Sensible.

> * Refnames cannot be directly under "refs/" but should be in a
> namespace like "refs/heads/".  (This is implied by the old
> check-ref-format documentation:

Almost, but beware of stashes.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-05-15 15:19 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-05-01 22:19 [BUG] "fetch $there +refs/*:refs/*" fails if there is a stash Junio C Hamano
2012-05-02  9:15 ` Michael Haggerty
2012-05-02 16:26   ` Junio C Hamano
2012-05-04 22:30 ` [PATCH] fetch/push: allow refs/*:refs/* Junio C Hamano
2012-05-04 22:35   ` [PATCH] get_fetch_map(): tighten checks on dest refs Junio C Hamano
2012-05-05  6:03   ` [PATCH] fetch/push: allow refs/*:refs/* Michael Haggerty
2012-05-05 18:22     ` Michael Haggerty
2012-05-07 16:24     ` Junio C Hamano
2012-05-15  8:49       ` Michael Haggerty
2012-05-15 15:19         ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).