All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Permit refspec source side to parse as a sha1
@ 2008-03-21  0:54 Daniel Barkalow
  2008-03-21  4:10 ` Junio C Hamano
  0 siblings, 1 reply; 20+ messages in thread
From: Daniel Barkalow @ 2008-03-21  0:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Samuel Tardieu

This fixes "git push origin HEAD~1:foo". "git fetch origin HEAD~1:foo"
will report "Couldn't find remote ref HEAD~1", while
"git fetch origin HEAD**1:foo" reports "Invalid refspec 'HEAD**1:foo'"

That is, HEAD~1 is something you're not allowed to ask the remote for, 
while HEAD**1 doesn't mean anything.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
---
Note that this actually tries to look up the source side, so "git 
fetch origin HEAD^3:foo" usually gives a wrongish error message. But this 
only applies to error cases which nobody is likely to attempt anyway, and 
they still come out as errors regardless.

 remote.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/remote.c b/remote.c
index 9700a33..d737579 100644
--- a/remote.c
+++ b/remote.c
@@ -434,8 +434,10 @@ struct refspec *parse_ref_spec(int nr_refspec, const char **refspec)
 		rs[i].src = xstrndup(sp, ep - sp);
 
 		if (*rs[i].src) {
+			unsigned char sha1[20];
 			st = check_ref_format(rs[i].src);
-			if (st && st != CHECK_REF_FORMAT_ONELEVEL)
+			if (st && st != CHECK_REF_FORMAT_ONELEVEL &&
+			    get_sha1(rs[i].src, sha1))
 				die("Invalid refspec '%s'", refspec[i]);
 		}
 		if (rs[i].dst && *rs[i].dst) {
-- 
1.5.4.3.610.gea6cd

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] Permit refspec source side to parse as a sha1
  2008-03-21  0:54 [PATCH] Permit refspec source side to parse as a sha1 Daniel Barkalow
@ 2008-03-21  4:10 ` Junio C Hamano
  2008-03-21  4:50   ` Junio C Hamano
  2008-03-21  5:09   ` Daniel Barkalow
  0 siblings, 2 replies; 20+ messages in thread
From: Junio C Hamano @ 2008-03-21  4:10 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: git, Samuel Tardieu

Daniel Barkalow <barkalow@iabervon.org> writes:

> This fixes "git push origin HEAD~1:foo". "git fetch origin HEAD~1:foo"
> will report "Couldn't find remote ref HEAD~1", while
> "git fetch origin HEAD**1:foo" reports "Invalid refspec 'HEAD**1:foo'"
>
> That is, HEAD~1 is something you're not allowed to ask the remote for, 
> while HEAD**1 doesn't mean anything.
>
> Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
> ---
> Note that this actually tries to look up the source side, so "git 
> fetch origin HEAD^3:foo" usually gives a wrongish error message.

This is very wrong.  When you do not have anything on your own (i.e. HEAD
points at an unborn branch), "git fetch origin HEAD:master" would barf?
We should _never_ look at the refs we have; the check is about the syntax.

Why are fetch and push refspec parsing code share the same function and
enforce exactly the same error checking to begin with?  Is it because we
are too lazy to implement two semantics that has to be different and try
to "share" code by only implementing checks for "common denominator"?

I think that is the root of the problem, because their syntax are designed
to look similar, but their semantics are different [*1*].

For a fetch refspec, LHS is the remote end and you do not even know what
refs are available over there; the _only_ thing you care about is that it
is well formed.

On the other hand, for a push refspec, LHS is the local end and _must_
name a valid commit you are pushing (unless you are pushing empty to
delete the ref).

I really think we cannot afford piling hacks on top of hacks to hide the
broken interface forever.  We have two different things to validate, and
the callers all know what they have when calling us to validate.  We
should not have a single loose validation that only catches "it cannot be
either fetch nor push refspec" breakage.

        As a side note, I have a vague recollection that we used to treat
        a refspec from the command line and a refspec from the config
        differently in some contexts, and rules applied to them might have
        been different.  I do not think of what the context was offhand,
        but we might need to further split the implementation of the
        rules.

But at least I think something like this patch would lead us in the right
direction.

---

 builtin-fetch.c |    3 +-
 remote.c        |  114 +++++++++++++++++++++++++++++++++++-------------------
 remote.h        |    3 +-
 3 files changed, 78 insertions(+), 42 deletions(-)

diff --git a/builtin-fetch.c b/builtin-fetch.c
index b2b9935..a11548c 100644
--- a/builtin-fetch.c
+++ b/builtin-fetch.c
@@ -652,5 +652,6 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 
 	signal(SIGINT, unlock_pack_on_signal);
 	atexit(unlock_pack);
-	return do_fetch(transport, parse_ref_spec(ref_nr, refs), ref_nr);
+	return do_fetch(transport,
+			parse_fetch_refspec(ref_nr, refs), ref_nr);
 }
diff --git a/remote.c b/remote.c
index 9700a33..3225f07 100644
--- a/remote.c
+++ b/remote.c
@@ -393,58 +393,92 @@ static void read_config(void)
 	alias_all_urls();
 }
 
-struct refspec *parse_ref_spec(int nr_refspec, const char **refspec)
+static struct refspec *parse_refspec_internal(int nr_refspec, const char **refspec, int fetch)
 {
 	int i;
 	int st;
 	struct refspec *rs = xcalloc(sizeof(*rs), nr_refspec);
+
 	for (i = 0; i < nr_refspec; i++) {
-		const char *sp, *ep, *gp;
-		sp = refspec[i];
-		if (*sp == '+') {
+		size_t llen, rlen;
+		int is_glob;
+		const char *lhs, *rhs;
+
+		llen = rlen = is_glob = 0;
+
+		lhs = refspec[i];
+		if (*lhs == '+') {
 			rs[i].force = 1;
-			sp++;
+			lhs++;
 		}
-		gp = strstr(sp, "/*");
-		ep = strchr(sp, ':');
-		if (gp && ep && gp > ep)
-			gp = NULL;
-		if (ep) {
-			if (ep[1]) {
-				const char *glob = strstr(ep + 1, "/*");
-				if (glob && glob[2])
-					glob = NULL;
-				if (!glob)
-					gp = NULL;
-				if (gp)
-					rs[i].dst = xstrndup(ep + 1,
-							     glob - ep - 1);
-				else
-					rs[i].dst = xstrdup(ep + 1);
-			}
+
+		rhs = strrchr(lhs, ':');
+		if (rhs && *++rhs) {
+			/* Has RHS */
+			rlen = strlen(rhs);
+			is_glob = (2 <= rlen && !strcmp(rhs + rlen - 2, "/*"));
 		} else {
-			ep = sp + strlen(sp);
+			rhs = NULL;
 		}
-		if (gp && gp + 2 != ep)
-			gp = NULL;
-		if (gp) {
-			rs[i].pattern = 1;
-			ep = gp;
+
+		llen = (rhs ? (rhs - lhs - 1) : strlen(lhs));
+		if (is_glob != (2 <= llen && !memcmp(lhs + llen - 2, "/*", 2)))
+			goto invalid;
+
+		if (is_glob) {
+			llen -= 2;
+			rlen -= 2;
 		}
-		rs[i].src = xstrndup(sp, ep - sp);
+		rs[i].pattern = is_glob;
+		if (rlen)
+			rs[i].dst = xstrndup(rhs, rlen);
+		rs[i].src = xstrndup(lhs, llen);
 
-		if (*rs[i].src) {
+		/*
+		 * Do we want to validate LHS?
+		 *
+		 * Fetch refspec must have LHS that names a valid
+		 * looking ref (unless it is empty, which defaults to
+		 * HEAD).
+		 *
+		 * Push refspec "LHS" without ":RHS" is a synonym for
+		 * "LHS:LHS", and LHS must be a valid looking ref.
+		 *
+		 * For push with explicit RHS, LHS can be any valid
+		 * extended SHA-1 expression (or even "empty" for
+		 * removal) and we cannot check for error until it
+		 * actually gets used.
+		 */
+		if ((fetch && *rs[i].src) || (!fetch && !rhs)) {
 			st = check_ref_format(rs[i].src);
 			if (st && st != CHECK_REF_FORMAT_ONELEVEL)
-				die("Invalid refspec '%s'", refspec[i]);
+				goto invalid;
 		}
-		if (rs[i].dst && *rs[i].dst) {
+
+		/*
+		 * RHS: both push destination and fetch tracking ref
+		 * must name a valid looking ref.
+		 */
+		if (rs[i].dst) {
 			st = check_ref_format(rs[i].dst);
 			if (st && st != CHECK_REF_FORMAT_ONELEVEL)
-				die("Invalid refspec '%s'", refspec[i]);
+				goto invalid;
 		}
 	}
 	return rs;
+
+ invalid:
+	die("Invalid refspec '%s'", refspec[i]);
+}
+
+struct refspec *parse_fetch_refspec(int nr_refspec, const char **refspec)
+{
+	return parse_refspec_internal(nr_refspec, refspec, 1);
+}
+
+struct refspec *parse_push_refspec(int nr_refspec, const char **refspec)
+{
+	return parse_refspec_internal(nr_refspec, refspec, 0);
 }
 
 static int valid_remote_nick(const char *name)
@@ -475,8 +509,8 @@ struct remote *remote_get(const char *name)
 		add_url_alias(ret, name);
 	if (!ret->url)
 		return NULL;
-	ret->fetch = parse_ref_spec(ret->fetch_refspec_nr, ret->fetch_refspec);
-	ret->push = parse_ref_spec(ret->push_refspec_nr, ret->push_refspec);
+	ret->fetch = parse_fetch_refspec(ret->fetch_refspec_nr, ret->fetch_refspec);
+	ret->push = parse_push_refspec(ret->push_refspec_nr, ret->push_refspec);
 	return ret;
 }
 
@@ -489,11 +523,11 @@ int for_each_remote(each_remote_fn fn, void *priv)
 		if (!r)
 			continue;
 		if (!r->fetch)
-			r->fetch = parse_ref_spec(r->fetch_refspec_nr,
-					r->fetch_refspec);
+			r->fetch = parse_fetch_refspec(r->fetch_refspec_nr,
+						       r->fetch_refspec);
 		if (!r->push)
-			r->push = parse_ref_spec(r->push_refspec_nr,
-					r->push_refspec);
+			r->push = parse_push_refspec(r->push_refspec_nr,
+						     r->push_refspec);
 		result = fn(r, priv);
 	}
 	return result;
@@ -824,7 +858,7 @@ int match_refs(struct ref *src, struct ref *dst, struct ref ***dst_tail,
 	       int nr_refspec, const char **refspec, int flags)
 {
 	struct refspec *rs =
-		parse_ref_spec(nr_refspec, (const char **) refspec);
+		parse_push_refspec(nr_refspec, (const char **) refspec);
 	int send_all = flags & MATCH_REFS_ALL;
 	int send_mirror = flags & MATCH_REFS_MIRROR;
 
diff --git a/remote.h b/remote.h
index f1dedf1..7e9ae79 100644
--- a/remote.h
+++ b/remote.h
@@ -67,7 +67,8 @@ void free_refs(struct ref *ref);
  */
 void ref_remove_duplicates(struct ref *ref_map);
 
-struct refspec *parse_ref_spec(int nr_refspec, const char **refspec);
+struct refspec *parse_fetch_refspec(int nr_refspec, const char **refspec);
+struct refspec *parse_push_refspec(int nr_refspec, const char **refspec);
 
 int match_refs(struct ref *src, struct ref *dst, struct ref ***dst_tail,
 	       int nr_refspec, const char **refspec, int all);

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] Permit refspec source side to parse as a sha1
  2008-03-21  4:10 ` Junio C Hamano
@ 2008-03-21  4:50   ` Junio C Hamano
  2008-03-21  5:09   ` Daniel Barkalow
  1 sibling, 0 replies; 20+ messages in thread
From: Junio C Hamano @ 2008-03-21  4:50 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: git, Samuel Tardieu

Junio C Hamano <gitster@pobox.com> writes:

> I really think we cannot afford piling hacks on top of hacks to hide the
> broken interface forever.  We have two different things to validate, and
> the callers all know what they have when calling us to validate.  We
> should not have a single loose validation that only catches "it cannot be
> either fetch nor push refspec" breakage.
> ...
> But at least I think something like this patch would lead us in the right
> direction.

On top of the one I sent out, with this patch, you can:

	$ git push other ':/remote show:refs/heads/new'

to send the commit you would review with "git show ':/remote show'".

---
 builtin-send-pack.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/builtin-send-pack.c b/builtin-send-pack.c
index 930e0fb..bb9c33a 100644
--- a/builtin-send-pack.c
+++ b/builtin-send-pack.c
@@ -537,7 +537,7 @@ static void verify_remote_names(int nr_heads, const char **heads)
 	int i;
 
 	for (i = 0; i < nr_heads; i++) {
-		const char *remote = strchr(heads[i], ':');
+		const char *remote = strrchr(heads[i], ':');
 
 		remote = remote ? (remote + 1) : heads[i];
 		switch (check_ref_format(remote)) {

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] Permit refspec source side to parse as a sha1
  2008-03-21  4:10 ` Junio C Hamano
  2008-03-21  4:50   ` Junio C Hamano
@ 2008-03-21  5:09   ` Daniel Barkalow
  2008-03-21  5:30     ` Junio C Hamano
  1 sibling, 1 reply; 20+ messages in thread
From: Daniel Barkalow @ 2008-03-21  5:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Samuel Tardieu

On Thu, 20 Mar 2008, Junio C Hamano wrote:

> Daniel Barkalow <barkalow@iabervon.org> writes:
> 
> > This fixes "git push origin HEAD~1:foo". "git fetch origin HEAD~1:foo"
> > will report "Couldn't find remote ref HEAD~1", while
> > "git fetch origin HEAD**1:foo" reports "Invalid refspec 'HEAD**1:foo'"
> >
> > That is, HEAD~1 is something you're not allowed to ask the remote for, 
> > while HEAD**1 doesn't mean anything.
> >
> > Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
> > ---
> > Note that this actually tries to look up the source side, so "git 
> > fetch origin HEAD^3:foo" usually gives a wrongish error message.
> 
> This is very wrong.  When you do not have anything on your own (i.e. HEAD
> points at an unborn branch), "git fetch origin HEAD:master" would barf?

No, HEAD is syntactically valid as a ref, so it's accepted regardless of 
anything else.

> We should _never_ look at the refs we have; the check is about the syntax.

Unfortunately, there's not (so far as I know) a handy way to check the 
syntax of a sha1 name.

> Why are fetch and push refspec parsing code share the same function and
> enforce exactly the same error checking to begin with?  Is it because we
> are too lazy to implement two semantics that has to be different and try
> to "share" code by only implementing checks for "common denominator"?

We enforce additional, purpose-specific error-checking after parsing, when 
we're actually trying to use the refspecs. The problem is that we can't 
make the syntactic check sufficient lenient to cover all valid syntax, but 
tight enough to prohibit any syntax errors, without getting tangled in a 
bit of semantics.

> I think that is the root of the problem, because their syntax are designed
> to look similar, but their semantics are different [*1*].

The syntax is identical. The semantics is not the parser's problem, and is 
more appropriate for the callers to handle.

> For a fetch refspec, LHS is the remote end and you do not even know what
> refs are available over there; the _only_ thing you care about is that it
> is well formed.
> 
> On the other hand, for a push refspec, LHS is the local end and _must_
> name a valid commit you are pushing (unless you are pushing empty to
> delete the ref).

Not really. If you're fetching, and you have a push refspec configured for 
the same ref, and that refspec has a LHS which, for the current HEAD, 
doesn't refer to a valid commit at the moment, that's fine. It's only a 
problem if you try to use it while it doesn't point at a valid commit, or 
if it isn't valid syntax for a sha1 name (like HEAD*2).

> I really think we cannot afford piling hacks on top of hacks to hide the
> broken interface forever.  We have two different things to validate, and
> the callers all know what they have when calling us to validate.  We
> should not have a single loose validation that only catches "it cannot be
> either fetch nor push refspec" breakage.

So we should call the same code to parse the string, have that code do no 
validation at all, and have the caller validate the return as appropriate. 
The parsing doesn't depend at all on whether it's for fetching or pushing.

>         As a side note, I have a vague recollection that we used to treat
>         a refspec from the command line and a refspec from the config
>         differently in some contexts, and rules applied to them might have
>         been different.  I do not think of what the context was offhand,
>         but we might need to further split the implementation of the
>         rules.

I think it was that we used to break "push = HEAD", but we support it now.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Permit refspec source side to parse as a sha1
  2008-03-21  5:09   ` Daniel Barkalow
@ 2008-03-21  5:30     ` Junio C Hamano
  2008-03-21  5:57       ` Daniel Barkalow
  0 siblings, 1 reply; 20+ messages in thread
From: Junio C Hamano @ 2008-03-21  5:30 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: git, Samuel Tardieu

Daniel Barkalow <barkalow@iabervon.org> writes:

> So we should call the same code to parse the string, have that code do no 
> validation at all, and have the caller validate the return as appropriate. 
> The parsing doesn't depend at all on whether it's for fetching or pushing.

Obviously you did not read my patch before responding.

While I would agree that removing the check from parsing and add necessary
checks to all callers would be another way to catch the same kind of
breakages,

 (1) it would be more code to change, and I do not see a clear advantage,
     in that approach, especially given the discussion that lead to
     $gmane/77413;

 (2) the error reporting by the callers that use the parsed result will
     not be able to say "Invalid refspec '%s'" on the source string,
     simply because the source string is not available to them anymore;

 (3) worse yet, the older code even discarded part of the user input, so
     the error reporting that uses the parsed src/dst pair may not even be
     similar to the problematic input the user gave us to begin with,
     making it harder for the user to spot what we did not like and
     correct it.

In any case, don't you agree that the patch you responded to is much
easier to understand what we are (and we are not) checking than the
original code?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Permit refspec source side to parse as a sha1
  2008-03-21  5:30     ` Junio C Hamano
@ 2008-03-21  5:57       ` Daniel Barkalow
  2008-03-21  6:26         ` Junio C Hamano
  0 siblings, 1 reply; 20+ messages in thread
From: Daniel Barkalow @ 2008-03-21  5:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Samuel Tardieu

On Thu, 20 Mar 2008, Junio C Hamano wrote:

> Daniel Barkalow <barkalow@iabervon.org> writes:
> 
> > So we should call the same code to parse the string, have that code do no 
> > validation at all, and have the caller validate the return as appropriate. 
> > The parsing doesn't depend at all on whether it's for fetching or pushing.
> 
> Obviously you did not read my patch before responding.
> 
> While I would agree that removing the check from parsing and add necessary
> checks to all callers would be another way to catch the same kind of
> breakages,
> 
>  (1) it would be more code to change, and I do not see a clear advantage,
>      in that approach, especially given the discussion that lead to
>      $gmane/77413;

Surely it's no more code to change all of the callers of parse_ref_spec to 
also call a function to validate it as appropriate than it is to change 
all of the callers of parse_ref_spec to call a different function instead.
 
>  (2) the error reporting by the callers that use the parsed result will
>      not be able to say "Invalid refspec '%s'" on the source string,
>      simply because the source string is not available to them anymore;

That's easy enough to keep, and likely even worthwhile; it would probably 
be nice, for example, to have "git fetch origin typo/*:something/*" give 
an error when the pattern doesn't match anything, and that error would 
need the refspec string.

>  (3) worse yet, the older code even discarded part of the user input, so
>      the error reporting that uses the parsed src/dst pair may not even be
>      similar to the problematic input the user gave us to begin with,
>      making it harder for the user to spot what we did not like and
>      correct it.

That was a bug that was fixed in the intermediate version.

> In any case, don't you agree that the patch you responded to is much
> easier to understand what we are (and we are not) checking than the
> original code?

No, I think it's much more complicated. It mixes the semantics of what an 
empty side means for a particular use of refspecs into the parsing of the 
string. At the very least, the checks should be outside of _internal() in 
the functions for particular uses.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Permit refspec source side to parse as a sha1
  2008-03-21  5:57       ` Daniel Barkalow
@ 2008-03-21  6:26         ` Junio C Hamano
  2008-03-21 16:08           ` Daniel Barkalow
  0 siblings, 1 reply; 20+ messages in thread
From: Junio C Hamano @ 2008-03-21  6:26 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: git, Samuel Tardieu

Daniel Barkalow <barkalow@iabervon.org> writes:

> On Thu, 20 Mar 2008, Junio C Hamano wrote:
> ...
>> In any case, don't you agree that the patch you responded to is much
>> easier to understand what we are (and we are not) checking than the
>> original code?
>
> No, I think it's much more complicated. It mixes the semantics of what an 
> empty side means for a particular use of refspecs into the parsing of the 
> string. At the very least, the checks should be outside of _internal() in 
> the functions for particular uses.

The thing is, the syntax is the same between fetch and push only to a
degree.  They are both <LHS> ':' <RHS>.  What is allowed on LHS and RHS
are quite different even at the syntactic level.

I already know our criteria when judging if a particular code is clean or
complex are _vastly_ different, from the experience working with you in
other parts of the system (namely, read-tree 3-way rules and
unpack_trees() rewrite that happened quite a long time ago).

While I would note that you thought my version is more complex to read, I
would not argue about this issue with you anymore, except saying that I
strongly disagree.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] Permit refspec source side to parse as a sha1
  2008-03-21  6:26         ` Junio C Hamano
@ 2008-03-21 16:08           ` Daniel Barkalow
  2008-03-21 22:17             ` [PATCH] remote.c: Fix overtight refspec validation Junio C Hamano
  0 siblings, 1 reply; 20+ messages in thread
From: Daniel Barkalow @ 2008-03-21 16:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Samuel Tardieu

On Thu, 20 Mar 2008, Junio C Hamano wrote:

> Daniel Barkalow <barkalow@iabervon.org> writes:
> 
> > On Thu, 20 Mar 2008, Junio C Hamano wrote:
> > ...
> >> In any case, don't you agree that the patch you responded to is much
> >> easier to understand what we are (and we are not) checking than the
> >> original code?
> >
> > No, I think it's much more complicated. It mixes the semantics of what an 
> > empty side means for a particular use of refspecs into the parsing of the 
> > string. At the very least, the checks should be outside of _internal() in 
> > the functions for particular uses.
> 
> The thing is, the syntax is the same between fetch and push only to a
> degree.  They are both <LHS> ':' <RHS>.  What is allowed on LHS and RHS
> are quite different even at the syntactic level.

There's also the optional + at the beginning and the way of forming 
patterns, and the need to distinguish not having a colon with not having 
anything on one or the other side of the colon.

> I already know our criteria when judging if a particular code is clean or
> complex are _vastly_ different, from the experience working with you in
> other parts of the system (namely, read-tree 3-way rules and
> unpack_trees() rewrite that happened quite a long time ago).

Agreed.

And I do think that the rewrite of the part of the function before the 
semantic checks is clearer than the corresponding original (although the 
patch text made it hard to see; we really need some magic to make diff not 
bother to save 4 identifier-free lines and just show it as a complete 
replacement).

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH] remote.c: Fix overtight refspec validation
  2008-03-21 16:08           ` Daniel Barkalow
@ 2008-03-21 22:17             ` Junio C Hamano
  2008-03-21 23:12               ` Daniel Barkalow
  2008-03-26  1:45               ` Linus Torvalds
  0 siblings, 2 replies; 20+ messages in thread
From: Junio C Hamano @ 2008-03-21 22:17 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: git, Samuel Tardieu

We tightened the refspec validation code in an earlier ef00d15 (Tighten
refspec processing, 2008-03-17) per my suggestion, but the suggestion was
misguided to begin with and it broke this usage:

    $ git push origin HEAD~12:master

The syntax of push refspecs and fetch refspecs are similar in that they
are both colon separated LHS and RHS (possibly prefixed with a + to
force), but the similarity ends there.  For example, LHS in a push refspec
can be anything that evaluates to a valid object name at runtime (except
when colon and RHS is missing, or it is a glob), while it must be a
valid-looking refname in a fetch refspec.  To validate them correctly, the
caller needs to be able to say which kind of refspecs they are.  It is
unreasonable to keep a single interface that cannot tell which kind it is
dealing with, and ask it to behave sensibly.

This commit separates the parsing of the two into different functions, and
clarifies the code to implement the parsing proper (i.e. splitting into
two parts, making sure both sides are wildcard or neither side is).

This happens to also allow pushing a commit named with the esoteric "look
for that string" syntax:

    $ git push ../test.git ':/remote.c: Fix overtight refspec:master'

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

  Daniel Barkalow <barkalow@iabervon.org> writes:

  > ... (although the 
  > patch text made it hard to see; we really need some magic to make diff not 
  > bother to save 4 identifier-free lines and just show it as a complete 
  > replacement).

  Yeah, I tend to agree and some people seem to favor "diff -c" for this
  reason.

  In any case, here is an updated patch with a better description and
  necessary tests.  The "validly looking extended SHA-1 expression" check
  is wholly outside the scope of 1.5.5 regression fix so I did not even
  attempt to address it, but what this commit fixes needs to be addressed
  before -rc1.

 builtin-fetch.c     |    3 +-
 builtin-send-pack.c |    2 +-
 remote.c            |  122 ++++++++++++++++++++++++++++++++++-----------------
 remote.h            |    3 +-
 t/t5511-refspec.sh  |   67 ++++++++++++++++++++++++++++
 5 files changed, 153 insertions(+), 44 deletions(-)
 create mode 100755 t/t5511-refspec.sh

diff --git a/builtin-fetch.c b/builtin-fetch.c
index b2b9935..a11548c 100644
--- a/builtin-fetch.c
+++ b/builtin-fetch.c
@@ -652,5 +652,6 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 
 	signal(SIGINT, unlock_pack_on_signal);
 	atexit(unlock_pack);
-	return do_fetch(transport, parse_ref_spec(ref_nr, refs), ref_nr);
+	return do_fetch(transport,
+			parse_fetch_refspec(ref_nr, refs), ref_nr);
 }
diff --git a/builtin-send-pack.c b/builtin-send-pack.c
index 930e0fb..bb9c33a 100644
--- a/builtin-send-pack.c
+++ b/builtin-send-pack.c
@@ -537,7 +537,7 @@ static void verify_remote_names(int nr_heads, const char **heads)
 	int i;
 
 	for (i = 0; i < nr_heads; i++) {
-		const char *remote = strchr(heads[i], ':');
+		const char *remote = strrchr(heads[i], ':');
 
 		remote = remote ? (remote + 1) : heads[i];
 		switch (check_ref_format(remote)) {
diff --git a/remote.c b/remote.c
index 9700a33..4117bfc 100644
--- a/remote.c
+++ b/remote.c
@@ -393,58 +393,98 @@ static void read_config(void)
 	alias_all_urls();
 }
 
-struct refspec *parse_ref_spec(int nr_refspec, const char **refspec)
+static struct refspec *parse_refspec_internal(int nr_refspec, const char **refspec, int fetch)
 {
 	int i;
 	int st;
 	struct refspec *rs = xcalloc(sizeof(*rs), nr_refspec);
+
 	for (i = 0; i < nr_refspec; i++) {
-		const char *sp, *ep, *gp;
-		sp = refspec[i];
-		if (*sp == '+') {
+		size_t llen, rlen;
+		int is_glob;
+		const char *lhs, *rhs;
+
+		llen = rlen = is_glob = 0;
+
+		lhs = refspec[i];
+		if (*lhs == '+') {
 			rs[i].force = 1;
-			sp++;
+			lhs++;
 		}
-		gp = strstr(sp, "/*");
-		ep = strchr(sp, ':');
-		if (gp && ep && gp > ep)
-			gp = NULL;
-		if (ep) {
-			if (ep[1]) {
-				const char *glob = strstr(ep + 1, "/*");
-				if (glob && glob[2])
-					glob = NULL;
-				if (!glob)
-					gp = NULL;
-				if (gp)
-					rs[i].dst = xstrndup(ep + 1,
-							     glob - ep - 1);
-				else
-					rs[i].dst = xstrdup(ep + 1);
-			}
-		} else {
-			ep = sp + strlen(sp);
+
+		rhs = strrchr(lhs, ':');
+		if (rhs) {
+			rhs++;
+			rlen = strlen(rhs);
+			is_glob = (2 <= rlen && !strcmp(rhs + rlen - 2, "/*"));
+			rs[i].dst = xstrndup(rhs, rlen - is_glob * 2);
 		}
-		if (gp && gp + 2 != ep)
-			gp = NULL;
-		if (gp) {
-			rs[i].pattern = 1;
-			ep = gp;
+
+		llen = (rhs ? (rhs - lhs - 1) : strlen(lhs));
+		if (is_glob != (2 <= llen && !memcmp(lhs + llen - 2, "/*", 2)))
+			goto invalid;
+
+		if (is_glob) {
+			llen -= 2;
+			rlen -= 2;
 		}
-		rs[i].src = xstrndup(sp, ep - sp);
+		rs[i].pattern = is_glob;
+		rs[i].src = xstrndup(lhs, llen);
 
-		if (*rs[i].src) {
+		/*
+		 * Do we want to validate LHS?
+		 *
+		 * - Fetch refspec must have LHS that names a valid
+		 *   looking ref (unless it is empty, which defaults to
+		 *   HEAD).
+		 *
+		 * - Push refspec "LHS" without ":RHS" is a synonym for
+		 *   "LHS:LHS", and LHS must be a valid looking ref
+		 *   (cannot be empty).
+		 *
+		 * - Push refspec "LHS:RHS" that is a glob must be
+		 *   a valid looking ref (unless it is empty to remove)
+		 *
+		 * Hence we check non-empty LHS for fetch, and
+		 * colonless or glob LHS for push here.
+		 *
+		 * Note that push refspec "LHS:RHS" that is not a glob
+		 * can have any valid extended SHA-1 expression (or
+		 * "empty" for removal) in LHS, and we cannot check
+		 * for error until it actually gets used.
+		 */
+		if (fetch ? (*rs[i].src) : (!rhs || is_glob)) {
 			st = check_ref_format(rs[i].src);
 			if (st && st != CHECK_REF_FORMAT_ONELEVEL)
-				die("Invalid refspec '%s'", refspec[i]);
+				goto invalid;
 		}
-		if (rs[i].dst && *rs[i].dst) {
+
+		/*
+		 * RHS: both push destination and fetch tracking ref
+		 * must name a valid looking ref, but in fetch, colon
+		 * followed by emptiness is the same as not having the
+		 * colon nor RHS.
+		 */
+		if (rs[i].dst && (!fetch || *rs[i].dst)) {
 			st = check_ref_format(rs[i].dst);
 			if (st && st != CHECK_REF_FORMAT_ONELEVEL)
-				die("Invalid refspec '%s'", refspec[i]);
+				goto invalid;
 		}
 	}
 	return rs;
+
+ invalid:
+	die("Invalid refspec '%s'", refspec[i]);
+}
+
+struct refspec *parse_fetch_refspec(int nr_refspec, const char **refspec)
+{
+	return parse_refspec_internal(nr_refspec, refspec, 1);
+}
+
+struct refspec *parse_push_refspec(int nr_refspec, const char **refspec)
+{
+	return parse_refspec_internal(nr_refspec, refspec, 0);
 }
 
 static int valid_remote_nick(const char *name)
@@ -475,8 +515,8 @@ struct remote *remote_get(const char *name)
 		add_url_alias(ret, name);
 	if (!ret->url)
 		return NULL;
-	ret->fetch = parse_ref_spec(ret->fetch_refspec_nr, ret->fetch_refspec);
-	ret->push = parse_ref_spec(ret->push_refspec_nr, ret->push_refspec);
+	ret->fetch = parse_fetch_refspec(ret->fetch_refspec_nr, ret->fetch_refspec);
+	ret->push = parse_push_refspec(ret->push_refspec_nr, ret->push_refspec);
 	return ret;
 }
 
@@ -489,11 +529,11 @@ int for_each_remote(each_remote_fn fn, void *priv)
 		if (!r)
 			continue;
 		if (!r->fetch)
-			r->fetch = parse_ref_spec(r->fetch_refspec_nr,
-					r->fetch_refspec);
+			r->fetch = parse_fetch_refspec(r->fetch_refspec_nr,
+						       r->fetch_refspec);
 		if (!r->push)
-			r->push = parse_ref_spec(r->push_refspec_nr,
-					r->push_refspec);
+			r->push = parse_push_refspec(r->push_refspec_nr,
+						     r->push_refspec);
 		result = fn(r, priv);
 	}
 	return result;
@@ -824,7 +864,7 @@ int match_refs(struct ref *src, struct ref *dst, struct ref ***dst_tail,
 	       int nr_refspec, const char **refspec, int flags)
 {
 	struct refspec *rs =
-		parse_ref_spec(nr_refspec, (const char **) refspec);
+		parse_push_refspec(nr_refspec, (const char **) refspec);
 	int send_all = flags & MATCH_REFS_ALL;
 	int send_mirror = flags & MATCH_REFS_MIRROR;
 
diff --git a/remote.h b/remote.h
index f1dedf1..7e9ae79 100644
--- a/remote.h
+++ b/remote.h
@@ -67,7 +67,8 @@ void free_refs(struct ref *ref);
  */
 void ref_remove_duplicates(struct ref *ref_map);
 
-struct refspec *parse_ref_spec(int nr_refspec, const char **refspec);
+struct refspec *parse_fetch_refspec(int nr_refspec, const char **refspec);
+struct refspec *parse_push_refspec(int nr_refspec, const char **refspec);
 
 int match_refs(struct ref *src, struct ref *dst, struct ref ***dst_tail,
 	       int nr_refspec, const char **refspec, int all);
diff --git a/t/t5511-refspec.sh b/t/t5511-refspec.sh
new file mode 100755
index 0000000..7360311
--- /dev/null
+++ b/t/t5511-refspec.sh
@@ -0,0 +1,67 @@
+#!/bin/sh
+
+test_description='refspec parsing'
+
+. ./test-lib.sh
+
+test_refspec () {
+
+	kind=$1 refspec=$2 expect=$3
+	git config remote.frotz.url "." &&
+	git config --remove-section remote.frotz &&
+	git config remote.frotz.url "." &&
+	git config "remote.frotz.$kind" "$refspec" &&
+	if test "$expect" != invalid
+	then
+		title="$kind $refspec"
+		test='git ls-remote frotz'
+	else
+		title="$kind $refspec (invalid)"
+		test='test_must_fail git ls-remote frotz'
+	fi
+	test_expect_success "$title" "$test"
+}
+
+
+test_refspec push 'refs/heads/*:refs/remotes/frotz/*'
+test_refspec push 'refs/heads/*:refs/remotes/frotz'		invalid
+test_refspec push 'refs/heads:refs/remotes/frotz/*'		invalid
+test_refspec push 'refs/heads/master:refs/remotes/frotz/xyzzy'
+
+
+# These have invalid LHS, but we do not have a formal "valid sha-1
+# expression syntax checker" so they are not checked with the current
+# code.  They will be caught downstream anyway, but we may want to
+# have tighter check later...
+
+: test_refspec push 'refs/heads/master::refs/remotes/frotz/xyzzy'	invalid
+: test_refspec push 'refs/heads/maste :refs/remotes/frotz/xyzzy'	invalid
+
+test_refspec fetch 'refs/heads/*:refs/remotes/frotz/*'
+test_refspec fetch 'refs/heads/*:refs/remotes/frotz'		invalid
+test_refspec fetch 'refs/heads:refs/remotes/frotz/*'		invalid
+test_refspec fetch 'refs/heads/master:refs/remotes/frotz/xyzzy'
+test_refspec fetch 'refs/heads/master::refs/remotes/frotz/xyzzy'	invalid
+test_refspec fetch 'refs/heads/maste :refs/remotes/frotz/xyzzy'	invalid
+
+test_refspec push 'master~1:refs/remotes/frotz/backup'
+test_refspec fetch 'master~1:refs/remotes/frotz/backup'		invalid
+test_refspec push 'HEAD~4:refs/remotes/frotz/new'
+test_refspec fetch 'HEAD~4:refs/remotes/frotz/new'		invalid
+
+test_refspec push 'HEAD'
+test_refspec fetch 'HEAD'
+test_refspec push 'refs/heads/ nitfol'				invalid
+test_refspec fetch 'refs/heads/ nitfol'				invalid
+
+test_refspec push 'HEAD:'					invalid
+test_refspec fetch 'HEAD:'
+test_refspec push 'refs/heads/ nitfol:'				invalid
+test_refspec fetch 'refs/heads/ nitfol:'			invalid
+
+test_refspec push ':refs/remotes/frotz/deleteme'
+test_refspec fetch ':refs/remotes/frotz/HEAD-to-me'
+test_refspec push ':refs/remotes/frotz/delete me'		invalid
+test_refspec fetch ':refs/remotes/frotz/HEAD to me'		invalid
+
+test_done
-- 
1.5.5.rc0.139.g9315d

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] remote.c: Fix overtight refspec validation
  2008-03-21 22:17             ` [PATCH] remote.c: Fix overtight refspec validation Junio C Hamano
@ 2008-03-21 23:12               ` Daniel Barkalow
  2008-03-21 23:59                 ` Junio C Hamano
  2008-03-26  1:45               ` Linus Torvalds
  1 sibling, 1 reply; 20+ messages in thread
From: Daniel Barkalow @ 2008-03-21 23:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Samuel Tardieu

On Fri, 21 Mar 2008, Junio C Hamano wrote:

> We tightened the refspec validation code in an earlier ef00d15 (Tighten
> refspec processing, 2008-03-17) per my suggestion, but the suggestion was
> misguided to begin with and it broke this usage:
> 
>     $ git push origin HEAD~12:master
> 
> The syntax of push refspecs and fetch refspecs are similar in that they
> are both colon separated LHS and RHS (possibly prefixed with a + to
> force), but the similarity ends there.  For example, LHS in a push refspec
> can be anything that evaluates to a valid object name at runtime (except
> when colon and RHS is missing, or it is a glob), while it must be a
> valid-looking refname in a fetch refspec.  To validate them correctly, the
> caller needs to be able to say which kind of refspecs they are.  It is
> unreasonable to keep a single interface that cannot tell which kind it is
> dealing with, and ask it to behave sensibly.
> 
> This commit separates the parsing of the two into different functions, and
> clarifies the code to implement the parsing proper (i.e. splitting into
> two parts, making sure both sides are wildcard or neither side is).
> 
> This happens to also allow pushing a commit named with the esoteric "look
> for that string" syntax:
> 
>     $ git push ../test.git ':/remote.c: Fix overtight refspec:master'


> 
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
> 
>   Daniel Barkalow <barkalow@iabervon.org> writes:
> 
>   > ... (although the 
>   > patch text made it hard to see; we really need some magic to make diff not 
>   > bother to save 4 identifier-free lines and just show it as a complete 
>   > replacement).
> 
>   Yeah, I tend to agree and some people seem to favor "diff -c" for this
>   reason.
> 
>   In any case, here is an updated patch with a better description and
>   necessary tests.  The "validly looking extended SHA-1 expression" check
>   is wholly outside the scope of 1.5.5 regression fix so I did not even
>   attempt to address it, but what this commit fixes needs to be addressed
>   before -rc1.
> 
>  builtin-fetch.c     |    3 +-
>  builtin-send-pack.c |    2 +-
>  remote.c            |  122 ++++++++++++++++++++++++++++++++++-----------------
>  remote.h            |    3 +-
>  t/t5511-refspec.sh  |   67 ++++++++++++++++++++++++++++
>  5 files changed, 153 insertions(+), 44 deletions(-)
>  create mode 100755 t/t5511-refspec.sh
> 
> diff --git a/builtin-fetch.c b/builtin-fetch.c
> index b2b9935..a11548c 100644
> --- a/builtin-fetch.c
> +++ b/builtin-fetch.c
> @@ -652,5 +652,6 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
>  
>  	signal(SIGINT, unlock_pack_on_signal);
>  	atexit(unlock_pack);
> -	return do_fetch(transport, parse_ref_spec(ref_nr, refs), ref_nr);
> +	return do_fetch(transport,
> +			parse_fetch_refspec(ref_nr, refs), ref_nr);
>  }
> diff --git a/builtin-send-pack.c b/builtin-send-pack.c
> index 930e0fb..bb9c33a 100644
> --- a/builtin-send-pack.c
> +++ b/builtin-send-pack.c
> @@ -537,7 +537,7 @@ static void verify_remote_names(int nr_heads, const char **heads)
>  	int i;
>  
>  	for (i = 0; i < nr_heads; i++) {
> -		const char *remote = strchr(heads[i], ':');
> +		const char *remote = strrchr(heads[i], ':');
>  
>  		remote = remote ? (remote + 1) : heads[i];
>  		switch (check_ref_format(remote)) {
> diff --git a/remote.c b/remote.c
> index 9700a33..4117bfc 100644
> --- a/remote.c
> +++ b/remote.c
> @@ -393,58 +393,98 @@ static void read_config(void)
>  	alias_all_urls();
>  }
>  
> -struct refspec *parse_ref_spec(int nr_refspec, const char **refspec)
> +static struct refspec *parse_refspec_internal(int nr_refspec, const char **refspec, int fetch)
>  {
>  	int i;
>  	int st;
>  	struct refspec *rs = xcalloc(sizeof(*rs), nr_refspec);
> +
>  	for (i = 0; i < nr_refspec; i++) {
> -		const char *sp, *ep, *gp;
> -		sp = refspec[i];
> -		if (*sp == '+') {
> +		size_t llen, rlen;
> +		int is_glob;
> +		const char *lhs, *rhs;
> +
> +		llen = rlen = is_glob = 0;
> +
> +		lhs = refspec[i];
> +		if (*lhs == '+') {
>  			rs[i].force = 1;
> -			sp++;
> +			lhs++;
>  		}
> -		gp = strstr(sp, "/*");
> -		ep = strchr(sp, ':');
> -		if (gp && ep && gp > ep)
> -			gp = NULL;
> -		if (ep) {
> -			if (ep[1]) {
> -				const char *glob = strstr(ep + 1, "/*");
> -				if (glob && glob[2])
> -					glob = NULL;
> -				if (!glob)
> -					gp = NULL;
> -				if (gp)
> -					rs[i].dst = xstrndup(ep + 1,
> -							     glob - ep - 1);
> -				else
> -					rs[i].dst = xstrdup(ep + 1);
> -			}
> -		} else {
> -			ep = sp + strlen(sp);
> +
> +		rhs = strrchr(lhs, ':');
> +		if (rhs) {
> +			rhs++;
> +			rlen = strlen(rhs);
> +			is_glob = (2 <= rlen && !strcmp(rhs + rlen - 2, "/*"));
> +			rs[i].dst = xstrndup(rhs, rlen - is_glob * 2);
>  		}
> -		if (gp && gp + 2 != ep)
> -			gp = NULL;
> -		if (gp) {
> -			rs[i].pattern = 1;
> -			ep = gp;
> +
> +		llen = (rhs ? (rhs - lhs - 1) : strlen(lhs));
> +		if (is_glob != (2 <= llen && !memcmp(lhs + llen - 2, "/*", 2)))
> +			goto invalid;
> +
> +		if (is_glob) {
> +			llen -= 2;
> +			rlen -= 2;
>  		}
> -		rs[i].src = xstrndup(sp, ep - sp);
> +		rs[i].pattern = is_glob;
> +		rs[i].src = xstrndup(lhs, llen);
>  
> -		if (*rs[i].src) {
> +		/*
> +		 * Do we want to validate LHS?
> +		 *
> +		 * - Fetch refspec must have LHS that names a valid
> +		 *   looking ref (unless it is empty, which defaults to
> +		 *   HEAD).
> +		 *
> +		 * - Push refspec "LHS" without ":RHS" is a synonym for
> +		 *   "LHS:LHS", and LHS must be a valid looking ref
> +		 *   (cannot be empty).
> +		 *
> +		 * - Push refspec "LHS:RHS" that is a glob must be
> +		 *   a valid looking ref (unless it is empty to remove)
> +		 *
> +		 * Hence we check non-empty LHS for fetch, and
> +		 * colonless or glob LHS for push here.
> +		 *
> +		 * Note that push refspec "LHS:RHS" that is not a glob
> +		 * can have any valid extended SHA-1 expression (or
> +		 * "empty" for removal) in LHS, and we cannot check
> +		 * for error until it actually gets used.
> +		 */

Wouldn't this be clearer and not meaningfully harder in 
parse_fetch_refspec and parse_push_refspec? There are, between the two 
types of checking and the two sides to check, a total of 6 expressions 
used to determine whether something has to be valid as a head, and only 
one pair of them are the same (rs[i].dst), and that's trivial (you don't 
check the part after the colon if there's no colon).

> +		if (fetch ? (*rs[i].src) : (!rhs || is_glob)) {

This is an odd combination of locals and struct members.
                                         : (!rs[i].dst || rs[i].pattern) {

>  			st = check_ref_format(rs[i].src);
>  			if (st && st != CHECK_REF_FORMAT_ONELEVEL)
> -				die("Invalid refspec '%s'", refspec[i]);
> +				goto invalid;
>  		}
> -		if (rs[i].dst && *rs[i].dst) {
> +
> +		/*
> +		 * RHS: both push destination and fetch tracking ref
> +		 * must name a valid looking ref, but in fetch, colon
> +		 * followed by emptiness is the same as not having the
> +		 * colon nor RHS.
> +		 */
> +		if (rs[i].dst && (!fetch || *rs[i].dst)) {
>  			st = check_ref_format(rs[i].dst);
>  			if (st && st != CHECK_REF_FORMAT_ONELEVEL)
> -				die("Invalid refspec '%s'", refspec[i]);
> +				goto invalid;
>  		}
>  	}
>  	return rs;
> +
> + invalid:
> +	die("Invalid refspec '%s'", refspec[i]);
> +}

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] remote.c: Fix overtight refspec validation
  2008-03-21 23:12               ` Daniel Barkalow
@ 2008-03-21 23:59                 ` Junio C Hamano
  2008-03-22  0:36                   ` Daniel Barkalow
  0 siblings, 1 reply; 20+ messages in thread
From: Junio C Hamano @ 2008-03-21 23:59 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: git, Samuel Tardieu

Daniel Barkalow <barkalow@iabervon.org> writes:

>> +		/*
>> +		 * Do we want to validate LHS?
>> + ...
>> +		 * Hence we check non-empty LHS for fetch, and
>> +		 * colonless or glob LHS for push here.
>> +		 */
>
> Wouldn't this be clearer and not meaningfully harder in 
> parse_fetch_refspec and parse_push_refspec?

Do you mean you want the callers of this internal implementation to also
loop over the input set of refs?  I think that would be more complex code
but I do not see much gain by doing so.

>> +		if (fetch ? (*rs[i].src) : (!rhs || is_glob)) {
>
> This is an odd combination of locals and struct members.
>                                        : (!rs[i].dst || rs[i].pattern) {

Sorry, I do not understand what's wrong about it.

	!!rhs === (did we see a colon) === !!rs[].dst
        is_glob === (did they both end with "/*") === rs[].pattern

They are equivalent, and local variables are primarily what the logic
works on and bases its decisions to store what in rs[] structures.

Ahh...  do you mean:

	(*rs[i].src) === (is lhs non empty?) === !!llen

I guess using "llen" there is more consistent and is moderately cleaner.

Perhaps squash this as a clean-up?

diff --git a/remote.c b/remote.c
index 4117bfc..86113b7 100644
--- a/remote.c
+++ b/remote.c
@@ -453,7 +453,7 @@ static struct refspec *parse_refspec_internal(int nr_refspec, const char **refsp
 		 * "empty" for removal) in LHS, and we cannot check
 		 * for error until it actually gets used.
 		 */
-		if (fetch ? (*rs[i].src) : (!rhs || is_glob)) {
+		if (fetch ? llen : (!rhs || is_glob)) {
 			st = check_ref_format(rs[i].src);
 			if (st && st != CHECK_REF_FORMAT_ONELEVEL)
 				goto invalid;

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] remote.c: Fix overtight refspec validation
  2008-03-21 23:59                 ` Junio C Hamano
@ 2008-03-22  0:36                   ` Daniel Barkalow
  2008-03-22 19:48                     ` Junio C Hamano
  0 siblings, 1 reply; 20+ messages in thread
From: Daniel Barkalow @ 2008-03-22  0:36 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Samuel Tardieu

On Fri, 21 Mar 2008, Junio C Hamano wrote:

> Daniel Barkalow <barkalow@iabervon.org> writes:
> 
> >> +		/*
> >> +		 * Do we want to validate LHS?
> >> + ...
> >> +		 * Hence we check non-empty LHS for fetch, and
> >> +		 * colonless or glob LHS for push here.
> >> +		 */
> >
> > Wouldn't this be clearer and not meaningfully harder in 
> > parse_fetch_refspec and parse_push_refspec?
> 
> Do you mean you want the callers of this internal implementation to also
> loop over the input set of refs?  I think that would be more complex code
> but I do not see much gain by doing so.

I think it's more breadth but less depth. It would make the internal 
implementation not depend on fetch, and put the checks that only apply to 
fetch out of the push code path and vice versa.

Or just have a section

	if (fetch) {
		// checks for fetch LHS
		// checks for fetch RHS
	} else {
		// checks for push LHS
		// checks for push RHS
	}

The body of the condition is only four lines, after all.

> >> +		if (fetch ? (*rs[i].src) : (!rhs || is_glob)) {
> >
> > This is an odd combination of locals and struct members.
> >                                        : (!rs[i].dst || rs[i].pattern) {
> 
> Sorry, I do not understand what's wrong about it.
> 
> 	!!rhs === (did we see a colon) === !!rs[].dst
>         is_glob === (did they both end with "/*") === rs[].pattern
> 
> They are equivalent, and local variables are primarily what the logic
> works on and bases its decisions to store what in rs[] structures.

Considering that it's already stored values into the struct, it might as 
well use those, and those are presumably more familiar to the average 
reader, because all of the git code that uses refspecs other than this 
function uses them.

> Ahh...  do you mean:
> 
> 	(*rs[i].src) === (is lhs non empty?) === !!llen
> 
> I guess using "llen" there is more consistent and is moderately cleaner.

I'd go the other way, but having them all from the same set makes more 
sense than one from one set and two from the other, regardless of which 
way. If you go this way, you should probably also include the the rhs 
checks, and the argument to check_ref_format().

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] remote.c: Fix overtight refspec validation
  2008-03-22  0:36                   ` Daniel Barkalow
@ 2008-03-22 19:48                     ` Junio C Hamano
  2008-03-22 20:45                       ` Daniel Barkalow
  0 siblings, 1 reply; 20+ messages in thread
From: Junio C Hamano @ 2008-03-22 19:48 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: git, Samuel Tardieu

Daniel Barkalow <barkalow@iabervon.org> writes:

> On Fri, 21 Mar 2008, Junio C Hamano wrote:
> ...
>> Do you mean you want the callers of this internal implementation to also
>> loop over the input set of refs?  I think that would be more complex code
>> but I do not see much gain by doing so.
>
> I think it's more breadth but less depth. It would make the internal 
> implementation not depend on fetch, and put the checks that only apply to 
> fetch out of the push code path and vice versa.
>
> Or just have a section
>
> 	if (fetch) {
> 		// checks for fetch LHS
> 		// checks for fetch RHS
> 	} else {
> 		// checks for push LHS
> 		// checks for push RHS
> 	}
>
> The body of the condition is only four lines, after all.

There are two commits on jc/refspec-fix branch merged to 'pu'.  The
earlier one is my version, and the one on top is based on the above
suggestion.  I do not know which one is clearer, more readable and
maintainable.

>> Ahh...  do you mean:
>> 
>> 	(*rs[i].src) === (is lhs non empty?) === !!llen
>> 
>> I guess using "llen" there is more consistent and is moderately cleaner.
>
> I'd go the other way, but having them all from the same set makes more 
> sense than one from one set and two from the other, regardless of which 
> way. If you go this way, you should probably also include the the rhs 
> checks, and the argument to check_ref_format().

It is very much sensible to look at either the local variables it used
during the parsing and the tentative result it produced while deciding
what to check and how, and it also is very sensible to validate what it
gives back to the user and/or what it knows is equal to what it gives back
to the user.  When they are equivalent, it is mostly a matter of taste
which to use for deciding and which to use for validating.  But it makes
more sense to prefer its local variables for logic to decide what to do
and how, and validate what we are actually going to give back.

If you mean to suggest to change parameter given to check_ref_format()
from rs[i].{src,dst} to something else based on the local variables we
used, that's backwards.

But we have already agreed that our brains are wired differently when it
comes to taste; I'd prefer not pursuing bikeshedding any further.

Unless your suggestion _isn't_ bikeshedding, that is.

What I did so far was to spend time to respond with code that fixes
existing breakages, while what you did so far was to kibitz instead of
showing code.  Showing code might make me realize that the way you may
want to go is more than bikeshedding and actual improvements to
readability and maintainability, but honestly speaking I am tired of
looking at this part of the code for now, so...

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] remote.c: Fix overtight refspec validation
  2008-03-22 19:48                     ` Junio C Hamano
@ 2008-03-22 20:45                       ` Daniel Barkalow
  0 siblings, 0 replies; 20+ messages in thread
From: Daniel Barkalow @ 2008-03-22 20:45 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Samuel Tardieu

On Sat, 22 Mar 2008, Junio C Hamano wrote:

> Daniel Barkalow <barkalow@iabervon.org> writes:
> 
> > On Fri, 21 Mar 2008, Junio C Hamano wrote:
> > ...
> >> Do you mean you want the callers of this internal implementation to also
> >> loop over the input set of refs?  I think that would be more complex code
> >> but I do not see much gain by doing so.
> >
> > I think it's more breadth but less depth. It would make the internal 
> > implementation not depend on fetch, and put the checks that only apply to 
> > fetch out of the push code path and vice versa.
> >
> > Or just have a section
> >
> > 	if (fetch) {
> > 		// checks for fetch LHS
> > 		// checks for fetch RHS
> > 	} else {
> > 		// checks for push LHS
> > 		// checks for push RHS
> > 	}
> >
> > The body of the condition is only four lines, after all.
> 
> There are two commits on jc/refspec-fix branch merged to 'pu'.  The
> earlier one is my version, and the one on top is based on the above
> suggestion.  I do not know which one is clearer, more readable and
> maintainable.

I like the second one quite a bit. If nothing else, it's got, as the
comments, the best documentation of refspecs I've seen, and then it's got 
code which clearly implements those rules. I think the explicitness is 
worth the extra lines.

Thanks for taking care of this; as a result of starting a new job, I 
haven't had much in the way of coding attention for other stuff the past 
couple of weeks.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] remote.c: Fix overtight refspec validation
  2008-03-21 22:17             ` [PATCH] remote.c: Fix overtight refspec validation Junio C Hamano
  2008-03-21 23:12               ` Daniel Barkalow
@ 2008-03-26  1:45               ` Linus Torvalds
  2008-03-26  3:31                 ` Junio C Hamano
  1 sibling, 1 reply; 20+ messages in thread
From: Linus Torvalds @ 2008-03-26  1:45 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Daniel Barkalow, git, Samuel Tardieu



On Fri, 21 Mar 2008, Junio C Hamano wrote:
>
> We tightened the refspec validation code in an earlier ef00d15 (Tighten
> refspec processing, 2008-03-17) per my suggestion, but the suggestion was
> misguided to begin with and it broke this usage:

It seems to have also broken "git push --tags".

I now get 

	fatal: Invalid refspec 'refs/tags/*'

which seems a bit sad.

			Linus

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] remote.c: Fix overtight refspec validation
  2008-03-26  1:45               ` Linus Torvalds
@ 2008-03-26  3:31                 ` Junio C Hamano
  2008-03-26  4:11                   ` Junio C Hamano
  0 siblings, 1 reply; 20+ messages in thread
From: Junio C Hamano @ 2008-03-26  3:31 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Daniel Barkalow, git, Samuel Tardieu

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Fri, 21 Mar 2008, Junio C Hamano wrote:
>>
>> We tightened the refspec validation code in an earlier ef00d15 (Tighten
>> refspec processing, 2008-03-17) per my suggestion, but the suggestion was
>> misguided to begin with and it broke this usage:
>
> It seems to have also broken "git push --tags".
>
> I now get 
>
> 	fatal: Invalid refspec 'refs/tags/*'
>
> which seems a bit sad.

Yeah that indeed is very sad.  Given that Andrew identified another
regression on the fetch side today, I am beginning to suspect that 1.5.4
was still too early to merge the C rewrite.  On the other hand, until we
tag, new features are not tested in the field, so...

Especially that "refs/tags/*" is sad in that it is leaking an internal
implementation detail.  I do not think the original code ever used
wildcards on the push side, and it probably was a good idea to allow
wildcards when the code was rewritten.

In any case, this should fix it.

 builtin-push.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/builtin-push.c b/builtin-push.c
index b68c681..5316d6d 100644
--- a/builtin-push.c
+++ b/builtin-push.c
@@ -120,7 +120,7 @@ int cmd_push(int argc, const char **argv, const char *prefix)
 	if (verbose)
 		flags |= TRANSPORT_PUSH_VERBOSE;
 	if (tags)
-		add_refspec("refs/tags/*");
+		add_refspec("refs/tags/*:refs/tags/*");
 	if (all)
 		flags |= TRANSPORT_PUSH_ALL;
 	if (mirror)

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] remote.c: Fix overtight refspec validation
  2008-03-26  3:31                 ` Junio C Hamano
@ 2008-03-26  4:11                   ` Junio C Hamano
  2008-03-26  5:42                     ` Daniel Barkalow
  0 siblings, 1 reply; 20+ messages in thread
From: Junio C Hamano @ 2008-03-26  4:11 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Daniel Barkalow, git, Samuel Tardieu

Junio C Hamano <gitster@pobox.com> writes:

> Especially that "refs/tags/*" is sad in that it is leaking an internal
> implementation detail.  I do not think the original code ever used
> wildcards on the push side, and it probably was a good idea to allow
> wildcards when the code was rewritten.

Having thought about this a bit more, I think this patch would be more
useful.  Please discard the previous patch to builtin-push.c and replace
with this patch.

Now it allows you to say:

	[remote "neigh"]
        	url = ../neighbour
                push = refs/tags/*

to propagate all tags one-to-one, without having to say
"refs/tags/*:refs/tags/*".

This however has unintended side effect of allowing 

	[remote "bour"]
        	url = ../neighbour
                fetch = refs/heads/*

at the syntax level.  I do not know offhand the fetch backends are
prepared to deal with such wildcard patterns.

Daniel?

---
 remote.c |   18 +++++++++++-------
 1 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/remote.c b/remote.c
index 40ed246..c39d831 100644
--- a/remote.c
+++ b/remote.c
@@ -417,17 +417,21 @@ static struct refspec *parse_refspec_internal(int nr_refspec, const char **refsp
 			rhs++;
 			rlen = strlen(rhs);
 			is_glob = (2 <= rlen && !strcmp(rhs + rlen - 2, "/*"));
-			rs[i].dst = xstrndup(rhs, rlen - is_glob * 2);
+			if (is_glob)
+				rlen -= 2;
+			rs[i].dst = xstrndup(rhs, rlen);
 		}
 
 		llen = (rhs ? (rhs - lhs - 1) : strlen(lhs));
-		if (is_glob != (2 <= llen && !memcmp(lhs + llen - 2, "/*", 2)))
-			goto invalid;
-
-		if (is_glob) {
+		if (2 <= llen && !memcmp(lhs + llen - 2, "/*", 2)) {
+			if (rhs && !is_glob)
+				goto invalid;
+			is_glob = 1;
 			llen -= 2;
-			rlen -= 2;
+		} else if (rhs && is_glob) {
+			goto invalid;
 		}
+
 		rs[i].pattern = is_glob;
 		rs[i].src = xstrndup(lhs, llen);
 
@@ -446,7 +450,7 @@ static struct refspec *parse_refspec_internal(int nr_refspec, const char **refsp
 			}
 			/*
 			 * RHS
-			 * - missing is allowed.
+			 * - missing is ok, and is same as empty.
 			 * - empty is ok; it means not to store.
 			 * - otherwise it must be a valid looking ref.
 			 */

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] remote.c: Fix overtight refspec validation
  2008-03-26  4:11                   ` Junio C Hamano
@ 2008-03-26  5:42                     ` Daniel Barkalow
  2008-03-26  5:46                       ` Junio C Hamano
  0 siblings, 1 reply; 20+ messages in thread
From: Daniel Barkalow @ 2008-03-26  5:42 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, git, Samuel Tardieu

On Tue, 25 Mar 2008, Junio C Hamano wrote:

> This however has unintended side effect of allowing 
> 
> 	[remote "bour"]
>         	url = ../neighbour
>                 fetch = refs/heads/*
> 
> at the syntax level.  I do not know offhand the fetch backends are
> prepared to deal with such wildcard patterns.
> 
> Daniel?

It's not a matter of the backends, which don't implement any of the 
control flow in the fetch direction; it's get_expanded_map(), which needs 
to be told that you can have something match a pattern but not have a 
local tracking ref.

OTOH, the only use for such a pattern is an octopus merge of whatever 
branches a remote happens to have, right? I remember thinking this was a 
non-useful refspec when I was dealing with the fetch code (and then 
forgetting that it was useful for push). It might be better to just 
disallow it in the direction-specific semantic checks.

Here's the patch to make it work, anyway:
----

commit 6a8bcb917e1aa9b3c972f14f618ab573e457ebee
Author: Daniel Barkalow <barkalow@iabervon.org>
Date:   Wed Mar 26 01:39:07 2008 -0400

    Support fetching refspecs like "refs/heads/*"
    
    Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>

diff --git a/remote.c b/remote.c
index a027bca..f8f4b34 100644
--- a/remote.c
+++ b/remote.c
@@ -998,22 +998,24 @@ static struct ref *get_expanded_map(const struct ref *remote_refs,
 	struct ref **tail = &ret;
 
 	int remote_prefix_len = strlen(refspec->src);
-	int local_prefix_len = strlen(refspec->dst);
+	int local_prefix_len = refspec->dst ? strlen(refspec->dst) : 0;
 
 	for (ref = remote_refs; ref; ref = ref->next) {
 		if (strchr(ref->name, '^'))
 			continue; /* a dereference item */
 		if (!prefixcmp(ref->name, refspec->src)) {
-			const char *match;
 			struct ref *cpy = copy_ref(ref);
-			match = ref->name + remote_prefix_len;
-
-			cpy->peer_ref = alloc_ref(local_prefix_len +
-						  strlen(match) + 1);
-			sprintf(cpy->peer_ref->name, "%s%s",
-				refspec->dst, match);
-			if (refspec->force)
-				cpy->peer_ref->force = 1;
+
+			if (refspec->dst) {
+				const char *match = ref->name + 
+					remote_prefix_len;
+				cpy->peer_ref = alloc_ref(local_prefix_len +
+							  strlen(match) + 1);
+				sprintf(cpy->peer_ref->name, "%s%s",
+					refspec->dst, match);
+				if (refspec->force)
+					cpy->peer_ref->force = 1;
+			}
 			*tail = cpy;
 			tail = &cpy->next;
 		}

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] remote.c: Fix overtight refspec validation
  2008-03-26  5:42                     ` Daniel Barkalow
@ 2008-03-26  5:46                       ` Junio C Hamano
  2008-03-26  6:22                         ` Jeff King
  0 siblings, 1 reply; 20+ messages in thread
From: Junio C Hamano @ 2008-03-26  5:46 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: Linus Torvalds, git, Samuel Tardieu

Daniel Barkalow <barkalow@iabervon.org> writes:

> OTOH, the only use for such a pattern is an octopus merge of whatever 
> branches a remote happens to have, right? I remember thinking this was a 
> non-useful refspec when I was dealing with the fetch code (and then 
> forgetting that it was useful for push). It might be better to just 
> disallow it in the direction-specific semantic checks.

I agree.  refs/*:refs/* could have been useful before --mirror, but refs/*
(fetch but not store) is not useful at all, so what I plan to commit looks
like this (i.e. "the third time lucky" edition).

-- >8 --
[PATCH] refspec: allow colon-less wildcard "refs/category/*"

"git push --tags elsewhere" is implemented in terms of wildcarded refspec
"refs/tags/*" these days, and the user wants to push the tags under the
same name to the other branch.  This resurrects the support for it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 remote.c |   18 +++++++++++-------
 1 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/remote.c b/remote.c
index 40ed246..04f7521 100644
--- a/remote.c
+++ b/remote.c
@@ -417,17 +417,21 @@ static struct refspec *parse_refspec_internal(int nr_refspec, const char **refsp
 			rhs++;
 			rlen = strlen(rhs);
 			is_glob = (2 <= rlen && !strcmp(rhs + rlen - 2, "/*"));
-			rs[i].dst = xstrndup(rhs, rlen - is_glob * 2);
+			if (is_glob)
+				rlen -= 2;
+			rs[i].dst = xstrndup(rhs, rlen);
 		}
 
 		llen = (rhs ? (rhs - lhs - 1) : strlen(lhs));
-		if (is_glob != (2 <= llen && !memcmp(lhs + llen - 2, "/*", 2)))
-			goto invalid;
-
-		if (is_glob) {
+		if (2 <= llen && !memcmp(lhs + llen - 2, "/*", 2)) {
+			if ((rhs && !is_glob) || (!rhs && fetch))
+				goto invalid;
+			is_glob = 1;
 			llen -= 2;
-			rlen -= 2;
+		} else if (rhs && is_glob) {
+			goto invalid;
 		}
j+
 		rs[i].pattern = is_glob;
 		rs[i].src = xstrndup(lhs, llen);
 
@@ -446,7 +450,7 @@ static struct refspec *parse_refspec_internal(int nr_refspec, const char **refsp
 			}
 			/*
 			 * RHS
-			 * - missing is allowed.
+			 * - missing is ok, and is same as empty.
 			 * - empty is ok; it means not to store.
 			 * - otherwise it must be a valid looking ref.
 			 */
-- 
1.5.5.rc1.128.g340c

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] remote.c: Fix overtight refspec validation
  2008-03-26  5:46                       ` Junio C Hamano
@ 2008-03-26  6:22                         ` Jeff King
  0 siblings, 0 replies; 20+ messages in thread
From: Jeff King @ 2008-03-26  6:22 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Daniel Barkalow, Linus Torvalds, git, Samuel Tardieu

On Tue, Mar 25, 2008 at 10:46:23PM -0700, Junio C Hamano wrote:

> I agree.  refs/*:refs/* could have been useful before --mirror, but refs/*

It still is, unless there is a way to specify --mirror in your config
file.

-Peff

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2008-03-26  6:23 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-03-21  0:54 [PATCH] Permit refspec source side to parse as a sha1 Daniel Barkalow
2008-03-21  4:10 ` Junio C Hamano
2008-03-21  4:50   ` Junio C Hamano
2008-03-21  5:09   ` Daniel Barkalow
2008-03-21  5:30     ` Junio C Hamano
2008-03-21  5:57       ` Daniel Barkalow
2008-03-21  6:26         ` Junio C Hamano
2008-03-21 16:08           ` Daniel Barkalow
2008-03-21 22:17             ` [PATCH] remote.c: Fix overtight refspec validation Junio C Hamano
2008-03-21 23:12               ` Daniel Barkalow
2008-03-21 23:59                 ` Junio C Hamano
2008-03-22  0:36                   ` Daniel Barkalow
2008-03-22 19:48                     ` Junio C Hamano
2008-03-22 20:45                       ` Daniel Barkalow
2008-03-26  1:45               ` Linus Torvalds
2008-03-26  3:31                 ` Junio C Hamano
2008-03-26  4:11                   ` Junio C Hamano
2008-03-26  5:42                     ` Daniel Barkalow
2008-03-26  5:46                       ` Junio C Hamano
2008-03-26  6:22                         ` Jeff King

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.