All of lore.kernel.org
 help / color / mirror / Atom feed
* BUG: check-ref-format and rev-parse can not handle branches with an @ in their name combined with @{u}
@ 2014-01-14 23:04 Keith Derrick
  2014-01-14 23:45 ` Junio C Hamano
  0 siblings, 1 reply; 12+ messages in thread
From: Keith Derrick @ 2014-01-14 23:04 UTC (permalink / raw)
  To: git

I couldn't find a duplicate in the JIRA instance.

According to the documentation of check-ref-format, a branch name such 
as @mybranch is valid. Yet 'git check-ref-format --branch @mybranch@{u}' 
claims @mybranch is an invalid branch name. yet I am able to create the 
branch (which would seem the obvious place to check for an invalid 
branch name)

Pick any repository with an upstream remote (I cloned git itself from 
github). The following will reproduce the bug
> $cd ~/play/gitsrc/
> $
> $git status
> On branch master
> Your branch is up-to-date with 'origin/master'.
>
> nothing to commit, working directory clean
> $git checkout -t -b @mybranch origin/master
> Branch @mybranch set up to track remote branch master from origin.
> Switched to a new branch '@mybranch'
> $git branch -av
> * @mybranch             14598b9 Sync with 1.8.5.3
>   master                14598b9 Sync with 1.8.5.3
>   remotes/origin/HEAD   -> origin/master
>   remotes/origin/maint  4224916 Git 1.8.5.3
>   remotes/origin/master 14598b9 Sync with 1.8.5.3
>   remotes/origin/next   b139ac2 Sync with master
>   remotes/origin/pu     3d58c42 Merge branch 'ab/subtree-doc' into pu
>   remotes/origin/todo   1066e10 What's cooking (2014/01 #02)
> $git check-ref-format --branch @mybranch
> @mybranch
> $git check-ref-format --branch @mybranch@{u}
> fatal: '@mybranch@{u}' is not a valid branch name
> $git rev-parse --abbrev-ref @mybranch
> @mybranch
> $git rev-parse --abbrev-ref @mybranch@{u}
> @mybranch@{u}
> fatal: ambiguous argument '@mybranch@{u}': unknown revision or path 
> not in the working tree.
> Use '--' to separate paths from revisions, like this:
> 'git <command> [<revision>...] -- [<file>...]'
> $

The same problem appears if the single '@' is anywhere in the branch name.

I *think* the problem lies in 'interpret_empty_at' in sha1_name.c with 
these statements

>     if (next != name + 1)
>         return -1;
Which is trying to allow the sequence '@@' but I'm not certain enough of 
unintended consequences to suggest a patch.

In our real-world example, we wanted to use a naming convention where a 
branch name beginning with @ was intended to be a long-lived branch (for 
example, a support branch for an official release). Thus, our sequence 
above actually begins with something like 'git checkout -t 
origin/@release1'.

We hit this problem some considerable time after initiating the naming 
convention, so it's too late to turn back for us.

As a work around, if you have the branch checked out, then using HEAD 
instead of the branch name works
> $git checkout @mybranch
> Switched to branch '@mybranch'
> Your branch is up-to-date with 'origin/master'.
> $git check-ref-format --branch HEAD@{u}
> origin/master
> $git rev-parse --abbrev-ref HEAD@{u}
> origin/master
> $

Envronment:

> $git version
> git version 1.8.5.2
> $lsb_release -a
> No LSB modules are available.
> Distributor ID:    Ubuntu
> Description:    Ubuntu 12.04.3 LTS
> Release:    12.04
> Codename:    precise

Keith Derrick

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: BUG: check-ref-format and rev-parse can not handle branches with an @ in their name combined with @{u}
  2014-01-14 23:04 BUG: check-ref-format and rev-parse can not handle branches with an @ in their name combined with @{u} Keith Derrick
@ 2014-01-14 23:45 ` Junio C Hamano
  2014-01-15  5:00   ` Jeff King
  0 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2014-01-14 23:45 UTC (permalink / raw)
  To: Keith Derrick; +Cc: git

Keith Derrick <keith.derrick@lge.com> writes:

> I couldn't find a duplicate in the JIRA instance.

Don't worry, we do not use any JIRA instance ;-)

> According to the documentation of check-ref-format, a branch name such 
> as @mybranch is valid.

Correct.

> Yet 'git check-ref-format --branch @mybranch@{u}' 
> claims @mybranch is an invalid branch name.

I do not think it claims any such thing.

    $ git check-ref-format --branch @foo@{u}; echo $?
    fatal: '@foo@{u}' is not a valid branch name
    128
    $ git check-ref-format --branch @foo; echo $?
    @foo
    0

The former asks "Is the string '@foo@{u}' a suitable name to give a
branch?" and the answer is no.  The latter asks the same question
about the string '@foo', and the answer is yes.

So I do not see any bug here.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: BUG: check-ref-format and rev-parse can not handle branches with an @ in their name combined with @{u}
  2014-01-14 23:45 ` Junio C Hamano
@ 2014-01-15  5:00   ` Jeff King
  2014-01-15  7:46     ` Junio C Hamano
  2014-01-15  8:25     ` [PATCH 0/5] interpret_branch_name bug potpourri Jeff King
  0 siblings, 2 replies; 12+ messages in thread
From: Jeff King @ 2014-01-15  5:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Keith Derrick, git

On Tue, Jan 14, 2014 at 03:45:27PM -0800, Junio C Hamano wrote:

> > Yet 'git check-ref-format --branch @mybranch@{u}' 
> > claims @mybranch is an invalid branch name.
> 
> I do not think it claims any such thing.
> 
>     $ git check-ref-format --branch @foo@{u}; echo $?
>     fatal: '@foo@{u}' is not a valid branch name
>     128
>     $ git check-ref-format --branch @foo; echo $?
>     @foo
>     0
> 
> The former asks "Is the string '@foo@{u}' a suitable name to give a
> branch?" and the answer is no.  The latter asks the same question
> about the string '@foo', and the answer is yes.

Is that what "--branch" does? I have never used it, but the manpage
seems to suggest it is about _parsing_ (which, IMHO, means it probably
should have been an option to rev-parse, but that is another issue
altogether).

So a more interesting output than the above is:
 
  $ git checkout -t -b @mybranch origin/master
  Branch @mybranch set up to track remote branch master from origin.
  Switched to a new branch '@mybranch'

  $ git check-ref-format --branch @mybranch@{u}; echo $?
  fatal: '@mybranch@{u}' is not a valid branch name
  128

  $ git check-ref-format --branch HEAD@{u}; echo $?
  origin/master
  0

So "check-ref-format --branch" does understand @{u} syntax. But it
doesn't like @mybranch@{u}. You can see the same problem with rev-parse:

  $ git rev-parse --symbolic-full-name HEAD@{u}
  refs/remotes/origin/master
  $ git rev-parse --symbolic-full-name @mybranch@{u}
  @mybranch@{u}
  fatal: ambiguous argument '@mybranch@{u}': unknown revision or path
  not in the working tree.

So I do think there is a bug. The interpret_branch_name parser somehow
gets confused by the "@" in the name.

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: BUG: check-ref-format and rev-parse can not handle branches with an @ in their name combined with @{u}
  2014-01-15  5:00   ` Jeff King
@ 2014-01-15  7:46     ` Junio C Hamano
  2014-01-15  7:47       ` Jeff King
  2014-01-15  8:25     ` [PATCH 0/5] interpret_branch_name bug potpourri Jeff King
  1 sibling, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2014-01-15  7:46 UTC (permalink / raw)
  To: Jeff King; +Cc: Keith Derrick, git

Jeff King <peff@peff.net> writes:

> Is that what "--branch" does? I have never used it, but the manpage
> seems to suggest it is about _parsing_ (which, IMHO, means it probably
> should have been an option to rev-parse, but that is another issue
> altogether).

Ahh, of course you are right.  I never use it, and somehow thought
it was just prepending refs/heads/ to its arguments, but it seems to
want to do a lot more than that.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: BUG: check-ref-format and rev-parse can not handle branches with an @ in their name combined with @{u}
  2014-01-15  7:46     ` Junio C Hamano
@ 2014-01-15  7:47       ` Jeff King
  0 siblings, 0 replies; 12+ messages in thread
From: Jeff King @ 2014-01-15  7:47 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Keith Derrick, git

On Tue, Jan 14, 2014 at 11:46:58PM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > Is that what "--branch" does? I have never used it, but the manpage
> > seems to suggest it is about _parsing_ (which, IMHO, means it probably
> > should have been an option to rev-parse, but that is another issue
> > altogether).
> 
> Ahh, of course you are right.  I never use it, and somehow thought
> it was just prepending refs/heads/ to its arguments, but it seems to
> want to do a lot more than that.

I am just about done with a patch series to address this, and a few
other related bugs I found. So don't look too hard; I should have
something out in a few minutes.

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 0/5] interpret_branch_name bug potpourri
  2014-01-15  5:00   ` Jeff King
  2014-01-15  7:46     ` Junio C Hamano
@ 2014-01-15  8:25     ` Jeff King
  2014-01-15  8:26       ` [PATCH 1/5] interpret_branch_name: factor out upstream handling Jeff King
                         ` (5 more replies)
  1 sibling, 6 replies; 12+ messages in thread
From: Jeff King @ 2014-01-15  8:25 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Keith Derrick, git

On Wed, Jan 15, 2014 at 12:00:03AM -0500, Jeff King wrote:

>   $ git rev-parse --symbolic-full-name HEAD@{u}
>   refs/remotes/origin/master
>   $ git rev-parse --symbolic-full-name @mybranch@{u}
>   @mybranch@{u}
>   fatal: ambiguous argument '@mybranch@{u}': unknown revision or path
>   not in the working tree.
> 
> So I do think there is a bug. The interpret_branch_name parser somehow
> gets confused by the "@" in the name.

The "somehow" is because we only look for the first "@", and never
consider any possible marks after that. The series below fixes it, along
with two other bugs I found while looking at this code. Ugh. Remind me
never to look at our object name parser ever again.

I feel pretty good that this is fixing real bugs and not regressing
anything else. I would not be surprised if there are other weird things
lurking, though. See the discussion in patch 4.

  [1/5]: interpret_branch_name: factor out upstream handling
  [2/5]: interpret_branch_name: rename "cp" variable to "at"
  [3/5]: interpret_branch_name: always respect "namelen" parameter
  [4/5]: interpret_branch_name: avoid @{upstream} past colon
  [5/5]: interpret_branch_name: find all possible @-marks

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/5] interpret_branch_name: factor out upstream handling
  2014-01-15  8:25     ` [PATCH 0/5] interpret_branch_name bug potpourri Jeff King
@ 2014-01-15  8:26       ` Jeff King
  2014-01-15  8:27       ` [PATCH 2/5] interpret_branch_name: rename "cp" variable to "at" Jeff King
                         ` (4 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Jeff King @ 2014-01-15  8:26 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Keith Derrick, git

This function checks a few different @{}-constructs. The
early part checks for and dispatches us to helpers for each
construct, but the code for handling @{upstream} is inline.

Let's factor this out into its own function. This makes
interpret_branch_name more readable, and will make it much
simpler to further refactor the function in future patches.

While we're at it, let's also break apart the refactored
code into a few helper functions. These will be useful if we
eventually implement similar @{upstream}-like constructs.

Signed-off-by: Jeff King <peff@peff.net>
---
This is identical to the cleanup I posted recently for the @{publish}
topic, though I did tweak the commit message a bit to make more sense in
the context of this series.

 sha1_name.c | 83 ++++++++++++++++++++++++++++++++++++++-----------------------
 1 file changed, 52 insertions(+), 31 deletions(-)

diff --git a/sha1_name.c b/sha1_name.c
index a5578f7..5db742b 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -1046,6 +1046,54 @@ static int reinterpret(const char *name, int namelen, int len, struct strbuf *bu
 	return ret - used + len;
 }
 
+static void set_shortened_ref(struct strbuf *buf, const char *ref)
+{
+	char *s = shorten_unambiguous_ref(ref, 0);
+	strbuf_reset(buf);
+	strbuf_addstr(buf, s);
+	free(s);
+}
+
+static const char *get_upstream_branch(const char *branch_buf, int len)
+{
+	char *branch = xstrndup(branch_buf, len);
+	struct branch *upstream = branch_get(*branch ? branch : NULL);
+
+	/*
+	 * Upstream can be NULL only if branch refers to HEAD and HEAD
+	 * points to something different than a branch.
+	 */
+	if (!upstream)
+		die(_("HEAD does not point to a branch"));
+	if (!upstream->merge || !upstream->merge[0]->dst) {
+		if (!ref_exists(upstream->refname))
+			die(_("No such branch: '%s'"), branch);
+		if (!upstream->merge) {
+			die(_("No upstream configured for branch '%s'"),
+				upstream->name);
+		}
+		die(
+			_("Upstream branch '%s' not stored as a remote-tracking branch"),
+			upstream->merge[0]->src);
+	}
+	free(branch);
+
+	return upstream->merge[0]->dst;
+}
+
+static int interpret_upstream_mark(const char *name, int namelen,
+				   int at, struct strbuf *buf)
+{
+	int len;
+
+	len = upstream_mark(name + at, namelen - at);
+	if (!len)
+		return -1;
+
+	set_shortened_ref(buf, get_upstream_branch(name, at));
+	return len + at;
+}
+
 /*
  * This reads short-hand syntax that not only evaluates to a commit
  * object name, but also can act as if the end user spelled the name
@@ -1070,9 +1118,7 @@ static int reinterpret(const char *name, int namelen, int len, struct strbuf *bu
 int interpret_branch_name(const char *name, int namelen, struct strbuf *buf)
 {
 	char *cp;
-	struct branch *upstream;
 	int len = interpret_nth_prior_checkout(name, buf);
-	int tmp_len;
 
 	if (!namelen)
 		namelen = strlen(name);
@@ -1094,36 +1140,11 @@ int interpret_branch_name(const char *name, int namelen, struct strbuf *buf)
 	if (len > 0)
 		return reinterpret(name, namelen, len, buf);
 
-	tmp_len = upstream_mark(cp, namelen - (cp - name));
-	if (!tmp_len)
-		return -1;
+	len = interpret_upstream_mark(name, namelen, cp - name, buf);
+	if (len > 0)
+		return len;
 
-	len = cp + tmp_len - name;
-	cp = xstrndup(name, cp - name);
-	upstream = branch_get(*cp ? cp : NULL);
-	/*
-	 * Upstream can be NULL only if cp refers to HEAD and HEAD
-	 * points to something different than a branch.
-	 */
-	if (!upstream)
-		die(_("HEAD does not point to a branch"));
-	if (!upstream->merge || !upstream->merge[0]->dst) {
-		if (!ref_exists(upstream->refname))
-			die(_("No such branch: '%s'"), cp);
-		if (!upstream->merge) {
-			die(_("No upstream configured for branch '%s'"),
-				upstream->name);
-		}
-		die(
-			_("Upstream branch '%s' not stored as a remote-tracking branch"),
-			upstream->merge[0]->src);
-	}
-	free(cp);
-	cp = shorten_unambiguous_ref(upstream->merge[0]->dst, 0);
-	strbuf_reset(buf);
-	strbuf_addstr(buf, cp);
-	free(cp);
-	return len;
+	return -1;
 }
 
 int strbuf_branchname(struct strbuf *sb, const char *name)
-- 
1.8.5.2.500.g8060133

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/5] interpret_branch_name: rename "cp" variable to "at"
  2014-01-15  8:25     ` [PATCH 0/5] interpret_branch_name bug potpourri Jeff King
  2014-01-15  8:26       ` [PATCH 1/5] interpret_branch_name: factor out upstream handling Jeff King
@ 2014-01-15  8:27       ` Jeff King
  2014-01-15  8:31       ` [PATCH 3/5] interpret_branch_name: always respect "namelen" parameter Jeff King
                         ` (3 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Jeff King @ 2014-01-15  8:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Keith Derrick, git

In the original version of this function, "cp" acted as a
pointer to many different things. Since the refactoring in
the last patch, it only marks the at-sign in the string.
Let's use a more descriptive variable name.

Signed-off-by: Jeff King <peff@peff.net>
---
Obviously can be squashed with the prior refactoring, but I think
splitting it makes the diffs easier to read.

 sha1_name.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/sha1_name.c b/sha1_name.c
index 5db742b..958aa2e 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -1117,7 +1117,7 @@ static int interpret_upstream_mark(const char *name, int namelen,
  */
 int interpret_branch_name(const char *name, int namelen, struct strbuf *buf)
 {
-	char *cp;
+	char *at;
 	int len = interpret_nth_prior_checkout(name, buf);
 
 	if (!namelen)
@@ -1132,15 +1132,15 @@ int interpret_branch_name(const char *name, int namelen, struct strbuf *buf)
 			return reinterpret(name, namelen, len, buf);
 	}
 
-	cp = strchr(name, '@');
-	if (!cp)
+	at = strchr(name, '@');
+	if (!at)
 		return -1;
 
-	len = interpret_empty_at(name, namelen, cp - name, buf);
+	len = interpret_empty_at(name, namelen, at - name, buf);
 	if (len > 0)
 		return reinterpret(name, namelen, len, buf);
 
-	len = interpret_upstream_mark(name, namelen, cp - name, buf);
+	len = interpret_upstream_mark(name, namelen, at - name, buf);
 	if (len > 0)
 		return len;
 
-- 
1.8.5.2.500.g8060133

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/5] interpret_branch_name: always respect "namelen" parameter
  2014-01-15  8:25     ` [PATCH 0/5] interpret_branch_name bug potpourri Jeff King
  2014-01-15  8:26       ` [PATCH 1/5] interpret_branch_name: factor out upstream handling Jeff King
  2014-01-15  8:27       ` [PATCH 2/5] interpret_branch_name: rename "cp" variable to "at" Jeff King
@ 2014-01-15  8:31       ` Jeff King
  2014-01-15  8:37       ` [PATCH 4/5] interpret_branch_name: avoid @{upstream} past colon Jeff King
                         ` (2 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Jeff King @ 2014-01-15  8:31 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Keith Derrick, git

interpret_branch_name gets passed a "name" buffer to parse,
along with a "namelen" parameter representing its length. If
"namelen" is zero, we fallback to the NUL-terminated
string-length of "name".

However, it does not necessarily follow that if we have
gotten a non-zero "namelen", it is the NUL-terminated
string-length of "name". E.g., when get_sha1() is parsing
"foo:bar", we will be asked to operate only on the first
three characters.

Yet in interpret_branch_name and its helpers, we use string
functions like strchr() to operate on "name", looking past
the length we were given.  This can result in us mis-parsing
object names.  We should instead be limiting our search to
"namelen" bytes.

There are three distinct types of object names this patch
addresses:

  - The intrepret_empty_at helper uses strchr to find the
    next @-expression after our potential empty-at.  In an
    expression like "@:foo@bar", it erroneously thinks that
    the second "@" is relevant, even if we were asked only
    to look at the first character. This case is easy to
    trigger (and we test it in this patch).

  - When finding the initial @-mark for @{upstream}, we use
    strchr.  This means we might treat "foo:@{upstream}" as
    the upstream for "foo:", even though we were asked only
    to look at "foo". We cannot test this one in practice,
    because it is masked by another bug (which is fixed in
    the next patch).

  - The interpret_nth_prior_checkout helper did not receive
    the name length at all. This turns out not to be a
    problem in practice, though, because its parsing is so
    limited: it always starts from the far-left of the
    string, and will not tolerate a colon (which is
    currently the only way to get a smaller-than-strlen
    "namelen"). However, it's still worth fixing to make the
    code more obviously correct, and to future-proof us
    against callers with more exotic buffers.

Signed-off-by: Jeff King <peff@peff.net>
---
 sha1_name.c                | 17 ++++++++++-------
 t/t1508-at-combinations.sh | 15 ++++++++++++++-
 2 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/sha1_name.c b/sha1_name.c
index 958aa2e..6d5038d 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -430,7 +430,7 @@ static inline int upstream_mark(const char *string, int len)
 }
 
 static int get_sha1_1(const char *name, int len, unsigned char *sha1, unsigned lookup_flags);
-static int interpret_nth_prior_checkout(const char *name, struct strbuf *buf);
+static int interpret_nth_prior_checkout(const char *name, int namelen, struct strbuf *buf);
 
 static int get_sha1_basic(const char *str, int len, unsigned char *sha1)
 {
@@ -492,7 +492,7 @@ static int get_sha1_basic(const char *str, int len, unsigned char *sha1)
 		struct strbuf buf = STRBUF_INIT;
 		int detached;
 
-		if (interpret_nth_prior_checkout(str, &buf) > 0) {
+		if (interpret_nth_prior_checkout(str, len, &buf) > 0) {
 			detached = (buf.len == 40 && !get_sha1_hex(buf.buf, sha1));
 			strbuf_release(&buf);
 			if (detached)
@@ -929,7 +929,8 @@ static int grab_nth_branch_switch(unsigned char *osha1, unsigned char *nsha1,
  * Parse @{-N} syntax, return the number of characters parsed
  * if successful; otherwise signal an error with negative value.
  */
-static int interpret_nth_prior_checkout(const char *name, struct strbuf *buf)
+static int interpret_nth_prior_checkout(const char *name, int namelen,
+					struct strbuf *buf)
 {
 	long nth;
 	int retval;
@@ -937,9 +938,11 @@ static int interpret_nth_prior_checkout(const char *name, struct strbuf *buf)
 	const char *brace;
 	char *num_end;
 
+	if (namelen < 4)
+		return -1;
 	if (name[0] != '@' || name[1] != '{' || name[2] != '-')
 		return -1;
-	brace = strchr(name, '}');
+	brace = memchr(name, '}', namelen);
 	if (!brace)
 		return -1;
 	nth = strtol(name + 3, &num_end, 10);
@@ -1012,7 +1015,7 @@ static int interpret_empty_at(const char *name, int namelen, int len, struct str
 		return -1;
 
 	/* make sure it's a single @, or @@{.*}, not @foo */
-	next = strchr(name + len + 1, '@');
+	next = memchr(name + len + 1, '@', namelen - len - 1);
 	if (next && next[1] != '{')
 		return -1;
 	if (!next)
@@ -1118,7 +1121,7 @@ static int interpret_upstream_mark(const char *name, int namelen,
 int interpret_branch_name(const char *name, int namelen, struct strbuf *buf)
 {
 	char *at;
-	int len = interpret_nth_prior_checkout(name, buf);
+	int len = interpret_nth_prior_checkout(name, namelen, buf);
 
 	if (!namelen)
 		namelen = strlen(name);
@@ -1132,7 +1135,7 @@ int interpret_branch_name(const char *name, int namelen, struct strbuf *buf)
 			return reinterpret(name, namelen, len, buf);
 	}
 
-	at = strchr(name, '@');
+	at = memchr(name, '@', namelen);
 	if (!at)
 		return -1;
 
diff --git a/t/t1508-at-combinations.sh b/t/t1508-at-combinations.sh
index ceb8449..078e119 100755
--- a/t/t1508-at-combinations.sh
+++ b/t/t1508-at-combinations.sh
@@ -9,8 +9,11 @@ check() {
 		if test '$2' = 'commit'
 		then
 			git log -1 --format=%s '$1' >actual
-		else
+		elif test '$2' = 'ref'
+		then
 			git rev-parse --symbolic-full-name '$1' >actual
+		else
+			git cat-file -p '$1' >actual
 		fi &&
 		test_cmp expect actual
 	"
@@ -82,4 +85,14 @@ check HEAD ref refs/heads/old-branch
 check "HEAD@{1}" commit new-two
 check "@{1}" commit old-one
 
+test_expect_success 'create path with @' '
+	echo content >normal &&
+	echo content >fun@ny &&
+	git add normal fun@ny &&
+	git commit -m "funny path"
+'
+
+check "@:normal" blob content
+check "@:fun@ny" blob content
+
 test_done
-- 
1.8.5.2.500.g8060133

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/5] interpret_branch_name: avoid @{upstream} past colon
  2014-01-15  8:25     ` [PATCH 0/5] interpret_branch_name bug potpourri Jeff King
                         ` (2 preceding siblings ...)
  2014-01-15  8:31       ` [PATCH 3/5] interpret_branch_name: always respect "namelen" parameter Jeff King
@ 2014-01-15  8:37       ` Jeff King
  2014-01-15  8:40       ` [PATCH 5/5] interpret_branch_name: find all possible @-marks Jeff King
  2014-01-15 21:03       ` [PATCH 0/5] interpret_branch_name bug potpourri Junio C Hamano
  5 siblings, 0 replies; 12+ messages in thread
From: Jeff King @ 2014-01-15  8:37 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Keith Derrick, git

get_sha1() cannot currently parse a valid object name like
"HEAD:@{upstream}" (assuming that such an oddly named file
exists in the HEAD commit). It takes two passes to parse the
string:

  1. It first considers the whole thing as a ref, which
     results in looking for the upstream of "HEAD:".

  2. It finds the colon, parses "HEAD" as a tree-ish, and then
     finds the path "@{upstream}" in the tree.

For a path that looks like a normal reflog (e.g.,
"HEAD:@{yesterday}"), the first pass is a no-op. We try to
dwim_ref("HEAD:"), that returns zero refs, and we proceed
with colon-parsing.

For "HEAD:@{upstream}", though, the first pass ends up in
interpret_upstream_mark, which tries to find the branch
"HEAD:". When it sees that the branch does not exist, it
actually dies rather than returning an error to the caller.
As a result, we never make it to the second pass.

One obvious way of fixing this would be to teach
interpret_upstream_mark to simply report "no, this isn't an
upstream" in such a case. However, that would make the
error-reporting for legitimate upstream cases significantly
worse. Something like "bogus@{upstream}" would simply report
"unknown revision: bogus@{upstream}", while the current code
diagnoses a wide variety of possible misconfigurations (no
such branch, branch exists but does not have upstream, etc).

However, we can take advantage of the fact that a branch
name cannot contain a colon. Therefore even if we find an
upstream mark, any prefix with a colon must mean that
the upstream mark we found is actually a pathname, and
should be disregarded completely. This patch implements that
logic.

Signed-off-by: Jeff King <peff@peff.net>
---
I think this would actually be cleaner if get_sha1() simply did the
colon-parsing first, and omitted the first pass completely. Then
sub-functions would not have to care about arbitrary junk that can come
in paths; they would always be parsing just the revision-specifier.

However, given the way this code has developed over the years and its
general fragility, I would be entirely unsurprised if there is some case
that relies on the current scheme. So I went with the simple fix here,
which should be much less likely to have any fallout. And I could not
come up with an example that is actually broken under the current code
(we just do some suboptimal things, like looking for "foo:bar" as a ref
in the filesystem, even though it is syntactically bogus).

 sha1_name.c                   |  3 +++
 t/t1507-rev-parse-upstream.sh | 16 ++++++++++++++++
 2 files changed, 19 insertions(+)

diff --git a/sha1_name.c b/sha1_name.c
index 6d5038d..b253a88 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -1093,6 +1093,9 @@ static int interpret_upstream_mark(const char *name, int namelen,
 	if (!len)
 		return -1;
 
+	if (memchr(name, ':', at))
+		return -1;
+
 	set_shortened_ref(buf, get_upstream_branch(name, at));
 	return len + at;
 }
diff --git a/t/t1507-rev-parse-upstream.sh b/t/t1507-rev-parse-upstream.sh
index 2a19e79..cace1ca 100755
--- a/t/t1507-rev-parse-upstream.sh
+++ b/t/t1507-rev-parse-upstream.sh
@@ -210,4 +210,20 @@ test_expect_success 'log -g other@{u}@{now}' '
 	test_cmp expect actual
 '
 
+test_expect_success '@{reflog}-parsing does not look beyond colon' '
+	echo content >@{yesterday} &&
+	git add @{yesterday} &&
+	git commit -m "funny reflog file" &&
+	git hash-object @{yesterday} >expect &&
+	git rev-parse HEAD:@{yesterday} >actual
+'
+
+test_expect_success '@{upstream}-parsing does not look beyond colon' '
+	echo content >@{upstream} &&
+	git add @{upstream} &&
+	git commit -m "funny upstream file" &&
+	git hash-object @{upstream} >expect &&
+	git rev-parse HEAD:@{upstream} >actual
+'
+
 test_done
-- 
1.8.5.2.500.g8060133

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 5/5] interpret_branch_name: find all possible @-marks
  2014-01-15  8:25     ` [PATCH 0/5] interpret_branch_name bug potpourri Jeff King
                         ` (3 preceding siblings ...)
  2014-01-15  8:37       ` [PATCH 4/5] interpret_branch_name: avoid @{upstream} past colon Jeff King
@ 2014-01-15  8:40       ` Jeff King
  2014-01-15 21:03       ` [PATCH 0/5] interpret_branch_name bug potpourri Junio C Hamano
  5 siblings, 0 replies; 12+ messages in thread
From: Jeff King @ 2014-01-15  8:40 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Keith Derrick, git

When we parse a string like "foo@{upstream}", we look for
the first "@"-sign, and check to see if it is an upstream
mark. However, since branch names can contain an @, we may
also see "@foo@{upstream}". In this case, we check only the
first @, and ignore the second. As a result, we do not find
the upstream.

We can solve this by iterating through all @-marks in the
string, and seeing if any is a legitimate upstream or
empty-at mark.

Another strategy would be to parse from the right-hand side
of the string. However, that does not work for the
"empty_at" case, which allows "@@{upstream}". We need to
find the left-most one in this case (and we then recurse as
"HEAD@{upstream}").

Signed-off-by: Jeff King <peff@peff.net>
---
And this one actually fixes Keith's bug.

The diff is noisy due to indentation changes; try it with "-b" for
increased reading pleasure.

 sha1_name.c                   | 20 +++++++++++---------
 t/t1507-rev-parse-upstream.sh | 21 +++++++++++++++++++++
 2 files changed, 32 insertions(+), 9 deletions(-)

diff --git a/sha1_name.c b/sha1_name.c
index b253a88..6fca869 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -1124,6 +1124,7 @@ static int interpret_upstream_mark(const char *name, int namelen,
 int interpret_branch_name(const char *name, int namelen, struct strbuf *buf)
 {
 	char *at;
+	const char *start;
 	int len = interpret_nth_prior_checkout(name, namelen, buf);
 
 	if (!namelen)
@@ -1138,17 +1139,18 @@ int interpret_branch_name(const char *name, int namelen, struct strbuf *buf)
 			return reinterpret(name, namelen, len, buf);
 	}
 
-	at = memchr(name, '@', namelen);
-	if (!at)
-		return -1;
+	for (start = name;
+	     (at = memchr(start, '@', namelen - (start - name)));
+	     start = at + 1) {
 
-	len = interpret_empty_at(name, namelen, at - name, buf);
-	if (len > 0)
-		return reinterpret(name, namelen, len, buf);
+		len = interpret_empty_at(name, namelen, at - name, buf);
+		if (len > 0)
+			return reinterpret(name, namelen, len, buf);
 
-	len = interpret_upstream_mark(name, namelen, at - name, buf);
-	if (len > 0)
-		return len;
+		len = interpret_upstream_mark(name, namelen, at - name, buf);
+		if (len > 0)
+			return len;
+	}
 
 	return -1;
 }
diff --git a/t/t1507-rev-parse-upstream.sh b/t/t1507-rev-parse-upstream.sh
index cace1ca..178694e 100755
--- a/t/t1507-rev-parse-upstream.sh
+++ b/t/t1507-rev-parse-upstream.sh
@@ -17,6 +17,9 @@ test_expect_success 'setup' '
 	 test_commit 4 &&
 	 git branch --track my-side origin/side &&
 	 git branch --track local-master master &&
+	 git branch --track fun@ny origin/side &&
+	 git branch --track @funny origin/side &&
+	 git branch --track funny@ origin/side &&
 	 git remote add -t master master-only .. &&
 	 git fetch master-only &&
 	 git branch bad-upstream &&
@@ -54,6 +57,24 @@ test_expect_success 'my-side@{upstream} resolves to correct full name' '
 	test refs/remotes/origin/side = "$(full_name my-side@{u})"
 '
 
+test_expect_success 'upstream of branch with @ in middle' '
+	full_name fun@ny@{u} >actual &&
+	echo refs/remotes/origin/side >expect &&
+	test_cmp expect actual
+'
+
+test_expect_success 'upstream of branch with @ at start' '
+	full_name @funny@{u} >actual &&
+	echo refs/remotes/origin/side >expect &&
+	test_cmp expect actual
+'
+
+test_expect_success 'upstream of branch with @ at end' '
+	full_name funny@@{u} >actual &&
+	echo refs/remotes/origin/side >expect &&
+	test_cmp expect actual
+'
+
 test_expect_success 'refs/heads/my-side@{upstream} does not resolve to my-side{upstream}' '
 	test_must_fail full_name refs/heads/my-side@{upstream}
 '
-- 
1.8.5.2.500.g8060133

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/5] interpret_branch_name bug potpourri
  2014-01-15  8:25     ` [PATCH 0/5] interpret_branch_name bug potpourri Jeff King
                         ` (4 preceding siblings ...)
  2014-01-15  8:40       ` [PATCH 5/5] interpret_branch_name: find all possible @-marks Jeff King
@ 2014-01-15 21:03       ` Junio C Hamano
  5 siblings, 0 replies; 12+ messages in thread
From: Junio C Hamano @ 2014-01-15 21:03 UTC (permalink / raw)
  To: Jeff King; +Cc: Keith Derrick, git

Jeff King <peff@peff.net> writes:

> On Wed, Jan 15, 2014 at 12:00:03AM -0500, Jeff King wrote:
>
>>   $ git rev-parse --symbolic-full-name HEAD@{u}
>>   refs/remotes/origin/master
>>   $ git rev-parse --symbolic-full-name @mybranch@{u}
>>   @mybranch@{u}
>>   fatal: ambiguous argument '@mybranch@{u}': unknown revision or path
>>   not in the working tree.
>> 
>> So I do think there is a bug. The interpret_branch_name parser somehow
>> gets confused by the "@" in the name.
>
> The "somehow" is because we only look for the first "@", and never
> consider any possible marks after that. The series below fixes it, along
> with two other bugs I found while looking at this code. Ugh. Remind me
> never to look at our object name parser ever again.
>
> I feel pretty good that this is fixing real bugs and not regressing
> anything else. I would not be surprised if there are other weird things
> lurking, though. See the discussion in patch 4.
>
>   [1/5]: interpret_branch_name: factor out upstream handling
>   [2/5]: interpret_branch_name: rename "cp" variable to "at"
>   [3/5]: interpret_branch_name: always respect "namelen" parameter
>   [4/5]: interpret_branch_name: avoid @{upstream} past colon
>   [5/5]: interpret_branch_name: find all possible @-marks
>
> -Peff

All the steps looked very sensible.  Thanks for a pleasant read.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2014-01-15 21:03 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-14 23:04 BUG: check-ref-format and rev-parse can not handle branches with an @ in their name combined with @{u} Keith Derrick
2014-01-14 23:45 ` Junio C Hamano
2014-01-15  5:00   ` Jeff King
2014-01-15  7:46     ` Junio C Hamano
2014-01-15  7:47       ` Jeff King
2014-01-15  8:25     ` [PATCH 0/5] interpret_branch_name bug potpourri Jeff King
2014-01-15  8:26       ` [PATCH 1/5] interpret_branch_name: factor out upstream handling Jeff King
2014-01-15  8:27       ` [PATCH 2/5] interpret_branch_name: rename "cp" variable to "at" Jeff King
2014-01-15  8:31       ` [PATCH 3/5] interpret_branch_name: always respect "namelen" parameter Jeff King
2014-01-15  8:37       ` [PATCH 4/5] interpret_branch_name: avoid @{upstream} past colon Jeff King
2014-01-15  8:40       ` [PATCH 5/5] interpret_branch_name: find all possible @-marks Jeff King
2014-01-15 21:03       ` [PATCH 0/5] interpret_branch_name bug potpourri Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.