git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] disallow refs containing successive slashes
@ 2009-10-10 17:49 Jens Lehmann
  2009-10-10 21:50 ` Junio C Hamano
  0 siblings, 1 reply; 17+ messages in thread
From: Jens Lehmann @ 2009-10-10 17:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

When creating branches using names starting with '/' or containing a '//',
a leading slash would just vanish and successive slashes were 'compressed'
into just one slash.

This behaviour comes from the refs being stored as files in subdirectories
where the filesystem compresses each '//' in a pathname to a '/' . So
disallowing refs to contain '//' fixes these problems and prohibits
branch names to start with a '/' or to contain '//'.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
---

I became aware of this issue while looking into problems occuring
when a user created a branch starting with a '/' in git gui (e.g.
"/foo"). Strange things happen, while git gui shows the current
branch as "/foo" under the hood a branch "foo" (without the slash)
had been created. But then you can't delete "/foo" from git gui,
because a branch of that name doesn't exist.

IMHO branch names should not be allowed to start with a '/' or
contain a '//' (branches with trailing slashes are not allowed since
commit 03feddd6). So this implies that refs should never contain a
'//' in the first place.

This behaviour has been changed the last time by commit 03feddd.
The version before that commit didn't allow '//' to be part of a
ref (but still allowed a trailing '/'). The newer version explicitly
states "/* tolerate duplicated slashes */" in the added code. But i
couldn't find any hints on the list or on the interweb *why* these
were allowed. So this patch might break something i can't see right
now, even though the test suite runs fine ...

Thoughts? Objections?


 Documentation/git-check-ref-format.txt |    2 ++
 refs.c                                 |    7 +++++--
 t/t1600-check-ref-format.sh            |   19 +++++++++++++++++++
 3 files changed, 26 insertions(+), 2 deletions(-)
 create mode 100755 t/t1600-check-ref-format.sh

diff --git a/Documentation/git-check-ref-format.txt b/Documentation/git-check-ref-format.txt
index 0b7982e..180ffbb 100644
--- a/Documentation/git-check-ref-format.txt
+++ b/Documentation/git-check-ref-format.txt
@@ -38,6 +38,8 @@ imposes the following rules on how references are named:

 . They cannot end with a slash `/` nor a dot `.`.

+. They cannot contain two successive slashes `//`.
+
 . They cannot end with the sequence `.lock`.

 . They cannot contain a sequence `@{`.
diff --git a/refs.c b/refs.c
index 808f56b..cedd768 100644
--- a/refs.c
+++ b/refs.c
@@ -687,6 +687,7 @@ int for_each_rawref(each_ref_fn fn, void *cb_data)
  * - it has double dots "..", or
  * - it has ASCII control character, "~", "^", ":" or SP, anywhere, or
  * - it ends with a "/".
+ * - it contains '//'
  * - it ends with ".lock"
  * - it contains a "\" (backslash)
  */
@@ -712,8 +713,10 @@ int check_ref_format(const char *ref)

 	level = 0;
 	while (1) {
-		while ((ch = *cp++) == '/')
-			; /* tolerate duplicated slashes */
+		if ((ch = *cp++) == '/')
+			/* Don't tolerate successive slashes */
+			return CHECK_REF_FORMAT_ERROR;
+
 		if (!ch)
 			/* should not end with slashes */
 			return CHECK_REF_FORMAT_ERROR;
diff --git a/t/t1600-check-ref-format.sh b/t/t1600-check-ref-format.sh
new file mode 100755
index 0000000..05fb9f9
--- /dev/null
+++ b/t/t1600-check-ref-format.sh
@@ -0,0 +1,19 @@
+#!/bin/sh
+
+test_description='git check-ref-format'
+
+. ./test-lib.sh
+
+test_expect_success 'refs/heads/master is ok' '
+	git check-ref-format refs/heads/master
+'
+
+test_expect_success 'no successive slashes allowed' '
+	test_must_fail git check-ref-format a//b
+'
+
+test_expect_success 'no trailing slashes allowed' '
+	test_must_fail git check-ref-format a/b/
+'
+
+test_done
-- 
1.6.5.rc2.205.g1185a

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] disallow refs containing successive slashes
  2009-10-10 17:49 [PATCH] disallow refs containing successive slashes Jens Lehmann
@ 2009-10-10 21:50 ` Junio C Hamano
  2009-10-11 10:42   ` Jens Lehmann
  2009-10-12  5:25   ` [PATCH/RFC 0/4] plumbing to help fix git-gui Jonathan Nieder
  0 siblings, 2 replies; 17+ messages in thread
From: Junio C Hamano @ 2009-10-10 21:50 UTC (permalink / raw)
  To: Jens Lehmann; +Cc: git

Jens Lehmann <Jens.Lehmann@web.de> writes:

> When creating branches using names starting with '/' or containing a '//',
> a leading slash would just vanish and successive slashes were 'compressed'
> into just one slash.

Hmm.  We already do that without your patch.

    $ git branch /foo//bar
    $ git for-each-ref --format='%(refname)'
    refs/heads/foo/bar
    refs/heads/master
    $ git branch -d /foo//bar
    Deleted branch /foo//bar (was deadbeef)
    $ git for-each-ref --format='%(refname)'
    refs/heads/master

> Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
> ---
>
> I became aware of this issue while looking into problems occuring
> when a user created a branch starting with a '/' in git gui (e.g.
> "/foo"). Strange things happen, while git gui shows the current
> branch as "/foo" under the hood a branch "foo" (without the slash)
> had been created. But then you can't delete "/foo" from git gui,
> because a branch of that name doesn't exist.

Perhaps an interface to give a cleaned-up version, e.g.

    $ git check-ref-format --print refs/heads//foo/bar
    refs/heads/foo/bar

is what you want in order to fix git-gui?  I dunno.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] disallow refs containing successive slashes
  2009-10-10 21:50 ` Junio C Hamano
@ 2009-10-11 10:42   ` Jens Lehmann
  2009-10-11 18:52     ` Junio C Hamano
  2009-10-12  5:25   ` [PATCH/RFC 0/4] plumbing to help fix git-gui Jonathan Nieder
  1 sibling, 1 reply; 17+ messages in thread
From: Jens Lehmann @ 2009-10-11 10:42 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio C Hamano schrieb:
> Jens Lehmann <Jens.Lehmann@web.de> writes:
> 
>> When creating branches using names starting with '/' or containing a '//',
>> a leading slash would just vanish and successive slashes were 'compressed'
>> into just one slash.
> 
> Hmm.  We already do that without your patch.
> 
>     $ git branch /foo//bar
>     $ git for-each-ref --format='%(refname)'
>     refs/heads/foo/bar
>     refs/heads/master
>     $ git branch -d /foo//bar
>     Deleted branch /foo//bar (was deadbeef)
>     $ git for-each-ref --format='%(refname)'
>     refs/heads/master

Yes, but git changes the name silently from '/foo//bar' to 'foo/bar'!

The automagical removal of some slashes leads to strange behaviour when
using such names:

    $ git checkout -b foo/bar
    Switched to a new branch 'foo/bar'
    $ git checkout -b /foo//bar
    fatal: A branch named '/foo//bar' already exists.
    $ git for-each-ref --format='%(refname)'
    refs/heads/foo/bar
    refs/heads/master

git claims wrongly that "A branch named '/foo//bar' already exists.".

And that same happens to tag and repo names, leading to other strange
effects:

    $ git tag /foo//bar
    $ git tag
    foo/bar

That is not the tagname the user provided.

    $ git remote add /foo//bar git://git.kernel.org/pub/scm/git/git.git
    $ git remote show
    warning: Config remote shorthand cannot begin with '/': /foo//bar.url
    warning: Config remote shorthand cannot begin with '/': /foo//bar.fetch
    $ git fetch /foo//bar
    warning: Config remote shorthand cannot begin with '/': /foo//bar.url
    warning: Config remote shorthand cannot begin with '/': /foo//bar.fetch
    fatal: 'foo/bar' does not appear to be a git repository
    fatal: The remote end hung up unexpectedly

Note: git fetch uses both versions of the remote name in its output.


>> I became aware of this issue while looking into problems occuring
>> when a user created a branch starting with a '/' in git gui (e.g.
>> "/foo"). Strange things happen, while git gui shows the current
>> branch as "/foo" under the hood a branch "foo" (without the slash)
>> had been created. But then you can't delete "/foo" from git gui,
>> because a branch of that name doesn't exist.
> 
> Perhaps an interface to give a cleaned-up version, e.g.
> 
>     $ git check-ref-format --print refs/heads//foo/bar
>     refs/heads/foo/bar
> 
> is what you want in order to fix git-gui?  I dunno.

Yes, one solution could be to fix every application handling branch, tag
or repo names to mimic the namechange done in the bowels of git. But i
think it is not worth the hassle. Every application i tested (git gui,
gitk and a handful of git commands) just assumes - and i think rightfully
so - that refnames will not change while they are being created (And they
do call "git check-ref-format" or strbuf_check_branch_ref() and friends
to make sure they have a valid refname, but it gets changed nonetheless).

With this patch you get an error every time you try to create such a
refname in the first place and the strange effects and the ambiguity
of refnames ('/foo//bar' is a synonym for 'foo/bar' right now) just
go away.

The motivation for this patch was a confused user at my dayjob. IMHO
most people don't use '/foo//bar' on purpose but the extra slashes get
there through typos, copy & paste errors and such. Then generating an
error might just be the Right Thing to do to avoid the problems with
the changed refname afterwards.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] disallow refs containing successive slashes
  2009-10-11 10:42   ` Jens Lehmann
@ 2009-10-11 18:52     ` Junio C Hamano
  2009-10-12  0:31       ` Jonathan Nieder
  2009-10-12  2:47       ` Nanako Shiraishi
  0 siblings, 2 replies; 17+ messages in thread
From: Junio C Hamano @ 2009-10-11 18:52 UTC (permalink / raw)
  To: Jens Lehmann; +Cc: git

Jens Lehmann <Jens.Lehmann@web.de> writes:

>>> I became aware of this issue while looking into problems occuring
>>> when a user created a branch starting with a '/' in git gui (e.g.
>>> "/foo"). Strange things happen, while git gui shows the current
>>> branch as "/foo" under the hood a branch "foo" (without the slash)
>>> had been created. But then you can't delete "/foo" from git gui,
>>> because a branch of that name doesn't exist.
>> 
>> Perhaps an interface to give a cleaned-up version, e.g.
>> 
>>     $ git check-ref-format --print refs/heads//foo/bar
>>     refs/heads/foo/bar
>> 
>> is what you want in order to fix git-gui?  I dunno.
>
> Yes, one solution could be to fix every application handling branch, tag
> or repo names to mimic the namechange done in the bowels of git. But i
> think it is not worth the hassle.

That cuts both ways.

When the users make typoes (e.g. /foo//bar) you can accept the only sane
correction (e.g. foo/bar) instead of rejecting, since the only thing the
user can do after getting such a rejection is to correct it to that
corrected name (e.g. foo/bar) himself and re-issue the command anyway.
You can push the "hassle" down to the user, or you can fix the tool to
remove the hassle from the user.

Besides, by rejecting what we used to accept you are breaking people's
expectations.  So I am moderately negative, unless you can say your "every
application" is literally _tons_.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] disallow refs containing successive slashes
  2009-10-11 18:52     ` Junio C Hamano
@ 2009-10-12  0:31       ` Jonathan Nieder
  2009-10-12  2:47       ` Nanako Shiraishi
  1 sibling, 0 replies; 17+ messages in thread
From: Jonathan Nieder @ 2009-10-12  0:31 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jens Lehmann, git

Junio C Hamano wrote:
 
> When the users make typoes (e.g. /foo//bar) you can accept the only sane
> correction (e.g. foo/bar) instead of rejecting, since the only thing the
> user can do after getting such a rejection is to correct it to that
> corrected name (e.g. foo/bar) himself and re-issue the command anyway.
> You can push the "hassle" down to the user, or you can fix the tool to
> remove the hassle from the user.

Yes, making check-ref-format stricter without changing its users to
massage their input would be a regression.

The problem Jens described is a git-gui bug.  In lib/branch_delete.tcl,
line 57, the list of deletable branches is populated as follows:

	foreach h [load_all_heads] {
		if {$h ne $current_branch} {
			$w_heads insert end $h
		}
	}

Since slashes coalesce, a user-supplied new branch name is not canonical
and checking "$h eq $current_branch" does not actually check if they are
the same branch.  git-gui should be using the branch name as output by
"git check-ref-format --branch" after the branch is created.

But what about other scripts that assume each branch has only one
possible name?  Maybe they could be forced to fix up the name early on
by making check-ref-format reject names with "//" in them and providing
a "git check-ref-format --print" to help.

Upside: scripts would complain loudly instead of failing subtly on input
with extra slashes if they forget to use "git check-ref-format --print"
when appropriate.  Downside: users, including the 22 callers of
check_ref_format() in git, would have to be checked and probably changed
to avoid regressions.

Never having made that typo, I find it hard to consider such a change
worth the trouble, but it sounds doable.

Regards,
Jonathan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] disallow refs containing successive slashes
  2009-10-11 18:52     ` Junio C Hamano
  2009-10-12  0:31       ` Jonathan Nieder
@ 2009-10-12  2:47       ` Nanako Shiraishi
  1 sibling, 0 replies; 17+ messages in thread
From: Nanako Shiraishi @ 2009-10-12  2:47 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jens Lehmann, git

Quoting Junio C Hamano <gitster@pobox.com> writes:

> Jens Lehmann <Jens.Lehmann@web.de> writes:
>> Yes, one solution could be to fix every application handling branch, tag
>> or repo names to mimic the namechange done in the bowels of git. But i
>> think it is not worth the hassle.
> Besides, by rejecting what we used to accept you are breaking people's
> expectations.  So I am moderately negative, unless you can say your "every
> application" is literally _tons_.

Isn't 1.7.0 a good release to break such people's expectations?

-- 
Nanako Shiraishi
http://ivory.ap.teacup.com/nanako3/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH/RFC 0/4] plumbing to help fix git-gui
  2009-10-10 21:50 ` Junio C Hamano
  2009-10-11 10:42   ` Jens Lehmann
@ 2009-10-12  5:25   ` Jonathan Nieder
  2009-10-12  5:27     ` [PATCH 1/4] Add tests for git check-ref-format Jonathan Nieder
                       ` (4 more replies)
  1 sibling, 5 replies; 17+ messages in thread
From: Jonathan Nieder @ 2009-10-12  5:25 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jens Lehmann, git, Shawn O. Pearce

Junio C Hamano wrote:
> Jens Lehmann <Jens.Lehmann@web.de> writes:

>> I became aware of this issue while looking into problems occuring
>> when a user created a branch starting with a '/' in git gui (e.g.
>> "/foo"). Strange things happen, while git gui shows the current
>> branch as "/foo" under the hood a branch "foo" (without the slash)
>> had been created. But then you can't delete "/foo" from git gui,
>> because a branch of that name doesn't exist.
> 
> Perhaps an interface to give a cleaned-up version, e.g.
> 
>     $ git check-ref-format --print refs/heads//foo/bar
>     refs/heads/foo/bar
> 
> is what you want in order to fix git-gui?  I dunno.

The following packages do exactly that.  I tried to use it to fix
git gui, but my Tcl-fu was not up to it, at least for tonight.

It is not obvious to me that this interface is the one that would be
most helpful for git gui: probably a command to just munge the branch
name would be more convenient, though less general.

Is the --print option useful?  Right now, it seems that its main use is
documentation.  But in the future, it would be very nice if this command
could be used to transform Unicode ref names to some appropriate
normalization form, to make Unicode ref names usable in Mac OS X and
less confusing everywhere.

I look forward to your thoughts.

Jonathan Nieder (4):
  Add tests for git check-ref-format
  Documentation: describe check-ref-format --branch in more detail
  check-ref-format: add option to print canonical name
  check-ref-format: Simplify --print implementation

 Documentation/git-check-ref-format.txt |   32 ++++++++++++----
 builtin-check-ref-format.c             |   31 ++++++++++++++++
 t/t1402-check-ref-format.sh            |   61 ++++++++++++++++++++++++++++++++
 3 files changed, 116 insertions(+), 8 deletions(-)
 create mode 100644 t/t1402-check-ref-format.sh

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/4] Add tests for git check-ref-format
  2009-10-12  5:25   ` [PATCH/RFC 0/4] plumbing to help fix git-gui Jonathan Nieder
@ 2009-10-12  5:27     ` Jonathan Nieder
  2009-10-12  5:28     ` [PATCH 2/4] Documentation: describe check-ref-format --branch Jonathan Nieder
                       ` (3 subsequent siblings)
  4 siblings, 0 replies; 17+ messages in thread
From: Jonathan Nieder @ 2009-10-12  5:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jens Lehmann, git, Shawn O. Pearce

The "git check-ref-format" command is a basic command various
porcelains rely on.  Test its functionality to make sure it does
not unintentionally change.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
 t/t1402-check-ref-format.sh |   44 +++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 44 insertions(+), 0 deletions(-)
 create mode 100644 t/t1402-check-ref-format.sh

diff --git a/t/t1402-check-ref-format.sh b/t/t1402-check-ref-format.sh
new file mode 100644
index 0000000..382bc6e
--- /dev/null
+++ b/t/t1402-check-ref-format.sh
@@ -0,0 +1,44 @@
+#!/bin/sh
+
+test_description='Test git check-ref-format'
+
+. ./test-lib.sh
+
+valid_ref() {
+	test_expect_success "ref name '$1' is valid" \
+		"git check-ref-format '$1'"
+}
+invalid_ref() {
+	test_expect_success "ref name '$1' is not valid" \
+		"test_must_fail git check-ref-format '$1'"
+}
+
+valid_ref 'heads/foo'
+invalid_ref 'foo'
+valid_ref 'foo/bar/baz'
+valid_ref 'refs///heads/foo'
+invalid_ref 'heads/foo/'
+invalid_ref './foo'
+invalid_ref '.refs/foo'
+invalid_ref 'heads/foo..bar'
+invalid_ref 'heads/foo?bar'
+valid_ref 'foo./bar'
+invalid_ref 'heads/foo.lock'
+valid_ref 'heads/foo@bar'
+invalid_ref 'heads/v@{ation'
+invalid_ref 'heads/foo\bar'
+
+test_expect_success "check-ref-format --branch @{-1}" '
+	T=$(git write-tree) &&
+	sha1=$(echo A | git commit-tree $T) &&
+	git update-ref refs/heads/master $sha1 &&
+	git update-ref refs/remotes/origin/master $sha1
+	git checkout master &&
+	git checkout origin/master &&
+	git checkout master &&
+	refname=$(git check-ref-format --branch @{-1}) &&
+	test "$refname" = "$sha1" &&
+	refname2=$(git check-ref-format --branch @{-2}) &&
+	test "$refname2" = master'
+
+test_done
-- 
1.6.5.rc1.199.g596ec

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/4] Documentation: describe check-ref-format --branch
  2009-10-12  5:25   ` [PATCH/RFC 0/4] plumbing to help fix git-gui Jonathan Nieder
  2009-10-12  5:27     ` [PATCH 1/4] Add tests for git check-ref-format Jonathan Nieder
@ 2009-10-12  5:28     ` Jonathan Nieder
  2009-10-12  5:31     ` [PATCH/RFC 3/4] git check-ref-format --print Jonathan Nieder
                       ` (2 subsequent siblings)
  4 siblings, 0 replies; 17+ messages in thread
From: Jonathan Nieder @ 2009-10-12  5:28 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jens Lehmann, git, Shawn O. Pearce

Unless one already knew, it was not obvious what sort of shorthand
"git check-ref-format --branch" expands.  Explain it.

The --branch argument is not optional.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
 Documentation/git-check-ref-format.txt |    9 ++++++---
 1 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/Documentation/git-check-ref-format.txt b/Documentation/git-check-ref-format.txt
index 0b7982e..e9b3b40 100644
--- a/Documentation/git-check-ref-format.txt
+++ b/Documentation/git-check-ref-format.txt
@@ -9,7 +9,7 @@ SYNOPSIS
 --------
 [verse]
 'git check-ref-format' <refname>
-'git check-ref-format' [--branch] <branchname-shorthand>
+'git check-ref-format' --branch <branchname-shorthand>
 
 DESCRIPTION
 -----------
@@ -63,8 +63,11 @@ reference name expressions (see linkgit:git-rev-parse[1]):
 
 . at-open-brace `@{` is used as a notation to access a reflog entry.
 
-With the `--branch` option, it expands a branch name shorthand and
-prints the name of the branch the shorthand refers to.
+With the `--branch` option, it expands the ``previous branch syntax''
+`@{-n}`.  For example, `@{-1}` is a way to refer the last branch you
+were on.  This option should be used by porcelains to accept this
+syntax anywhere a branch name is expected, so they can act as if you
+typed the branch name.
 
 EXAMPLE
 -------
-- 
1.6.5.rc1.199.g596ec

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH/RFC 3/4] git check-ref-format --print
  2009-10-12  5:25   ` [PATCH/RFC 0/4] plumbing to help fix git-gui Jonathan Nieder
  2009-10-12  5:27     ` [PATCH 1/4] Add tests for git check-ref-format Jonathan Nieder
  2009-10-12  5:28     ` [PATCH 2/4] Documentation: describe check-ref-format --branch Jonathan Nieder
@ 2009-10-12  5:31     ` Jonathan Nieder
  2009-10-12 14:39       ` Shawn O. Pearce
  2009-10-12 23:36       ` Junio C Hamano
  2009-10-12  5:33     ` [PATCH/RFC 4/4] check-ref-format: simplify --print implementation Jonathan Nieder
  2009-10-12  5:45     ` [PATCH/RFC 0/4] plumbing to help fix git-gui Jonathan Nieder
  4 siblings, 2 replies; 17+ messages in thread
From: Jonathan Nieder @ 2009-10-12  5:31 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jens Lehmann, git, Shawn O. Pearce

Tolerating empty path components in ref names means each ref does
not have a unique name.  This creates difficulty for porcelains
that want to see if two branches are equal.  Add a helper associating
to each ref a canonical name.

If a user asks a porcelain to create a ref "refs/heads//master",
the porcelain can run "git check-ref-format --print refs/heads//master"
and only deal with "refs/heads/master" from then on.

In the future, it would be very nice if this command could be
modified to transform Unicode ref names to some appropriate
normalization form, to make Unicode ref names usable in Mac OS X,
too, and less confusing everywhere.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
 Documentation/git-check-ref-format.txt |   25 +++++++++++++++++++------
 builtin-check-ref-format.c             |   10 ++++++++++
 t/t1402-check-ref-format.sh            |   17 +++++++++++++++++
 3 files changed, 46 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-check-ref-format.txt b/Documentation/git-check-ref-format.txt
index e9b3b40..0aeef24 100644
--- a/Documentation/git-check-ref-format.txt
+++ b/Documentation/git-check-ref-format.txt
@@ -9,6 +9,7 @@ SYNOPSIS
 --------
 [verse]
 'git check-ref-format' <refname>
+'git check-ref-format' --print <refname>
 'git check-ref-format' --branch <branchname-shorthand>
 
 DESCRIPTION
@@ -63,19 +64,31 @@ reference name expressions (see linkgit:git-rev-parse[1]):
 
 . at-open-brace `@{` is used as a notation to access a reflog entry.
 
+With the `--print` option, if 'refname' is acceptable, it prints the
+canonicalized name of a hypothetical reference with that name.  That is,
+it prints 'refname' with any extra `/` characters removed.
+
 With the `--branch` option, it expands the ``previous branch syntax''
 `@{-n}`.  For example, `@{-1}` is a way to refer the last branch you
 were on.  This option should be used by porcelains to accept this
 syntax anywhere a branch name is expected, so they can act as if you
 typed the branch name.
 
-EXAMPLE
--------
-
-git check-ref-format --branch @{-1}::
-
-Print the name of the previous branch.
+EXAMPLES
+--------
 
+* Print the name of the previous branch:
++
+------------
+$ git check-ref-format --branch @{-1}
+------------
+
+* Determine the reference name to use for a new branch:
++
+------------
+$ ref=$(git check-ref-format --print "refs/heads/$newbranch") ||
+die "we do not like '$newbranch' as a branch name."
+------------
 
 GIT
 ---
diff --git a/builtin-check-ref-format.c b/builtin-check-ref-format.c
index f9381e0..b97b61a 100644
--- a/builtin-check-ref-format.c
+++ b/builtin-check-ref-format.c
@@ -17,6 +17,16 @@ int cmd_check_ref_format(int argc, const char **argv, const char *prefix)
 		printf("%s\n", sb.buf + 11);
 		exit(0);
 	}
+	if (argc == 3 && !strcmp(argv[1], "--print")) {
+		char *refname = xmalloc(strlen(argv[2]) + 1);
+
+		if (check_ref_format(argv[2]))
+			exit(1);
+		if (normalize_path_copy(refname, argv[2]))
+			die("Could not normalize ref name '%s'", argv[2]);
+		printf("%s\n", refname);
+		exit(0);
+	}
 	if (argc != 2)
 		usage("git check-ref-format refname");
 	return !!check_ref_format(argv[1]);
diff --git a/t/t1402-check-ref-format.sh b/t/t1402-check-ref-format.sh
index 382bc6e..eb45afb 100644
--- a/t/t1402-check-ref-format.sh
+++ b/t/t1402-check-ref-format.sh
@@ -41,4 +41,21 @@ test_expect_success "check-ref-format --branch @{-1}" '
 	refname2=$(git check-ref-format --branch @{-2}) &&
 	test "$refname2" = master'
 
+valid_ref_normalized() {
+	test_expect_success "ref name '$1' simplifies to '$2'" "
+		refname=\$(git check-ref-format --print '$1') &&
+		test \"\$refname\" = '$2'"
+}
+invalid_ref_normalized() {
+	test_expect_success "check-ref-format --print rejects '$1'" "
+		test_must_fail git check-ref-format --print '$1'"
+}
+
+valid_ref_normalized 'heads/foo' 'heads/foo'
+valid_ref_normalized 'refs///heads/foo' 'refs/heads/foo'
+invalid_ref_normalized 'foo'
+invalid_ref_normalized 'heads/foo/../bar'
+invalid_ref_normalized 'heads/./foo'
+invalid_ref_normalized 'heads\foo'
+
 test_done
-- 
1.6.5.rc1.199.g596ec

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH/RFC 4/4] check-ref-format: simplify --print implementation
  2009-10-12  5:25   ` [PATCH/RFC 0/4] plumbing to help fix git-gui Jonathan Nieder
                       ` (2 preceding siblings ...)
  2009-10-12  5:31     ` [PATCH/RFC 3/4] git check-ref-format --print Jonathan Nieder
@ 2009-10-12  5:33     ` Jonathan Nieder
  2009-10-12  5:45     ` [PATCH/RFC 0/4] plumbing to help fix git-gui Jonathan Nieder
  4 siblings, 0 replies; 17+ messages in thread
From: Jonathan Nieder @ 2009-10-12  5:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jens Lehmann, git, Shawn O. Pearce

normalize_path_copy() is a complicated function, but most of its
functionality will never apply to a ref name that has been checked
with check_ref_format().

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
 builtin-check-ref-format.c |   25 +++++++++++++++++++++++--
 1 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/builtin-check-ref-format.c b/builtin-check-ref-format.c
index b97b61a..e439136 100644
--- a/builtin-check-ref-format.c
+++ b/builtin-check-ref-format.c
@@ -7,6 +7,28 @@
 #include "builtin.h"
 #include "strbuf.h"
 
+/*
+ * Replace each run of adjacent slashes in src with a single slash,
+ * and write the result to dst.
+ *
+ * This function is similar to normalize_path_copy(), but stripped down
+ * to meet check_ref_format's simpler needs.
+ */
+static void collapse_slashes(char *dst, const char *src)
+{
+	char ch;
+	char prev = '\0';
+
+	while (ch = *src++) {
+		if (prev == '/' && ch == prev)
+			continue;
+
+		*dst++ = ch;
+		prev = ch;
+	}
+	*dst = '\0';
+}
+
 int cmd_check_ref_format(int argc, const char **argv, const char *prefix)
 {
 	if (argc == 3 && !strcmp(argv[1], "--branch")) {
@@ -22,8 +44,7 @@ int cmd_check_ref_format(int argc, const char **argv, const char *prefix)
 
 		if (check_ref_format(argv[2]))
 			exit(1);
-		if (normalize_path_copy(refname, argv[2]))
-			die("Could not normalize ref name '%s'", argv[2]);
+		collapse_slashes(refname, argv[2]);
 		printf("%s\n", refname);
 		exit(0);
 	}
-- 
1.6.5.rc1.199.g596ec

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH/RFC 0/4] plumbing to help fix git-gui
  2009-10-12  5:25   ` [PATCH/RFC 0/4] plumbing to help fix git-gui Jonathan Nieder
                       ` (3 preceding siblings ...)
  2009-10-12  5:33     ` [PATCH/RFC 4/4] check-ref-format: simplify --print implementation Jonathan Nieder
@ 2009-10-12  5:45     ` Jonathan Nieder
  4 siblings, 0 replies; 17+ messages in thread
From: Jonathan Nieder @ 2009-10-12  5:45 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jens Lehmann, git, Shawn O. Pearce

Jonathan Nieder wrote:
> Junio C Hamano wrote:

>> Perhaps an interface to give a cleaned-up version, e.g.
>> 
>>     $ git check-ref-format --print refs/heads//foo/bar
>>     refs/heads/foo/bar
>> 
>> is what you want in order to fix git-gui?  I dunno.
> 
> The following packages do exactly that.

s/packages/patches/

I better sleep...

Sorry for the confusion,
Jonathan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH/RFC 3/4] git check-ref-format --print
  2009-10-12  5:31     ` [PATCH/RFC 3/4] git check-ref-format --print Jonathan Nieder
@ 2009-10-12 14:39       ` Shawn O. Pearce
  2009-10-12 21:06         ` Junio C Hamano
  2009-10-12 23:36       ` Junio C Hamano
  1 sibling, 1 reply; 17+ messages in thread
From: Shawn O. Pearce @ 2009-10-12 14:39 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Junio C Hamano, Jens Lehmann, git

Jonathan Nieder <jrnieder@gmail.com> wrote:
> +valid_ref_normalized 'heads/foo' 'heads/foo'
> +valid_ref_normalized 'refs///heads/foo' 'refs/heads/foo'
> +invalid_ref_normalized 'foo'
> +invalid_ref_normalized 'heads/foo/../bar'
> +invalid_ref_normalized 'heads/./foo'
> +invalid_ref_normalized 'heads\foo'

What about '/refs/heads/foo'?  Shouldn't that drop the leading /?

I actually had someone enter that into Gerrit Code Review once,
exposing a bug I have yet to fix that permits that as a valid
branch name.  *sigh*

FWIW, I think this is heading in the right direction.  Rather
than teaching the UIs how clean up a name, give us a tool to
do the normalization and validation, and let us call it when
we get user input.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH/RFC 3/4] git check-ref-format --print
  2009-10-12 14:39       ` Shawn O. Pearce
@ 2009-10-12 21:06         ` Junio C Hamano
  2009-10-12 23:26           ` Shawn O. Pearce
  0 siblings, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2009-10-12 21:06 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Jonathan Nieder, Junio C Hamano, Jens Lehmann, git

"Shawn O. Pearce" <spearce@spearce.org> writes:

> Jonathan Nieder <jrnieder@gmail.com> wrote:
>> +valid_ref_normalized 'heads/foo' 'heads/foo'
>> +valid_ref_normalized 'refs///heads/foo' 'refs/heads/foo'
>> +invalid_ref_normalized 'foo'
>> +invalid_ref_normalized 'heads/foo/../bar'
>> +invalid_ref_normalized 'heads/./foo'
>> +invalid_ref_normalized 'heads\foo'
>
> What about '/refs/heads/foo'?  Shouldn't that drop the leading /?
>
> I actually had someone enter that into Gerrit Code Review once,
> exposing a bug I have yet to fix that permits that as a valid
> branch name.  *sigh*
>
> FWIW, I think this is heading in the right direction.  Rather
> than teaching the UIs how clean up a name, give us a tool to
> do the normalization and validation, and let us call it when
> we get user input.

I understand that you prefer the latter between "there is no tool; the
caller is responsibile to make sure it feeds us canonical representation"
and "there is a tool that makes a slightly malformed string into canonical
form for the callers to use before calling us."  And that would be my
preference between these two as well.

But that is based on the current behaviour of accepting slightly malformed
and silently making it canonical.  If we throw a third alternative,
Jonathan's original patch, that did "we reject malformed string", in the
mix, what would be your preference?

I moderately favour the "tool to canonicalize is given, and it would be a
bug for the caller not to use it" approach this series takes primarily
because that approach won't break scripts that do something like this to
make a new branch, optionally grouped by the owner (e.g. 'sp/smart-http'
or 'next'):

	owner=$1 topic=$2
	branch=$owner/$topic
        git branch "$branch"

This currently works as expected as long as it does not later try to
compare for-each-ref output with "refs/heads/$branch" and expects a match.
With the third approach, the optional username grouping becomes mandatory
for such a script, as 'git branch' will error out.  To keep the grouping
still optinal, such a script needs to be written perhaps like:

	if test -z "$2"
        then
		branch="$1"
	else
        	branch="$1/$2"
	fi

and that would be a hassle for the scripters, but this _could_ be a kind
of backward incompatible tightening we might want to consider for 1.7.0,
as somebody suggested earlier.

But now I have spelled this out, I do not see much upside for rejecting,
and more importantly, I think it would be an independent issue.  We can
reject or just keep normalizing silently, and a tool to show the
normalized name would be useful and necessary regardless of that.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH/RFC 3/4] git check-ref-format --print
  2009-10-12 21:06         ` Junio C Hamano
@ 2009-10-12 23:26           ` Shawn O. Pearce
  0 siblings, 0 replies; 17+ messages in thread
From: Shawn O. Pearce @ 2009-10-12 23:26 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Nieder, Jens Lehmann, git

Junio C Hamano <gitster@pobox.com> wrote:
> I understand that you prefer the latter between "there is no tool; the
> caller is responsibile to make sure it feeds us canonical representation"
> and "there is a tool that makes a slightly malformed string into canonical
> form for the callers to use before calling us."  And that would be my
> preference between these two as well.
...
> But now I have spelled this out, I do not see much upside for rejecting,
> and more importantly, I think it would be an independent issue.  We can
> reject or just keep normalizing silently, and a tool to show the
> normalized name would be useful and necessary regardless of that.

I agree with the last paragraph here, we shouldn't reject, but
instead keep the current state but encourage tools to use the new
canonical print tool to clean up a name if they want to hang onto the
string the user entered and it needs to exactly match for-each-ref
sort of output.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH/RFC 3/4] git check-ref-format --print
  2009-10-12  5:31     ` [PATCH/RFC 3/4] git check-ref-format --print Jonathan Nieder
  2009-10-12 14:39       ` Shawn O. Pearce
@ 2009-10-12 23:36       ` Junio C Hamano
  2009-10-13  4:49         ` Jonathan Nieder
  1 sibling, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2009-10-12 23:36 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Jens Lehmann, git, Shawn O. Pearce

Jonathan Nieder <jrnieder@gmail.com> writes:

> In the future, it would be very nice if this command could be
> modified to transform Unicode ref names to some appropriate
> normalization form, to make Unicode ref names usable in Mac OS X,
> too, and less confusing everywhere.

I do not disagree with a desire to help fixing the unicode insanity on
that platform, but I suspect that check-ref-format is a wrong place to
tackle the issue.  You would need a similar filter for outputs from the
likes of ls-files and "diff --name-only", iow, anything that deal with
pathnames, no?

It would have be something like "check-ref-format --print | iconv ..."
pipeline (conceptually, if not forcing the pipeline to the end users, that
is).

Also are people happy with "--print"?  I was waiting for others to reword
it to "--normalize" or something like that.  In your documentation you
explain it to "canonicalize", and in your tests you name the output
"normalized".  I am fine with either wording, but would like to see us
using only one, not both.

Thanks.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH/RFC 3/4] git check-ref-format --print
  2009-10-12 23:36       ` Junio C Hamano
@ 2009-10-13  4:49         ` Jonathan Nieder
  0 siblings, 0 replies; 17+ messages in thread
From: Jonathan Nieder @ 2009-10-13  4:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jens Lehmann, git, Shawn O. Pearce

Junio C Hamano wrote:

> I do not disagree with a desire to help fixing the unicode insanity on
> that platform, but I suspect that check-ref-format is a wrong place to
> tackle the issue.  You would need a similar filter for outputs from the
> likes of ls-files and "diff --name-only", iow, anything that deal with
> pathnames, no?
> 
> It would have be something like "check-ref-format --print | iconv ..."
> pipeline (conceptually, if not forcing the pipeline to the end users, that
> is).

GNU iconv does not write the various Unicode normalization forms, so it
would have to be something like "check-ref-format --print | charlint ..."
instead.  Regardless of the filesystem, it seems reasonable to consider é
(U+00e9) and é (U+0065 + U+0301) the same character when it appears in a
ref name, and one way to achieve this would be to pick one normalization
form and stick to it.  This does not seem so different from stripping out
empty path components.

As a side effect, that would deal with OS X’s strange handling of unicode
filenames for .git/refs/*.  Now that I think about it, if fighting OS X
were the only problem that needed to be solved, I don’t think I’d like
this solution so much.  The analogous solution to the also unsolved issue
of case insensitive filesystems is to force all ref names to lowercase.
Do we want to do that?  (The case insensitivity issue might not be as bad,
since the relevant filesystems will at least _preserve_ the case of
filenames in .git/refs.  Should we copy them and smudge new ref names to
match known ones that differ only in case?  Just thinking about the
problem makes me cringe.)

Coping with the unicode filename craziness in the working tree is a
separate issue, though probably a more important one.  I think Linus set
up a framework for solving it in
<http://thread.gmane.org/gmane.comp.version-control.git/77827>.

Regards,
Jonathan

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2009-10-13  4:43 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-10 17:49 [PATCH] disallow refs containing successive slashes Jens Lehmann
2009-10-10 21:50 ` Junio C Hamano
2009-10-11 10:42   ` Jens Lehmann
2009-10-11 18:52     ` Junio C Hamano
2009-10-12  0:31       ` Jonathan Nieder
2009-10-12  2:47       ` Nanako Shiraishi
2009-10-12  5:25   ` [PATCH/RFC 0/4] plumbing to help fix git-gui Jonathan Nieder
2009-10-12  5:27     ` [PATCH 1/4] Add tests for git check-ref-format Jonathan Nieder
2009-10-12  5:28     ` [PATCH 2/4] Documentation: describe check-ref-format --branch Jonathan Nieder
2009-10-12  5:31     ` [PATCH/RFC 3/4] git check-ref-format --print Jonathan Nieder
2009-10-12 14:39       ` Shawn O. Pearce
2009-10-12 21:06         ` Junio C Hamano
2009-10-12 23:26           ` Shawn O. Pearce
2009-10-12 23:36       ` Junio C Hamano
2009-10-13  4:49         ` Jonathan Nieder
2009-10-12  5:33     ` [PATCH/RFC 4/4] check-ref-format: simplify --print implementation Jonathan Nieder
2009-10-12  5:45     ` [PATCH/RFC 0/4] plumbing to help fix git-gui Jonathan Nieder

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).