All of lore.kernel.org
 help / color / mirror / Atom feed
* Possible bug with branch names and case sensitivity
@ 2011-11-19 20:08 Gerd Knops
  2011-11-21 19:18 ` Jay Soffian
  0 siblings, 1 reply; 11+ messages in thread
From: Gerd Knops @ 2011-11-19 20:08 UTC (permalink / raw)
  To: git

Hi All,

On Mac OS X with a case-insensitive file system (not sure if that matters) git get's confused with branch names that differ only in case. Here is what happened as far as I can reconstruct:

While in a branch named "foundry" I accidentally

	git checkout Crucible

instead of "crucible". This appears to have made a local branch "Crucible" of the remote tracking "crucible" branch (had I done this on purpose, I would have expected "Crucible" to be a branch of "foundry" instead of a branch of "crucible").

I made some changes, committed, and pushed, only to be puzzled that no changes were pushed upstream.

At this point "git branch -a" showed:

	* Crucible
	  foundry
	  master
	  remotes/origin/DAExceptions
	  remotes/origin/HEAD -> origin/master
	  remotes/origin/centerSectionOptimizer
	  remotes/origin/crucible
	  remotes/origin/foundry
	  remotes/origin/ipad
	  remotes/origin/master

So naturally I proceeded with

	git checkout crucible
	git merge Crucible

only to see "Already up-to-date."

Not sure if any of this is expected behavior, but to me it didn't feel like it.

Thanks

Gerd

PS: here is how I "fixed" this:

	git checkout Crucible
	git reset --soft HEAD^
	git stash
	git stash apply
	
added, committed, pushed. BTW now "git branch -a" shows:

	* crucible
	  foundry
	  master
	  remotes/origin/DAExceptions
	  remotes/origin/HEAD -> origin/master
	  remotes/origin/centerSectionOptimizer
	  remotes/origin/crucible
	  remotes/origin/foundry
	  remotes/origin/ipad
	  remotes/origin/master

No trace of the "Crucible" branch.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Possible bug with branch names and case sensitivity
  2011-11-19 20:08 Possible bug with branch names and case sensitivity Gerd Knops
@ 2011-11-21 19:18 ` Jay Soffian
  2011-11-22  5:21   ` Michael Haggerty
  0 siblings, 1 reply; 11+ messages in thread
From: Jay Soffian @ 2011-11-21 19:18 UTC (permalink / raw)
  To: Gerd Knops; +Cc: git

On Sat, Nov 19, 2011 at 3:08 PM, Gerd Knops <gerti@bitart.com> wrote:
> On Mac OS X with a case-insensitive file system (not sure if that matters) git get's confused with branch names that differ only in case.

This is true. The branch code assumes a case-sensitive filesystem. I
started working on a fix, but it was more involved than I first
thought it would be. See my local WIP commit below, apologies if gmail
lines wraps it.

j.

commit dfa86073b7
Author: Jay Soffian <jaysoffian@gmail.com>
Date:   Thu Oct 6 14:51:15 2011 -0400

    Try not to confuse branch foo with branch Foo (WIP)

    This probably needs to canonicalize the branch name instead. Sigh.

diff --git a/builtin/checkout.c b/builtin/checkout.c
index a41c818a7c..0e7362345d 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -363,7 +363,7 @@ static void setup_branch_path(struct branch_info *branch)
 	struct strbuf buf = STRBUF_INIT;

 	strbuf_branchname(&buf, branch->name);
-	if (strcmp(buf.buf, branch->name))
+	if (strcmp_icase(buf.buf, branch->name))
 		branch->name = xstrdup(buf.buf);
 	strbuf_splice(&buf, 0, 0, "refs/heads/", 11);
 	branch->path = strbuf_detach(&buf, NULL);
@@ -523,7 +523,7 @@ static void record_checkout(const char *name,
const char *new_work_tree)
 	} else { /* release name if we reserved it */
 		struct branch *branch = branch_get(name);
 		if (branch->work_tree &&
-		    !strcmp(branch->work_tree, get_git_work_tree()))
+		    !strcmp_icase(branch->work_tree, get_git_work_tree()))
 			git_config_set(key.buf, "");
 	}
 	strbuf_release(&key);
@@ -567,7 +567,7 @@ static void update_refs_for_switch(struct
checkout_opts *opts,
 	strbuf_addf(&msg, "checkout: moving from %s to %s",
 		    old_desc ? old_desc : "(invalid)", new->name);

-	if (!strcmp(new->name, "HEAD") && !new->path && !opts->force_detach) {
+	if (!strcmp_icase(new->name, "HEAD") && !new->path && !opts->force_detach) {
 		/* Nothing to do. */
 	} else if (opts->force_detach || !new->path) {	/* No longer on any branch. */
 		update_ref(msg.buf, "HEAD", new->commit->object.sha1, NULL,
@@ -582,7 +582,7 @@ static void update_refs_for_switch(struct
checkout_opts *opts,
 	} else if (new->path) {	/* Switch branches. */
 		create_symref("HEAD", new->path, msg.buf);
 		if (!opts->quiet) {
-			if (old->path && !strcmp(new->path, old->path)) {
+			if (old->path && !strcmp_icase(new->path, old->path)) {
 				fprintf(stderr, _("Already on '%s'\n"),
 					new->name);
 			} else if (opts->new_branch) {
@@ -612,7 +612,7 @@ static void update_refs_for_switch(struct
checkout_opts *opts,
 	remove_branch_state();
 	strbuf_release(&msg);
 	if (!opts->quiet &&
-	    (new->path || (!opts->force_detach && !strcmp(new->name, "HEAD"))))
+	    (new->path || (!opts->force_detach && !strcmp_icase(new->name, "HEAD"))))
 		report_tracking(new);
 }

@@ -719,7 +719,7 @@ static void check_if_checked_out(struct
checkout_opts *opts, const char *name)
 {
 	struct branch *branch = branch_get(name);
 	if (branch->work_tree && strlen(branch->work_tree) &&
-	    strcmp(branch->work_tree, get_git_work_tree())) {
+	    strcmp_icase(branch->work_tree, get_git_work_tree())) {
 		if (opts->force)
 			warning(_("branch '%s' is currently checked out"
 				  " in '%s'"), name, branch->work_tree);
diff --git a/remote.c b/remote.c
index 283b2121bd..1fba1c7fa3 100644
--- a/remote.c
+++ b/remote.c
@@ -166,9 +166,9 @@ static struct branch *make_branch(const char *name, int len)
 	char *refname;

 	for (i = 0; i < branches_nr; i++) {
-		if (len ? (!strncmp(name, branches[i]->name, len) &&
+		if (len ? (!strncmp_icase(name, branches[i]->name, len) &&
 			   !branches[i]->name[len]) :
-		    !strcmp(name, branches[i]->name))
+		    !strcmp_icase(name, branches[i]->name))
 			return branches[i];
 	}

@@ -829,7 +829,7 @@ static int query_refspecs(struct refspec *refs,
int ref_count, struct refspec *q
 				query->force = refspec->force;
 				return 0;
 			}
-		} else if (!strcmp(needle, key)) {
+		} else if (!strcmp_icase(needle, key)) {
 			*result = xstrdup(value);
 			query->force = refspec->force;
 			return 0;

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: Possible bug with branch names and case sensitivity
  2011-11-21 19:18 ` Jay Soffian
@ 2011-11-22  5:21   ` Michael Haggerty
  2011-11-22 17:31     ` Jay Soffian
                       ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Michael Haggerty @ 2011-11-22  5:21 UTC (permalink / raw)
  To: Jay Soffian; +Cc: Gerd Knops, git

On 11/21/2011 08:18 PM, Jay Soffian wrote:
> On Sat, Nov 19, 2011 at 3:08 PM, Gerd Knops <gerti@bitart.com>
> wrote:
>> On Mac OS X with a case-insensitive file system (not sure if that
>> matters) git get's confused with branch names that differ only in
>> case.
> 
> This is true. The branch code assumes a case-sensitive filesystem. I 
> started working on a fix, but it was more involved than I first 
> thought it would be. See my local WIP commit below, apologies if
> gmail lines wraps it.

Is it obvious how references *should* be handled on case-insensitive
filesystems?  It's certainly not obvious to me (has it been discussed
elsewhere?)  I don't think it is a good idea to "fix" this one problem
without defining an overall policy.

Currently git handles references names case-sensitively and allows
multiple reference names that differ only in case.  If this behavior is
to be preserved on case-insensitive filesystems, then either loose
references must be stored differently (e.g., multiple references in the
same file) or ambiguous references need always to be packed.  Moreover,
given a refname, we would need to be careful not to just try to open a
file with that name and assume that it is the correct reference; rather,
we would have to ask the filesystem for the name of the file in its
original case and make sure that it agrees with the case of the refname
that we seek.

By the way, this could have ramifications for the recently-added test
that top-level refnames should be in ALL_CAPS.

If we want to consider bending git's behavior, there are a number of
ways we could go:

1. Remain case-sensitive but prohibit refnames that differ only in case.

2. Remain case-sensitive but prohibit refnames that differ only in case
*when running on a case-insensitive filesystem*.

3. Change the handling of refnames to be case-insensitive but
case-preserving.

The above all assumes a case-insensitive filesystem that is
*case-preserving*.  If we want to support filesystems that do not
preserve case, things get even more complicated.

And if we want to pretend to support non-ASCII refnames, then the issue
of encodings is another nasty can of worms...

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Possible bug with branch names and case sensitivity
  2011-11-22  5:21   ` Michael Haggerty
@ 2011-11-22 17:31     ` Jay Soffian
  2011-11-23  8:54       ` Michael Haggerty
  2011-11-23 20:50       ` Ævar Arnfjörð Bjarmason
  2011-11-22 17:49     ` Junio C Hamano
  2011-11-22 18:01     ` Junio C Hamano
  2 siblings, 2 replies; 11+ messages in thread
From: Jay Soffian @ 2011-11-22 17:31 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Gerd Knops, git

On Tue, Nov 22, 2011 at 12:21 AM, Michael Haggerty <mhagger@alum.mit.edu> wrote:
> Is it obvious how references *should* be handled on case-insensitive
> filesystems?  It's certainly not obvious to me (has it been discussed
> elsewhere?)  I don't think it is a good idea to "fix" this one problem
> without defining an overall policy.

Indeed, I hadn't thought this through very well at all. My initial
take was just that if I were on a case-insensitive file system, I
don't get to have references that differ only in case. This is of
course quite short-sighted in a distributed VCS. :-(

> Currently git handles references names case-sensitively and allows
> multiple reference names that differ only in case.  If this behavior is
> to be preserved on case-insensitive filesystems, then either loose
> references must be stored differently (e.g., multiple references in the
> same file) or ambiguous references need always to be packed.  Moreover,
> given a refname, we would need to be careful not to just try to open a
> file with that name and assume that it is the correct reference; rather,
> we would have to ask the filesystem for the name of the file in its
> original case and make sure that it agrees with the case of the refname
> that we seek.

I wonder what the downside would be of always using packed refs on
case-insenstive file systems. This would seem analogous to how git no
longer uses symlinks.

> By the way, this could have ramifications for the recently-added test
> that top-level refnames should be in ALL_CAPS.
>
> If we want to consider bending git's behavior, there are a number of
> ways we could go:
>
> 1. Remain case-sensitive but prohibit refnames that differ only in case.
>
> 2. Remain case-sensitive but prohibit refnames that differ only in case
> *when running on a case-insensitive filesystem*.
>
> 3. Change the handling of refnames to be case-insensitive but
> case-preserving.
>
> The above all assumes a case-insensitive filesystem that is
> *case-preserving*.  If we want to support filesystems that do not
> preserve case, things get even more complicated.
>
> And if we want to pretend to support non-ASCII refnames, then the issue
> of encodings is another nasty can of worms...

These all seem like sub-optimal things to do if we can just always
used packed refs.

j.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Possible bug with branch names and case sensitivity
  2011-11-22  5:21   ` Michael Haggerty
  2011-11-22 17:31     ` Jay Soffian
@ 2011-11-22 17:49     ` Junio C Hamano
  2011-11-23  9:22       ` Michael Haggerty
  2011-11-22 18:01     ` Junio C Hamano
  2 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2011-11-22 17:49 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Jay Soffian, Gerd Knops, git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> Is it obvious how references *should* be handled on case-insensitive
> filesystems?  It's certainly not obvious to me (has it been discussed
> elsewhere?)  I don't think it is a good idea to "fix" this one problem
> without defining an overall policy.

Thanks for a very sane comment.

> Currently git handles references names case-sensitively and allows
> multiple reference names that differ only in case.

We do the same for in-tree paths, by the way.  Ultimately, I think the
sane thing to do is to appeal to the user's common sense.  In a project
where its participants may use, or in a project that is about, a platform
where a case-folding filesystem is the default choice, the project would
avoid in-tree paths that are different only in case and would not have
xt_TCPMSS.c and xt_tcpmss.c at the same time.  Even though Git allows you
on such a platform to add case-conflicting pair of paths by using
"update-index --cacheinfo", people would not do that, because it is not a
useful thing to do. And Git by default does not forbid recording such pair
of paths, as projects for whatever reason may want to use such pair of
paths if they know its participants can deal with case sensitivity just
fine.

I think refnames have exactly the same issue. In theory, you could have
"Master" and "master" branches, and nothing stops you from trying to do
so, but in practice, if it is not useful for you and your project, and
if it is equally fine to use some other name instead of "Master" for the
purpose of you and your project, then there is no strong reason for doing
so, unless you are trying to irritate users on case folding platforms.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Possible bug with branch names and case sensitivity
  2011-11-22  5:21   ` Michael Haggerty
  2011-11-22 17:31     ` Jay Soffian
  2011-11-22 17:49     ` Junio C Hamano
@ 2011-11-22 18:01     ` Junio C Hamano
  2 siblings, 0 replies; 11+ messages in thread
From: Junio C Hamano @ 2011-11-22 18:01 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Jay Soffian, Gerd Knops, git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> If we want to consider bending git's behavior, there are a number of
> ways we could go:
>
> 1. Remain case-sensitive but prohibit refnames that differ only in case.

I do not see a strong enough reason to be that draconian.

> 2. Remain case-sensitive but prohibit refnames that differ only in case
> *when running on a case-insensitive filesystem*.

If you make it conditional, it should be per-project, not per-repository.
You may be participating in a cross platform project and you may happen to
be on the case-sensitive system, but absense of such a check for you may
end up hurting other participants who work on a case-insensitive one.

> 3. Change the handling of refnames to be case-insensitive but
> case-preserving.

I do not see it is worth the effort. If you were to expend much effort
then I could see in the longer term (now I am talking about Git 2.0 in
this paragraph) one solution is to remove on-filesystem $GIT_DIR/refs/
hierarchy, put it in a trivial database of some sort, keyed with case
sensitive strings.

The transfer of refs over the wire will stay case sensitive so such a
change would be purely local to the repository, so transition would only
matter if you network mount a new style repository and attempt to use with
older version of Git.

If you go that route, we still would need to think about how to deal with
the $GIT_DIR/logs/ hierarchy, though.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Possible bug with branch names and case sensitivity
  2011-11-22 17:31     ` Jay Soffian
@ 2011-11-23  8:54       ` Michael Haggerty
  2011-11-23 20:50       ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 11+ messages in thread
From: Michael Haggerty @ 2011-11-23  8:54 UTC (permalink / raw)
  To: Jay Soffian; +Cc: Gerd Knops, git, Junio C Hamano

On 11/22/2011 06:31 PM, Jay Soffian wrote:
> I wonder what the downside would be of always using packed refs on
> case-insenstive file systems. This would seem analogous to how git no
> longer uses symlinks.

The theoretical downside is that when the total number of packed refs is
very large, it is more expensive to access or change a single ref if it
is packed than if it is loose (because the whole packed refs file has to
be read, parsed, then rewritten, and thus scales like O(N)).  OTOH the
number of references must be quite large before loose references win,
because the constant factor for loose references is much larger than
that for packed references.  I also believe that there is still scope
for optimizing the handling of packed references to make them yet faster
and perhaps even improve their scaling.

But I think that a lot of code would have to change to make this happen.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Possible bug with branch names and case sensitivity
  2011-11-22 17:49     ` Junio C Hamano
@ 2011-11-23  9:22       ` Michael Haggerty
  2011-11-23 18:59         ` Junio C Hamano
  0 siblings, 1 reply; 11+ messages in thread
From: Michael Haggerty @ 2011-11-23  9:22 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jay Soffian, Gerd Knops, git

On 11/22/2011 06:49 PM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
>> Currently git handles references names case-sensitively and allows
>> multiple reference names that differ only in case.
> 
> We do the same for in-tree paths, by the way.  Ultimately, I think the
> sane thing to do is to appeal to the user's common sense.  [...common
> sense aka "if it hurts don't do it" omitted...]
> 
> I think refnames have exactly the same issue. In theory, you could have
> "Master" and "master" branches, and nothing stops you from trying to do
> so, but in practice, if it is not useful for you and your project, and
> if it is equally fine to use some other name instead of "Master" for the
> purpose of you and your project, then there is no strong reason for doing
> so, unless you are trying to irritate users on case folding platforms.

I agree.

But git could nevertheless help users (1) by providing config settings
or hook scripts or something that could be configured in a repository to
prevent case-conflicts from entering the project history; (2) by
emitting an error when such a conflict arises rather than getting so
confused.

Note that Unicode encoding differences can cause very similar problems
(even assuming utf8, there can be multiple ways to encode the same
string) and should maybe be addressed similarly.

By the way, I'm not volunteering for this project; case-sensitive
ASCII's good enough for me :-)

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Possible bug with branch names and case sensitivity
  2011-11-23  9:22       ` Michael Haggerty
@ 2011-11-23 18:59         ` Junio C Hamano
  0 siblings, 0 replies; 11+ messages in thread
From: Junio C Hamano @ 2011-11-23 18:59 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Jay Soffian, Gerd Knops, git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> On 11/22/2011 06:49 PM, Junio C Hamano wrote:
>> Michael Haggerty <mhagger@alum.mit.edu> writes:
>>> Currently git handles references names case-sensitively and allows
>>> multiple reference names that differ only in case.
>> 
>> We do the same for in-tree paths, by the way.  Ultimately, I think the
>> sane thing to do is to appeal to the user's common sense.  [...common
>> sense aka "if it hurts don't do it" omitted...]
>> 
>> I think refnames have exactly the same issue. In theory, you could have
>> "Master" and "master" branches, and nothing stops you from trying to do
>> so, but in practice, if it is not useful for you and your project, and
>> if it is equally fine to use some other name instead of "Master" for the
>> purpose of you and your project, then there is no strong reason for doing
>> so, unless you are trying to irritate users on case folding platforms.
>
> I agree.
>
> But git could nevertheless help users (1) by providing config settings
> or hook scripts or something that could be configured in a repository to
> prevent case-conflicts from entering the project history; (2) by
> emitting an error when such a conflict arises rather than getting so
> confused.

Yeah, and you didn't have to say "But"; we are in agreement (see my other
message in response to the same message from you).

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Possible bug with branch names and case sensitivity
  2011-11-22 17:31     ` Jay Soffian
  2011-11-23  8:54       ` Michael Haggerty
@ 2011-11-23 20:50       ` Ævar Arnfjörð Bjarmason
  2011-11-23 22:08         ` Joshua Jensen
  1 sibling, 1 reply; 11+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2011-11-23 20:50 UTC (permalink / raw)
  To: Jay Soffian; +Cc: Michael Haggerty, Gerd Knops, git

On Tue, Nov 22, 2011 at 18:31, Jay Soffian <jaysoffian@gmail.com> wrote:
> I wonder what the downside would be of always using packed refs on
> case-insenstive file systems. This would seem analogous to how git no
> longer uses symlinks.

Note that Git doesn't only have confusing behavior with refs on
case-insensitive filesystems. The other day HFS+ users @ work had
issues because of a case collision in the checked out tree, which
confused git status et al.

Note that HFS+ in particular is case-insensitive *but* case
preserving. E.g.:

    $ touch Foo; perl -wle 'opendir my $d, "."; print while readdir
$d; -f and print "yes" for qw(foo Foo FOO)'
    .
    ..
    Foo
    yes
    yes
    yes

On case-insensitive and not-case-preserving systems the third line
would usually print either "foo" or "FOO", but on HFS+ the system
preserves the original name.

This means that you can in some cases figure out what's going on by
doing a readdir() in addition to a stat() as you could do on
POSIX-compliant systems.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Possible bug with branch names and case sensitivity
  2011-11-23 20:50       ` Ævar Arnfjörð Bjarmason
@ 2011-11-23 22:08         ` Joshua Jensen
  0 siblings, 0 replies; 11+ messages in thread
From: Joshua Jensen @ 2011-11-23 22:08 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Jay Soffian, Michael Haggerty, Gerd Knops, git

----- Original Message -----
From: Ævar Arnfjörð Bjarmason
Date: 11/23/2011 1:50 PM
>
>
> Note that Git doesn't only have confusing behavior with refs on
> case-insensitive filesystems. The other day HFS+ users @ work had
> issues because of a case collision in the checked out tree, which
> confused git status et al.
Is core.ignorecase set to true?  Is the repository shared with a case 
sensitive file system?

I have a patch sitting around for 'git update-index --add' that fixes 
some case insensitivity issues, especially when using Git Gui.  This 
patch complements the core.ignorecase patches I sent in the past.

-Josh

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2011-11-23 22:07 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-19 20:08 Possible bug with branch names and case sensitivity Gerd Knops
2011-11-21 19:18 ` Jay Soffian
2011-11-22  5:21   ` Michael Haggerty
2011-11-22 17:31     ` Jay Soffian
2011-11-23  8:54       ` Michael Haggerty
2011-11-23 20:50       ` Ævar Arnfjörð Bjarmason
2011-11-23 22:08         ` Joshua Jensen
2011-11-22 17:49     ` Junio C Hamano
2011-11-23  9:22       ` Michael Haggerty
2011-11-23 18:59         ` Junio C Hamano
2011-11-22 18:01     ` Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.