All of lore.kernel.org
 help / color / mirror / Atom feed
* OS X and umlauts in file names
@ 2009-11-23 16:37 Thomas Singer
  2009-11-23 17:45 ` Thomas Rast
                   ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Thomas Singer @ 2009-11-23 16:37 UTC (permalink / raw)
  To: git

I'm on an English OS X 10.6.2 and I created a sample file with umlauts in
its name (Überlänge.txt). When I try to stage the file in the terminal, I
can't complete the file name by typing the Ü and hitting the tab key, but I
can complete it by typing an U and hitting the tab key. Unfortunately, after
executing

 git stage Überlänge.txt

I invoked

 git status

and it still shows the file as new file. Should I set some environment
variable to be able to work with files containing umlauts in the name?

Thanks in advance,
Tom

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: OS X and umlauts in file names
  2009-11-23 16:37 OS X and umlauts in file names Thomas Singer
@ 2009-11-23 17:45 ` Thomas Rast
  2009-11-23 18:10   ` Thomas Singer
  2009-11-23 20:26 ` Daniel Barkalow
  2009-11-26 17:23 ` Jay Soffian
  2 siblings, 1 reply; 20+ messages in thread
From: Thomas Rast @ 2009-11-23 17:45 UTC (permalink / raw)
  To: Thomas Singer; +Cc: git

Thomas Singer wrote:
> I'm on an English OS X 10.6.2 and I created a sample file with umlauts in
> its name (Überlänge.txt). When I try to stage the file in the terminal, I
> can't complete the file name by typing the Ü and hitting the tab key, but I
> can complete it by typing an U and hitting the tab key. Unfortunately, after
> executing
> 
>  git stage Überlänge.txt

This is because of OS X's unicode normalisation.  Try any of the
many threads on the topic, e.g.,

  http://thread.gmane.org/gmane.comp.version-control.git/70688

The short version is that this Ü is in fact decomposed into an
U-umlaut duo.

Considering that this leads to endless fun[*] not just with git, and
that we German speakers have an easy way out (Ueberlaenge), I can only
suggest that you avoid umlauts wherever possible to preserve
the sanity of your users.


[*] I once had an SVN repo with two different directories both called
Übungen.  Took me a while to figure out what was going on.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: OS X and umlauts in file names
  2009-11-23 17:45 ` Thomas Rast
@ 2009-11-23 18:10   ` Thomas Singer
  2009-11-23 18:23     ` Johannes Schindelin
  2009-11-23 18:29     ` Martin Langhoff
  0 siblings, 2 replies; 20+ messages in thread
From: Thomas Singer @ 2009-11-23 18:10 UTC (permalink / raw)
  To: Thomas Rast; +Cc: git

Hi Thomas,

Thanks for the feed-back. I know the problem from SVN, too, but I had the
hope, that Git was smarter than SVN for this topic. IIRC, one could get SVN
working "somehow" with umlauts on OS X by setting some environment variable.
Unfortunately, I don't remember the details any more.

Basically, getting it "somehow" to work on OS X is just one minor step. IMHO
Git should standardize on file names in the repository and do the
platform-specific conversion independent of any locale setting, if needed.
Then and only then it would be possible to get the same characters out of
the repository, no matter whether the file was added or checked out on OS X,
Linux or Windows.

At the moment we've got a problem report regarding our SmartGit GUI client:
the user says, on command line it[1] works (German OS X) but not with
SmartGit, for me it doesn't even work on the command line (English OS X). As
you may know, Java uses characters for file names, the Java runtime
internally converts from the platform-specific byte-representation on disk
to characters. I can't simply tunnel the file name as byte array to the
invoked Git command - I simply don't know how to transform the characters of
the file name to a representation the Git command line client will
understand[2].

Tom

[1] e.g. to stage or commit files with umlauts in the file name
[2] executing an external command in Java also "only" works with strings
(aka characters), not with byte sequences


Thomas Rast wrote:
> Thomas Singer wrote:
>> I'm on an English OS X 10.6.2 and I created a sample file with umlauts in
>> its name (Überlänge.txt). When I try to stage the file in the terminal, I
>> can't complete the file name by typing the Ü and hitting the tab key, but I
>> can complete it by typing an U and hitting the tab key. Unfortunately, after
>> executing
>>
>>  git stage Überlänge.txt
> 
> This is because of OS X's unicode normalisation.  Try any of the
> many threads on the topic, e.g.,
> 
>   http://thread.gmane.org/gmane.comp.version-control.git/70688
> 
> The short version is that this Ü is in fact decomposed into an
> U-umlaut duo.
> 
> Considering that this leads to endless fun[*] not just with git, and
> that we German speakers have an easy way out (Ueberlaenge), I can only
> suggest that you avoid umlauts wherever possible to preserve
> the sanity of your users.
> 
> 
> [*] I once had an SVN repo with two different directories both called
> Übungen.  Took me a while to figure out what was going on.
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: OS X and umlauts in file names
  2009-11-23 18:10   ` Thomas Singer
@ 2009-11-23 18:23     ` Johannes Schindelin
  2009-11-23 20:31       ` Thomas Singer
  2009-11-23 23:31       ` Jakub Narebski
  2009-11-23 18:29     ` Martin Langhoff
  1 sibling, 2 replies; 20+ messages in thread
From: Johannes Schindelin @ 2009-11-23 18:23 UTC (permalink / raw)
  To: Thomas Singer; +Cc: Thomas Rast, git

Hi,

On Mon, 23 Nov 2009, Thomas Singer wrote:

> Basically, getting it "somehow" to work on OS X is just one minor step. 
> IMHO Git should standardize on file names in the repository and do the 
> platform-specific conversion independent of any locale setting, if 
> needed.

That is contrary to the design of Git which honors content (byte-wise!) as 
much as possible, and treats file names very much as content.

There were beginnings of supporting OSX' brain-damaged filename mangling, 
but an obnoxious OSX fan worked very hard on trying to defend the OSX 
design and to decry Git's respect for the raw bytes on this list, so hard 
that even the nicest developers had no fun working on this issue anymore.

This little background may help you understand why there is no solution 
implemented in Git yet.  And maybe quite a few developers are reluctant to 
discuss the issue and possible solutions due to said sad story, too.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: OS X and umlauts in file names
  2009-11-23 18:10   ` Thomas Singer
  2009-11-23 18:23     ` Johannes Schindelin
@ 2009-11-23 18:29     ` Martin Langhoff
  1 sibling, 0 replies; 20+ messages in thread
From: Martin Langhoff @ 2009-11-23 18:29 UTC (permalink / raw)
  To: Thomas Singer; +Cc: Thomas Rast, git

On Mon, Nov 23, 2009 at 7:10 PM, Thomas Singer
<thomas.singer@syntevo.com> wrote:
> I can't simply tunnel the file name as byte array to the
> invoked Git command - I simply don't know how to transform the characters of
> the file name to a representation the Git command line client will
> understand[2].

Ouch - so git is respecting whatever name the user and/or OS have
picked, but Java wants to canonicalize it,  and whatever scheme it
uses does not match OSX? That must hurt Java usage on OSX a lot. Sure
they have a workaround...?

Suggestions:

1 - Configure Java to canonicalize in the same style as OSX. Actually,
OSX's canonicalization is somewhat arbitrary so I think it exposes a
call to canonicalize a string "the right way".

2 - Many git calls accept filenames via STDIN - Java will surely write
binary there...

3 - xargs with its -z parameter can complement #2

hth,


m
-- 
 martin.langhoff@gmail.com
 martin@laptop.org -- School Server Architect
 - ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: OS X and umlauts in file names
  2009-11-23 16:37 OS X and umlauts in file names Thomas Singer
  2009-11-23 17:45 ` Thomas Rast
@ 2009-11-23 20:26 ` Daniel Barkalow
  2009-11-25  8:50   ` Thomas Singer
  2009-11-26 17:23 ` Jay Soffian
  2 siblings, 1 reply; 20+ messages in thread
From: Daniel Barkalow @ 2009-11-23 20:26 UTC (permalink / raw)
  To: Thomas Singer; +Cc: git

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1860 bytes --]

On Mon, 23 Nov 2009, Thomas Singer wrote:

> I'm on an English OS X 10.6.2 and I created a sample file with umlauts in
> its name (Überlänge.txt). When I try to stage the file in the terminal, I
> can't complete the file name by typing the Ü and hitting the tab key, but I
> can complete it by typing an U and hitting the tab key.

You've already got a bug before involving git at all. You create a 
file "Überlänge.txt", but OS X writes "U:berla:nge.txt" (typing the 
combining character umlaut as : so that you can see the difference), and 
the directory listing doesn't contain any files that start with Ü, so the 
terminal already can't find the file you created. Obviously, git is going 
to have all the problems that the OS-provided readline library has, and 
you're not going to be able to get predictable results in any case where 
user-supplied filenames are compared with directory listings.

Part of the problem is that OS X does a canonicalization that is not what 
anybody else does, so you hit the problem every single time, but the 
fundamental issue is that there isn't any way to tell, when you create a 
file, what name that file will be listed under.

Note that this isn't a matter of characters to byte sequences. OS X 
actually uses different characters for the filename in its listings than 
you've used.

If there's a difference between German and English versions, I suspect 
that it's actually that you're not using a German keyboard with a key 
that, under OS X, produces the two-character sequence U:, but using some 
method that produces the single character Ü. I'd guess that your SmartGit 
problem is that Java is converting the U: that the user typed into Ü, and 
passing it to the OS, which turns it back into U: and then doesn't list 
the file that Java thinks the user asked for.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: OS X and umlauts in file names
  2009-11-23 18:23     ` Johannes Schindelin
@ 2009-11-23 20:31       ` Thomas Singer
  2009-11-23 23:31       ` Jakub Narebski
  1 sibling, 0 replies; 20+ messages in thread
From: Thomas Singer @ 2009-11-23 20:31 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Thomas Rast, git

I agree, that getting it done correctly could be a long and hard (maybe
incompatible) way. Neglecting the problem or blaming platform specific
"anomalies" does not help to solve that serious real-world problem for the
end-user. IMHO, telling a user to not use non-US-ASCII characters in file
names to stay platform-independent seems not to be state-of-the-art and
would exclude a lot of users world-wide. Finally, everyone would expect Git
to be better than CVS, also at this point.

But what about getting it working "somehow" on OS X in a few minutes? What
should I do to be able to stage/commit/work with files containing umlauts in
their name on my English OS X (by specifying the file names) as it seems to
work magically on a German OS X? Is this topic already /documented/
somewhere (I couldn't find something)?

Thanks in advance,
Tom


Johannes Schindelin wrote:
> Hi,
> 
> On Mon, 23 Nov 2009, Thomas Singer wrote:
> 
>> Basically, getting it "somehow" to work on OS X is just one minor step. 
>> IMHO Git should standardize on file names in the repository and do the 
>> platform-specific conversion independent of any locale setting, if 
>> needed.
> 
> That is contrary to the design of Git which honors content (byte-wise!) as 
> much as possible, and treats file names very much as content.
> 
> There were beginnings of supporting OSX' brain-damaged filename mangling, 
> but an obnoxious OSX fan worked very hard on trying to defend the OSX 
> design and to decry Git's respect for the raw bytes on this list, so hard 
> that even the nicest developers had no fun working on this issue anymore.
> 
> This little background may help you understand why there is no solution 
> implemented in Git yet.  And maybe quite a few developers are reluctant to 
> discuss the issue and possible solutions due to said sad story, too.
> 
> Ciao,
> Dscho
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: OS X and umlauts in file names
  2009-11-23 18:23     ` Johannes Schindelin
  2009-11-23 20:31       ` Thomas Singer
@ 2009-11-23 23:31       ` Jakub Narebski
  1 sibling, 0 replies; 20+ messages in thread
From: Jakub Narebski @ 2009-11-23 23:31 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Thomas Singer, Thomas Rast, git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> On Mon, 23 Nov 2009, Thomas Singer wrote:
> 
> > Basically, getting it "somehow" to work on OS X is just one minor step. 
> > IMHO Git should standardize on file names in the repository and do the 
> > platform-specific conversion independent of any locale setting, if 
> > needed.
> 
> That is contrary to the design of Git which honors content (byte-wise!) as 
> much as possible, and treats file names very much as content.
> 
> There were beginnings of supporting OSX' brain-damaged filename mangling, 
> but an obnoxious OSX fan worked very hard on trying to defend the OSX 
> design and to decry Git's respect for the raw bytes on this list, so hard 
> that even the nicest developers had no fun working on this issue anymore.
> 
> This little background may help you understand why there is no solution 
> implemented in Git yet.  And maybe quite a few developers are reluctant to 
> discuss the issue and possible solutions due to said sad story, too.

To be more exact the problem is not that MacOS X uses denormalized
form (does file mangling).  This would be the problem only in
cross-platform development (where some developers would work from
different operating system).

The problem is that the name under which Git creates file is different
from the name MacOS X lists file (in readdir etc.).

-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: OS X and umlauts in file names
  2009-11-23 20:26 ` Daniel Barkalow
@ 2009-11-25  8:50   ` Thomas Singer
  2009-11-25  9:51     ` B Smith-Mannschott
                       ` (3 more replies)
  0 siblings, 4 replies; 20+ messages in thread
From: Thomas Singer @ 2009-11-25  8:50 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: git

I've did following:

 toms-mac-mini:git-umlauts tom$ ls
 Überlänge.txt
 toms-mac-mini:git-umlauts tom$ git status
 # On branch master
 #
 # Initial commit
 #
 # Changes to be committed:
 #   (use "git rm --cached <file>..." to unstage)
 #
  #	new file:   "U\314\210berla\314\210nge.txt"
 #
 toms-mac-mini:git-umlauts tom$ git stage "U\314\210berla\314\210nge.txt"
 fatal: pathspec 'U\314\210berla\314\210nge.txt' did not match any files

Note, that I copy-pasted the file name which 'git status' showed to the
stage command. IMHO, this should work, especially, because different people
said Git would treat the file name as byte-array without interpreting it in
some kind.

From the user with the German OS X (for which the staging is said to work),
I've got the output of 'env' and hence also tried

 export LANG=de_DE.UTF-8

before doing the above steps, but with the same results. :(

-- 
Tom


Daniel Barkalow wrote:
> On Mon, 23 Nov 2009, Thomas Singer wrote:
> 
>> I'm on an English OS X 10.6.2 and I created a sample file with umlauts in
>> its name (Überlänge.txt). When I try to stage the file in the terminal, I
>> can't complete the file name by typing the Ü and hitting the tab key, but I
>> can complete it by typing an U and hitting the tab key.
> 
> You've already got a bug before involving git at all. You create a 
> file "Überlänge.txt", but OS X writes "U:berla:nge.txt" (typing the 
> combining character umlaut as : so that you can see the difference), and 
> the directory listing doesn't contain any files that start with Ü, so the 
> terminal already can't find the file you created. Obviously, git is going 
> to have all the problems that the OS-provided readline library has, and 
> you're not going to be able to get predictable results in any case where 
> user-supplied filenames are compared with directory listings.
> 
> Part of the problem is that OS X does a canonicalization that is not what 
> anybody else does, so you hit the problem every single time, but the 
> fundamental issue is that there isn't any way to tell, when you create a 
> file, what name that file will be listed under.
> 
> Note that this isn't a matter of characters to byte sequences. OS X 
> actually uses different characters for the filename in its listings than 
> you've used.
> 
> If there's a difference between German and English versions, I suspect 
> that it's actually that you're not using a German keyboard with a key 
> that, under OS X, produces the two-character sequence U:, but using some 
> method that produces the single character Ü. I'd guess that your SmartGit 
> problem is that Java is converting the U: that the user typed into Ü, and 
> passing it to the OS, which turns it back into U: and then doesn't list 
> the file that Java thinks the user asked for.
> 
> 	-Daniel
> *This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: OS X and umlauts in file names
  2009-11-25  8:50   ` Thomas Singer
@ 2009-11-25  9:51     ` B Smith-Mannschott
  2009-11-25 10:07     ` Martin Langhoff
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 20+ messages in thread
From: B Smith-Mannschott @ 2009-11-25  9:51 UTC (permalink / raw)
  To: Thomas Singer; +Cc: Daniel Barkalow, git

On Wed, Nov 25, 2009 at 09:50, Thomas Singer <thomas.singer@syntevo.com> wrote:
> I've did following:
>
>  toms-mac-mini:git-umlauts tom$ ls
>  Überlänge.txt
>  toms-mac-mini:git-umlauts tom$ git status
>  # On branch master
>  #
>  # Initial commit
>  #
>  # Changes to be committed:
>  #   (use "git rm --cached <file>..." to unstage)
>  #
>  #     new file:   "U\314\210berla\314\210nge.txt"
>  #
>  toms-mac-mini:git-umlauts tom$ git stage "U\314\210berla\314\210nge.txt"
>  fatal: pathspec 'U\314\210berla\314\210nge.txt' did not match any files
>
> Note, that I copy-pasted the file name which 'git status' showed to the
> stage command. IMHO, this should work, especially, because different people
> said Git would treat the file name as byte-array without interpreting it in
> some kind.
>
> From the user with the German OS X (for which the staging is said to work),
> I've got the output of 'env' and hence also tried
>
>  export LANG=de_DE.UTF-8
>
> before doing the above steps, but with the same results. :(

The problem you are having is not because of the *encoding*, it's the
Normalization form that's messing things up. The fact is that in
Unicode there are two ways to represent many -- but not all --
accented characters.

- "composed": one code point for the accented character)
- "decomposed": two code points: one for the base letter, one or more
combining characters for the accents.

The composed code points are really just backward compatibility to
legacy encodings (like LATIN-1). If you want to actually support
(rather than just tolerate) unicode you have to know how to deal with
the decomposed form, and once you can do that there's little point
beyond backward compatibility in continuing to use composed form
internally.

The Subversion people have run into this same problem because they
made the same error of assuming that any given sequence of glyphs has
only one possible representation as unicode code points and thus only
one representation as UTF-8 bytes. Dionisos has done written up the
issues involved here:

http://svn.apache.org/repos/asf/subversion/trunk/notes/unicode-composition-for-filenames

// Ben

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: OS X and umlauts in file names
  2009-11-25  8:50   ` Thomas Singer
  2009-11-25  9:51     ` B Smith-Mannschott
@ 2009-11-25 10:07     ` Martin Langhoff
  2009-11-25 10:19       ` Martin Langhoff
  2009-11-25 22:43     ` Andreas Schwab
  2009-11-26 17:27     ` Jay Soffian
  3 siblings, 1 reply; 20+ messages in thread
From: Martin Langhoff @ 2009-11-25 10:07 UTC (permalink / raw)
  To: Thomas Singer; +Cc: Daniel Barkalow, git

On Wed, Nov 25, 2009 at 9:50 AM, Thomas Singer
<thomas.singer@syntevo.com> wrote:
>  toms-mac-mini:git-umlauts tom$ git stage "U\314\210berla\314\210nge.txt"
>  fatal: pathspec 'U\314\210berla\314\210nge.txt' did not match any files

does a find * | xargs git add work?

cheers,


m
-- 
 martin.langhoff@gmail.com
 martin@laptop.org -- School Server Architect
 - ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: OS X and umlauts in file names
  2009-11-25 10:07     ` Martin Langhoff
@ 2009-11-25 10:19       ` Martin Langhoff
  0 siblings, 0 replies; 20+ messages in thread
From: Martin Langhoff @ 2009-11-25 10:19 UTC (permalink / raw)
  To: Thomas Singer; +Cc: Daniel Barkalow, git

On Wed, Nov 25, 2009 at 11:07 AM, Martin Langhoff
<martin.langhoff@gmail.com> wrote:
> On Wed, Nov 25, 2009 at 9:50 AM, Thomas Singer
> <thomas.singer@syntevo.com> wrote:
>>  toms-mac-mini:git-umlauts tom$ git stage "U\314\210berla\314\210nge.txt"
>>  fatal: pathspec 'U\314\210berla\314\210nge.txt' did not match any files
>
> does a find * | xargs git add work?

Also, you can try with `find * -print0 | git-update-index --add
--stdin -z `. Find should report the exact filename that the OS has,
and git should add it as it is.

Background: git-add used to be a trivial shell script wrapping around
git-update-index. If you have a git checkout, try:

 git show f25933987f29070e9cd79dfddf03018010e82e80:git-add.sh

If git cannot track this file in a pure OSX world, there is a good
chance it's a bug in git.

In in this narrow test case (single machine, running OSX) git *must*
be able to do the right thing. If you work on multi-platform projects
however, there is a good chance a Windows or Linux user will commit a
file with a name that _when you checkout on OSX_, OSX will save with a
different (but "equivalent") name due to its funny decomposition
rules. And all sorts of "fun" will ensue.

cheers,


m
-- 
 martin.langhoff@gmail.com
 martin@laptop.org -- School Server Architect
 - ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: OS X and umlauts in file names
  2009-11-25  8:50   ` Thomas Singer
  2009-11-25  9:51     ` B Smith-Mannschott
  2009-11-25 10:07     ` Martin Langhoff
@ 2009-11-25 22:43     ` Andreas Schwab
  2009-11-26  8:28       ` Thomas Singer
  2009-11-26 17:27     ` Jay Soffian
  3 siblings, 1 reply; 20+ messages in thread
From: Andreas Schwab @ 2009-11-25 22:43 UTC (permalink / raw)
  To: Thomas Singer; +Cc: Daniel Barkalow, git

Thomas Singer <thomas.singer@syntevo.com> writes:

> I've did following:
>
>  toms-mac-mini:git-umlauts tom$ ls
>  Überlänge.txt
>  toms-mac-mini:git-umlauts tom$ git status
>  # On branch master
>  #
>  # Initial commit
>  #
>  # Changes to be committed:
>  #   (use "git rm --cached <file>..." to unstage)
>  #
>   #	new file:   "U\314\210berla\314\210nge.txt"
>  #
>  toms-mac-mini:git-umlauts tom$ git stage "U\314\210berla\314\210nge.txt"
>  fatal: pathspec 'U\314\210berla\314\210nge.txt' did not match any files

Try $'U\314\210berla\314\210nge.txt' instead.
"U\314\210berla\314\210nge.txt" is the same as
"U\\314\\210berla\\314\\210nge.txt" to the shell.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: OS X and umlauts in file names
  2009-11-25 22:43     ` Andreas Schwab
@ 2009-11-26  8:28       ` Thomas Singer
  0 siblings, 0 replies; 20+ messages in thread
From: Thomas Singer @ 2009-11-26  8:28 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Daniel Barkalow, git

Hi Andreas,

Thank you for this hint. When trying

 toms-mac-mini:git-umlauts tom$ git stage $'U\314\210berla\314\210nge.txt'

git shows me no error or other output, but invoking 'git status' again shows
no difference, the file is still showing up as new file.

I've also tried to use double backslashes, but I could not enter a backslash
in the OS X Terminal (works fine in other applications). :(

-- 
Tom

Andreas Schwab wrote:
> Thomas Singer <thomas.singer@syntevo.com> writes:
> 
>> I've did following:
>>
>>  toms-mac-mini:git-umlauts tom$ ls
>>  Überlänge.txt
>>  toms-mac-mini:git-umlauts tom$ git status
>>  # On branch master
>>  #
>>  # Initial commit
>>  #
>>  # Changes to be committed:
>>  #   (use "git rm --cached <file>..." to unstage)
>>  #
>>   #	new file:   "U\314\210berla\314\210nge.txt"
>>  #
>>  toms-mac-mini:git-umlauts tom$ git stage "U\314\210berla\314\210nge.txt"
>>  fatal: pathspec 'U\314\210berla\314\210nge.txt' did not match any files
> 
> Try $'U\314\210berla\314\210nge.txt' instead.
> "U\314\210berla\314\210nge.txt" is the same as
> "U\\314\\210berla\\314\\210nge.txt" to the shell.
> 
> Andreas.
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: OS X and umlauts in file names
  2009-11-23 16:37 OS X and umlauts in file names Thomas Singer
  2009-11-23 17:45 ` Thomas Rast
  2009-11-23 20:26 ` Daniel Barkalow
@ 2009-11-26 17:23 ` Jay Soffian
  2 siblings, 0 replies; 20+ messages in thread
From: Jay Soffian @ 2009-11-26 17:23 UTC (permalink / raw)
  To: Thomas Singer; +Cc: git

On Mon, Nov 23, 2009 at 11:37 AM, Thomas Singer
<thomas.singer@syntevo.com> wrote:
> I'm on an English OS X 10.6.2 and I created a sample file with umlauts in
> its name (Überlänge.txt). When I try to stage the file in the terminal, I
> can't complete the file name by typing the Ü and hitting the tab key, but I
> can complete it by typing an U and hitting the tab key. Unfortunately, after
> executing
>
>  git stage Überlänge.txt
>
> I invoked
>
>  git status
>
> and it still shows the file as new file. Should I set some environment
> variable to be able to work with files containing umlauts in the name?

Works for me on 10.6.2:

kore:~/foo (master)$ echo Überlänge.txt > Überlänge.txt
kore:~/foo (master)$ git stage Überlänge.txt
kore:~/foo (master)$ git st
# On branch master
#
# Initial commit
#
# Changes to be committed:
#   (use "git rm --cached <file>..." to unstage)
#
#	new file:   "U\314\210berla\314\210nge.txt"
#
kore:~/foo (master)$ git commit -m initial
[master (root-commit) f23e23f] initial
 1 files changed, 1 insertions(+), 0 deletions(-)
 create mode 100644 "U\314\210berla\314\210nge.txt"
kore:~/foo (master)$ git st
# On branch master
nothing to commit (working directory clean)

Doesn't matter whether LANG and/or LC_* are set or not for me.

j.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: OS X and umlauts in file names
  2009-11-25  8:50   ` Thomas Singer
                       ` (2 preceding siblings ...)
  2009-11-25 22:43     ` Andreas Schwab
@ 2009-11-26 17:27     ` Jay Soffian
  2009-11-27 10:01       ` Thomas Singer
  3 siblings, 1 reply; 20+ messages in thread
From: Jay Soffian @ 2009-11-26 17:27 UTC (permalink / raw)
  To: Thomas Singer; +Cc: Daniel Barkalow, git

On Wed, Nov 25, 2009 at 3:50 AM, Thomas Singer
<thomas.singer@syntevo.com> wrote:
> I've did following:
>
>  toms-mac-mini:git-umlauts tom$ ls
>  Überlänge.txt
>  toms-mac-mini:git-umlauts tom$ git status
>  # On branch master
>  #
>  # Initial commit
>  #
>  # Changes to be committed:
>  #   (use "git rm --cached <file>..." to unstage)
>  #
>  #     new file:   "U\314\210berla\314\210nge.txt"
>  #

Wait, what's the problem here? It's staged according to the above,
just commit it.

j.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: OS X and umlauts in file names
  2009-11-26 17:27     ` Jay Soffian
@ 2009-11-27 10:01       ` Thomas Singer
  2009-11-27 10:20         ` Thomas Singer
  0 siblings, 1 reply; 20+ messages in thread
From: Thomas Singer @ 2009-11-27 10:01 UTC (permalink / raw)
  To: Jay Soffian; +Cc: Daniel Barkalow, git

Jay Soffian wrote:
>>  toms-mac-mini:git-umlauts tom$ git status
>>  # On branch master
>>  #
>>  # Initial commit
>>  #
>>  # Changes to be committed:
>>  #   (use "git rm --cached <file>..." to unstage)
>>  #
>>  #     new file:   "U\314\210berla\314\210nge.txt"
>>  #
> 
> Wait, what's the problem here? It's staged according to the above,
> just commit it.

You are completely right and I feel quite foolish.

What about this one:

toms-mac-mini:git-umlauts tom$ git status
# On branch master
#
# Initial commit
#
# Changes to be committed:
#   (use "git rm --cached <file>..." to unstage)
#
#	new file:   "U\314\210berla\314\210nge.txt"
#
toms-mac-mini:git-umlauts tom$ git rm --cached "U\314\210berla\314\210nge.txt"
fatal: pathspec 'U\314\210berla\314\210nge.txt' did not match any files

-- 
Thanks in advance,
Tom

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: OS X and umlauts in file names
  2009-11-27 10:01       ` Thomas Singer
@ 2009-11-27 10:20         ` Thomas Singer
  2009-11-27 10:56           ` Martin Langhoff
  0 siblings, 1 reply; 20+ messages in thread
From: Thomas Singer @ 2009-11-27 10:20 UTC (permalink / raw)
  To: Jay Soffian; +Cc: Daniel Barkalow, git

Thomas Singer wrote:
> Jay Soffian wrote:
>>>  toms-mac-mini:git-umlauts tom$ git status
>>>  # On branch master
>>>  #
>>>  # Initial commit
>>>  #
>>>  # Changes to be committed:
>>>  #   (use "git rm --cached <file>..." to unstage)
>>>  #
>>>  #     new file:   "U\314\210berla\314\210nge.txt"
>>>  #
>> Wait, what's the problem here? It's staged according to the above,
>> just commit it.
> 
> You are completely right and I feel quite foolish.
> 
> What about this one:
> 
> toms-mac-mini:git-umlauts tom$ git status
> # On branch master
> #
> # Initial commit
> #
> # Changes to be committed:
> #   (use "git rm --cached <file>..." to unstage)
> #
> #	new file:   "U\314\210berla\314\210nge.txt"
> #
> toms-mac-mini:git-umlauts tom$ git rm --cached "U\314\210berla\314\210nge.txt"
> fatal: pathspec 'U\314\210berla\314\210nge.txt' did not match any files

OK, I've found it. This works (I have to complete the file name after having
typed an U):

toms-mac-mini:git-umlauts tom$ git rm --cached Überlänge.txt

-- 
Tom

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: OS X and umlauts in file names
  2009-11-27 10:20         ` Thomas Singer
@ 2009-11-27 10:56           ` Martin Langhoff
  2009-11-27 18:35             ` Thomas Singer
  0 siblings, 1 reply; 20+ messages in thread
From: Martin Langhoff @ 2009-11-27 10:56 UTC (permalink / raw)
  To: Thomas Singer; +Cc: Jay Soffian, Daniel Barkalow, git

On Fri, Nov 27, 2009 at 11:20 AM, Thomas Singer
<thomas.singer@syntevo.com> wrote:
>> toms-mac-mini:git-umlauts tom$ git rm --cached "U\314\210berla\314\210nge.txt"
>> fatal: pathspec 'U\314\210berla\314\210nge.txt' did not match any files
>
> OK, I've found it. This works (I have to complete the file name after having
> typed an U):
>
> toms-mac-mini:git-umlauts tom$ git rm --cached Überlänge.txt

Tom,

have you tried calling git-update-index --add
--stdin -z? Your original email stated

> we've got a problem report regarding our SmartGit GUI client

so it sounds like you are building a porcelain. In that case, the
sanest approach is to invoke git-update-index and write to its stdin.

cheers,



m
-- 
 martin.langhoff@gmail.com
 martin@laptop.org -- School Server Architect
 - ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: OS X and umlauts in file names
  2009-11-27 10:56           ` Martin Langhoff
@ 2009-11-27 18:35             ` Thomas Singer
  0 siblings, 0 replies; 20+ messages in thread
From: Thomas Singer @ 2009-11-27 18:35 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: Jay Soffian, Daniel Barkalow, git

Martin Langhoff wrote:
> have you tried calling git-update-index --add
> --stdin -z? Your original email stated

No, we don't do such a massive change immediately before a release.

>> we've got a problem report regarding our SmartGit GUI client
> 
> so it sounds like you are building a porcelain. In that case, the
> sanest approach is to invoke git-update-index and write to its stdin.

We will try this out after release.

For those who are interested: I've got it working on OS X and Git was not
the problem, but Java. A longer time ago directory.list() or
directory.listFiles() returned the file names with decomposed characters (as
they are stored on OS X hard disk). Now (don't know which Java update
introduced this change) these methods return file names with composed
characters, so I had to decompose them before handing them to the git
executable call.

Nevertheless, the cross-platform-problem remains: if you add files with
umlauts in their names on non-OS X, you will not be able to use them on OS X.

-- 
Tom

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2009-11-27 18:34 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-11-23 16:37 OS X and umlauts in file names Thomas Singer
2009-11-23 17:45 ` Thomas Rast
2009-11-23 18:10   ` Thomas Singer
2009-11-23 18:23     ` Johannes Schindelin
2009-11-23 20:31       ` Thomas Singer
2009-11-23 23:31       ` Jakub Narebski
2009-11-23 18:29     ` Martin Langhoff
2009-11-23 20:26 ` Daniel Barkalow
2009-11-25  8:50   ` Thomas Singer
2009-11-25  9:51     ` B Smith-Mannschott
2009-11-25 10:07     ` Martin Langhoff
2009-11-25 10:19       ` Martin Langhoff
2009-11-25 22:43     ` Andreas Schwab
2009-11-26  8:28       ` Thomas Singer
2009-11-26 17:27     ` Jay Soffian
2009-11-27 10:01       ` Thomas Singer
2009-11-27 10:20         ` Thomas Singer
2009-11-27 10:56           ` Martin Langhoff
2009-11-27 18:35             ` Thomas Singer
2009-11-26 17:23 ` Jay Soffian

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.