All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH/RFC] core.precomposeunicode is true by default
@ 2013-07-27  1:21 Torsten Bögershausen
  2013-07-27 15:23 ` Duy Nguyen
  2013-07-29 17:20 ` Junio C Hamano
  0 siblings, 2 replies; 10+ messages in thread
From: Torsten Bögershausen @ 2013-07-27  1:21 UTC (permalink / raw)
  To: git, tboegi

When core.precomposeunicode was introduced, it was set to false
by default, to be compatible with older versions of Git.

Whenever UTF-8 file names are used in a mixed environment,
the Mac OS users need to find out that this configuration exist
and set it to true manually.

There is no measurable performance impact between false and true.
A smoother workflow can be achieved for new Git users,
so change the default to true:

- Remove the auto-sensing
- Rename the internal variable into precompose_unicode,
  and set it to 1 meaning true.
- Adjust and clean up test cases

The configuration core.precomposeunicode is still supported.

Signed-off-by: Torsten Bögershausen <tboegi@web.de>
---
 Documentation/config.txt     |  2 +-
 builtin/init-db.c            |  1 -
 cache.h                      |  2 +-
 compat/precompose_utf8.c     | 28 ++--------------------------
 compat/precompose_utf8.h     |  1 -
 config.c                     |  2 +-
 environment.c                |  2 +-
 git-compat-util.h            |  1 -
 t/t0050-filesystem.sh        |  1 +
 t/t3910-mac-os-precompose.sh | 14 --------------
 t/t7400-submodule-basic.sh   |  1 -
 11 files changed, 7 insertions(+), 48 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index e0b923f..abe4cfa 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -240,7 +240,7 @@ core.precomposeunicode::
 	This option is only used by Mac OS implementation of Git.
 	When core.precomposeunicode=true, Git reverts the unicode decomposition
 	of filenames done by Mac OS. This is useful when sharing a repository
-	between Mac OS and Linux or Windows.
+	between Mac OS and Linux or Windows. True by default.
 	(Git for Windows 1.7.10 or higher is needed, or Git under cygwin 1.7).
 	When false, file names are handled fully transparent by Git,
 	which is backward compatible with older versions of Git.
diff --git a/builtin/init-db.c b/builtin/init-db.c
index 78aa387..08854d5 100644
--- a/builtin/init-db.c
+++ b/builtin/init-db.c
@@ -290,7 +290,6 @@ static int create_default_files(const char *template_path)
 		strcpy(path + len, "CoNfIg");
 		if (!access(path, F_OK))
 			git_config_set("core.ignorecase", "true");
-		probe_utf8_pathname_composition(path, len);
 	}
 
 	return reinit;
diff --git a/cache.h b/cache.h
index 4c606ce..abb2cad 100644
--- a/cache.h
+++ b/cache.h
@@ -593,7 +593,7 @@ extern int read_replace_refs;
 extern int fsync_object_files;
 extern int core_preload_index;
 extern int core_apply_sparse_checkout;
-extern int precomposed_unicode;
+extern int precompose_unicode;
 
 /*
  * The character that begins a commented line in user-editable file
diff --git a/compat/precompose_utf8.c b/compat/precompose_utf8.c
index 7980abd..5396b91 100644
--- a/compat/precompose_utf8.c
+++ b/compat/precompose_utf8.c
@@ -36,30 +36,6 @@ static size_t has_non_ascii(const char *s, size_t maxlen, size_t *strlen_c)
 }
 
 
-void probe_utf8_pathname_composition(char *path, int len)
-{
-	static const char *auml_nfc = "\xc3\xa4";
-	static const char *auml_nfd = "\x61\xcc\x88";
-	int output_fd;
-	if (precomposed_unicode != -1)
-		return; /* We found it defined in the global config, respect it */
-	strcpy(path + len, auml_nfc);
-	output_fd = open(path, O_CREAT|O_EXCL|O_RDWR, 0600);
-	if (output_fd >= 0) {
-		close(output_fd);
-		strcpy(path + len, auml_nfd);
-		/* Indicate to the user, that we can configure it to true */
-		if (!access(path, R_OK))
-			git_config_set("core.precomposeunicode", "false");
-		/* To be backward compatible, set precomposed_unicode to 0 */
-		precomposed_unicode = 0;
-		strcpy(path + len, auml_nfc);
-		if (unlink(path))
-			die_errno(_("failed to unlink '%s'"), path);
-	}
-}
-
-
 void precompose_argv(int argc, const char **argv)
 {
 	int i = 0;
@@ -67,7 +43,7 @@ void precompose_argv(int argc, const char **argv)
 	char *newarg;
 	iconv_t ic_precompose;
 
-	if (precomposed_unicode != 1)
+	if (!precompose_unicode)
 		return;
 
 	ic_precompose = iconv_open(repo_encoding, path_encoding);
@@ -130,7 +106,7 @@ struct dirent_prec_psx *precompose_utf8_readdir(PREC_DIR *prec_dir)
 		prec_dir->dirent_nfc->d_ino  = res->d_ino;
 		prec_dir->dirent_nfc->d_type = res->d_type;
 
-		if ((precomposed_unicode == 1) && has_non_ascii(res->d_name, (size_t)-1, NULL)) {
+		if (precompose_unicode && has_non_ascii(res->d_name, (size_t)-1, NULL)) {
 			if (prec_dir->ic_precompose == (iconv_t)-1) {
 				die("iconv_open(%s,%s) failed, but needed:\n"
 						"    precomposed unicode is not supported.\n"
diff --git a/compat/precompose_utf8.h b/compat/precompose_utf8.h
index 3b73585..488f81d 100644
--- a/compat/precompose_utf8.h
+++ b/compat/precompose_utf8.h
@@ -27,7 +27,6 @@ typedef struct {
 } PREC_DIR;
 
 void precompose_argv(int argc, const char **argv);
-void probe_utf8_pathname_composition(char *, int);
 
 PREC_DIR *precompose_utf8_opendir(const char *dirname);
 struct dirent_prec_psx *precompose_utf8_readdir(PREC_DIR *dirp);
diff --git a/config.c b/config.c
index e13a7b6..c181767 100644
--- a/config.c
+++ b/config.c
@@ -827,7 +827,7 @@ static int git_default_core_config(const char *var, const char *value)
 	}
 
 	if (!strcmp(var, "core.precomposeunicode")) {
-		precomposed_unicode = git_config_bool(var, value);
+		precompose_unicode = git_config_bool(var, value);
 		return 0;
 	}
 
diff --git a/environment.c b/environment.c
index 0cb67b2..b98f14b 100644
--- a/environment.c
+++ b/environment.c
@@ -58,7 +58,7 @@ char *notes_ref_name;
 int grafts_replace_parents = 1;
 int core_apply_sparse_checkout;
 int merge_log_config = -1;
-int precomposed_unicode = -1; /* see probe_utf8_pathname_composition() */
+int precompose_unicode = 1; /* default: true */
 struct startup_info *startup_info;
 unsigned long pack_size_limit_cfg;
 
diff --git a/git-compat-util.h b/git-compat-util.h
index cc4ba4d..3f6f27c 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -185,7 +185,6 @@ typedef unsigned long uintptr_t;
 #else
 #define precompose_str(in,i_nfd2nfc)
 #define precompose_argv(c,v)
-#define probe_utf8_pathname_composition(a,b)
 #endif
 
 #ifdef NEEDS_CLIPPED_WRITE
diff --git a/t/t0050-filesystem.sh b/t/t0050-filesystem.sh
index 05d78d2..6b3cedc 100755
--- a/t/t0050-filesystem.sh
+++ b/t/t0050-filesystem.sh
@@ -91,6 +91,7 @@ test_expect_failure CASE_INSENSITIVE_FS 'add (with different case)' '
 test_expect_success "setup unicode normalization tests" '
 	test_create_repo unicode &&
 	cd unicode &&
+	git config core.precomposeunicode false &&
 	touch "$aumlcdiar" &&
 	git add "$aumlcdiar" &&
 	git commit -m initial &&
diff --git a/t/t3910-mac-os-precompose.sh b/t/t3910-mac-os-precompose.sh
index 5fe57c5..b853d6a 100755
--- a/t/t3910-mac-os-precompose.sh
+++ b/t/t3910-mac-os-precompose.sh
@@ -34,11 +34,6 @@ Alongc=$Alongc$Alongc$Alongc$Alongc$Alongc           #50 Byte
 Alongc=$Alongc$Alongc$Alongc$Alongc$Alongc           #250 Byte
 Alongc=$Alongc$AEligatu$AEligatu                     #254 Byte
 
-test_expect_success "detect if nfd needed" '
-	precomposeunicode=`git config core.precomposeunicode` &&
-	test "$precomposeunicode" = false &&
-	git config core.precomposeunicode true
-'
 test_expect_success "setup" '
 	>x &&
 	git add x &&
@@ -140,14 +135,5 @@ test_expect_success "Add long precomposed filename" '
 	git add * &&
 	git commit -m "Long filename"
 '
-# Test if the global core.precomposeunicode stops autosensing
-# Must be the last test case
-test_expect_success "respect git config --global core.precomposeunicode" '
-	git config --global core.precomposeunicode true &&
-	rm -rf .git &&
-	git init &&
-	precomposeunicode=`git config core.precomposeunicode` &&
-	test "$precomposeunicode" = "true"
-'
 
 test_done
diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
index 5ee97b0..f0f8cde 100755
--- a/t/t7400-submodule-basic.sh
+++ b/t/t7400-submodule-basic.sh
@@ -958,7 +958,6 @@ test_expect_success 'submodule with UTF-8 name' '
 		git add sub &&
 		git commit -m "init sub"
 	) &&
-	test_config core.precomposeunicode true &&
 	git submodule add ./"$svname" &&
 	git submodule >&2 &&
 	test -n "$(git submodule | grep "$svname")"
-- 
1.8.3

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH/RFC] core.precomposeunicode is true by default
  2013-07-27  1:21 [PATCH/RFC] core.precomposeunicode is true by default Torsten Bögershausen
@ 2013-07-27 15:23 ` Duy Nguyen
  2013-07-27 22:53   ` Torsten Bögershausen
  2013-07-29 17:20 ` Junio C Hamano
  1 sibling, 1 reply; 10+ messages in thread
From: Duy Nguyen @ 2013-07-27 15:23 UTC (permalink / raw)
  To: Torsten Bögershausen; +Cc: Git Mailing List

On Sat, Jul 27, 2013 at 8:21 AM, Torsten Bögershausen <tboegi@web.de> wrote:
> When core.precomposeunicode was introduced, it was set to false
> by default, to be compatible with older versions of Git.
>
> Whenever UTF-8 file names are used in a mixed environment,
> the Mac OS users need to find out that this configuration exist
> and set it to true manually.
>
> There is no measurable performance impact between false and true.
> A smoother workflow can be achieved for new Git users,
> so change the default to true:
>
> - Remove the auto-sensing
> - Rename the internal variable into precompose_unicode,
>   and set it to 1 meaning true.
> - Adjust and clean up test cases
>
> The configuration core.precomposeunicode is still supported.

Does this have any effects on non-utf8 users? I'm on utf-8, so this is
not really my concern, that is unless it changes something on LANG=C..
-- 
Duy

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH/RFC] core.precomposeunicode is true by default
  2013-07-27 15:23 ` Duy Nguyen
@ 2013-07-27 22:53   ` Torsten Bögershausen
  2013-07-28  4:45     ` Duy Nguyen
  0 siblings, 1 reply; 10+ messages in thread
From: Torsten Bögershausen @ 2013-07-27 22:53 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Torsten Bögershausen, Git Mailing List

On 2013-07-27 17.23, Duy Nguyen wrote:
> On Sat, Jul 27, 2013 at 8:21 AM, Torsten Bögershausen <tboegi@web.de> wrote:
>> When core.precomposeunicode was introduced, it was set to false
>> by default, to be compatible with older versions of Git.
>>
>> Whenever UTF-8 file names are used in a mixed environment,
>> the Mac OS users need to find out that this configuration exist
>> and set it to true manually.
>>
>> There is no measurable performance impact between false and true.
>> A smoother workflow can be achieved for new Git users,
>> so change the default to true:
>>
>> - Remove the auto-sensing
>> - Rename the internal variable into precompose_unicode,
>>   and set it to 1 meaning true.
>> - Adjust and clean up test cases
>>
>> The configuration core.precomposeunicode is still supported.
> 
> Does this have any effects on non-utf8 users? I'm on utf-8, so this is
> not really my concern, that is unless it changes something on LANG=C..
> 
Not sure if I fully understand the question.

Mac OS will always use UTF-8, and we can choose between
precomposesd and decomposed.

Windows ("Git for Windows" == msysgit) uses UTF-8 (precomposed)
Git under cygwin 1.7 uses UTF-8, precomposed.
Git under cygwin 1.5 or git compiled under mingw does not use
UTF-8, but a Windows code page"

Linux may use UTF-8 or ISO-8859 or whatever you configure.

This change affects only Mac OS, 
(should this be stated better in the commit MSG?)

And if somebody wants to change a repo between Linux, Windows
and/or Mac OS, everybody should use UTF-8 (precomposed) for filenames,
directories and branches.
(or stick to ASCII)
Does this answer the question?
/Torsten


 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH/RFC] core.precomposeunicode is true by default
  2013-07-27 22:53   ` Torsten Bögershausen
@ 2013-07-28  4:45     ` Duy Nguyen
  0 siblings, 0 replies; 10+ messages in thread
From: Duy Nguyen @ 2013-07-28  4:45 UTC (permalink / raw)
  To: Torsten Bögershausen; +Cc: Git Mailing List

On Sun, Jul 28, 2013 at 5:53 AM, Torsten Bögershausen <tboegi@web.de> wrote:
>> Does this have any effects on non-utf8 users? I'm on utf-8, so this is
>> not really my concern, that is unless it changes something on LANG=C..
>>
> Not sure if I fully understand the question.
>
> Mac OS will always use UTF-8, and we can choose between
> precomposesd and decomposed.
>
> Windows ("Git for Windows" == msysgit) uses UTF-8 (precomposed)
> Git under cygwin 1.7 uses UTF-8, precomposed.
> Git under cygwin 1.5 or git compiled under mingw does not use
> UTF-8, but a Windows code page"
>
> Linux may use UTF-8 or ISO-8859 or whatever you configure.
>
> This change affects only Mac OS,
> (should this be stated better in the commit MSG?)

Ah yes, I see precompose_unicode changes default value from -1 to 1,
but I did not see that the whole precompose unicode code is only
active in Mac OS.

>
> And if somebody wants to change a repo between Linux, Windows
> and/or Mac OS, everybody should use UTF-8 (precomposed) for filenames,
> directories and branches.
> (or stick to ASCII)
> Does this answer the question?

It does, thanks.
-- 
Duy

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH/RFC] core.precomposeunicode is true by default
  2013-07-27  1:21 [PATCH/RFC] core.precomposeunicode is true by default Torsten Bögershausen
  2013-07-27 15:23 ` Duy Nguyen
@ 2013-07-29 17:20 ` Junio C Hamano
  2013-08-27 13:45   ` Torsten Bögershausen
  1 sibling, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2013-07-29 17:20 UTC (permalink / raw)
  To: Torsten Bögershausen; +Cc: git

Torsten Bögershausen <tboegi@web.de> writes:

> When core.precomposeunicode was introduced, it was set to false
> by default, to be compatible with older versions of Git.
>
> Whenever UTF-8 file names are used in a mixed environment,
> the Mac OS users need to find out that this configuration exist
> and set it to true manually.
>
> There is no measurable performance impact between false and true.

The real reason we default it to auto-sensing in the current code is
for correctness, I think. the new precompose code could be buggy,
and by auto-sensing, we hoped that we would enable it only on
filesystems that the codepath matters.

> A smoother workflow can be achieved for new Git users,
> so change the default to true:
>
> - Remove the auto-sensing

Why?

> - Rename the internal variable into precompose_unicode,
>   and set it to 1 meaning true.

Why the rename?

> - Adjust and clean up test cases
>
> The configuration core.precomposeunicode is still supported.

Sorry, but I do not quite understand the change.  Is this because
the auto-sensing is not working, or after auto-sensing we do a wrong
thing?  If that is the case, perhaps that is what we should fix?

> diff --git a/compat/precompose_utf8.c b/compat/precompose_utf8.c
> index 7980abd..5396b91 100644
> --- a/compat/precompose_utf8.c
> +++ b/compat/precompose_utf8.c
> @@ -36,30 +36,6 @@ static size_t has_non_ascii(const char *s, size_t maxlen, size_t *strlen_c)
>  }
>  
>  
> -void probe_utf8_pathname_composition(char *path, int len)
> -{
> -	static const char *auml_nfc = "\xc3\xa4";
> -	static const char *auml_nfd = "\x61\xcc\x88";
> -	int output_fd;
> -	if (precomposed_unicode != -1)
> -		return; /* We found it defined in the global config, respect it */
> -	strcpy(path + len, auml_nfc);
> -	output_fd = open(path, O_CREAT|O_EXCL|O_RDWR, 0600);

So we try to create a path under one name, and ...

> -	if (output_fd >= 0) {
> -		close(output_fd);
> -		strcpy(path + len, auml_nfd);
> -		/* Indicate to the user, that we can configure it to true */
> -		if (!access(path, R_OK))
> -			git_config_set("core.precomposeunicode", "false");

... see if that path can be seen under its alias.  Why do we set it
to "false"?  Isn't this the true culprit?

After all, this is not in the "reinit" codepath, so we know we are
dealing with a repository that was created afresh.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH/RFC] core.precomposeunicode is true by default
  2013-07-29 17:20 ` Junio C Hamano
@ 2013-08-27 13:45   ` Torsten Bögershausen
  2013-08-27 14:49     ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Torsten Bögershausen @ 2013-08-27 13:45 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Torsten Bögershausen, git

(Sorry for the somewhat late reply, thanks for review)
>Torsten Bögershausen <tboegi@web.de> writes:
>
>> When core.precomposeunicode was introduced, it was set to false
>> by default, to be compatible with older versions of Git.
>>
>> Whenever UTF-8 file names are used in a mixed environment,
>> the Mac OS users need to find out that this configuration exist
>> and set it to true manually.
>>
>> There is no measurable performance impact between false and true.
>
>The real reason we default it to auto-sensing in the current code is
>for correctness, I think. the new precompose code could be buggy,
>and by auto-sensing, we hoped that we would enable it only on
>filesystems that the codepath matters.
>
>> A smoother workflow can be achieved for new Git users,
>> so change the default to true:
>>
>> - Remove the auto-sensing
>
>Why?
>
>> - Rename the internal variable into precompose_unicode,
>>   and set it to 1 meaning true.
>
>Why the rename?
>
>> - Adjust and clean up test cases
>>
>> The configuration core.precomposeunicode is still supported.
>
>Sorry, but I do not quite understand the change.  Is this because
>the auto-sensing is not working, or after auto-sensing we do a wrong
>thing?  If that is the case, perhaps that is what we should fix?
>
>> diff --git a/compat/precompose_utf8.c b/compat/precompose_utf8.c
>> index 7980abd..5396b91 100644
>> --- a/compat/precompose_utf8.c
>> +++ b/compat/precompose_utf8.c
>> @@ -36,30 +36,6 @@ static size_t has_non_ascii(const char *s, size_t maxlen, size_t *strlen_c)
>>  }
>>  
>>  
>> -void probe_utf8_pathname_composition(char *path, int len)
>> -{
>> -	static const char *auml_nfc = "\xc3\xa4";
>> -	static const char *auml_nfd = "\x61\xcc\x88";
>> -	int output_fd;
>> -	if (precomposed_unicode != -1)
>> -		return; /* We found it defined in the global config, respect it */
>> -	strcpy(path + len, auml_nfc);
>> -	output_fd = open(path, O_CREAT|O_EXCL|O_RDWR, 0600);
>
>So we try to create a path under one name, and ...
>
>> -	if (output_fd >= 0) {
>> -		close(output_fd);
>> -		strcpy(path + len, auml_nfd);
>> -		/* Indicate to the user, that we can configure it to true */
>> -		if (!access(path, R_OK))
>> -			git_config_set("core.precomposeunicode", "false");
>
>... see if that path can be seen under its alias.  Why do we set it
>to "false"?  Isn't this the true culprit?
>
>After all, this is not in the "reinit" codepath, so we know we are
>dealing with a repository that was created afresh.
>

There is nothing wrong with the auto-sensing as such.
The problem for many users today is that we set core.precomposeunicode
to false, when it should be true.

A patch for that comes out in a minute. But first look back and 
collect some experience with core.precomposeunicode.

Lets have a look at the variable "precomposed_unicode",
(the one I wanted to rename to be more consistant).
It is controlled by the git config files and
depending on the config it is set like this:
core.precomposeuinicode false -> precomposed_unicode = 0
core.precomposeuinicode true  -> precomposed_unicode = 1
core.precomposeuinicode <not set> -> precomposed_unicode = -1.

Let's look what precomposed_unicode does and go through a couple
of git operations.

1)
When we create a repo under Mac OS using HFS+,
we want to have precomposed_unicode = 1

2)
When we access a repo from Windows/Linux using SAMBA,
readdir() will return decomposed.
When the repo is created by nonMacOS, core.precomposeunicode is undefined.
The precomposition is off, but should be on, 
precomposed_unicode = -1, but should be = 1

3)
When we access a repo from another Mac OS system using 
SAMBA, NFS or AFP readdir() will return decomposed.
As the repo is created under Mac OS, we have the same case as (1)

4)
When we access a repo from Linux using NFS we can have
precomposed_unicode = 0 (which is technically more correct).
If Linux users do not use decomposed unicode in their file names,
(according to my understanding this is the case), we can use 1
as well as 0:
precomposing an already precomposed string is a no-op, so it doesn't
harm.


5)
When we create a repo under Linux/Windows on a USB-drive,
and run "git status" under Mac OS, we want precomposed_unicode = 1.

There are few cases where we want to use precomposed_unicode=0:
a) To work around bugs. This may be a short term solution,
  I would rather see bugs to be fixed.
  I'm not aware of any bugs, so please remind me if I missed something.

b) Working with foreign vcs:  E.g. bzr and hg use decomposed unicode,
   so it may be better to use decomposed unicode in git as well.

The simplified V2 patch looks like this (I send it in a seperate mail):

diff --git a/compat/precompose_utf8.c b/compat/precompose_utf8.c
index 7980abd..95fe849 100644
--- a/compat/precompose_utf8.c
+++ b/compat/precompose_utf8.c
@@ -48,11 +48,8 @@ void probe_utf8_pathname_composition(char *path, int len)
 	if (output_fd >= 0) {
 		close(output_fd);
 		strcpy(path + len, auml_nfd);
-		/* Indicate to the user, that we can configure it to true */
-		if (!access(path, R_OK))
-			git_config_set("core.precomposeunicode", "false");
-		/* To be backward compatible, set precomposed_unicode to 0 */
-		precomposed_unicode = 0;
+		precomposed_unicode = access(path, R_OK) ? 0 : 1;
+		git_config_set("core.precomposeunicode", precomposed_unicode ? "true" : "false");

This will not affect existing repos, as they should have
core.precomposeunicode either true or false.
When a new repo is created, users having core.precomposeunicode in the
global config are not effected,
and the global setting switches off the auto-sensing.

The users which are new to git are affected,
and hopefully we do the right thing for them.
At least, according to my understanding, we do the best for majority
of users and use cases.

Thoughts are welcome, and arguments for and against V1 or V2.

Anybody who uses Mac OS and has experience with decomposed unicode?
I have core.precomposeunicode true in my global config.
 

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH/RFC] core.precomposeunicode is true by default
  2013-08-27 13:45   ` Torsten Bögershausen
@ 2013-08-27 14:49     ` Junio C Hamano
  2013-08-27 15:06       ` Torsten Bögershausen
  0 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2013-08-27 14:49 UTC (permalink / raw)
  To: Torsten Bögershausen; +Cc: git

Torsten Bögershausen <tboegi@web.de> writes:

>>... see if that path can be seen under its alias.  Why do we set it
>>to "false"?  Isn't this the true culprit?
>>
>>After all, this is not in the "reinit" codepath, so we know we are
>>dealing with a repository that was created afresh.
>
> There is nothing wrong with the auto-sensing as such.
> The problem for many users today is that we set core.precomposeunicode
> to false, when it should be true.

I think we are in agreement then.

The code detects a broken filesystem just fine, but what it does
when it finds the filesystem is broken is wrong---it sets the
variable to false.  That makes the whole auto-sensing wrong, and I
think it makes sense to correct that behaviour.

> Let's look what precomposed_unicode does and go through a couple
> of git operations.
>
> 1)
> When we create a repo under Mac OS using HFS+,
> we want to have precomposed_unicode = 1

Yes.

> 2)
> When we access a repo from Windows/Linux using SAMBA,

You mean s/repo/repository that resides on HFS+/?

> readdir() will return decomposed.
> When the repo is created by nonMacOS, core.precomposeunicode is undefined.
> The precomposition is off, but should be on, 
> precomposed_unicode = -1, but should be = 1

I do not think UTF-8-MAC is widely available; even if you flip the
bit on, would it help much?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH/RFC] core.precomposeunicode is true by default
  2013-08-27 14:49     ` Junio C Hamano
@ 2013-08-27 15:06       ` Torsten Bögershausen
  2013-08-27 16:27         ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Torsten Bögershausen @ 2013-08-27 15:06 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On 27.08.13 16:49, Junio C Hamano wrote:
> Torsten Bögershausen <tboegi@web.de> writes:
>
>>> ... see if that path can be seen under its alias.  Why do we set it
>>> to "false"?  Isn't this the true culprit?
>>>
>>> After all, this is not in the "reinit" codepath, so we know we are
>>> dealing with a repository that was created afresh.
>> There is nothing wrong with the auto-sensing as such.
>> The problem for many users today is that we set core.precomposeunicode
>> to false, when it should be true.
> I think we are in agreement then.
>
> The code detects a broken filesystem just fine, but what it does
> when it finds the filesystem is broken is wrong---it sets the
> variable to false.  That makes the whole auto-sensing wrong, and I
> think it makes sense to correct that behaviour.
>
>> Let's look what precomposed_unicode does and go through a couple
>> of git operations.
>>
>> 1)
>> When we create a repo under Mac OS using HFS+,
>> we want to have precomposed_unicode = 1
> Yes.
>
>> 2)
>> When we access a repo from Windows/Linux using SAMBA,
> You mean s/repo/repository that resides on HFS+/?
Sorry being unclear here, trying being clearer with an example:
I have a /data/Docs on my linux box, which is handled by git

I export /data/Docs via SAMBA, and use the Finder under Mac OS to have it
mounted on my Mac OS X box:
//tb@Linux/Docs on /Volumes/Docs (smbfs, nodev, nosuid, mounted by tb)
>> readdir() will return decomposed.
>> When the repo is created by nonMacOS, core.precomposeunicode is undefined.
>> The precomposition is off, but should be on, 
>> precomposed_unicode = -1, but should be = 1
> I do not think UTF-8-MAC is widely available; even if you flip the
> bit on, would it help much?
In the above example
/data/Docs/.git/config was created by Linux, so it does not have
core.precomposeunicode set, neither false nor true.
The Linux box does not have  UTF-8-MAC under iconv,
but will ignore core.precomposeunicode anyway (since the code is not compiled here)

The Mac OS machine sees it under /Volumes/Docs/.git/config
And here we want the precomposition, even if core.precomposeunicode
is not present in the config.

 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH/RFC] core.precomposeunicode is true by default
  2013-08-27 15:06       ` Torsten Bögershausen
@ 2013-08-27 16:27         ` Junio C Hamano
  2013-08-27 19:34           ` Torsten Bögershausen
  0 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2013-08-27 16:27 UTC (permalink / raw)
  To: Torsten Bögershausen; +Cc: git

Torsten Bögershausen <tboegi@web.de> writes:

>>> 2)
>>> When we access a repo from Windows/Linux using SAMBA,
>> You mean s/repo/repository that resides on HFS+/?
> Sorry being unclear here, trying being clearer with an example:
> I have a /data/Docs on my linux box, which is handled by git
>
> I export /data/Docs via SAMBA, and use the Finder under Mac OS to have it
> mounted on my Mac OS X box:
> //tb@Linux/Docs on /Volumes/Docs (smbfs, nodev, nosuid, mounted by tb)
>>> readdir() will return decomposed.
>>> When the repo is created by nonMacOS, core.precomposeunicode is undefined.
>>> The precomposition is off, but should be on, 
>>> precomposed_unicode = -1, but should be = 1
>> I do not think UTF-8-MAC is widely available; even if you flip the
>> bit on, would it help much?
> In the above example
> /data/Docs/.git/config was created by Linux, so it does not have
> core.precomposeunicode set, neither false nor true.
> The Linux box does not have  UTF-8-MAC under iconv,
> but will ignore core.precomposeunicode anyway (since the code is not compiled here)
>
> The Mac OS machine sees it under /Volumes/Docs/.git/config
> And here we want the precomposition, even if core.precomposeunicode
> is not present in the config.

It almost makes me wonder if you want not a per-repository but a
per-machine configuration, i.e. "Whichever repository I am
accessing, either on a local filesystem or shared over the network,
readdir() on my box reports wrong paths, and I need correction."

That, or "if it hurts, don't do the remote mount then."

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH/RFC] core.precomposeunicode is true by default
  2013-08-27 16:27         ` Junio C Hamano
@ 2013-08-27 19:34           ` Torsten Bögershausen
  0 siblings, 0 replies; 10+ messages in thread
From: Torsten Bögershausen @ 2013-08-27 19:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Torsten Bögershausen, git

On 2013-08-27 18.27, Junio C Hamano wrote:
> Torsten Bögershausen <tboegi@web.de> writes:
> 
>>>> 2)
>>>> When we access a repo from Windows/Linux using SAMBA,
>>> You mean s/repo/repository that resides on HFS+/?
>> Sorry being unclear here, trying being clearer with an example:
>> I have a /data/Docs on my linux box, which is handled by git
>>
>> I export /data/Docs via SAMBA, and use the Finder under Mac OS to have it
>> mounted on my Mac OS X box:
>> //tb@Linux/Docs on /Volumes/Docs (smbfs, nodev, nosuid, mounted by tb)
>>>> readdir() will return decomposed.
>>>> When the repo is created by nonMacOS, core.precomposeunicode is undefined.
>>>> The precomposition is off, but should be on, 
>>>> precomposed_unicode = -1, but should be = 1
>>> I do not think UTF-8-MAC is widely available; even if you flip the
>>> bit on, would it help much?
>> In the above example
>> /data/Docs/.git/config was created by Linux, so it does not have
>> core.precomposeunicode set, neither false nor true.
>> The Linux box does not have  UTF-8-MAC under iconv,
>> but will ignore core.precomposeunicode anyway (since the code is not compiled here)
>>
>> The Mac OS machine sees it under /Volumes/Docs/.git/config
>> And here we want the precomposition, even if core.precomposeunicode
>> is not present in the config.
> 
> It almost makes me wonder if you want not a per-repository but a
> per-machine configuration, i.e. "Whichever repository I am
> accessing, either on a local filesystem or shared over the network,
> readdir() on my box reports wrong paths, and I need correction."
> 
> That, or "if it hurts, don't do the remote mount then."
No, we don't need to be that restrictive.
We already have repository/user/system wide configuration files,
allowing tweeks and this is a good thing.

Re-Re-reading $gmane/188940: 
I am happy having the V2 patch from today being queued, thanks.

As a next step I will have a look into the advice machine.
 

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2013-08-27 19:34 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-27  1:21 [PATCH/RFC] core.precomposeunicode is true by default Torsten Bögershausen
2013-07-27 15:23 ` Duy Nguyen
2013-07-27 22:53   ` Torsten Bögershausen
2013-07-28  4:45     ` Duy Nguyen
2013-07-29 17:20 ` Junio C Hamano
2013-08-27 13:45   ` Torsten Bögershausen
2013-08-27 14:49     ` Junio C Hamano
2013-08-27 15:06       ` Torsten Bögershausen
2013-08-27 16:27         ` Junio C Hamano
2013-08-27 19:34           ` Torsten Bögershausen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.