All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch] Makefile: Unexport LANG
@ 2009-12-24 11:13 Simon Horman
  2009-12-26  4:30 ` Masami Hiramatsu
  0 siblings, 1 reply; 34+ messages in thread
From: Simon Horman @ 2009-12-24 11:13 UTC (permalink / raw)
  To: linux-kernel, linux-kbuild
  Cc: H. Peter Anvin, Michal Marek, Roland Dreier, Sam Ravnborg,
	Masami Hiramatsu

The recent changes to setting and unexport various LC_ variables
produces a problem on my system (Debian sid).

$ locale
LANG=ja_JP.utf8
LANGUAGE=ja_JP.utf8
LC_CTYPE="ja_JP.utf8"
LC_NUMERIC="ja_JP.utf8"
LC_TIME="ja_JP.utf8"
LC_COLLATE="ja_JP.utf8"
LC_MONETARY="ja_JP.utf8"
LC_MESSAGES="ja_JP.utf8"
LC_PAPER="ja_JP.utf8"
LC_NAME="ja_JP.utf8"
LC_ADDRESS="ja_JP.utf8"
LC_TELEPHONE="ja_JP.utf8"
LC_MEASUREMENT="ja_JP.utf8"
LC_IDENTIFICATION="ja_JP.utf8"
LC_ALL=ja_JP.utf8

Without this patch:
$ make
make[2]: ??: ?? make ? -jN ?????????: jobserver ??????????.
make[2]: ??: ?? make ? -jN ?????????: jobserver ??????????.

With this patch:
$ make
...
make[2]: warning: -jN forced in submake: disabling jobserver mode.
make[2]: warning: -jN forced in submake: disabling jobserver mode.
...

Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Michal Marek <mmarek@sues.cz>
Cc: Roland Dreier <rdreier@cisco.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>

Index: linux-2.6/Makefile
===================================================================
--- linux-2.6.orig/Makefile	2009-12-24 22:09:29.000000000 +1100
+++ linux-2.6/Makefile	2009-12-24 22:10:58.000000000 +1100
@@ -17,6 +17,7 @@ NAME = Man-Eating Seals of Antiquity
 MAKEFLAGS += -rR --no-print-directory
 
 # Avoid funny character set dependencies
+unexport LANG
 unexport LC_ALL
 LC_CTYPE=C
 LC_COLLATE=C

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch] Makefile: Unexport LANG
  2009-12-24 11:13 [patch] Makefile: Unexport LANG Simon Horman
@ 2009-12-26  4:30 ` Masami Hiramatsu
  2009-12-26  5:14   ` H. Peter Anvin
  0 siblings, 1 reply; 34+ messages in thread
From: Masami Hiramatsu @ 2009-12-26  4:30 UTC (permalink / raw)
  To: Simon Horman
  Cc: linux-kernel, linux-kbuild, H. Peter Anvin, Michal Marek,
	Roland Dreier, Sam Ravnborg

Simon Horman wrote:
> The recent changes to setting and unexport various LC_ variables
> produces a problem on my system (Debian sid).
> 
> $ locale
> LANG=ja_JP.utf8
> LANGUAGE=ja_JP.utf8
> LC_CTYPE="ja_JP.utf8"
> LC_NUMERIC="ja_JP.utf8"
> LC_TIME="ja_JP.utf8"
> LC_COLLATE="ja_JP.utf8"
> LC_MONETARY="ja_JP.utf8"
> LC_MESSAGES="ja_JP.utf8"
> LC_PAPER="ja_JP.utf8"
> LC_NAME="ja_JP.utf8"
> LC_ADDRESS="ja_JP.utf8"
> LC_TELEPHONE="ja_JP.utf8"
> LC_MEASUREMENT="ja_JP.utf8"
> LC_IDENTIFICATION="ja_JP.utf8"
> LC_ALL=ja_JP.utf8
> 
> Without this patch:
> $ make
> make[2]: ??: ?? make ? -jN ?????????: jobserver ??????????.
> make[2]: ??: ?? make ? -jN ?????????: jobserver ??????????.
> 
> With this patch:
> $ make
> ...
> make[2]: warning: -jN forced in submake: disabling jobserver mode.
> make[2]: warning: -jN forced in submake: disabling jobserver mode.
> ...
> 
> Cc: H. Peter Anvin <hpa@zytor.com>
> Cc: Michal Marek <mmarek@sues.cz>
> Cc: Roland Dreier <rdreier@cisco.com>
> Cc: Sam Ravnborg <sam@ravnborg.org>
> Cc: Masami Hiramatsu <mhiramat@redhat.com>
> Signed-off-by: Simon Horman <horms@verge.net.au>

Tested on Fedora 11 too, and it works good. Thank you!

Tested-by: Masami Hiramatsu <mhiramat@redhat.com>

> 
> Index: linux-2.6/Makefile
> ===================================================================
> --- linux-2.6.orig/Makefile	2009-12-24 22:09:29.000000000 +1100
> +++ linux-2.6/Makefile	2009-12-24 22:10:58.000000000 +1100
> @@ -17,6 +17,7 @@ NAME = Man-Eating Seals of Antiquity
>  MAKEFLAGS += -rR --no-print-directory
>  
>  # Avoid funny character set dependencies
> +unexport LANG
>  unexport LC_ALL
>  LC_CTYPE=C
>  LC_COLLATE=C
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch] Makefile: Unexport LANG
  2009-12-26  4:30 ` Masami Hiramatsu
@ 2009-12-26  5:14   ` H. Peter Anvin
  2009-12-26 11:20     ` Simon Horman
  0 siblings, 1 reply; 34+ messages in thread
From: H. Peter Anvin @ 2009-12-26  5:14 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Simon Horman, linux-kernel, linux-kbuild, Michal Marek,
	Roland Dreier, Sam Ravnborg

On 12/25/2009 08:30 PM, Masami Hiramatsu wrote:
>>  
>>  # Avoid funny character set dependencies
>> +unexport LANG
>>  unexport LC_ALL
>>  LC_CTYPE=C
>>  LC_COLLATE=C
> 

At this point, it seems to me that we should just LC_ALL=C and be done
with it (see other thread.)

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch] Makefile: Unexport LANG
  2009-12-26  5:14   ` H. Peter Anvin
@ 2009-12-26 11:20     ` Simon Horman
  2010-01-08  0:41       ` Simon Horman
  0 siblings, 1 reply; 34+ messages in thread
From: Simon Horman @ 2009-12-26 11:20 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Masami Hiramatsu, linux-kernel, linux-kbuild, Michal Marek,
	Roland Dreier, Sam Ravnborg

On Fri, Dec 25, 2009 at 09:14:40PM -0800, H. Peter Anvin wrote:
> On 12/25/2009 08:30 PM, Masami Hiramatsu wrote:
> >>  
> >>  # Avoid funny character set dependencies
> >> +unexport LANG
> >>  unexport LC_ALL
> >>  LC_CTYPE=C
> >>  LC_COLLATE=C
> > 
> 
> At this point, it seems to me that we should just LC_ALL=C and be done
> with it (see other thread.)

Sure, that would also work for the case that I'm seeing.

I tested the following:

# Avoid funny character set dependencies
LC_ALL=C
export LC_ALL

Though personally I would advocate tweaking the locale as needed closer
to awk scripts and the like, rather than the high-level general change that
was made. Fall-out from a high-level change seems inevitable to me.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch] Makefile: Unexport LANG
  2009-12-26 11:20     ` Simon Horman
@ 2010-01-08  0:41       ` Simon Horman
  2010-01-08  0:43         ` H. Peter Anvin
  0 siblings, 1 reply; 34+ messages in thread
From: Simon Horman @ 2010-01-08  0:41 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Masami Hiramatsu, linux-kernel, linux-kbuild, Michal Marek,
	Roland Dreier, Sam Ravnborg

On Sat, Dec 26, 2009 at 10:20:07PM +1100, Simon Horman wrote:
> On Fri, Dec 25, 2009 at 09:14:40PM -0800, H. Peter Anvin wrote:
> > On 12/25/2009 08:30 PM, Masami Hiramatsu wrote:
> > >>  
> > >>  # Avoid funny character set dependencies
> > >> +unexport LANG
> > >>  unexport LC_ALL
> > >>  LC_CTYPE=C
> > >>  LC_COLLATE=C
> > > 
> > 
> > At this point, it seems to me that we should just LC_ALL=C and be done
> > with it (see other thread.)
> 
> Sure, that would also work for the case that I'm seeing.
> 
> I tested the following:
> 
> # Avoid funny character set dependencies
> LC_ALL=C
> export LC_ALL
> 
> Though personally I would advocate tweaking the locale as needed closer
> to awk scripts and the like, rather than the high-level general change that
> was made. Fall-out from a high-level change seems inevitable to me.

This seems to still be broken. Can we decide on a solution?



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch] Makefile: Unexport LANG
  2010-01-08  0:41       ` Simon Horman
@ 2010-01-08  0:43         ` H. Peter Anvin
  2010-01-08  2:45           ` Simon Horman
  0 siblings, 1 reply; 34+ messages in thread
From: H. Peter Anvin @ 2010-01-08  0:43 UTC (permalink / raw)
  To: Simon Horman
  Cc: Masami Hiramatsu, linux-kernel, linux-kbuild, Michal Marek,
	Roland Dreier, Sam Ravnborg

On 01/07/2010 04:41 PM, Simon Horman wrote:
> On Sat, Dec 26, 2009 at 10:20:07PM +1100, Simon Horman wrote:
>> On Fri, Dec 25, 2009 at 09:14:40PM -0800, H. Peter Anvin wrote:
>>> On 12/25/2009 08:30 PM, Masami Hiramatsu wrote:
>>>>>  
>>>>>  # Avoid funny character set dependencies
>>>>> +unexport LANG
>>>>>  unexport LC_ALL
>>>>>  LC_CTYPE=C
>>>>>  LC_COLLATE=C
>>>>
>>>
>>> At this point, it seems to me that we should just LC_ALL=C and be done
>>> with it (see other thread.)
>>
>> Sure, that would also work for the case that I'm seeing.
>>
>> I tested the following:
>>
>> # Avoid funny character set dependencies
>> LC_ALL=C
>> export LC_ALL
>>
>> Though personally I would advocate tweaking the locale as needed closer
>> to awk scripts and the like, rather than the high-level general change that
>> was made. Fall-out from a high-level change seems inevitable to me.
> 
> This seems to still be broken. Can we decide on a solution?
> 

I think it's up to Michal to pick the preferred solution.

It has been pointed out that one option might also to be to *not*
override LC_CTYPE, and only override LC_COLLATE.

	-hpa



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch] Makefile: Unexport LANG
  2010-01-08  0:43         ` H. Peter Anvin
@ 2010-01-08  2:45           ` Simon Horman
  2010-01-08  2:59             ` Simon Horman
  0 siblings, 1 reply; 34+ messages in thread
From: Simon Horman @ 2010-01-08  2:45 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Masami Hiramatsu, linux-kernel, linux-kbuild, Michal Marek,
	Roland Dreier, Sam Ravnborg

On Thu, Jan 07, 2010 at 04:43:55PM -0800, H. Peter Anvin wrote:
> On 01/07/2010 04:41 PM, Simon Horman wrote:
> > On Sat, Dec 26, 2009 at 10:20:07PM +1100, Simon Horman wrote:
> >> On Fri, Dec 25, 2009 at 09:14:40PM -0800, H. Peter Anvin wrote:
> >>> On 12/25/2009 08:30 PM, Masami Hiramatsu wrote:
> >>>>>  
> >>>>>  # Avoid funny character set dependencies
> >>>>> +unexport LANG
> >>>>>  unexport LC_ALL
> >>>>>  LC_CTYPE=C
> >>>>>  LC_COLLATE=C
> >>>>
> >>>
> >>> At this point, it seems to me that we should just LC_ALL=C and be done
> >>> with it (see other thread.)
> >>
> >> Sure, that would also work for the case that I'm seeing.
> >>
> >> I tested the following:
> >>
> >> # Avoid funny character set dependencies
> >> LC_ALL=C
> >> export LC_ALL
> >>
> >> Though personally I would advocate tweaking the locale as needed closer
> >> to awk scripts and the like, rather than the high-level general change that
> >> was made. Fall-out from a high-level change seems inevitable to me.
> > 
> > This seems to still be broken. Can we decide on a solution?
> > 
> 
> I think it's up to Michal to pick the preferred solution.
> 
> It has been pointed out that one option might also to be to *not*
> override LC_CTYPE, and only override LC_COLLATE.

I've confirmed that both of the following allow make to give sane output
for me.  And they are better than my suggestion in the respect that the
error messages are according to the otherwise prevailing locale, not
suddenly switched to English.

# Avoid funny character set dependencies
unexport LC_ALL
LC_COLLATE=C
export LC_NUMERIC

# Avoid funny character set dependencies
unexport LC_ALL
LC_COLLATE=C
LC_NUMERIC=C
export LC_COLLATE LC_NUMERIC

I did not verify that they do something sensible for the awk concern
that originally introduced the locale change - but I think it is
unaffected by my locale settings.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch] Makefile: Unexport LANG
  2010-01-08  2:45           ` Simon Horman
@ 2010-01-08  2:59             ` Simon Horman
  2010-01-08 11:57               ` Michal Marek
  0 siblings, 1 reply; 34+ messages in thread
From: Simon Horman @ 2010-01-08  2:59 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Masami Hiramatsu, linux-kernel, linux-kbuild, Michal Marek,
	Roland Dreier, Sam Ravnborg

On Fri, Jan 08, 2010 at 01:45:56PM +1100, Simon Horman wrote:
> On Thu, Jan 07, 2010 at 04:43:55PM -0800, H. Peter Anvin wrote:
> > On 01/07/2010 04:41 PM, Simon Horman wrote:
> > > On Sat, Dec 26, 2009 at 10:20:07PM +1100, Simon Horman wrote:
> > >> On Fri, Dec 25, 2009 at 09:14:40PM -0800, H. Peter Anvin wrote:
> > >>> On 12/25/2009 08:30 PM, Masami Hiramatsu wrote:
> > >>>>>  
> > >>>>>  # Avoid funny character set dependencies
> > >>>>> +unexport LANG
> > >>>>>  unexport LC_ALL
> > >>>>>  LC_CTYPE=C
> > >>>>>  LC_COLLATE=C
> > >>>>
> > >>>
> > >>> At this point, it seems to me that we should just LC_ALL=C and be done
> > >>> with it (see other thread.)
> > >>
> > >> Sure, that would also work for the case that I'm seeing.
> > >>
> > >> I tested the following:
> > >>
> > >> # Avoid funny character set dependencies
> > >> LC_ALL=C
> > >> export LC_ALL
> > >>
> > >> Though personally I would advocate tweaking the locale as needed closer
> > >> to awk scripts and the like, rather than the high-level general change that
> > >> was made. Fall-out from a high-level change seems inevitable to me.
> > > 
> > > This seems to still be broken. Can we decide on a solution?
> > > 
> > 
> > I think it's up to Michal to pick the preferred solution.

Is it just me or is Michal's email bouncing of late?


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch] Makefile: Unexport LANG
  2010-01-08  2:59             ` Simon Horman
@ 2010-01-08 11:57               ` Michal Marek
  2010-01-08 12:16                   ` Michal Marek
  0 siblings, 1 reply; 34+ messages in thread
From: Michal Marek @ 2010-01-08 11:57 UTC (permalink / raw)
  To: Simon Horman, H. Peter Anvin
  Cc: Masami Hiramatsu, linux-kernel, linux-kbuild, Roland Dreier,
	Sam Ravnborg

On Thu, Jan 07, 2010 at 04:43:55PM -0800, H. Peter Anvin wrote:
> On 01/07/2010 04:41 PM, Simon Horman wrote:
> > On Sat, Dec 26, 2009 at 10:20:07PM +1100, Simon Horman wrote:
> >> On Fri, Dec 25, 2009 at 09:14:40PM -0800, H. Peter Anvin wrote:
> I think it's up to Michal to pick the preferred solution.
> 
> It has been pointed out that one option might also to be to *not*
> override LC_CTYPE, and only override LC_COLLATE.

Yes, that's imo a good compromise. As I noted in another thread, the
only drawback of not setting LC_CTYPE is that it makes the behavior of
awk's tolower()/toupper() volatile, but that seems to be only used in a
single script. I'll post a patch in a separate email.


On Fri, Jan 08, 2010 at 01:59:24PM +1100, Simon Horman wrote:
> Is it just me or is Michal's email bouncing of late?

My email address was wrong, sues.cz does not exist, suse.cz does. Which
is why I overlooked this thread.

Michal

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH] Makefile: do not override LC_CTYPE
  2010-01-08 11:57               ` Michal Marek
@ 2010-01-08 12:16                   ` Michal Marek
  0 siblings, 0 replies; 34+ messages in thread
From: Michal Marek @ 2010-01-08 12:16 UTC (permalink / raw)
  To: H. Peter Anvin, Simon Horman
  Cc: Masami Hiramatsu, Roland Dreier, Sam Ravnborg,
	Sergei Trofimovich, linux-kbuild, linux-kernel, linux-sh

Setting LC_CTYPE=C breaks localized messages in some setups. With only
LC_COLLATE=C and LC_NUMERIC=C, we get almost all we need, except for not
so defined character classes and tolower()/toupper(). The former is not
a big issue, because we can assume that e.g. [:alpha:] will always
include a-zA-Z and we only ever process ASCII input. The latter seems
only affect arch/sh/tools/gen-mach-types, which we can handle separately.

So after this patch the meaning of ranges like [a-z], the behavior of
sort and join, etc. should be the same everywhere and at the same time
gcc should be able to print localized waring and error messages.
LC_NUMERIC=C might not be necessary, but setting it doesn't hurt.

Reported-by: Simon Horman <horms@verge.net.au>
Reported-by: Sergei Trofimovich <slyfox@inbox.ru>
Signed-off-by: Michal Marek <mmarek@suse.cz>
---

Note: if this still breaks for someone, we will simply set LC_ALL=C.

 Makefile               |    3 +--
 arch/sh/tools/Makefile |    2 +-
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/Makefile b/Makefile
index 09a320f..a7b4351 100644
--- a/Makefile
+++ b/Makefile
@@ -18,10 +18,9 @@ MAKEFLAGS += -rR --no-print-directory
 
 # Avoid funny character set dependencies
 unexport LC_ALL
-LC_CTYPE=C
 LC_COLLATE=C
 LC_NUMERIC=C
-export LC_CTYPE LC_COLLATE LC_NUMERIC
+export LC_COLLATE LC_NUMERIC
 
 # We are using a recursive build, so we need to do a little thinking
 # to get the ordering right.
diff --git a/arch/sh/tools/Makefile b/arch/sh/tools/Makefile
index 558a56b..2082af1 100644
--- a/arch/sh/tools/Makefile
+++ b/arch/sh/tools/Makefile
@@ -13,4 +13,4 @@
 include/generated/machtypes.h: $(src)/gen-mach-types $(src)/mach-types
 	@echo '  Generating $@'
 	$(Q)mkdir -p $(dir $@)
-	$(Q)$(AWK) -f $^ > $@ || { rm -f $@; /bin/false; }
+	$(Q)LC_ALL=C $(AWK) -f $^ > $@ || { rm -f $@; /bin/false; }
-- 
1.6.5.3


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH] Makefile: do not override LC_CTYPE
@ 2010-01-08 12:16                   ` Michal Marek
  0 siblings, 0 replies; 34+ messages in thread
From: Michal Marek @ 2010-01-08 12:16 UTC (permalink / raw)
  To: H. Peter Anvin, Simon Horman
  Cc: Masami Hiramatsu, Roland Dreier, Sam Ravnborg,
	Sergei Trofimovich, linux-kbuild, linux-kernel, linux-sh

Setting LC_CTYPE=C breaks localized messages in some setups. With only
LC_COLLATE=C and LC_NUMERIC=C, we get almost all we need, except for not
so defined character classes and tolower()/toupper(). The former is not
a big issue, because we can assume that e.g. [:alpha:] will always
include a-zA-Z and we only ever process ASCII input. The latter seems
only affect arch/sh/tools/gen-mach-types, which we can handle separately.

So after this patch the meaning of ranges like [a-z], the behavior of
sort and join, etc. should be the same everywhere and at the same time
gcc should be able to print localized waring and error messages.
LC_NUMERIC=C might not be necessary, but setting it doesn't hurt.

Reported-by: Simon Horman <horms@verge.net.au>
Reported-by: Sergei Trofimovich <slyfox@inbox.ru>
Signed-off-by: Michal Marek <mmarek@suse.cz>
---

Note: if this still breaks for someone, we will simply set LC_ALL=C.

 Makefile               |    3 +--
 arch/sh/tools/Makefile |    2 +-
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/Makefile b/Makefile
index 09a320f..a7b4351 100644
--- a/Makefile
+++ b/Makefile
@@ -18,10 +18,9 @@ MAKEFLAGS += -rR --no-print-directory
 
 # Avoid funny character set dependencies
 unexport LC_ALL
-LC_CTYPE=C
 LC_COLLATE=C
 LC_NUMERIC=C
-export LC_CTYPE LC_COLLATE LC_NUMERIC
+export LC_COLLATE LC_NUMERIC
 
 # We are using a recursive build, so we need to do a little thinking
 # to get the ordering right.
diff --git a/arch/sh/tools/Makefile b/arch/sh/tools/Makefile
index 558a56b..2082af1 100644
--- a/arch/sh/tools/Makefile
+++ b/arch/sh/tools/Makefile
@@ -13,4 +13,4 @@
 include/generated/machtypes.h: $(src)/gen-mach-types $(src)/mach-types
 	@echo '  Generating $@'
 	$(Q)mkdir -p $(dir $@)
-	$(Q)$(AWK) -f $^ > $@ || { rm -f $@; /bin/false; }
+	$(Q)LC_ALL=C $(AWK) -f $^ > $@ || { rm -f $@; /bin/false; }
-- 
1.6.5.3


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH] Makefile: do not override LC_CTYPE
  2010-01-08 12:16                   ` Michal Marek
@ 2010-01-08 18:50                     ` H. Peter Anvin
  -1 siblings, 0 replies; 34+ messages in thread
From: H. Peter Anvin @ 2010-01-08 18:50 UTC (permalink / raw)
  To: Michal Marek
  Cc: Simon Horman, Masami Hiramatsu, Roland Dreier, Sam Ravnborg,
	Sergei Trofimovich, linux-kbuild, linux-kernel, linux-sh

On 01/08/2010 04:16 AM, Michal Marek wrote:
> Setting LC_CTYPE=C breaks localized messages in some setups. With only
> LC_COLLATE=C and LC_NUMERIC=C, we get almost all we need, except for not
> so defined character classes and tolower()/toupper(). The former is not
> a big issue, because we can assume that e.g. [:alpha:] will always
> include a-zA-Z and we only ever process ASCII input. The latter seems
> only affect arch/sh/tools/gen-mach-types, which we can handle separately.
> 
> So after this patch the meaning of ranges like [a-z], the behavior of
> sort and join, etc. should be the same everywhere and at the same time
> gcc should be able to print localized waring and error messages.
> LC_NUMERIC=C might not be necessary, but setting it doesn't hurt.
> 
> Reported-by: Simon Horman <horms@verge.net.au>
> Reported-by: Sergei Trofimovich <slyfox@inbox.ru>
> Signed-off-by: Michal Marek <mmarek@suse.cz>

For what it's worth:

Acked-by: H. Peter Anvin <hpa@zytor.com>

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] Makefile: do not override LC_CTYPE
@ 2010-01-08 18:50                     ` H. Peter Anvin
  0 siblings, 0 replies; 34+ messages in thread
From: H. Peter Anvin @ 2010-01-08 18:50 UTC (permalink / raw)
  To: Michal Marek
  Cc: Simon Horman, Masami Hiramatsu, Roland Dreier, Sam Ravnborg,
	Sergei Trofimovich, linux-kbuild, linux-kernel, linux-sh

On 01/08/2010 04:16 AM, Michal Marek wrote:
> Setting LC_CTYPE=C breaks localized messages in some setups. With only
> LC_COLLATE=C and LC_NUMERIC=C, we get almost all we need, except for not
> so defined character classes and tolower()/toupper(). The former is not
> a big issue, because we can assume that e.g. [:alpha:] will always
> include a-zA-Z and we only ever process ASCII input. The latter seems
> only affect arch/sh/tools/gen-mach-types, which we can handle separately.
> 
> So after this patch the meaning of ranges like [a-z], the behavior of
> sort and join, etc. should be the same everywhere and at the same time
> gcc should be able to print localized waring and error messages.
> LC_NUMERIC=C might not be necessary, but setting it doesn't hurt.
> 
> Reported-by: Simon Horman <horms@verge.net.au>
> Reported-by: Sergei Trofimovich <slyfox@inbox.ru>
> Signed-off-by: Michal Marek <mmarek@suse.cz>

For what it's worth:

Acked-by: H. Peter Anvin <hpa@zytor.com>

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] Makefile: do not override LC_CTYPE
  2010-01-08 12:16                   ` Michal Marek
@ 2010-01-09  0:00                     ` Simon Horman
  -1 siblings, 0 replies; 34+ messages in thread
From: Simon Horman @ 2010-01-09  0:00 UTC (permalink / raw)
  To: Michal Marek
  Cc: H. Peter Anvin, Masami Hiramatsu, Roland Dreier, Sam Ravnborg,
	Sergei Trofimovich, linux-kbuild, linux-kernel, linux-sh

Hi Michal,

sorry for messing up your email address in one of the previous threads.

On Fri, Jan 08, 2010 at 01:16:28PM +0100, Michal Marek wrote:
> Setting LC_CTYPE=C breaks localized messages in some setups. With only
> LC_COLLATE=C and LC_NUMERIC=C, we get almost all we need, except for not
> so defined character classes and tolower()/toupper(). The former is not
> a big issue, because we can assume that e.g. [:alpha:] will always
> include a-zA-Z and we only ever process ASCII input. The latter seems
> only affect arch/sh/tools/gen-mach-types, which we can handle separately.
> 
> So after this patch the meaning of ranges like [a-z], the behavior of
> sort and join, etc. should be the same everywhere and at the same time
> gcc should be able to print localized waring and error messages.
> LC_NUMERIC=C might not be necessary, but setting it doesn't hurt.
> 
> Reported-by: Simon Horman <horms@verge.net.au>
> Reported-by: Sergei Trofimovich <slyfox@inbox.ru>
> Signed-off-by: Michal Marek <mmarek@suse.cz>

Tested-by: Simon Horman <horms@verge.net.au>

> ---
> 
> Note: if this still breaks for someone, we will simply set LC_ALL=C.

Personally I think it would be much better to set the locale explicitly
as needed, where needed, such as the LC_ALL=C sledgehammer that you
have inserted into arch/sh/tools. Or at a slightly higher level,
offer an awk-wrapper, as it seems to be the main (only?) cause of concern.

Surely the goal isn't to alter the user-experience - to the extent that a
build has a user-experience - but to force some tools to behave as desired.

Just an opinion. The patch below seems to work fine for me.

> 
>  Makefile               |    3 +--
>  arch/sh/tools/Makefile |    2 +-
>  2 files changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/Makefile b/Makefile
> index 09a320f..a7b4351 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -18,10 +18,9 @@ MAKEFLAGS += -rR --no-print-directory
>  
>  # Avoid funny character set dependencies
>  unexport LC_ALL
> -LC_CTYPE=C
>  LC_COLLATE=C
>  LC_NUMERIC=C
> -export LC_CTYPE LC_COLLATE LC_NUMERIC
> +export LC_COLLATE LC_NUMERIC
>  
>  # We are using a recursive build, so we need to do a little thinking
>  # to get the ordering right.
> diff --git a/arch/sh/tools/Makefile b/arch/sh/tools/Makefile
> index 558a56b..2082af1 100644
> --- a/arch/sh/tools/Makefile
> +++ b/arch/sh/tools/Makefile
> @@ -13,4 +13,4 @@
>  include/generated/machtypes.h: $(src)/gen-mach-types $(src)/mach-types
>  	@echo '  Generating $@'
>  	$(Q)mkdir -p $(dir $@)
> -	$(Q)$(AWK) -f $^ > $@ || { rm -f $@; /bin/false; }
> +	$(Q)LC_ALL=C $(AWK) -f $^ > $@ || { rm -f $@; /bin/false; }
> -- 
> 1.6.5.3

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] Makefile: do not override LC_CTYPE
@ 2010-01-09  0:00                     ` Simon Horman
  0 siblings, 0 replies; 34+ messages in thread
From: Simon Horman @ 2010-01-09  0:00 UTC (permalink / raw)
  To: Michal Marek
  Cc: H. Peter Anvin, Masami Hiramatsu, Roland Dreier, Sam Ravnborg,
	Sergei Trofimovich, linux-kbuild, linux-kernel, linux-sh

Hi Michal,

sorry for messing up your email address in one of the previous threads.

On Fri, Jan 08, 2010 at 01:16:28PM +0100, Michal Marek wrote:
> Setting LC_CTYPE=C breaks localized messages in some setups. With only
> LC_COLLATE=C and LC_NUMERIC=C, we get almost all we need, except for not
> so defined character classes and tolower()/toupper(). The former is not
> a big issue, because we can assume that e.g. [:alpha:] will always
> include a-zA-Z and we only ever process ASCII input. The latter seems
> only affect arch/sh/tools/gen-mach-types, which we can handle separately.
> 
> So after this patch the meaning of ranges like [a-z], the behavior of
> sort and join, etc. should be the same everywhere and at the same time
> gcc should be able to print localized waring and error messages.
> LC_NUMERIC=C might not be necessary, but setting it doesn't hurt.
> 
> Reported-by: Simon Horman <horms@verge.net.au>
> Reported-by: Sergei Trofimovich <slyfox@inbox.ru>
> Signed-off-by: Michal Marek <mmarek@suse.cz>

Tested-by: Simon Horman <horms@verge.net.au>

> ---
> 
> Note: if this still breaks for someone, we will simply set LC_ALL=C.

Personally I think it would be much better to set the locale explicitly
as needed, where needed, such as the LC_ALL=C sledgehammer that you
have inserted into arch/sh/tools. Or at a slightly higher level,
offer an awk-wrapper, as it seems to be the main (only?) cause of concern.

Surely the goal isn't to alter the user-experience - to the extent that a
build has a user-experience - but to force some tools to behave as desired.

Just an opinion. The patch below seems to work fine for me.

> 
>  Makefile               |    3 +--
>  arch/sh/tools/Makefile |    2 +-
>  2 files changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/Makefile b/Makefile
> index 09a320f..a7b4351 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -18,10 +18,9 @@ MAKEFLAGS += -rR --no-print-directory
>  
>  # Avoid funny character set dependencies
>  unexport LC_ALL
> -LC_CTYPE=C
>  LC_COLLATE=C
>  LC_NUMERIC=C
> -export LC_CTYPE LC_COLLATE LC_NUMERIC
> +export LC_COLLATE LC_NUMERIC
>  
>  # We are using a recursive build, so we need to do a little thinking
>  # to get the ordering right.
> diff --git a/arch/sh/tools/Makefile b/arch/sh/tools/Makefile
> index 558a56b..2082af1 100644
> --- a/arch/sh/tools/Makefile
> +++ b/arch/sh/tools/Makefile
> @@ -13,4 +13,4 @@
>  include/generated/machtypes.h: $(src)/gen-mach-types $(src)/mach-types
>  	@echo '  Generating $@'
>  	$(Q)mkdir -p $(dir $@)
> -	$(Q)$(AWK) -f $^ > $@ || { rm -f $@; /bin/false; }
> +	$(Q)LC_ALL=C $(AWK) -f $^ > $@ || { rm -f $@; /bin/false; }
> -- 
> 1.6.5.3

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] Makefile: do not override LC_CTYPE
  2010-01-09  0:00                     ` Simon Horman
@ 2010-01-09  0:07                       ` H. Peter Anvin
  -1 siblings, 0 replies; 34+ messages in thread
From: H. Peter Anvin @ 2010-01-09  0:07 UTC (permalink / raw)
  To: Simon Horman
  Cc: Michal Marek, Masami Hiramatsu, Roland Dreier, Sam Ravnborg,
	Sergei Trofimovich, linux-kbuild, linux-kernel, linux-sh

On 01/08/2010 04:00 PM, Simon Horman wrote:
> 
> Personally I think it would be much better to set the locale explicitly
> as needed, where needed, such as the LC_ALL=C sledgehammer that you
> have inserted into arch/sh/tools. Or at a slightly higher level,
> offer an awk-wrapper, as it seems to be the main (only?) cause of concern.
> 

awk, sed, shell scripts, etc. all have the same problem.

	-hpa

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] Makefile: do not override LC_CTYPE
@ 2010-01-09  0:07                       ` H. Peter Anvin
  0 siblings, 0 replies; 34+ messages in thread
From: H. Peter Anvin @ 2010-01-09  0:07 UTC (permalink / raw)
  To: Simon Horman
  Cc: Michal Marek, Masami Hiramatsu, Roland Dreier, Sam Ravnborg,
	Sergei Trofimovich, linux-kbuild, linux-kernel, linux-sh

On 01/08/2010 04:00 PM, Simon Horman wrote:
> 
> Personally I think it would be much better to set the locale explicitly
> as needed, where needed, such as the LC_ALL=C sledgehammer that you
> have inserted into arch/sh/tools. Or at a slightly higher level,
> offer an awk-wrapper, as it seems to be the main (only?) cause of concern.
> 

awk, sed, shell scripts, etc. all have the same problem.

	-hpa

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] Makefile: do not override LC_CTYPE
  2010-01-08 12:16                   ` Michal Marek
@ 2010-01-09  0:09                     ` Masami Hiramatsu
  -1 siblings, 0 replies; 34+ messages in thread
From: Masami Hiramatsu @ 2010-01-09  0:09 UTC (permalink / raw)
  To: Michal Marek
  Cc: H. Peter Anvin, Simon Horman, Roland Dreier, Sam Ravnborg,
	Sergei Trofimovich, linux-kbuild, linux-kernel, linux-sh

Hi Michal,

Michal Marek wrote:
> Setting LC_CTYPE=C breaks localized messages in some setups. With only
> LC_COLLATE=C and LC_NUMERIC=C, we get almost all we need, except for not
> so defined character classes and tolower()/toupper(). The former is not
> a big issue, because we can assume that e.g. [:alpha:] will always
> include a-zA-Z and we only ever process ASCII input. The latter seems
> only affect arch/sh/tools/gen-mach-types, which we can handle separately.

Hmm, this also affects arch/x/tools/gen-insn-attr-x86.awk.
Could you also wrap it?

Thank you,

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] Makefile: do not override LC_CTYPE
@ 2010-01-09  0:09                     ` Masami Hiramatsu
  0 siblings, 0 replies; 34+ messages in thread
From: Masami Hiramatsu @ 2010-01-09  0:09 UTC (permalink / raw)
  To: Michal Marek
  Cc: H. Peter Anvin, Simon Horman, Roland Dreier, Sam Ravnborg,
	Sergei Trofimovich, linux-kbuild, linux-kernel, linux-sh

Hi Michal,

Michal Marek wrote:
> Setting LC_CTYPE=C breaks localized messages in some setups. With only
> LC_COLLATE=C and LC_NUMERIC=C, we get almost all we need, except for not
> so defined character classes and tolower()/toupper(). The former is not
> a big issue, because we can assume that e.g. [:alpha:] will always
> include a-zA-Z and we only ever process ASCII input. The latter seems
> only affect arch/sh/tools/gen-mach-types, which we can handle separately.

Hmm, this also affects arch/x/tools/gen-insn-attr-x86.awk.
Could you also wrap it?

Thank you,

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] Makefile: do not override LC_CTYPE
  2010-01-09  0:09                     ` Masami Hiramatsu
@ 2010-01-09  0:16                       ` H. Peter Anvin
  -1 siblings, 0 replies; 34+ messages in thread
From: H. Peter Anvin @ 2010-01-09  0:16 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Michal Marek, Simon Horman, Roland Dreier, Sam Ravnborg,
	Sergei Trofimovich, linux-kbuild, linux-kernel, linux-sh

On 01/08/2010 04:09 PM, Masami Hiramatsu wrote:
> Hi Michal,
> 
> Michal Marek wrote:
>> Setting LC_CTYPE=C breaks localized messages in some setups. With only
>> LC_COLLATE=C and LC_NUMERIC=C, we get almost all we need, except for not
>> so defined character classes and tolower()/toupper(). The former is not
>> a big issue, because we can assume that e.g. [:alpha:] will always
>> include a-zA-Z and we only ever process ASCII input. The latter seems
>> only affect arch/sh/tools/gen-mach-types, which we can handle separately.
> 
> Hmm, this also affects arch/x/tools/gen-insn-attr-x86.awk.
> Could you also wrap it?
> 

This is tolower/toupper()?  Do there exist locales where tolower/toupper
on ASCII input do weird things, or are we merely hypothesizing?

	-hpa




^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] Makefile: do not override LC_CTYPE
@ 2010-01-09  0:16                       ` H. Peter Anvin
  0 siblings, 0 replies; 34+ messages in thread
From: H. Peter Anvin @ 2010-01-09  0:16 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Michal Marek, Simon Horman, Roland Dreier, Sam Ravnborg,
	Sergei Trofimovich, linux-kbuild, linux-kernel, linux-sh

On 01/08/2010 04:09 PM, Masami Hiramatsu wrote:
> Hi Michal,
> 
> Michal Marek wrote:
>> Setting LC_CTYPE=C breaks localized messages in some setups. With only
>> LC_COLLATE=C and LC_NUMERIC=C, we get almost all we need, except for not
>> so defined character classes and tolower()/toupper(). The former is not
>> a big issue, because we can assume that e.g. [:alpha:] will always
>> include a-zA-Z and we only ever process ASCII input. The latter seems
>> only affect arch/sh/tools/gen-mach-types, which we can handle separately.
> 
> Hmm, this also affects arch/x/tools/gen-insn-attr-x86.awk.
> Could you also wrap it?
> 

This is tolower/toupper()?  Do there exist locales where tolower/toupper
on ASCII input do weird things, or are we merely hypothesizing?

	-hpa




^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] Makefile: do not override LC_CTYPE
  2010-01-09  0:16                       ` H. Peter Anvin
@ 2010-01-09  0:30                         ` Masami Hiramatsu
  -1 siblings, 0 replies; 34+ messages in thread
From: Masami Hiramatsu @ 2010-01-09  0:30 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Michal Marek, Simon Horman, Roland Dreier, Sam Ravnborg,
	Sergei Trofimovich, linux-kbuild, linux-kernel, linux-sh

H. Peter Anvin wrote:
> On 01/08/2010 04:09 PM, Masami Hiramatsu wrote:
>> Hi Michal,
>>
>> Michal Marek wrote:
>>> Setting LC_CTYPE=C breaks localized messages in some setups. With only
>>> LC_COLLATE=C and LC_NUMERIC=C, we get almost all we need, except for not
>>> so defined character classes and tolower()/toupper(). The former is not
>>> a big issue, because we can assume that e.g. [:alpha:] will always
>>> include a-zA-Z and we only ever process ASCII input. The latter seems
>>> only affect arch/sh/tools/gen-mach-types, which we can handle separately.
>>
>> Hmm, this also affects arch/x/tools/gen-insn-attr-x86.awk.
>> Could you also wrap it?
>>
> 
> This is tolower/toupper()?  Do there exist locales where tolower/toupper
> on ASCII input do weird things, or are we merely hypothesizing?

Isn't it affect [A-Z] or [a-z]? If not, the patch good to me too.

Thank you,
-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] Makefile: do not override LC_CTYPE
@ 2010-01-09  0:30                         ` Masami Hiramatsu
  0 siblings, 0 replies; 34+ messages in thread
From: Masami Hiramatsu @ 2010-01-09  0:30 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Michal Marek, Simon Horman, Roland Dreier, Sam Ravnborg,
	Sergei Trofimovich, linux-kbuild, linux-kernel, linux-sh

H. Peter Anvin wrote:
> On 01/08/2010 04:09 PM, Masami Hiramatsu wrote:
>> Hi Michal,
>>
>> Michal Marek wrote:
>>> Setting LC_CTYPE=C breaks localized messages in some setups. With only
>>> LC_COLLATE=C and LC_NUMERIC=C, we get almost all we need, except for not
>>> so defined character classes and tolower()/toupper(). The former is not
>>> a big issue, because we can assume that e.g. [:alpha:] will always
>>> include a-zA-Z and we only ever process ASCII input. The latter seems
>>> only affect arch/sh/tools/gen-mach-types, which we can handle separately.
>>
>> Hmm, this also affects arch/x/tools/gen-insn-attr-x86.awk.
>> Could you also wrap it?
>>
> 
> This is tolower/toupper()?  Do there exist locales where tolower/toupper
> on ASCII input do weird things, or are we merely hypothesizing?

Isn't it affect [A-Z] or [a-z]? If not, the patch good to me too.

Thank you,
-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] Makefile: do not override LC_CTYPE
  2010-01-09  0:30                         ` Masami Hiramatsu
@ 2010-01-09  0:43                           ` H. Peter Anvin
  -1 siblings, 0 replies; 34+ messages in thread
From: H. Peter Anvin @ 2010-01-09  0:43 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: H. Peter Anvin, Michal Marek, Simon Horman, Roland Dreier,
	Sam Ravnborg, Sergei Trofimovich, linux-kbuild, linux-kernel,
	linux-sh

> H. Peter Anvin wrote:
>> On 01/08/2010 04:09 PM, Masami Hiramatsu wrote:
>>> Hi Michal,
>>>
>>> Michal Marek wrote:
>>>> Setting LC_CTYPE=C breaks localized messages in some setups. With only
>>>> LC_COLLATE=C and LC_NUMERIC=C, we get almost all we need, except for
>>>> not
>>>> so defined character classes and tolower()/toupper(). The former is
>>>> not
>>>> a big issue, because we can assume that e.g. [:alpha:] will always
>>>> include a-zA-Z and we only ever process ASCII input. The latter seems
>>>> only affect arch/sh/tools/gen-mach-types, which we can handle
>>>> separately.
>>>
>>> Hmm, this also affects arch/x/tools/gen-insn-attr-x86.awk.
>>> Could you also wrap it?
>>>
>>
>> This is tolower/toupper()?  Do there exist locales where tolower/toupper
>> on ASCII input do weird things, or are we merely hypothesizing?
>
> Isn't it affect [A-Z] or [a-z]? If not, the patch good to me too.
>

[A-Z][a-z] is what LC_COLLATE is about.

    -hpa


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] Makefile: do not override LC_CTYPE
@ 2010-01-09  0:43                           ` H. Peter Anvin
  0 siblings, 0 replies; 34+ messages in thread
From: H. Peter Anvin @ 2010-01-09  0:43 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: H. Peter Anvin, Michal Marek, Simon Horman, Roland Dreier,
	Sam Ravnborg, Sergei Trofimovich, linux-kbuild, linux-kernel,
	linux-sh

> H. Peter Anvin wrote:
>> On 01/08/2010 04:09 PM, Masami Hiramatsu wrote:
>>> Hi Michal,
>>>
>>> Michal Marek wrote:
>>>> Setting LC_CTYPE=C breaks localized messages in some setups. With only
>>>> LC_COLLATE=C and LC_NUMERIC=C, we get almost all we need, except for
>>>> not
>>>> so defined character classes and tolower()/toupper(). The former is
>>>> not
>>>> a big issue, because we can assume that e.g. [:alpha:] will always
>>>> include a-zA-Z and we only ever process ASCII input. The latter seems
>>>> only affect arch/sh/tools/gen-mach-types, which we can handle
>>>> separately.
>>>
>>> Hmm, this also affects arch/x/tools/gen-insn-attr-x86.awk.
>>> Could you also wrap it?
>>>
>>
>> This is tolower/toupper()?  Do there exist locales where tolower/toupper
>> on ASCII input do weird things, or are we merely hypothesizing?
>
> Isn't it affect [A-Z] or [a-z]? If not, the patch good to me too.
>

[A-Z][a-z] is what LC_COLLATE is about.

    -hpa


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] Makefile: do not override LC_CTYPE
  2010-01-09  0:16                       ` H. Peter Anvin
@ 2010-01-09  0:53                         ` Masami Hiramatsu
  -1 siblings, 0 replies; 34+ messages in thread
From: Masami Hiramatsu @ 2010-01-09  0:53 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Michal Marek, Simon Horman, Roland Dreier, Sam Ravnborg,
	Sergei Trofimovich, linux-kbuild, linux-kernel, linux-sh

H. Peter Anvin wrote:
> On 01/08/2010 04:09 PM, Masami Hiramatsu wrote:
>> Hi Michal,
>>
>> Michal Marek wrote:
>>> Setting LC_CTYPE=C breaks localized messages in some setups. With only
>>> LC_COLLATE=C and LC_NUMERIC=C, we get almost all we need, except for not
>>> so defined character classes and tolower()/toupper(). The former is not
>>> a big issue, because we can assume that e.g. [:alpha:] will always
>>> include a-zA-Z and we only ever process ASCII input. The latter seems
>>> only affect arch/sh/tools/gen-mach-types, which we can handle separately.
>>
>> Hmm, this also affects arch/x/tools/gen-insn-attr-x86.awk.
>> Could you also wrap it?
>>
> 
> This is tolower/toupper()?  Do there exist locales where tolower/toupper
> on ASCII input do weird things, or are we merely hypothesizing?

Ah, sorry, I was just hypothesizing.
---
#!/bin/sh
# en_US locale sorts alphabets as AaBb...
LANG=en_US
LC_ALLLC_COLLATE=C
LC_NUMERIC=C
export LC_COLLATE LC_NUMERIC
awk 'BEGIN{if (match("C","[a-z]")) {print "NG"} else {print "OK"} exit;}'
---
this returns "OK". So, the patch is OK for me too.

Thanks,

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] Makefile: do not override LC_CTYPE
@ 2010-01-09  0:53                         ` Masami Hiramatsu
  0 siblings, 0 replies; 34+ messages in thread
From: Masami Hiramatsu @ 2010-01-09  0:53 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Michal Marek, Simon Horman, Roland Dreier, Sam Ravnborg,
	Sergei Trofimovich, linux-kbuild, linux-kernel, linux-sh

H. Peter Anvin wrote:
> On 01/08/2010 04:09 PM, Masami Hiramatsu wrote:
>> Hi Michal,
>>
>> Michal Marek wrote:
>>> Setting LC_CTYPE=C breaks localized messages in some setups. With only
>>> LC_COLLATE=C and LC_NUMERIC=C, we get almost all we need, except for not
>>> so defined character classes and tolower()/toupper(). The former is not
>>> a big issue, because we can assume that e.g. [:alpha:] will always
>>> include a-zA-Z and we only ever process ASCII input. The latter seems
>>> only affect arch/sh/tools/gen-mach-types, which we can handle separately.
>>
>> Hmm, this also affects arch/x/tools/gen-insn-attr-x86.awk.
>> Could you also wrap it?
>>
> 
> This is tolower/toupper()?  Do there exist locales where tolower/toupper
> on ASCII input do weird things, or are we merely hypothesizing?

Ah, sorry, I was just hypothesizing.
---
#!/bin/sh
# en_US locale sorts alphabets as AaBb...
LANG=en_US
LC_ALL=
LC_COLLATE=C
LC_NUMERIC=C
export LC_COLLATE LC_NUMERIC
awk 'BEGIN{if (match("C","[a-z]")) {print "NG"} else {print "OK"} exit;}'
---
this returns "OK". So, the patch is OK for me too.

Thanks,

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] Makefile: do not override LC_CTYPE
  2010-01-08 12:16                   ` Michal Marek
@ 2010-01-09  1:07                     ` Masami Hiramatsu
  -1 siblings, 0 replies; 34+ messages in thread
From: Masami Hiramatsu @ 2010-01-09  1:07 UTC (permalink / raw)
  To: Michal Marek
  Cc: H. Peter Anvin, Simon Horman, Roland Dreier, Sam Ravnborg,
	Sergei Trofimovich, linux-kbuild, linux-kernel, linux-sh

Michal Marek wrote:
> Setting LC_CTYPE=C breaks localized messages in some setups. With only
> LC_COLLATE=C and LC_NUMERIC=C, we get almost all we need, except for not
> so defined character classes and tolower()/toupper(). The former is not
> a big issue, because we can assume that e.g. [:alpha:] will always
> include a-zA-Z and we only ever process ASCII input. The latter seems
> only affect arch/sh/tools/gen-mach-types, which we can handle separately.
> 
> So after this patch the meaning of ranges like [a-z], the behavior of
> sort and join, etc. should be the same everywhere and at the same time
> gcc should be able to print localized waring and error messages.
> LC_NUMERIC=C might not be necessary, but setting it doesn't hurt.
> 
> Reported-by: Simon Horman <horms@verge.net.au>
> Reported-by: Sergei Trofimovich <slyfox@inbox.ru>
> Signed-off-by: Michal Marek <mmarek@suse.cz>

I checked that this change doesn't affect arch/x86/tools/gen-insn-attr-x86.awk.

Tested-by: Masami Hiramatsu <mhiramat@redhat.com>

Thank you!


> ---
> 
> Note: if this still breaks for someone, we will simply set LC_ALL=C.
> 
>  Makefile               |    3 +--
>  arch/sh/tools/Makefile |    2 +-
>  2 files changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/Makefile b/Makefile
> index 09a320f..a7b4351 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -18,10 +18,9 @@ MAKEFLAGS += -rR --no-print-directory
>  
>  # Avoid funny character set dependencies
>  unexport LC_ALL
> -LC_CTYPE=C
>  LC_COLLATE=C
>  LC_NUMERIC=C
> -export LC_CTYPE LC_COLLATE LC_NUMERIC
> +export LC_COLLATE LC_NUMERIC
>  
>  # We are using a recursive build, so we need to do a little thinking
>  # to get the ordering right.
> diff --git a/arch/sh/tools/Makefile b/arch/sh/tools/Makefile
> index 558a56b..2082af1 100644
> --- a/arch/sh/tools/Makefile
> +++ b/arch/sh/tools/Makefile
> @@ -13,4 +13,4 @@
>  include/generated/machtypes.h: $(src)/gen-mach-types $(src)/mach-types
>  	@echo '  Generating $@'
>  	$(Q)mkdir -p $(dir $@)
> -	$(Q)$(AWK) -f $^ > $@ || { rm -f $@; /bin/false; }
> +	$(Q)LC_ALL=C $(AWK) -f $^ > $@ || { rm -f $@; /bin/false; }

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] Makefile: do not override LC_CTYPE
@ 2010-01-09  1:07                     ` Masami Hiramatsu
  0 siblings, 0 replies; 34+ messages in thread
From: Masami Hiramatsu @ 2010-01-09  1:07 UTC (permalink / raw)
  To: Michal Marek
  Cc: H. Peter Anvin, Simon Horman, Roland Dreier, Sam Ravnborg,
	Sergei Trofimovich, linux-kbuild, linux-kernel, linux-sh

Michal Marek wrote:
> Setting LC_CTYPE=C breaks localized messages in some setups. With only
> LC_COLLATE=C and LC_NUMERIC=C, we get almost all we need, except for not
> so defined character classes and tolower()/toupper(). The former is not
> a big issue, because we can assume that e.g. [:alpha:] will always
> include a-zA-Z and we only ever process ASCII input. The latter seems
> only affect arch/sh/tools/gen-mach-types, which we can handle separately.
> 
> So after this patch the meaning of ranges like [a-z], the behavior of
> sort and join, etc. should be the same everywhere and at the same time
> gcc should be able to print localized waring and error messages.
> LC_NUMERIC=C might not be necessary, but setting it doesn't hurt.
> 
> Reported-by: Simon Horman <horms@verge.net.au>
> Reported-by: Sergei Trofimovich <slyfox@inbox.ru>
> Signed-off-by: Michal Marek <mmarek@suse.cz>

I checked that this change doesn't affect arch/x86/tools/gen-insn-attr-x86.awk.

Tested-by: Masami Hiramatsu <mhiramat@redhat.com>

Thank you!


> ---
> 
> Note: if this still breaks for someone, we will simply set LC_ALL=C.
> 
>  Makefile               |    3 +--
>  arch/sh/tools/Makefile |    2 +-
>  2 files changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/Makefile b/Makefile
> index 09a320f..a7b4351 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -18,10 +18,9 @@ MAKEFLAGS += -rR --no-print-directory
>  
>  # Avoid funny character set dependencies
>  unexport LC_ALL
> -LC_CTYPE=C
>  LC_COLLATE=C
>  LC_NUMERIC=C
> -export LC_CTYPE LC_COLLATE LC_NUMERIC
> +export LC_COLLATE LC_NUMERIC
>  
>  # We are using a recursive build, so we need to do a little thinking
>  # to get the ordering right.
> diff --git a/arch/sh/tools/Makefile b/arch/sh/tools/Makefile
> index 558a56b..2082af1 100644
> --- a/arch/sh/tools/Makefile
> +++ b/arch/sh/tools/Makefile
> @@ -13,4 +13,4 @@
>  include/generated/machtypes.h: $(src)/gen-mach-types $(src)/mach-types
>  	@echo '  Generating $@'
>  	$(Q)mkdir -p $(dir $@)
> -	$(Q)$(AWK) -f $^ > $@ || { rm -f $@; /bin/false; }
> +	$(Q)LC_ALL=C $(AWK) -f $^ > $@ || { rm -f $@; /bin/false; }

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] Makefile: do not override LC_CTYPE
  2010-01-09  0:16                       ` H. Peter Anvin
@ 2010-01-11  9:52                         ` Michal Marek
  -1 siblings, 0 replies; 34+ messages in thread
From: Michal Marek @ 2010-01-11  9:52 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Masami Hiramatsu, Simon Horman, Roland Dreier, Sam Ravnborg,
	Sergei Trofimovich, linux-kbuild, linux-kernel, linux-sh

On 9.1.2010 01:16, H. Peter Anvin wrote:
> On 01/08/2010 04:09 PM, Masami Hiramatsu wrote:
>> Hi Michal,
>>
>> Michal Marek wrote:
>>> Setting LC_CTYPE=C breaks localized messages in some setups. With only
>>> LC_COLLATE=C and LC_NUMERIC=C, we get almost all we need, except for not
>>> so defined character classes and tolower()/toupper(). The former is not
>>> a big issue, because we can assume that e.g. [:alpha:] will always
>>> include a-zA-Z and we only ever process ASCII input. The latter seems
>>> only affect arch/sh/tools/gen-mach-types, which we can handle separately.
>>
>> Hmm, this also affects arch/x/tools/gen-insn-attr-x86.awk.
>> Could you also wrap it?
>>
> 
> This is tolower/toupper()?  Do there exist locales where tolower/toupper
> on ASCII input do weird things, or are we merely hypothesizing?

In Turkish, uppercase i is İ (I with dot) and lowercase I is ı (i
without dot), see http://en.wikipedia.org/wiki/Dotted_and_dotless_I.

Michal

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] Makefile: do not override LC_CTYPE
@ 2010-01-11  9:52                         ` Michal Marek
  0 siblings, 0 replies; 34+ messages in thread
From: Michal Marek @ 2010-01-11  9:52 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Masami Hiramatsu, Simon Horman, Roland Dreier, Sam Ravnborg,
	Sergei Trofimovich, linux-kbuild, linux-kernel, linux-sh

On 9.1.2010 01:16, H. Peter Anvin wrote:
> On 01/08/2010 04:09 PM, Masami Hiramatsu wrote:
>> Hi Michal,
>>
>> Michal Marek wrote:
>>> Setting LC_CTYPE=C breaks localized messages in some setups. With only
>>> LC_COLLATE=C and LC_NUMERIC=C, we get almost all we need, except for not
>>> so defined character classes and tolower()/toupper(). The former is not
>>> a big issue, because we can assume that e.g. [:alpha:] will always
>>> include a-zA-Z and we only ever process ASCII input. The latter seems
>>> only affect arch/sh/tools/gen-mach-types, which we can handle separately.
>>
>> Hmm, this also affects arch/x/tools/gen-insn-attr-x86.awk.
>> Could you also wrap it?
>>
> 
> This is tolower/toupper()?  Do there exist locales where tolower/toupper
> on ASCII input do weird things, or are we merely hypothesizing?

In Turkish, uppercase i is İ (I with dot) and lowercase I is ı (i
without dot), see http://en.wikipedia.org/wiki/Dotted_and_dotless_I.

Michal

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] Makefile: do not override LC_CTYPE
  2010-01-09  0:16                       ` H. Peter Anvin
                                         ` (3 preceding siblings ...)
  (?)
@ 2010-01-11 10:52                       ` Alan Cox
  2010-01-12  0:50                           ` H. Peter Anvin
  -1 siblings, 1 reply; 34+ messages in thread
From: Alan Cox @ 2010-01-11 10:52 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Masami Hiramatsu, Michal Marek, Simon Horman, Roland Dreier,
	Sam Ravnborg, Sergei Trofimovich, linux-kbuild, linux-kernel,
	linux-sh

> This is tolower/toupper()?  Do there exist locales where tolower/toupper
> on ASCII input do weird things, or are we merely hypothesizing?

Turkish is the famous one for this and usually causes
internationalisation chaos. So yes they exist, and there are worse more
esoteric cases. There are good reasons sed and friends support classes as
well as old C locale style ranges.

Alan

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] Makefile: do not override LC_CTYPE
  2010-01-11 10:52                       ` Alan Cox
@ 2010-01-12  0:50                           ` H. Peter Anvin
  0 siblings, 0 replies; 34+ messages in thread
From: H. Peter Anvin @ 2010-01-12  0:50 UTC (permalink / raw)
  To: Alan Cox
  Cc: Masami Hiramatsu, Michal Marek, Simon Horman, Roland Dreier,
	Sam Ravnborg, Sergei Trofimovich, linux-kbuild, linux-kernel,
	linux-sh

On 01/11/2010 02:52 AM, Alan Cox wrote:
>> This is tolower/toupper()?  Do there exist locales where tolower/toupper
>> on ASCII input do weird things, or are we merely hypothesizing?
> 
> Turkish is the famous one for this and usually causes
> internationalisation chaos. So yes they exist, and there are worse more
> esoteric cases. There are good reasons sed and friends support classes as
> well as old C locale style ranges.
> 

Ah yes, forgot about Turkish.  Apparently Lithuanian and Azeri also have
special rules for the letters I and J.  Sigh.

	-hpa


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] Makefile: do not override LC_CTYPE
@ 2010-01-12  0:50                           ` H. Peter Anvin
  0 siblings, 0 replies; 34+ messages in thread
From: H. Peter Anvin @ 2010-01-12  0:50 UTC (permalink / raw)
  To: Alan Cox
  Cc: Masami Hiramatsu, Michal Marek, Simon Horman, Roland Dreier,
	Sam Ravnborg, Sergei Trofimovich, linux-kbuild, linux-kernel,
	linux-sh

On 01/11/2010 02:52 AM, Alan Cox wrote:
>> This is tolower/toupper()?  Do there exist locales where tolower/toupper
>> on ASCII input do weird things, or are we merely hypothesizing?
> 
> Turkish is the famous one for this and usually causes
> internationalisation chaos. So yes they exist, and there are worse more
> esoteric cases. There are good reasons sed and friends support classes as
> well as old C locale style ranges.
> 

Ah yes, forgot about Turkish.  Apparently Lithuanian and Azeri also have
special rules for the letters I and J.  Sigh.

	-hpa


^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2010-01-12  0:51 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-12-24 11:13 [patch] Makefile: Unexport LANG Simon Horman
2009-12-26  4:30 ` Masami Hiramatsu
2009-12-26  5:14   ` H. Peter Anvin
2009-12-26 11:20     ` Simon Horman
2010-01-08  0:41       ` Simon Horman
2010-01-08  0:43         ` H. Peter Anvin
2010-01-08  2:45           ` Simon Horman
2010-01-08  2:59             ` Simon Horman
2010-01-08 11:57               ` Michal Marek
2010-01-08 12:16                 ` [PATCH] Makefile: do not override LC_CTYPE Michal Marek
2010-01-08 12:16                   ` Michal Marek
2010-01-08 18:50                   ` H. Peter Anvin
2010-01-08 18:50                     ` H. Peter Anvin
2010-01-09  0:00                   ` Simon Horman
2010-01-09  0:00                     ` Simon Horman
2010-01-09  0:07                     ` H. Peter Anvin
2010-01-09  0:07                       ` H. Peter Anvin
2010-01-09  0:09                   ` Masami Hiramatsu
2010-01-09  0:09                     ` Masami Hiramatsu
2010-01-09  0:16                     ` H. Peter Anvin
2010-01-09  0:16                       ` H. Peter Anvin
2010-01-09  0:30                       ` Masami Hiramatsu
2010-01-09  0:30                         ` Masami Hiramatsu
2010-01-09  0:43                         ` H. Peter Anvin
2010-01-09  0:43                           ` H. Peter Anvin
2010-01-09  0:53                       ` Masami Hiramatsu
2010-01-09  0:53                         ` Masami Hiramatsu
2010-01-11  9:52                       ` Michal Marek
2010-01-11  9:52                         ` Michal Marek
2010-01-11 10:52                       ` Alan Cox
2010-01-12  0:50                         ` H. Peter Anvin
2010-01-12  0:50                           ` H. Peter Anvin
2010-01-09  1:07                   ` Masami Hiramatsu
2010-01-09  1:07                     ` Masami Hiramatsu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.