linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Maciej W. Rozycki" <macro@linux-mips.org>
To: Toma Tabacu <Toma.Tabacu@imgtec.com>
Cc: Daniel Sanders <Daniel.Sanders@imgtec.com>,
	Ralf Baechle <ralf@linux-mips.org>,
	Paul Burton <Paul.Burton@imgtec.com>,
	Paul Bolle <pebolle@tiscali.nl>,
	"Steven J. Hill" <Steven.Hill@imgtec.com>,
	Manuel Lauss <manuel.lauss@gmail.com>,
	Jim Quinlan <jim2101024@gmail.com>,
	"linux-mips@linux-mips.org" <linux-mips@linux-mips.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: RE: [PATCH 5/5] MIPS: LLVMLinux: Silence unicode warnings when preprocessing assembly.
Date: Thu, 5 Feb 2015 12:35:42 +0000 (GMT)	[thread overview]
Message-ID: <alpine.LFD.2.11.1502051107150.22715@eddie.linux-mips.org> (raw)
In-Reply-To: <A614194ED15B4844BC4C9FB7F21FCD9201347BAE@hhmail02.hh.imgtec.org>

On Thu, 5 Feb 2015, Toma Tabacu wrote:

> > 2. It considers these character pairs to be unicode escapes in the first 
> >    place given that they do not follow the syntax required for such 
> >    escapes, that is `\unnnn', where `n' are hex digits.
> > 
> 
> It doesn't actually treat them as unicode escapes, but it still warns the user,
> in case they were meant to be unicode escapes. Here's the warning message:
> 
> arch/mips/include/asm/asmmacro.h:197:51: warning: \u used with no following hex digits; treating as '\' followed by identifier [-Wunicode]
>          .word  0x41000000 | (\rt << 16) | (\rd << 11) | (\u << 5) | (\sel)
>                                                           ^
> I'll add it to the summary in v2.

 Thanks, that makes things clearer.  It always makes sense to include the 
exact error message produced where applicable or otherwise people do not 
necessarily know what the matter is.

> > Of course it may be reasonable for us to work this bug around as we've 
> > been doing for years with GCC, but has the issue been reported back to 
> > clang maintainers?  What was their response?
> > 
> 
> It hasn't been reported, but I don't think they would agree with removing
> unicode escape sequences from the assembler-with-cpp mode because it is
> currently being used for other languages as well, not just assembly.

 First, preprocessing rules surely have to be language specific.  The C 
language standard does not specify what the preprocessor is meant to do 
(if anything) for other languages.  GCC or clang -- that's no different.  

 The assembly language has a different syntax and `\u' has a different 
meaning in the context of assembly macro expansion than it would have in a 
name of a symbol, where such a Unicode escape sequence might indeed be 
interpreted as such and character encoded propagated to the symbol 
produced.  But that's up to the assembler -- GAS for example does not 
AFAIK support Unicode escape sequences in symbol names right now, but I 
suppose such a feature could be added if desired.

 Which prompts another question of course: how does the clang C compiler 
represent Unicode characters in identifiers in its assembly output?

 I have looked into the C language standard and it appears to me like the 
translation phase to interpret universal character names at has not been 
defined.  This is probably why the standard does specify the result of 
pasting preprocessor tokens together as undefined if a universal character 
name is produced this way.

 Consequently I think an important question in this context is: does 
clang's preprocessor actually convert these sequences anyhow before 
passing them down to the compiler?  How for example does C output from a 
trivial example that contains such Unicode escape sequences look like 
then?

> One such language is Haskell (ghc, to be more specific), for which the clang
> developers had to actually stop the preprocessor from enforcing the C universal
> character name restrictions in assembler-with-cpp mode, which suggests that ghc
> wants the preprocessor to check for unicode escape sequences.
> 
> At the moment, we can either disable -Wunicode for asmmacro.h or refrain from
> using '\u' as an identifier.

 To be clear: it's `u' here that is the identifier, the leading `\' is 
merely how assembly syntax has been specified for references to macro 
arguments.  And TBH I find banning any macro arguments starting with `u' 
rather silly.  I'm leaning towards considering having -Wunicode disabled 
for all assembly sources, or maybe even for the whole Linux compilation, 
the right solution.  It's not like we have a need for Unicode identifiers.  

 What's the exact semantics of -Wunicode for clang?

  Maciej

  reply	other threads:[~2015-02-05 12:35 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-03 13:37 [PATCH 0/5] MIPS: LLVMLinux: Patches to enable compilation of a working kernel for MIPS using Clang/LLVM Daniel Sanders
2015-02-03 13:37 ` [PATCH 1/5] LLVMLinux: Correct size_index table before replacing the bootstrap kmem_cache_node Daniel Sanders
2015-02-03 15:14   ` Christoph Lameter
2015-02-03 16:00     ` Daniel Sanders
2015-02-04 20:52       ` [PATCH v2 " Daniel Sanders
2015-02-04 21:06       ` [PATCH v3 1/5] slab: " Daniel Sanders
2015-02-05  8:37         ` Pekka Enberg
2015-02-04 19:33   ` [PATCH 1/5] LLVMLinux: " Pekka Enberg
2015-02-04 20:38     ` Daniel Sanders
2015-02-04 20:42       ` Pekka Enberg
2015-02-04 21:08         ` Daniel Sanders
2015-02-03 13:37 ` [PATCH 2/5] MIPS: LLVMLinux: Fix a 'cast to type not present in union' error Daniel Sanders
2015-02-03 13:37 ` [PATCH 3/5] MIPS: LLVMLinux: Fix an 'inline asm input/output type mismatch' error Daniel Sanders
2015-02-04 12:57   ` Maciej W. Rozycki
2015-02-05 15:43     ` Daniel Sanders
2015-02-06 10:09       ` Maciej W. Rozycki
2015-02-09 11:33   ` [PATCH v2 " Daniel Sanders
2015-02-09 14:12     ` Maciej W. Rozycki
2015-02-09 16:44     ` [PATCH v3 " Daniel Sanders
2015-02-03 13:37 ` [PATCH 4/5] MIPS: LLVMLinux: Silence variable self-assignment warnings Daniel Sanders
2015-02-03 13:37 ` [PATCH 5/5] MIPS: LLVMLinux: Silence unicode warnings when preprocessing assembly Daniel Sanders
2015-02-04 10:36   ` Maciej W. Rozycki
2015-02-05 10:25     ` Toma Tabacu
2015-02-05 12:35       ` Maciej W. Rozycki [this message]
2015-02-05 12:56         ` Måns Rullgård
2015-02-11 17:37           ` Daniel Sanders

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.2.11.1502051107150.22715@eddie.linux-mips.org \
    --to=macro@linux-mips.org \
    --cc=Daniel.Sanders@imgtec.com \
    --cc=Paul.Burton@imgtec.com \
    --cc=Steven.Hill@imgtec.com \
    --cc=Toma.Tabacu@imgtec.com \
    --cc=jim2101024@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@linux-mips.org \
    --cc=manuel.lauss@gmail.com \
    --cc=pebolle@tiscali.nl \
    --cc=ralf@linux-mips.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).