[v2,11/18] arm64: make mrs_s and msr_s macros work with LTO
diff mbox series

Message ID 20171115213428.22559-12-samitolvanen@google.com
State New, archived
Headers show
Series
  • Add support for clang LTO
Related show

Commit Message

Sami Tolvanen Nov. 15, 2017, 9:34 p.m. UTC
From: Alex Matveev <alxmtvv@gmail.com>

Use UNDEFINE_MRS_S and UNDEFINE_MSR_S to define corresponding macros
in-place and workaround gcc and clang limitations on redefining macros
across different assembler blocks.

Signed-off-by: Alex Matveev <alxmtvv@gmail.com>
Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
---
 arch/arm64/include/asm/kvm_hyp.h |  8 ++++--
 arch/arm64/include/asm/sysreg.h  | 55 +++++++++++++++++++++++++++-------------
 2 files changed, 43 insertions(+), 20 deletions(-)

Comments

Will Deacon Nov. 16, 2017, 11:54 a.m. UTC | #1
On Wed, Nov 15, 2017 at 01:34:21PM -0800, Sami Tolvanen wrote:
> From: Alex Matveev <alxmtvv@gmail.com>
> 
> Use UNDEFINE_MRS_S and UNDEFINE_MSR_S to define corresponding macros
> in-place and workaround gcc and clang limitations on redefining macros
> across different assembler blocks.

What limitations? Can you elaborate please? Is this a fix?

Will
Yury Norov Nov. 16, 2017, 1:07 p.m. UTC | #2
On Thu, Nov 16, 2017 at 11:54:33AM +0000, Will Deacon wrote:
> On Wed, Nov 15, 2017 at 01:34:21PM -0800, Sami Tolvanen wrote:
> > From: Alex Matveev <alxmtvv@gmail.com>
> > 
> > Use UNDEFINE_MRS_S and UNDEFINE_MSR_S to define corresponding macros
> > in-place and workaround gcc and clang limitations on redefining macros
> > across different assembler blocks.
> 
> What limitations? Can you elaborate please? Is this a fix?

Hi Will,

Regarding GCC.

When it joins preprocessed source files into single asm file,
mrs_s/msr_s becomes either not declared or declared multiple times.

./ccuFb68h.s:33120: Error: Macro `mrs_s' was already defined
./ccuFb68h.s:33124: Error: Macro `msr_s' was already defined

I'm not sure that GCC works correctly in this case, and I sent the
email to Linaro toolchain group to clarify it. See below.

Yury

[...]

Links:
My unfinished branch:
https://github.com/norov/linux/tree/lto
Andi Kleen tree:
https://github.com/andikleen/linux-misc/tree/lto-411-1
Sami Tolvanen's recent work for clang:
https://lkml.org/lkml/2017/11/3/606

Question we have for now:
There's mrs_s/msr_s macro that doesn't work with LTO - linker
complains very loudly that macro is either not declared, or declared
multiple times. (To reproduce - try to build my kernel branch w/o last
patch).

The same (?) problem is observed with clang, and people there
considered it as feature, not a bug.

https://bugs.llvm.org/show_bug.cgi?id=19749

We have the fix for both clang and gcc, but it looks hacky. Maybe it
worth to fix mrs/msr issue on toolchain side?
Robin Murphy Nov. 16, 2017, 1:55 p.m. UTC | #3
On 16/11/17 13:07, Yury Norov wrote:
> On Thu, Nov 16, 2017 at 11:54:33AM +0000, Will Deacon wrote:
>> On Wed, Nov 15, 2017 at 01:34:21PM -0800, Sami Tolvanen wrote:
>>> From: Alex Matveev <alxmtvv@gmail.com>
>>>
>>> Use UNDEFINE_MRS_S and UNDEFINE_MSR_S to define corresponding macros
>>> in-place and workaround gcc and clang limitations on redefining macros
>>> across different assembler blocks.
>>
>> What limitations? Can you elaborate please? Is this a fix?
> 
> Hi Will,
> 
> Regarding GCC.
> 
> When it joins preprocessed source files into single asm file,
> mrs_s/msr_s becomes either not declared or declared multiple times.
> 
> ./ccuFb68h.s:33120: Error: Macro `mrs_s' was already defined
> ./ccuFb68h.s:33124: Error: Macro `msr_s' was already defined
> 
> I'm not sure that GCC works correctly in this case, and I sent the
> email to Linaro toolchain group to clarify it. See below.
> 
> Yury
> 
> [...]
> 
> Links:
> My unfinished branch:
> https://github.com/norov/linux/tree/lto
> Andi Kleen tree:
> https://github.com/andikleen/linux-misc/tree/lto-411-1
> Sami Tolvanen's recent work for clang:
> https://lkml.org/lkml/2017/11/3/606
> 
> Question we have for now:
> There's mrs_s/msr_s macro that doesn't work with LTO - linker
> complains very loudly that macro is either not declared, or declared
> multiple times. (To reproduce - try to build my kernel branch w/o last
> patch).
> 
> The same (?) problem is observed with clang, and people there
> considered it as feature, not a bug.
> 
> https://bugs.llvm.org/show_bug.cgi?id=19749
> 
> We have the fix for both clang and gcc, but it looks hacky. Maybe it
> worth to fix mrs/msr issue on toolchain side?

Given that this whole mrs_s infrastructure is a workaround for older 
assemblers which don't support the "S<op0>_<op1>_<Cn>_<Cm>_<op2>" syntax 
for arbitrary unnamed system registers (which IIRC was a fairly late 
addition to the architecture), the only way it could be "fixed" on the 
toolchain side is by removing all those older toolchains from existence. 
Good luck with that ;)

In *theory*, it might be possible to do something similar to what we do 
with CONFIG_BROKEN_GAS_INST to detect offending assemblers and only 
define and use these macros when necessary (hopefully Clang and other 
LTO-capable toolchains do accept the proper syntax), but I've no idea 
how invasive or difficult that might turn out to be.

Robin.
Segher Boessenkool Nov. 16, 2017, 1:56 p.m. UTC | #4
On Thu, Nov 16, 2017 at 04:07:42PM +0300, Yury Norov wrote:
> > > Use UNDEFINE_MRS_S and UNDEFINE_MSR_S to define corresponding macros
> > > in-place and workaround gcc and clang limitations on redefining macros
> > > across different assembler blocks.
> > 
> > What limitations? Can you elaborate please? Is this a fix?

> Regarding GCC.
> 
> When it joins preprocessed source files into single asm file,
> mrs_s/msr_s becomes either not declared or declared multiple times.
> 
> ./ccuFb68h.s:33120: Error: Macro `mrs_s' was already defined
> ./ccuFb68h.s:33124: Error: Macro `msr_s' was already defined
> 
> I'm not sure that GCC works correctly in this case, and I sent the
> email to Linaro toolchain group to clarify it. See below.

The compiler is free to inline code, reorder code, duplicate code, etc.
Inline assembler is not exempt from this.

The compiler is fine, the assembler is fine (and the linker has nothing
to do with it).  Your code is not fine.


Segher
Sami Tolvanen Nov. 16, 2017, 4:43 p.m. UTC | #5
On Thu, Nov 16, 2017 at 11:54:33AM +0000, Will Deacon wrote:
> What limitations? Can you elaborate please? Is this a fix?

The commit message in v1 was more informative, I'll change this back for v3:

--

Clang's integrated assembler does not allow assembly macros defined
in one inline asm block using the .macro directive to be used across
separate asm blocks. LLVM developers consider this a feature and not a
bug, recommending code refactoring:

https://bugs.llvm.org/show_bug.cgi?id=19749

As binutils doesn't allow macros to be redefined, this change adds C
preprocessor macros that define the assembly macros globally for binutils
and locally for clang's integrated assembler.

Sami
Nick Desaulniers Nov. 16, 2017, 4:44 p.m. UTC | #6
I need this patch to assemble the kernel for arm64 with clang regardless of LTO.
Sami Tolvanen Nov. 16, 2017, 4:46 p.m. UTC | #7
On Thu, Nov 16, 2017 at 07:56:50AM -0600, Segher Boessenkool wrote:
> The compiler is fine, the assembler is fine (and the linker has
> nothing to do with it).  Your code is not fine.

Would you care to elaborate? The current code assumes that macros are
visible in other inline assembly blocks, and LLVM developers seem to
feel this isn't correct behavior. This patch fixes the current code so
it works with both assemblers.

Sami
Segher Boessenkool Nov. 16, 2017, 5:01 p.m. UTC | #8
On Thu, Nov 16, 2017 at 08:46:08AM -0800, Sami Tolvanen wrote:
> On Thu, Nov 16, 2017 at 07:56:50AM -0600, Segher Boessenkool wrote:
> > The compiler is fine, the assembler is fine (and the linker has
> > nothing to do with it).  Your code is not fine.
> 
> Would you care to elaborate? The current code assumes that macros are
> visible in other inline assembly blocks, and LLVM developers seem to
> feel this isn't correct behavior. This patch fixes the current code so
> it works with both assemblers.

If you say e.g.

void f(void)
{
	asm(".macro something\n\t.endm");
}

there is nothing that prevents the compiler from emitting this more
than once.  Expecting things to be emitted in whatever order is a bad
idea, too.

The thing with .purgem can work.  Inelegant, sure, but it can work :-)
Just make sure you do the macro define, the code that uses it, and the
undefine, all in the same inline asm statement.


Segher
Sami Tolvanen Nov. 16, 2017, 5:11 p.m. UTC | #9
On Thu, Nov 16, 2017 at 11:01:44AM -0600, Segher Boessenkool wrote:
> The thing with .purgem can work.  Inelegant, sure, but it can work :-)

It works, there are already functions in the kernel that use these
macros more than once. I agree that this might not be the most elegant
solution, but at least it allows us to use the same code path for both
compilers.

> Just make sure you do the macro define, the code that uses it, and the
> undefine, all in the same inline asm statement.

Yes, that's what we're doing here. Only KVM uses msr_s/msr_s directly,
everything else uses the (read|write)_sysreg_s wrappers instead, which
means the DEFINE/UNDEFINE macros are only needed in few places.

Sami
Alex Matveev Nov. 16, 2017, 6:14 p.m. UTC | #10
On Thu, 16 Nov 2017 11:54:33 +0000
Will Deacon <will.deacon@arm.com> wrote:

> On Wed, Nov 15, 2017 at 01:34:21PM -0800, Sami Tolvanen wrote:
> > From: Alex Matveev <alxmtvv@gmail.com>
> > 
> > Use UNDEFINE_MRS_S and UNDEFINE_MSR_S to define corresponding macros
> > in-place and workaround gcc and clang limitations on redefining
> > macros across different assembler blocks.  
> 
> What limitations? Can you elaborate please? Is this a fix?

Ok, here's the detailed explanation. Current state after preprocessing
is like this:

asm volatile(".macro mXX_s ... .endm");
void f()
{
	asm volatile("mXX_s a, b");
}

With GCC, it gives macro redefinition error because sysreg.h is
included in multiple source files, and assembler code for all of them
is later combined for LTO (I've seen an intermediate file with hundreds
of identical definitions).

With clang, it gives macro undefined error because clang doesn't allow
sharing macros between inline asm statements.

I also seem to remember catching another sort of undefined error with
GCC due to reordering of macro definition asm statement and generated
asm code for function that uses the macro.

Solution with defining and undefining for each use, while certainly not
elegant, satisfies both GCC and clang, LTO and non-LTO.

Regards,
Alex
Yury Norov Nov. 16, 2017, 9:29 p.m. UTC | #11
On Thu, Nov 16, 2017 at 01:55:31PM +0000, Robin Murphy wrote:
> On 16/11/17 13:07, Yury Norov wrote:
> > On Thu, Nov 16, 2017 at 11:54:33AM +0000, Will Deacon wrote:
> > > On Wed, Nov 15, 2017 at 01:34:21PM -0800, Sami Tolvanen wrote:
> > > > From: Alex Matveev <alxmtvv@gmail.com>
> > > > 
> > > > Use UNDEFINE_MRS_S and UNDEFINE_MSR_S to define corresponding macros
> > > > in-place and workaround gcc and clang limitations on redefining macros
> > > > across different assembler blocks.
> > > 
> > > What limitations? Can you elaborate please? Is this a fix?
> > 
> > Hi Will,
> > 
> > Regarding GCC.
> > 
> > When it joins preprocessed source files into single asm file,
> > mrs_s/msr_s becomes either not declared or declared multiple times.
> > 
> > ./ccuFb68h.s:33120: Error: Macro `mrs_s' was already defined
> > ./ccuFb68h.s:33124: Error: Macro `msr_s' was already defined
> > 
> > I'm not sure that GCC works correctly in this case, and I sent the
> > email to Linaro toolchain group to clarify it. See below.
> > 
> > Yury
> > 
> > [...]
> > 
> > Links:
> > My unfinished branch:
> > https://github.com/norov/linux/tree/lto
> > Andi Kleen tree:
> > https://github.com/andikleen/linux-misc/tree/lto-411-1
> > Sami Tolvanen's recent work for clang:
> > https://lkml.org/lkml/2017/11/3/606
> > 
> > Question we have for now:
> > There's mrs_s/msr_s macro that doesn't work with LTO - linker
> > complains very loudly that macro is either not declared, or declared
> > multiple times. (To reproduce - try to build my kernel branch w/o last
> > patch).
> > 
> > The same (?) problem is observed with clang, and people there
> > considered it as feature, not a bug.
> > 
> > https://bugs.llvm.org/show_bug.cgi?id=19749
> > 
> > We have the fix for both clang and gcc, but it looks hacky. Maybe it
> > worth to fix mrs/msr issue on toolchain side?
> 
> Given that this whole mrs_s infrastructure is a workaround for older
> assemblers which don't support the "S<op0>_<op1>_<Cn>_<Cm>_<op2>" syntax for
> arbitrary unnamed system registers (which IIRC was a fairly late addition to
> the architecture), the only way it could be "fixed" on the toolchain side is
> by removing all those older toolchains from existence. Good luck with that
> ;)
> 
> In *theory*, it might be possible to do something similar to what we do with
> CONFIG_BROKEN_GAS_INST to detect offending assemblers and only define and
> use these macros when necessary (hopefully Clang and other LTO-capable
> toolchains do accept the proper syntax), but I've no idea how invasive or
> difficult that might turn out to be.

Thank you Robin. The "S<op0>_<op1>_<Cn>_<Cm>_<op2>" is the feature I
was looking for. I'm not going to remove old toolchain from every user
machine - I believe if I do, I'll be put into jail finally. :)

But I think we should use this syntax for {read,write}_sysreg_s
and other functions if assembler supports it. The existing mrs_s/msr_s
machinery will be deprecated then, and it will be generally good.

Sami told that LTO doesn't work with old toolchains, so I think that
it would be OK to disable LTO for old assemblers that doesn't support
all necessary features. Anyway, this is not blocking issue because we
have patch that workarounds it (again).

Yury
Alex Matveev Nov. 16, 2017, 10:54 p.m. UTC | #12
On Fri, 17 Nov 2017 00:29:20 +0300
Yury Norov <ynorov@caviumnetworks.com> wrote:

> On Thu, Nov 16, 2017 at 01:55:31PM +0000, Robin Murphy wrote:
> > Given that this whole mrs_s infrastructure is a workaround for older
> > assemblers which don't support the "S<op0>_<op1>_<Cn>_<Cm>_<op2>"
> > syntax for arbitrary unnamed system registers (which IIRC was a
> > fairly late addition to the architecture), the only way it could be
> > "fixed" on the toolchain side is by removing all those older
> > toolchains from existence. Good luck with that ;)
> > 
> > In *theory*, it might be possible to do something similar to what
> > we do with CONFIG_BROKEN_GAS_INST to detect offending assemblers
> > and only define and use these macros when necessary (hopefully
> > Clang and other LTO-capable toolchains do accept the proper
> > syntax), but I've no idea how invasive or difficult that might turn
> > out to be.  
> 
> Thank you Robin. The "S<op0>_<op1>_<Cn>_<Cm>_<op2>" is the feature I
> was looking for. I'm not going to remove old toolchain from every user
> machine - I believe if I do, I'll be put into jail finally. :)
> 
> But I think we should use this syntax for {read,write}_sysreg_s
> and other functions if assembler supports it. The existing mrs_s/msr_s
> machinery will be deprecated then, and it will be generally good.

Well, "S<op>_..." is how registers were accessed before implementation
of mrs_s/msr_s - here's the patch from mid-2014 by Catalin Marinas:

commit 72c583951526
Author: Catalin Marinas <catalin.marinas@arm.com>
Date:   Thu Jul 24 14:14:42 2014 +0100

    arm64: gicv3: Allow GICv3 compilation with older binutils
    
    GICv3 introduces new system registers accessible with the full
    msr/mrs syntax (e.g. mrs x0, Sop0_op1_CRm_CRn_op2). However, only
    recent binutils understand the new syntax. This patch introduces
    msr_s/mrs_s assembly macros which generate the equivalent
    instructions above and converts the existing GICv3 code (both
    drivers/irqchip/ and arch/arm64/kernel/).

The question is - is it OK to drop compatibility with old versions of
binutils (which were already "older" back in 2014)? It's not my call to
make. If yes, then it should be possible to make this change more
aesthetic by reverting to "S<op>" (however, it will affect more places
as now some users of register definitions expect them to be numbers, not
"S<op>" strings).

Going the way of CONFIG_BROKEN_GAS_INST and selecting between "S<op>"
and mrs_s/msr_s is also possible, but then we'd have to keep both
mechanisms around. I'm not quite sure what that would give us.

Regards,
Alex
Nick Desaulniers Dec. 4, 2017, 5:34 p.m. UTC | #13
On Thu, Nov 16, 2017 at 2:54 PM, Alex Matveev <alxmtvv@gmail.com> wrote:
> On Fri, 17 Nov 2017 00:29:20 +0300
> Yury Norov <ynorov@caviumnetworks.com> wrote:
>
>> On Thu, Nov 16, 2017 at 01:55:31PM +0000, Robin Murphy wrote:
>> > Given that this whole mrs_s infrastructure is a workaround for older
>> > assemblers which don't support the "S<op0>_<op1>_<Cn>_<Cm>_<op2>"
>> > syntax for arbitrary unnamed system registers (which IIRC was a
>> > fairly late addition to the architecture), the only way it could be
>> > "fixed" on the toolchain side is by removing all those older
>> > toolchains from existence. Good luck with that ;)
> commit 72c583951526
> Author: Catalin Marinas <catalin.marinas@arm.com>
> Date:   Thu Jul 24 14:14:42 2014 +0100
>
>     arm64: gicv3: Allow GICv3 compilation with older binutils
>
>     GICv3 introduces new system registers accessible with the full
>     msr/mrs syntax (e.g. mrs x0, Sop0_op1_CRm_CRn_op2). However, only
>     recent binutils understand the new syntax. This patch introduces
>     msr_s/mrs_s assembly macros which generate the equivalent
>     instructions above and converts the existing GICv3 code (both
>     drivers/irqchip/ and arch/arm64/kernel/).
>
> The question is - is it OK to drop compatibility with old versions of
> binutils (which were already "older" back in 2014)? It's not my call to
> make. If yes, then it should be possible to make this change more
> aesthetic by reverting to "S<op>" (however, it will affect more places
> as now some users of register definitions expect them to be numbers, not
> "S<op>" strings).

I don't think we found a resolution to the compatibility question
posed.  Given that the affected file is only in use for arm64, I think
the arm64 maintainers should make the call.  I encourage them to drop
support for old toolchains; the use of ld-version macros can help warn
users using old toolchains on newer kernel versions.

Patch
diff mbox series

diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index 4572a9b560fa..20bfb8e676e0 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -29,7 +29,9 @@ 
 	({								\
 		u64 reg;						\
 		asm volatile(ALTERNATIVE("mrs %0, " __stringify(r##nvh),\
-					 "mrs_s %0, " __stringify(r##vh),\
+					 DEFINE_MRS_S			\
+					 "mrs_s %0, " __stringify(r##vh) "\n"\
+					 UNDEFINE_MRS_S,		\
 					 ARM64_HAS_VIRT_HOST_EXTN)	\
 			     : "=r" (reg));				\
 		reg;							\
@@ -39,7 +41,9 @@ 
 	do {								\
 		u64 __val = (u64)(v);					\
 		asm volatile(ALTERNATIVE("msr " __stringify(r##nvh) ", %x0",\
-					 "msr_s " __stringify(r##vh) ", %x0",\
+					 DEFINE_MSR_S			\
+					 "msr_s " __stringify(r##vh) ", %x0\n"\
+					 UNDEFINE_MSR_S,		\
 					 ARM64_HAS_VIRT_HOST_EXTN)	\
 					 : : "rZ" (__val));		\
 	} while (0)
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 08cc88574659..3ae147c7e160 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -584,20 +584,39 @@ 
 
 #include <linux/types.h>
 
-asm(
-"	.irp	num,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30\n"
-"	.equ	.L__reg_num_x\\num, \\num\n"
-"	.endr\n"
+#define __DEFINE_MRS_MSR_S_REGNUM				\
+"	.irp	num,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30\n" \
+"	.equ	.L__reg_num_x\\num, \\num\n"			\
+"	.endr\n"						\
 "	.equ	.L__reg_num_xzr, 31\n"
-"\n"
-"	.macro	mrs_s, rt, sreg\n"
-	__emit_inst(0xd5200000|(\\sreg)|(.L__reg_num_\\rt))
+
+#define DEFINE_MRS_S						\
+	__DEFINE_MRS_MSR_S_REGNUM				\
+"	.macro	mrs_s, rt, sreg\n"				\
+	__emit_inst(0xd5200000|(\\sreg)|(.L__reg_num_\\rt))	\
 "	.endm\n"
-"\n"
-"	.macro	msr_s, sreg, rt\n"
-	__emit_inst(0xd5000000|(\\sreg)|(.L__reg_num_\\rt))
+
+#define DEFINE_MSR_S						\
+	__DEFINE_MRS_MSR_S_REGNUM				\
+"	.macro	msr_s, sreg, rt\n"				\
+	__emit_inst(0xd5000000|(\\sreg)|(.L__reg_num_\\rt))	\
 "	.endm\n"
-);
+
+#define UNDEFINE_MRS_S						\
+"	.purgem	mrs_s\n"
+
+#define UNDEFINE_MSR_S						\
+"	.purgem	msr_s\n"
+
+#define __mrs_s(r, v)						\
+	DEFINE_MRS_S						\
+"	mrs_s %0, " __stringify(r) "\n"				\
+	UNDEFINE_MRS_S : "=r" (v)
+
+#define __msr_s(r, v)						\
+	DEFINE_MSR_S						\
+"	msr_s " __stringify(r) ", %x0\n"			\
+	UNDEFINE_MSR_S : : "rZ" (v)
 
 /*
  * Unlike read_cpuid, calls to read_sysreg are never expected to be
@@ -623,15 +642,15 @@  asm(
  * For registers without architectural names, or simply unsupported by
  * GAS.
  */
-#define read_sysreg_s(r) ({						\
-	u64 __val;							\
-	asm volatile("mrs_s %0, " __stringify(r) : "=r" (__val));	\
-	__val;								\
+#define read_sysreg_s(r) ({					\
+	u64 __val;						\
+	asm volatile(__mrs_s(r, __val));			\
+	__val;							\
 })
 
-#define write_sysreg_s(v, r) do {					\
-	u64 __val = (u64)(v);						\
-	asm volatile("msr_s " __stringify(r) ", %x0" : : "rZ" (__val));	\
+#define write_sysreg_s(v, r) do {				\
+	u64 __val = (u64)(v);					\
+	asm volatile(__msr_s(r, __val));			\
 } while (0)
 
 static inline void config_sctlr_el1(u32 clear, u32 set)