linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] ARM: use assembly mnemonics for VFP register access
@ 2020-02-21  6:34 Stefan Agner
  2020-02-25 19:10 ` Nick Desaulniers
  2020-02-29 22:58 ` Stefan Agner
  0 siblings, 2 replies; 8+ messages in thread
From: Stefan Agner @ 2020-02-21  6:34 UTC (permalink / raw)
  To: linux
  Cc: arnd, manojgupta, jiancai, linux-arm-kernel, linux-kernel,
	clang-built-linux, Stefan Agner

Clang's integrated assembler does not allow to to use the mcr
instruction to access floating point co-processor registers:
arch/arm/vfp/vfpmodule.c:342:2: error: invalid operand for instruction
        fmxr(FPEXC, fpexc & ~(FPEXC_EX|FPEXC_DEX|FPEXC_FP2V|FPEXC_VV|FPEXC_TRAP_MASK));
        ^
arch/arm/vfp/vfpinstr.h:79:6: note: expanded from macro 'fmxr'
        asm("mcr p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmxr   " #_vfp_ ", %0" \
            ^
<inline asm>:1:6: note: instantiated into assembly here
        mcr p10, 7, r0, cr8, cr0, 0 @ fmxr      FPEXC, r0
            ^

The GNU assembler supports the .fpu directive at least since 2.17 (when
documentation has been added). Since Linux requires binutils 2.21 it is
safe to use .fpu directive. Use the .fpu directive and mnemonics for VFP
register access.

This allows to build vfpmodule.c with Clang and its integrated assembler.

Link: https://github.com/ClangBuiltLinux/linux/issues/905
Signed-off-by: Stefan Agner <stefan@agner.ch>
---
 arch/arm/vfp/vfpinstr.h | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/arch/arm/vfp/vfpinstr.h b/arch/arm/vfp/vfpinstr.h
index 38dc154e39ff..799ccf065406 100644
--- a/arch/arm/vfp/vfpinstr.h
+++ b/arch/arm/vfp/vfpinstr.h
@@ -62,21 +62,17 @@
 #define FPSCR_C (1 << 29)
 #define FPSCR_V	(1 << 28)
 
-/*
- * Since we aren't building with -mfpu=vfp, we need to code
- * these instructions using their MRC/MCR equivalents.
- */
-#define vfpreg(_vfp_) #_vfp_
-
 #define fmrx(_vfp_) ({			\
 	u32 __v;			\
-	asm("mrc p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmrx	%0, " #_vfp_	\
+	asm(".fpu	vfpv2\n"	\
+	    "vmrs	%0, " #_vfp_	\
 	    : "=r" (__v) : : "cc");	\
 	__v;				\
  })
 
 #define fmxr(_vfp_,_var_)		\
-	asm("mcr p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmxr	" #_vfp_ ", %0"	\
+	asm(".fpu	vfpv2\n"	\
+	    "vmsr	" #_vfp_ ", %0"	\
 	   : : "r" (_var_) : "cc")
 
 u32 vfp_single_cpdo(u32 inst, u32 fpscr);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] ARM: use assembly mnemonics for VFP register access
  2020-02-21  6:34 [PATCH] ARM: use assembly mnemonics for VFP register access Stefan Agner
@ 2020-02-25 19:10 ` Nick Desaulniers
  2020-02-25 19:33   ` Ard Biesheuvel
  2020-02-29 22:58 ` Stefan Agner
  1 sibling, 1 reply; 8+ messages in thread
From: Nick Desaulniers @ 2020-02-25 19:10 UTC (permalink / raw)
  To: Stefan Agner
  Cc: Russell King, Arnd Bergmann, Manoj Gupta, Jian Cai, Linux ARM,
	LKML, clang-built-linux

On Mon, Feb 24, 2020 at 9:22 PM Stefan Agner <stefan@agner.ch> wrote:
>
> Clang's integrated assembler does not allow to to use the mcr
> instruction to access floating point co-processor registers:
> arch/arm/vfp/vfpmodule.c:342:2: error: invalid operand for instruction
>         fmxr(FPEXC, fpexc & ~(FPEXC_EX|FPEXC_DEX|FPEXC_FP2V|FPEXC_VV|FPEXC_TRAP_MASK));
>         ^
> arch/arm/vfp/vfpinstr.h:79:6: note: expanded from macro 'fmxr'
>         asm("mcr p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmxr   " #_vfp_ ", %0" \
>             ^
> <inline asm>:1:6: note: instantiated into assembly here
>         mcr p10, 7, r0, cr8, cr0, 0 @ fmxr      FPEXC, r0
>             ^
>
> The GNU assembler supports the .fpu directive at least since 2.17 (when
> documentation has been added). Since Linux requires binutils 2.21 it is
> safe to use .fpu directive. Use the .fpu directive and mnemonics for VFP
> register access.
>
> This allows to build vfpmodule.c with Clang and its integrated assembler.
>
> Link: https://github.com/ClangBuiltLinux/linux/issues/905
> Signed-off-by: Stefan Agner <stefan@agner.ch>
> ---
>  arch/arm/vfp/vfpinstr.h | 12 ++++--------
>  1 file changed, 4 insertions(+), 8 deletions(-)
>
> diff --git a/arch/arm/vfp/vfpinstr.h b/arch/arm/vfp/vfpinstr.h
> index 38dc154e39ff..799ccf065406 100644
> --- a/arch/arm/vfp/vfpinstr.h
> +++ b/arch/arm/vfp/vfpinstr.h
> @@ -62,21 +62,17 @@
>  #define FPSCR_C (1 << 29)
>  #define FPSCR_V        (1 << 28)
>
> -/*
> - * Since we aren't building with -mfpu=vfp, we need to code
> - * these instructions using their MRC/MCR equivalents.
> - */
> -#define vfpreg(_vfp_) #_vfp_
> -
>  #define fmrx(_vfp_) ({                 \
>         u32 __v;                        \
> -       asm("mrc p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmrx   %0, " #_vfp_    \
> +       asm(".fpu       vfpv2\n"        \
> +           "vmrs       %0, " #_vfp_    \
>             : "=r" (__v) : : "cc");     \
>         __v;                            \
>   })
>
>  #define fmxr(_vfp_,_var_)              \
> -       asm("mcr p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmxr   " #_vfp_ ", %0" \
> +       asm(".fpu       vfpv2\n"        \
> +           "vmsr       " #_vfp_ ", %0" \
>            : : "r" (_var_) : "cc")
>
>  u32 vfp_single_cpdo(u32 inst, u32 fpscr);
> --

Hi Stefan,
Thanks for the patch.  Reading through:
- FMRX, FMXR, and FMSTAT:
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0068b/Bcfbdihi.html
- VMRS and VMSR:
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0204h/Bcfbdihi.html

Should a macro called `fmrx` that had a comment about `fmrx` be using
`vmrs` in place of `fmrx`?

It looks like Clang treats them the same, but GCC keeps them separate:
https://godbolt.org/z/YKmSAs
Ah, this is only when streaming to assembly. Looks like they have the
same encoding, and produce the same disassembly. (Godbolt emits
assembly by default, and has the option to compile, then disassemble).
If I take my case from godbolt above:

➜  /tmp arm-linux-gnueabihf-gcc -O2 -c x.c
➜  /tmp llvm-objdump -dr x.o

x.o: file format elf32-arm-little


Disassembly of section .text:

00000000 bar:
       0: f1 ee 10 0a                  vmrs r0, fpscr
       4: 70 47                        bx lr
       6: 00 bf                        nop

00000008 baz:
       8: f1 ee 10 0a                  vmrs r0, fpscr
       c: 70 47                        bx lr
       e: 00 bf                        nop

So indeed a similar encoding exists for the two different assembler
instructions.
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>


-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] ARM: use assembly mnemonics for VFP register access
  2020-02-25 19:10 ` Nick Desaulniers
@ 2020-02-25 19:33   ` Ard Biesheuvel
  2020-02-25 19:45     ` Robin Murphy
  2020-02-25 20:27     ` Nick Desaulniers
  0 siblings, 2 replies; 8+ messages in thread
From: Ard Biesheuvel @ 2020-02-25 19:33 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Stefan Agner, Arnd Bergmann, LKML, Jian Cai, clang-built-linux,
	Manoj Gupta, Russell King, Linux ARM

On Tue, 25 Feb 2020 at 20:10, Nick Desaulniers <ndesaulniers@google.com> wrote:
>
> On Mon, Feb 24, 2020 at 9:22 PM Stefan Agner <stefan@agner.ch> wrote:
> >
> > Clang's integrated assembler does not allow to to use the mcr
> > instruction to access floating point co-processor registers:
> > arch/arm/vfp/vfpmodule.c:342:2: error: invalid operand for instruction
> >         fmxr(FPEXC, fpexc & ~(FPEXC_EX|FPEXC_DEX|FPEXC_FP2V|FPEXC_VV|FPEXC_TRAP_MASK));
> >         ^
> > arch/arm/vfp/vfpinstr.h:79:6: note: expanded from macro 'fmxr'
> >         asm("mcr p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmxr   " #_vfp_ ", %0" \
> >             ^
> > <inline asm>:1:6: note: instantiated into assembly here
> >         mcr p10, 7, r0, cr8, cr0, 0 @ fmxr      FPEXC, r0
> >             ^
> >
> > The GNU assembler supports the .fpu directive at least since 2.17 (when
> > documentation has been added). Since Linux requires binutils 2.21 it is
> > safe to use .fpu directive. Use the .fpu directive and mnemonics for VFP
> > register access.
> >
> > This allows to build vfpmodule.c with Clang and its integrated assembler.
> >
> > Link: https://github.com/ClangBuiltLinux/linux/issues/905
> > Signed-off-by: Stefan Agner <stefan@agner.ch>
> > ---
> >  arch/arm/vfp/vfpinstr.h | 12 ++++--------
> >  1 file changed, 4 insertions(+), 8 deletions(-)
> >
> > diff --git a/arch/arm/vfp/vfpinstr.h b/arch/arm/vfp/vfpinstr.h
> > index 38dc154e39ff..799ccf065406 100644
> > --- a/arch/arm/vfp/vfpinstr.h
> > +++ b/arch/arm/vfp/vfpinstr.h
> > @@ -62,21 +62,17 @@
> >  #define FPSCR_C (1 << 29)
> >  #define FPSCR_V        (1 << 28)
> >
> > -/*
> > - * Since we aren't building with -mfpu=vfp, we need to code
> > - * these instructions using their MRC/MCR equivalents.
> > - */
> > -#define vfpreg(_vfp_) #_vfp_
> > -
> >  #define fmrx(_vfp_) ({                 \
> >         u32 __v;                        \
> > -       asm("mrc p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmrx   %0, " #_vfp_    \
> > +       asm(".fpu       vfpv2\n"        \
> > +           "vmrs       %0, " #_vfp_    \
> >             : "=r" (__v) : : "cc");     \
> >         __v;                            \
> >   })
> >
> >  #define fmxr(_vfp_,_var_)              \
> > -       asm("mcr p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmxr   " #_vfp_ ", %0" \
> > +       asm(".fpu       vfpv2\n"        \
> > +           "vmsr       " #_vfp_ ", %0" \
> >            : : "r" (_var_) : "cc")
> >
> >  u32 vfp_single_cpdo(u32 inst, u32 fpscr);
> > --
>
> Hi Stefan,
> Thanks for the patch.  Reading through:
> - FMRX, FMXR, and FMSTAT:
> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0068b/Bcfbdihi.html
> - VMRS and VMSR:
> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0204h/Bcfbdihi.html
>
> Should a macro called `fmrx` that had a comment about `fmrx` be using
> `vmrs` in place of `fmrx`?
>
> It looks like Clang treats them the same, but GCC keeps them separate:
> https://godbolt.org/z/YKmSAs
> Ah, this is only when streaming to assembly. Looks like they have the
> same encoding, and produce the same disassembly. (Godbolt emits
> assembly by default, and has the option to compile, then disassemble).
> If I take my case from godbolt above:
>
> ➜  /tmp arm-linux-gnueabihf-gcc -O2 -c x.c
> ➜  /tmp llvm-objdump -dr x.o
>
> x.o: file format elf32-arm-little
>
>
> Disassembly of section .text:
>
> 00000000 bar:
>        0: f1 ee 10 0a                  vmrs r0, fpscr
>        4: 70 47                        bx lr
>        6: 00 bf                        nop
>
> 00000008 baz:
>        8: f1 ee 10 0a                  vmrs r0, fpscr
>        c: 70 47                        bx lr
>        e: 00 bf                        nop
>
> So indeed a similar encoding exists for the two different assembler
> instructions.

Does that hold for ARM (A32) instructions as well?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] ARM: use assembly mnemonics for VFP register access
  2020-02-25 19:33   ` Ard Biesheuvel
@ 2020-02-25 19:45     ` Robin Murphy
  2020-02-25 20:00       ` Stefan Agner
  2020-02-25 20:27     ` Nick Desaulniers
  1 sibling, 1 reply; 8+ messages in thread
From: Robin Murphy @ 2020-02-25 19:45 UTC (permalink / raw)
  To: Ard Biesheuvel, Nick Desaulniers
  Cc: Arnd Bergmann, LKML, Stefan Agner, Jian Cai, clang-built-linux,
	Manoj Gupta, Russell King, Linux ARM

On 2020-02-25 7:33 pm, Ard Biesheuvel wrote:
> On Tue, 25 Feb 2020 at 20:10, Nick Desaulniers <ndesaulniers@google.com> wrote:
>>
>> On Mon, Feb 24, 2020 at 9:22 PM Stefan Agner <stefan@agner.ch> wrote:
>>>
>>> Clang's integrated assembler does not allow to to use the mcr
>>> instruction to access floating point co-processor registers:
>>> arch/arm/vfp/vfpmodule.c:342:2: error: invalid operand for instruction
>>>          fmxr(FPEXC, fpexc & ~(FPEXC_EX|FPEXC_DEX|FPEXC_FP2V|FPEXC_VV|FPEXC_TRAP_MASK));
>>>          ^
>>> arch/arm/vfp/vfpinstr.h:79:6: note: expanded from macro 'fmxr'
>>>          asm("mcr p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmxr   " #_vfp_ ", %0" \
>>>              ^
>>> <inline asm>:1:6: note: instantiated into assembly here
>>>          mcr p10, 7, r0, cr8, cr0, 0 @ fmxr      FPEXC, r0
>>>              ^
>>>
>>> The GNU assembler supports the .fpu directive at least since 2.17 (when
>>> documentation has been added). Since Linux requires binutils 2.21 it is
>>> safe to use .fpu directive. Use the .fpu directive and mnemonics for VFP
>>> register access.
>>>
>>> This allows to build vfpmodule.c with Clang and its integrated assembler.
>>>
>>> Link: https://github.com/ClangBuiltLinux/linux/issues/905
>>> Signed-off-by: Stefan Agner <stefan@agner.ch>
>>> ---
>>>   arch/arm/vfp/vfpinstr.h | 12 ++++--------
>>>   1 file changed, 4 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/arch/arm/vfp/vfpinstr.h b/arch/arm/vfp/vfpinstr.h
>>> index 38dc154e39ff..799ccf065406 100644
>>> --- a/arch/arm/vfp/vfpinstr.h
>>> +++ b/arch/arm/vfp/vfpinstr.h
>>> @@ -62,21 +62,17 @@
>>>   #define FPSCR_C (1 << 29)
>>>   #define FPSCR_V        (1 << 28)
>>>
>>> -/*
>>> - * Since we aren't building with -mfpu=vfp, we need to code
>>> - * these instructions using their MRC/MCR equivalents.
>>> - */
>>> -#define vfpreg(_vfp_) #_vfp_
>>> -
>>>   #define fmrx(_vfp_) ({                 \
>>>          u32 __v;                        \
>>> -       asm("mrc p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmrx   %0, " #_vfp_    \
>>> +       asm(".fpu       vfpv2\n"        \
>>> +           "vmrs       %0, " #_vfp_    \
>>>              : "=r" (__v) : : "cc");     \
>>>          __v;                            \
>>>    })
>>>
>>>   #define fmxr(_vfp_,_var_)              \
>>> -       asm("mcr p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmxr   " #_vfp_ ", %0" \
>>> +       asm(".fpu       vfpv2\n"        \
>>> +           "vmsr       " #_vfp_ ", %0" \
>>>             : : "r" (_var_) : "cc")
>>>
>>>   u32 vfp_single_cpdo(u32 inst, u32 fpscr);
>>> --
>>
>> Hi Stefan,
>> Thanks for the patch.  Reading through:
>> - FMRX, FMXR, and FMSTAT:
>> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0068b/Bcfbdihi.html
>> - VMRS and VMSR:
>> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0204h/Bcfbdihi.html
>>
>> Should a macro called `fmrx` that had a comment about `fmrx` be using
>> `vmrs` in place of `fmrx`?
>>
>> It looks like Clang treats them the same, but GCC keeps them separate:
>> https://godbolt.org/z/YKmSAs
>> Ah, this is only when streaming to assembly. Looks like they have the
>> same encoding, and produce the same disassembly. (Godbolt emits
>> assembly by default, and has the option to compile, then disassemble).
>> If I take my case from godbolt above:
>>
>> ➜  /tmp arm-linux-gnueabihf-gcc -O2 -c x.c
>> ➜  /tmp llvm-objdump -dr x.o
>>
>> x.o: file format elf32-arm-little
>>
>>
>> Disassembly of section .text:
>>
>> 00000000 bar:
>>         0: f1 ee 10 0a                  vmrs r0, fpscr
>>         4: 70 47                        bx lr
>>         6: 00 bf                        nop
>>
>> 00000008 baz:
>>         8: f1 ee 10 0a                  vmrs r0, fpscr
>>         c: 70 47                        bx lr
>>         e: 00 bf                        nop
>>
>> So indeed a similar encoding exists for the two different assembler
>> instructions.
> 
> Does that hold for ARM (A32) instructions as well?

It should do - they're all the same thing underneath. The UAL syntax 
just renamed all the legacy VFP mnemonics from Fxxx to Vxxx form, apart 
from a couple of things that were already deprecated. GAS still accepts 
both regardless of ".syntax unified", and as a result GCC never saw a 
reason to stop emitting the old mnemonics.

Robin.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] ARM: use assembly mnemonics for VFP register access
  2020-02-25 19:45     ` Robin Murphy
@ 2020-02-25 20:00       ` Stefan Agner
  0 siblings, 0 replies; 8+ messages in thread
From: Stefan Agner @ 2020-02-25 20:00 UTC (permalink / raw)
  To: Robin Murphy, Nick Desaulniers, Ard Biesheuvel
  Cc: Arnd Bergmann, LKML, Jian Cai, clang-built-linux, Manoj Gupta,
	Russell King, Linux ARM

On 2020-02-25 20:45, Robin Murphy wrote:
> On 2020-02-25 7:33 pm, Ard Biesheuvel wrote:
>> On Tue, 25 Feb 2020 at 20:10, Nick Desaulniers <ndesaulniers@google.com> wrote:
>>>
>>> On Mon, Feb 24, 2020 at 9:22 PM Stefan Agner <stefan@agner.ch> wrote:
>>>>
>>>> Clang's integrated assembler does not allow to to use the mcr
>>>> instruction to access floating point co-processor registers:
>>>> arch/arm/vfp/vfpmodule.c:342:2: error: invalid operand for instruction
>>>>          fmxr(FPEXC, fpexc & ~(FPEXC_EX|FPEXC_DEX|FPEXC_FP2V|FPEXC_VV|FPEXC_TRAP_MASK));
>>>>          ^
>>>> arch/arm/vfp/vfpinstr.h:79:6: note: expanded from macro 'fmxr'
>>>>          asm("mcr p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmxr   " #_vfp_ ", %0" \
>>>>              ^
>>>> <inline asm>:1:6: note: instantiated into assembly here
>>>>          mcr p10, 7, r0, cr8, cr0, 0 @ fmxr      FPEXC, r0
>>>>              ^
>>>>
>>>> The GNU assembler supports the .fpu directive at least since 2.17 (when
>>>> documentation has been added). Since Linux requires binutils 2.21 it is
>>>> safe to use .fpu directive. Use the .fpu directive and mnemonics for VFP
>>>> register access.
>>>>
>>>> This allows to build vfpmodule.c with Clang and its integrated assembler.
>>>>
>>>> Link: https://github.com/ClangBuiltLinux/linux/issues/905
>>>> Signed-off-by: Stefan Agner <stefan@agner.ch>
>>>> ---
>>>>   arch/arm/vfp/vfpinstr.h | 12 ++++--------
>>>>   1 file changed, 4 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/arch/arm/vfp/vfpinstr.h b/arch/arm/vfp/vfpinstr.h
>>>> index 38dc154e39ff..799ccf065406 100644
>>>> --- a/arch/arm/vfp/vfpinstr.h
>>>> +++ b/arch/arm/vfp/vfpinstr.h
>>>> @@ -62,21 +62,17 @@
>>>>   #define FPSCR_C (1 << 29)
>>>>   #define FPSCR_V        (1 << 28)
>>>>
>>>> -/*
>>>> - * Since we aren't building with -mfpu=vfp, we need to code
>>>> - * these instructions using their MRC/MCR equivalents.
>>>> - */
>>>> -#define vfpreg(_vfp_) #_vfp_
>>>> -
>>>>   #define fmrx(_vfp_) ({                 \
>>>>          u32 __v;                        \
>>>> -       asm("mrc p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmrx   %0, " #_vfp_    \
>>>> +       asm(".fpu       vfpv2\n"        \
>>>> +           "vmrs       %0, " #_vfp_    \
>>>>              : "=r" (__v) : : "cc");     \
>>>>          __v;                            \
>>>>    })
>>>>
>>>>   #define fmxr(_vfp_,_var_)              \
>>>> -       asm("mcr p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmxr   " #_vfp_ ", %0" \
>>>> +       asm(".fpu       vfpv2\n"        \
>>>> +           "vmsr       " #_vfp_ ", %0" \
>>>>             : : "r" (_var_) : "cc")
>>>>
>>>>   u32 vfp_single_cpdo(u32 inst, u32 fpscr);
>>>> --
>>>
>>> Hi Stefan,
>>> Thanks for the patch.  Reading through:
>>> - FMRX, FMXR, and FMSTAT:
>>> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0068b/Bcfbdihi.html
>>> - VMRS and VMSR:
>>> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0204h/Bcfbdihi.html
>>>
>>> Should a macro called `fmrx` that had a comment about `fmrx` be using
>>> `vmrs` in place of `fmrx`?
>>>
>>> It looks like Clang treats them the same, but GCC keeps them separate:
>>> https://godbolt.org/z/YKmSAs
>>> Ah, this is only when streaming to assembly. Looks like they have the
>>> same encoding, and produce the same disassembly. (Godbolt emits
>>> assembly by default, and has the option to compile, then disassemble).
>>> If I take my case from godbolt above:
>>>
>>> ➜  /tmp arm-linux-gnueabihf-gcc -O2 -c x.c
>>> ➜  /tmp llvm-objdump -dr x.o
>>>
>>> x.o: file format elf32-arm-little
>>>
>>>
>>> Disassembly of section .text:
>>>
>>> 00000000 bar:
>>>         0: f1 ee 10 0a                  vmrs r0, fpscr
>>>         4: 70 47                        bx lr
>>>         6: 00 bf                        nop
>>>
>>> 00000008 baz:
>>>         8: f1 ee 10 0a                  vmrs r0, fpscr
>>>         c: 70 47                        bx lr
>>>         e: 00 bf                        nop
>>>
>>> So indeed a similar encoding exists for the two different assembler
>>> instructions.
>>
>> Does that hold for ARM (A32) instructions as well?
> 
> It should do - they're all the same thing underneath. The UAL syntax
> just renamed all the legacy VFP mnemonics from Fxxx to Vxxx form,
> apart from a couple of things that were already deprecated. GAS still
> accepts both regardless of ".syntax unified", and as a result GCC
> never saw a reason to stop emitting the old mnemonics.
> 

Yes this is really only a mnemonic change when unified assembler
language (UAL) got introduce, the ARM ARM has a list of mnemonic changes
in the appendix.

Just do make sure I also did compare the disassembled object file of
vfpmodule.c before and after this change.

I guess we could (should?) also change the macro name, but I guess that
should be a separate commit anyway.

--
Stefan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] ARM: use assembly mnemonics for VFP register access
  2020-02-25 19:33   ` Ard Biesheuvel
  2020-02-25 19:45     ` Robin Murphy
@ 2020-02-25 20:27     ` Nick Desaulniers
  2020-02-25 22:46       ` Nick Desaulniers
  1 sibling, 1 reply; 8+ messages in thread
From: Nick Desaulniers @ 2020-02-25 20:27 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Stefan Agner, Arnd Bergmann, LKML, Jian Cai, clang-built-linux,
	Manoj Gupta, Russell King, Linux ARM

On Tue, Feb 25, 2020 at 11:33 AM Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
>
> On Tue, 25 Feb 2020 at 20:10, Nick Desaulniers <ndesaulniers@google.com> wrote:
> > Ah, this is only when streaming to assembly. Looks like they have the
> > same encoding, and produce the same disassembly. (Godbolt emits
> > assembly by default, and has the option to compile, then disassemble).
> > If I take my case from godbolt above:
> >
> > ➜  /tmp arm-linux-gnueabihf-gcc -O2 -c x.c
> > ➜  /tmp llvm-objdump -dr x.o
> >
> > x.o: file format elf32-arm-little
> >
> >
> > Disassembly of section .text:
> >
> > 00000000 bar:
> >        0: f1 ee 10 0a                  vmrs r0, fpscr
> >        4: 70 47                        bx lr
> >        6: 00 bf                        nop
> >
> > 00000008 baz:
> >        8: f1 ee 10 0a                  vmrs r0, fpscr
> >        c: 70 47                        bx lr
> >        e: 00 bf                        nop
> >
> > So indeed a similar encoding exists for the two different assembler
> > instructions.
>
> Does that hold for ARM (A32) instructions as well?

TIL -mthumb is the default for arm-linux-gnueabihf-gcc -O2.

➜  /tmp arm-linux-gnueabihf-gcc -O2 -c x.c -marm
➜  /tmp llvm-objdump -dr x.o

x.o: file format elf32-arm-little


Disassembly of section .text:

00000000 bar:
       0: 10 0a f1 ee                  vmrs r0, fpscr
       4: 1e ff 2f e1                  bx lr

00000008 baz:
       8: 10 0a f1 ee                  vmrs r0, fpscr
       c: 1e ff 2f e1                  bx lr

^ Just to show the matching encoding.
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] ARM: use assembly mnemonics for VFP register access
  2020-02-25 20:27     ` Nick Desaulniers
@ 2020-02-25 22:46       ` Nick Desaulniers
  0 siblings, 0 replies; 8+ messages in thread
From: Nick Desaulniers @ 2020-02-25 22:46 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Stefan Agner, Arnd Bergmann, LKML, Jian Cai, clang-built-linux,
	Manoj Gupta, Russell King, Linux ARM, Peter Smith

On Tue, Feb 25, 2020 at 12:27 PM Nick Desaulniers
<ndesaulniers@google.com> wrote:
>
> On Tue, Feb 25, 2020 at 11:33 AM Ard Biesheuvel
> <ard.biesheuvel@linaro.org> wrote:
> >
> > On Tue, 25 Feb 2020 at 20:10, Nick Desaulniers <ndesaulniers@google.com> wrote:
> > > Ah, this is only when streaming to assembly. Looks like they have the
> > > same encoding, and produce the same disassembly. (Godbolt emits
> > > assembly by default, and has the option to compile, then disassemble).
> > > If I take my case from godbolt above:
> > >
> > > ➜  /tmp arm-linux-gnueabihf-gcc -O2 -c x.c
> > > ➜  /tmp llvm-objdump -dr x.o
> > >
> > > x.o: file format elf32-arm-little
> > >
> > >
> > > Disassembly of section .text:
> > >
> > > 00000000 bar:
> > >        0: f1 ee 10 0a                  vmrs r0, fpscr
> > >        4: 70 47                        bx lr
> > >        6: 00 bf                        nop
> > >
> > > 00000008 baz:
> > >        8: f1 ee 10 0a                  vmrs r0, fpscr
> > >        c: 70 47                        bx lr
> > >        e: 00 bf                        nop
> > >
> > > So indeed a similar encoding exists for the two different assembler
> > > instructions.
> >
> > Does that hold for ARM (A32) instructions as well?
>
> TIL -mthumb is the default for arm-linux-gnueabihf-gcc -O2.
>
> ➜  /tmp arm-linux-gnueabihf-gcc -O2 -c x.c -marm
> ➜  /tmp llvm-objdump -dr x.o
>
> x.o: file format elf32-arm-little
>
>
> Disassembly of section .text:
>
> 00000000 bar:
>        0: 10 0a f1 ee                  vmrs r0, fpscr
>        4: 1e ff 2f e1                  bx lr
>
> 00000008 baz:
>        8: 10 0a f1 ee                  vmrs r0, fpscr
>        c: 1e ff 2f e1                  bx lr
>
> ^ Just to show the matching encoding.

Further, Peter just sent me this response off thread, which I thought
I'd share. Thanks Peter.  Bookmarked.
```
FWIW the Arm ARM reference manual
https://static.docs.arm.com/ddi0487/ea/DDI0487E_a_armv8_arm.pdf has a
table that maps the pre-UAL syntax to the UAL syntax.

K6.1.2 Pre-UAL instruction syntax for the A32 floating-point instructions
This has an entry mapping pre-UAL (FMRX) to UAL (VMSR)

So they are the same instruction with the modern name being VMSR. If
it is possible to use the new name it will probably confuse fewer
people, but other than that it won't do any harm.
```
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] ARM: use assembly mnemonics for VFP register access
  2020-02-21  6:34 [PATCH] ARM: use assembly mnemonics for VFP register access Stefan Agner
  2020-02-25 19:10 ` Nick Desaulniers
@ 2020-02-29 22:58 ` Stefan Agner
  1 sibling, 0 replies; 8+ messages in thread
From: Stefan Agner @ 2020-02-29 22:58 UTC (permalink / raw)
  To: linux
  Cc: arnd, manojgupta, jiancai, linux-arm-kernel, linux-kernel,
	clang-built-linux

On 2020-02-21 07:34, Stefan Agner wrote:
> Clang's integrated assembler does not allow to to use the mcr
> instruction to access floating point co-processor registers:
> arch/arm/vfp/vfpmodule.c:342:2: error: invalid operand for instruction
>         fmxr(FPEXC, fpexc &
> ~(FPEXC_EX|FPEXC_DEX|FPEXC_FP2V|FPEXC_VV|FPEXC_TRAP_MASK));
>         ^
> arch/arm/vfp/vfpinstr.h:79:6: note: expanded from macro 'fmxr'
>         asm("mcr p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmxr   "
> #_vfp_ ", %0" \
>             ^
> <inline asm>:1:6: note: instantiated into assembly here
>         mcr p10, 7, r0, cr8, cr0, 0 @ fmxr      FPEXC, r0
>             ^
> 
> The GNU assembler supports the .fpu directive at least since 2.17 (when
> documentation has been added). Since Linux requires binutils 2.21 it is
> safe to use .fpu directive. Use the .fpu directive and mnemonics for VFP
> register access.
> 
> This allows to build vfpmodule.c with Clang and its integrated assembler.
> 
> Link: https://github.com/ClangBuiltLinux/linux/issues/905
> Signed-off-by: Stefan Agner <stefan@agner.ch>
> ---
>  arch/arm/vfp/vfpinstr.h | 12 ++++--------
>  1 file changed, 4 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/arm/vfp/vfpinstr.h b/arch/arm/vfp/vfpinstr.h
> index 38dc154e39ff..799ccf065406 100644
> --- a/arch/arm/vfp/vfpinstr.h
> +++ b/arch/arm/vfp/vfpinstr.h
> @@ -62,21 +62,17 @@
>  #define FPSCR_C (1 << 29)
>  #define FPSCR_V	(1 << 28)
>  
> -/*
> - * Since we aren't building with -mfpu=vfp, we need to code
> - * these instructions using their MRC/MCR equivalents.
> - */
> -#define vfpreg(_vfp_) #_vfp_
> -
>  #define fmrx(_vfp_) ({			\
>  	u32 __v;			\
> -	asm("mrc p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmrx	%0, " #_vfp_	\
> +	asm(".fpu	vfpv2\n"	\
> +	    "vmrs	%0, " #_vfp_	\
>  	    : "=r" (__v) : : "cc");	\
>  	__v;				\
>   })
>  
>  #define fmxr(_vfp_,_var_)		\
> -	asm("mcr p10, 7, %0, " vfpreg(_vfp_) ", cr0, 0 @ fmxr	" #_vfp_ ", %0"	\
> +	asm(".fpu	vfpv2\n"	\
> +	    "vmsr	" #_vfp_ ", %0"	\
>  	   : : "r" (_var_) : "cc")
>  
>  u32 vfp_single_cpdo(u32 inst, u32 fpscr);

I just found out that this fails with binutils 2.23.1. Since we support
binutils back to 2.21 I guess that is not OK..?

  CC      arch/arm/vfp/vfpmodule.o
/tmp/cc2Vcw98.s: Assembler messages:
/tmp/cc2Vcw98.s:920: Error: operand 1 must be a VFP extension System
Register -- `vmrs r6,FPINST'
/tmp/cc2Vcw98.s:948: Error: operand 1 must be a VFP extension System
Register -- `vmrs r6,FPINST2'

Looking into binutils history reveals that FPINST/FPINST2 has been
allowed with 16d02dc907c5717b5f47076bb90ae3795e73b59f
("gas/config/tc-arm.c (do_vmrs): Accept all control registers") which
made it into binutils 2.24...

I don't have a particular good idea how to make this work for Clang and
GCC other than a some ifdef's...

--
Stefan

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-02-29 22:58 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-21  6:34 [PATCH] ARM: use assembly mnemonics for VFP register access Stefan Agner
2020-02-25 19:10 ` Nick Desaulniers
2020-02-25 19:33   ` Ard Biesheuvel
2020-02-25 19:45     ` Robin Murphy
2020-02-25 20:00       ` Stefan Agner
2020-02-25 20:27     ` Nick Desaulniers
2020-02-25 22:46       ` Nick Desaulniers
2020-02-29 22:58 ` Stefan Agner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).