* [PATCH] MIPS: UAPI: Fix unrecognized opcode WSBH/DSBH/DSHD when using MIPS16.
@ 2015-09-03 9:47 Yousong Zhou
2015-09-04 18:52 ` Maciej W. Rozycki
0 siblings, 1 reply; 7+ messages in thread
From: Yousong Zhou @ 2015-09-03 9:47 UTC (permalink / raw)
To: Ralf Baechle; +Cc: Chen Jie, Yousong Zhou, linux-mips, linux-kernel
The nomips16 has to be added both as function attribute and assembler
directive.
When only function attribute is specified, the compiler will inline the
function with -Os optimization. The generated assembly code cannot be
correctly assembled because ISA mode switch has to be done through jump
instruction.
When only ".set nomips16" directive is used, the generated assembly code
will use MIPS32 code for the inline assembly template and MIPS16 for the
function return. The compiled binary is invalid:
00403100 <__arch_swab16>:
403100: 7c0410a0 wsbh v0,a0
403104: e820ea31 swc2 $0,-5583(at)
while correct code should be:
00402650 <__arch_swab16>:
402650: 7c0410a0 wsbh v0,a0
402654: 03e00008 jr ra
402658: 3042ffff andi v0,v0,0xffff
Signed-off-by: Yousong Zhou <yszhou4tech@gmail.com>
---
arch/mips/include/uapi/asm/swab.h | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/arch/mips/include/uapi/asm/swab.h b/arch/mips/include/uapi/asm/swab.h
index 8f2d184..c4ddc4f 100644
--- a/arch/mips/include/uapi/asm/swab.h
+++ b/arch/mips/include/uapi/asm/swab.h
@@ -16,11 +16,13 @@
#if (defined(__mips_isa_rev) && (__mips_isa_rev >= 2)) || \
defined(_MIPS_ARCH_LOONGSON3A)
-static inline __attribute_const__ __u16 __arch_swab16(__u16 x)
+static inline __attribute__((nomips16)) __attribute_const__
+ __u16 __arch_swab16(__u16 x)
{
__asm__(
" .set push \n"
" .set arch=mips32r2 \n"
+ " .set nomips16 \n"
" wsbh %0, %1 \n"
" .set pop \n"
: "=r" (x)
@@ -30,11 +32,13 @@ static inline __attribute_const__ __u16 __arch_swab16(__u16 x)
}
#define __arch_swab16 __arch_swab16
-static inline __attribute_const__ __u32 __arch_swab32(__u32 x)
+static inline __attribute__((nomips16)) __attribute_const__
+ __u32 __arch_swab32(__u32 x)
{
__asm__(
" .set push \n"
" .set arch=mips32r2 \n"
+ " .set nomips16 \n"
" wsbh %0, %1 \n"
" rotr %0, %0, 16 \n"
" .set pop \n"
@@ -50,11 +54,13 @@ static inline __attribute_const__ __u32 __arch_swab32(__u32 x)
* 64-bit kernel on r2 CPUs.
*/
#ifdef __mips64
-static inline __attribute_const__ __u64 __arch_swab64(__u64 x)
+static inline __attribute__((nomips16)) __attribute_const__
+ __u64 __arch_swab64(__u64 x)
{
__asm__(
" .set push \n"
" .set arch=mips64r2 \n"
+ " .set nomips16 \n"
" dsbh %0, %1 \n"
" dshd %0, %0 \n"
" .set pop \n"
--
1.7.10.4
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] MIPS: UAPI: Fix unrecognized opcode WSBH/DSBH/DSHD when using MIPS16.
2015-09-03 9:47 [PATCH] MIPS: UAPI: Fix unrecognized opcode WSBH/DSBH/DSHD when using MIPS16 Yousong Zhou
@ 2015-09-04 18:52 ` Maciej W. Rozycki
2015-09-05 3:29 ` Yousong Zhou
0 siblings, 1 reply; 7+ messages in thread
From: Maciej W. Rozycki @ 2015-09-04 18:52 UTC (permalink / raw)
To: Yousong Zhou; +Cc: Ralf Baechle, Chen Jie, linux-mips, linux-kernel
On Thu, 3 Sep 2015, Yousong Zhou wrote:
> The nomips16 has to be added both as function attribute and assembler
> directive.
>
> When only function attribute is specified, the compiler will inline the
> function with -Os optimization. The generated assembly code cannot be
> correctly assembled because ISA mode switch has to be done through jump
> instruction.
This can't be true. The compiler does not intepret the contents of an
inline asm and therefore cannot decide whether to inline a function
containing one or not based on the lone presence or the absence of an
assembly directive within.
> When only ".set nomips16" directive is used, the generated assembly code
> will use MIPS32 code for the inline assembly template and MIPS16 for the
> function return. The compiled binary is invalid:
>
> 00403100 <__arch_swab16>:
> 403100: 7c0410a0 wsbh v0,a0
> 403104: e820ea31 swc2 $0,-5583(at)
>
> while correct code should be:
>
> 00402650 <__arch_swab16>:
> 402650: 7c0410a0 wsbh v0,a0
> 402654: 03e00008 jr ra
> 402658: 3042ffff andi v0,v0,0xffff
It looks to me you're papering something over here -- when you use a
`.set nomips16' directive the assembler will happily switch your
instruction set in the middle of an instruction stream. Consequently if
this function is (for whatever reason; it should not really) inlined in
MIPS16 code, then you'll get a MIPS32 instruction in the middle, which
will obviously be interpreted differently in the MIPS16 execution mode and
is therefore surely a recipe for disaster.
How did you test your change and made the conclusion quoted with your
patch description?
> Signed-off-by: Yousong Zhou <yszhou4tech@gmail.com>
> ---
> arch/mips/include/uapi/asm/swab.h | 12 +++++++++---
> 1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/arch/mips/include/uapi/asm/swab.h b/arch/mips/include/uapi/asm/swab.h
> index 8f2d184..c4ddc4f 100644
> --- a/arch/mips/include/uapi/asm/swab.h
> +++ b/arch/mips/include/uapi/asm/swab.h
> @@ -16,11 +16,13 @@
> #if (defined(__mips_isa_rev) && (__mips_isa_rev >= 2)) || \
> defined(_MIPS_ARCH_LOONGSON3A)
>
> -static inline __attribute_const__ __u16 __arch_swab16(__u16 x)
> +static inline __attribute__((nomips16)) __attribute_const__
> + __u16 __arch_swab16(__u16 x)
> {
> __asm__(
> " .set push \n"
> " .set arch=mips32r2 \n"
> + " .set nomips16 \n"
> " wsbh %0, %1 \n"
> " .set pop \n"
> : "=r" (x)
So setting aside the correctness issues discussed above, for MIPS16 code
this has to be put out of line by the compiler, with all the usual
overhead of a function call, causing a performance loss rather than a gain
expected here. Especially as switching the ISA mode requires draining the
whole pipeline so that subsequent instructions are interpreted in the new
execution mode. This is expensive in performance terms.
I'm fairly sure simply disabling these asms (#ifndef __mips16) and
letting the compiler generate the right mask and shift operations from
plain C code will be cheaper in performance terms and possibly cheaper in
code size as well, especially in otherwise leaf functions where an extra
function call will according to the ABI clobber temporaries. So I suggest
going in that direction instead.
So this is a NAK really.
Maciej
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] MIPS: UAPI: Fix unrecognized opcode WSBH/DSBH/DSHD when using MIPS16.
2015-09-04 18:52 ` Maciej W. Rozycki
@ 2015-09-05 3:29 ` Yousong Zhou
2015-09-05 3:34 ` Yousong Zhou
2015-09-05 14:25 ` Maciej W. Rozycki
0 siblings, 2 replies; 7+ messages in thread
From: Yousong Zhou @ 2015-09-05 3:29 UTC (permalink / raw)
To: Maciej W. Rozycki
Cc: Ralf Baechle, Chen Jie, linux-mips, linux-kernel, Florian Fainelli
On 5 September 2015 at 02:52, Maciej W. Rozycki <macro@linux-mips.org> wrote:
> On Thu, 3 Sep 2015, Yousong Zhou wrote:
>
>> The nomips16 has to be added both as function attribute and assembler
>> directive.
>>
>> When only function attribute is specified, the compiler will inline the
>> function with -Os optimization. The generated assembly code cannot be
>> correctly assembled because ISA mode switch has to be done through jump
>> instruction.
>
> This can't be true. The compiler does not intepret the contents of an
> inline asm and therefore cannot decide whether to inline a function
> containing one or not based on the lone presence or the absence of an
> assembly directive within.
>
Most of the time I trust my compiler and never meddle with the
toolchain. Anyway I made a patch because it really did not work for
me. No big deal. It's not the end of world. It started with a
comment from OpenWrt packages feeds [1]. Actually this "unrecognized
opcode" problem have occurred within OpenWrt quite a few times before.
[1] https://github.com/openwrt/packages/commit/1e29676a8ac74f797f8ca799364681cec575ae6f#commitcomment-12901931
>> When only ".set nomips16" directive is used, the generated assembly code
>> will use MIPS32 code for the inline assembly template and MIPS16 for the
>> function return. The compiled binary is invalid:
>>
>> 00403100 <__arch_swab16>:
>> 403100: 7c0410a0 wsbh v0,a0
>> 403104: e820ea31 swc2 $0,-5583(at)
>>
>> while correct code should be:
>>
>> 00402650 <__arch_swab16>:
>> 402650: 7c0410a0 wsbh v0,a0
>> 402654: 03e00008 jr ra
>> 402658: 3042ffff andi v0,v0,0xffff
>
> It looks to me you're papering something over here -- when you use a
> `.set nomips16' directive the assembler will happily switch your
> instruction set in the middle of an instruction stream. Consequently if
> this function is (for whatever reason; it should not really) inlined in
> MIPS16 code, then you'll get a MIPS32 instruction in the middle, which
> will obviously be interpreted differently in the MIPS16 execution mode and
> is therefore surely a recipe for disaster.
If by "papering" you mean "made up", then whatever. Yeah, it's
disaster, an "invalid instruction" abort.
>
> How did you test your change and made the conclusion quoted with your
> patch description?
>
Compile the following program with a MIPS 24kc big endian variant compiler with
flag "-mips32r2 -mips16 -Os".
#include <stdio.h>
#include <stdint.h>
uint16_t __attribute__((noinline)) f(uint16_t v)
{
v = __cpu_to_le16(v);
return v;
}
int main()
{
printf("%x\n", f(0xbeef));
return 0;
}
When only ".set nomips16" was specified in __arch_swab16(), this was output
from objdump.
242 004003e0 <f>:
243 4003e0: 7c0410a0 wsbh v0,a0
244 4003e4: e820ea31 swc2 $0,-5583(at)
245 4003e8: 65006500 0x65006500
246 4003ec: 65006500 0x65006500
__arch_swab16() was indeed inlined. That swc2 instruction can be got from
assembler with the following code (it's from the "-S" result of GCC).
.set mips16
.set noreorder
.set nomacro
j $31
zeh $2
When only nomips16 function attribute was specified, this time GCC failed with
unrecognized opcode
/tmp/ccaGCouL.s: Assembler messages:
/tmp/ccaGCouL.s:21: Error: unrecognized opcode `wsbh $2,$4'
The generated assembly was something in the following form. Looks like the
assembler did not automatically switch to MIPS32 mode when ".set arch=mip32r2"
.set mips16
.ent f
.type f, @function
f
...
.set push
.set arch=mips32r2
wsbh $2,$4
.pop
j $31
zeh $2
.end f
...
The patch was run tested on QEMU Malta and an router with Atheros
AR9331 SoC. I didn't test __arch_swab64() though. I have done many
other trial-and-error tests while preparing this patch. It was a mess
when I was sure I should expect some sensible behaviour from the
compiler while it actually just did not behave that way.
>> Signed-off-by: Yousong Zhou <yszhou4tech@gmail.com>
>> ---
>> arch/mips/include/uapi/asm/swab.h | 12 +++++++++---
>> 1 file changed, 9 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/mips/include/uapi/asm/swab.h b/arch/mips/include/uapi/asm/swab.h
>> index 8f2d184..c4ddc4f 100644
>> --- a/arch/mips/include/uapi/asm/swab.h
>> +++ b/arch/mips/include/uapi/asm/swab.h
>> @@ -16,11 +16,13 @@
>> #if (defined(__mips_isa_rev) && (__mips_isa_rev >= 2)) || \
>> defined(_MIPS_ARCH_LOONGSON3A)
>>
>> -static inline __attribute_const__ __u16 __arch_swab16(__u16 x)
>> +static inline __attribute__((nomips16)) __attribute_const__
>> + __u16 __arch_swab16(__u16 x)
>> {
>> __asm__(
>> " .set push \n"
>> " .set arch=mips32r2 \n"
>> + " .set nomips16 \n"
>> " wsbh %0, %1 \n"
>> " .set pop \n"
>> : "=r" (x)
>
> So setting aside the correctness issues discussed above, for MIPS16 code
> this has to be put out of line by the compiler, with all the usual
> overhead of a function call, causing a performance loss rather than a gain
> expected here. Especially as switching the ISA mode requires draining the
> whole pipeline so that subsequent instructions are interpreted in the new
> execution mode. This is expensive in performance terms.
>
> I'm fairly sure simply disabling these asms (#ifndef __mips16) and
> letting the compiler generate the right mask and shift operations from
> plain C code will be cheaper in performance terms and possibly cheaper in
> code size as well, especially in otherwise leaf functions where an extra
> function call will according to the ABI clobber temporaries. So I suggest
> going in that direction instead.
I agree. Then you will provide the fix right? I am just curious
where that __mips16 should be placed or is it from compiler and
assembler?
>
> So this is a NAK really.
okay.
Cheers
yousong
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] MIPS: UAPI: Fix unrecognized opcode WSBH/DSBH/DSHD when using MIPS16.
2015-09-05 3:29 ` Yousong Zhou
@ 2015-09-05 3:34 ` Yousong Zhou
2015-09-05 14:25 ` Maciej W. Rozycki
1 sibling, 0 replies; 7+ messages in thread
From: Yousong Zhou @ 2015-09-05 3:34 UTC (permalink / raw)
To: Maciej W. Rozycki
Cc: Ralf Baechle, Chen Jie, linux-mips, linux-kernel, Florian Fainelli
On 5 September 2015 at 11:29, Yousong Zhou <yszhou4tech@gmail.com> wrote:
> #include <stdio.h>
> #include <stdint.h>
>
#include <asm/byteorder.h> was missed out at the top.
yousong
> uint16_t __attribute__((noinline)) f(uint16_t v)
> {
> v = __cpu_to_le16(v);
> return v;
> }
>
> int main()
> {
> printf("%x\n", f(0xbeef));
>
> return 0;
> }
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] MIPS: UAPI: Fix unrecognized opcode WSBH/DSBH/DSHD when using MIPS16.
2015-09-05 3:29 ` Yousong Zhou
2015-09-05 3:34 ` Yousong Zhou
@ 2015-09-05 14:25 ` Maciej W. Rozycki
2015-09-06 2:42 ` Yousong Zhou
1 sibling, 1 reply; 7+ messages in thread
From: Maciej W. Rozycki @ 2015-09-05 14:25 UTC (permalink / raw)
To: Yousong Zhou
Cc: Ralf Baechle, Chen Jie, linux-mips, linux-kernel, Florian Fainelli
On Sat, 5 Sep 2015, Yousong Zhou wrote:
> > This can't be true. The compiler does not intepret the contents of an
> > inline asm and therefore cannot decide whether to inline a function
> > containing one or not based on the lone presence or the absence of an
> > assembly directive within.
> >
>
> Most of the time I trust my compiler and never meddle with the
> toolchain. Anyway I made a patch because it really did not work for
> me. No big deal. It's not the end of world. It started with a
> comment from OpenWrt packages feeds [1]. Actually this "unrecognized
> opcode" problem have occurred within OpenWrt quite a few times before.
>
> [1] https://github.com/openwrt/packages/commit/1e29676a8ac74f797f8ca799364681cec575ae6f#commitcomment-12901931
The bug certainly was there, it's just your analysis and consequently the
fix that are wrong in the general case for some reason, maybe a buggy
compiler.
> > It looks to me you're papering something over here -- when you use a
> > `.set nomips16' directive the assembler will happily switch your
> > instruction set in the middle of an instruction stream. Consequently if
> > this function is (for whatever reason; it should not really) inlined in
> > MIPS16 code, then you'll get a MIPS32 instruction in the middle, which
> > will obviously be interpreted differently in the MIPS16 execution mode and
> > is therefore surely a recipe for disaster.
>
> If by "papering" you mean "made up", then whatever. Yeah, it's
> disaster, an "invalid instruction" abort.
By "papering over" I mean forcing source code to compile successfully at
the risk of producing incorrect binary code.
> > How did you test your change and made the conclusion quoted with your
> > patch description?
> >
>
> Compile the following program with a MIPS 24kc big endian variant compiler with
> flag "-mips32r2 -mips16 -Os".
>
> #include <stdio.h>
> #include <stdint.h>
>
> uint16_t __attribute__((noinline)) f(uint16_t v)
> {
> v = __cpu_to_le16(v);
> return v;
> }
>
> int main()
> {
> printf("%x\n", f(0xbeef));
>
> return 0;
> }
>
> When only ".set nomips16" was specified in __arch_swab16(), this was output
> from objdump.
>
> 242 004003e0 <f>:
> 243 4003e0: 7c0410a0 wsbh v0,a0
> 244 4003e4: e820ea31 swc2 $0,-5583(at)
> 245 4003e8: 65006500 0x65006500
> 246 4003ec: 65006500 0x65006500
Quite obviously.
For the record: the first instruction has been assembled in the regular
MIPS mode and that propagated to symbol annotation. Consequently `f' is
seen by `objdump' as a regular MIPS function and disassembles it all as
such. You can put a global label immediately after the WSBH instruction
in your source code to have the rest of the function disassembled
correctly (of course this won't make this code work at the run time).
> __arch_swab16() was indeed inlined. That swc2 instruction can be got from
> assembler with the following code (it's from the "-S" result of GCC).
>
> .set mips16
> .set noreorder
> .set nomacro
> j $31
> zeh $2
>
> When only nomips16 function attribute was specified, this time GCC failed with
> unrecognized opcode
>
> /tmp/ccaGCouL.s: Assembler messages:
> /tmp/ccaGCouL.s:21: Error: unrecognized opcode `wsbh $2,$4'
>
> The generated assembly was something in the following form. Looks like the
> assembler did not automatically switch to MIPS32 mode when ".set arch=mip32r2"
There's no switch to regular MIPS mode implied with `.set arch=mip32r2',
the directive merely switches the ISA level, affecting both the regular
MIPS and the MIPS16 mode (the MIPS32r2 ISA level adds extra MIPS16
instructions too, e.g. ZEH is a new addition).
> .set mips16
>
> .ent f
> .type f, @function
> f
> ...
> .set push
> .set arch=mips32r2
> wsbh $2,$4
> .pop
> j $31
> zeh $2
> .end f
> ...
That's exactly the papered-over buggy code scenario I've been referring
to above. This is clearly MIPS16 code: ZEH is a MIPS16 instruction only,
there's no such regular MIPS counterpart. And it also obviously fails to
assemble because on the contrary there's no MIPS16 WSBH instruction.
Now if you stick `.set nomips16' just above WSBH, then this code will
happily assemble, because this single instruction only (`.set pop' reverts
any previous `.set' directives; I'm assuming you wrote above by hand and
`.pop' is a typo) will assemble in the regular MIPS instruction mode. But
if this code is ever reached, then the processor will still execute the
machine code produced by the assembler from the WSBH instruction in the
MIPS16 mode.
For example the encoding of:
wsbh $2, $4
is (as you've shown in a dump above) 0x7c0410a0, which in the MIPS16 mode
yields (in the big-endian mode):
00000000 <f>:
0: 7c04 sd s0,32(a0)
2: 10a0 b 144 <f+0x144>
-- so this won't do what you might expect, you'll get a Reserved
Instruction exception on the SD instruction, which is not supported in the
32-bit mode, and consequently SIGILL.
> The patch was run tested on QEMU Malta and an router with Atheros
> AR9331 SoC. I didn't test __arch_swab64() though. I have done many
> other trial-and-error tests while preparing this patch. It was a mess
> when I was sure I should expect some sensible behaviour from the
> compiler while it actually just did not behave that way.
I've compiled your example provided and as I stated in the original mail
`__arch_swab16' is always produced as a separate function, whether `.set
nomips16' is present in the inline assembly placed there or not. This is
with (unreleased) GCC 6.0.0.
However if you happen to have a buggy compiler that fails to emit
`__arch_swab16' as a separate function despite the `nomips16' attribute,
then it's better if the resulting generated assembly code fails to
assemble rather than if it goes astray at the run time.
> > So setting aside the correctness issues discussed above, for MIPS16 code
> > this has to be put out of line by the compiler, with all the usual
> > overhead of a function call, causing a performance loss rather than a gain
> > expected here. Especially as switching the ISA mode requires draining the
> > whole pipeline so that subsequent instructions are interpreted in the new
> > execution mode. This is expensive in performance terms.
> >
> > I'm fairly sure simply disabling these asms (#ifndef __mips16) and
> > letting the compiler generate the right mask and shift operations from
> > plain C code will be cheaper in performance terms and possibly cheaper in
> > code size as well, especially in otherwise leaf functions where an extra
> > function call will according to the ABI clobber temporaries. So I suggest
> > going in that direction instead.
>
> I agree. Then you will provide the fix right? I am just curious
> where that __mips16 should be placed or is it from compiler and
> assembler?
No, it's your bug after all. I think the last paragraph I wrote quoted
above combined with the source code in question make it clear what to do.
I have also checked what the difference in generated MIPS16 code is
between a call to `__arch_swab16':
0000000c <f>:
c: 64c4 save 32,ra
e: 1800 0000 jal 0 <__arch_swab16>
e: R_MIPS16_26 __arch_swab16
12: ec31 zeh a0
14: 6444 restore 32,ra
16: e8a0 jrc ra
and equivalent generic code:
0000000c <f>:
c: ec31 zeh a0
e: 3280 sll v0,a0,8
10: 3482 srl a0,8
12: ea8d or v0,a0
14: e820 jr ra
16: ea31 zeh v0
so the win is I think clear.
Finally the MIPS64 `__arch_swab64' case does not matter as we have no
MIPS16 support for 64-bit code in Linux, the toolchain will simply refuse
to build it (only bare metal is supported).
Maciej
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] MIPS: UAPI: Fix unrecognized opcode WSBH/DSBH/DSHD when using MIPS16.
2015-09-05 14:25 ` Maciej W. Rozycki
@ 2015-09-06 2:42 ` Yousong Zhou
2015-09-06 23:23 ` Maciej W. Rozycki
0 siblings, 1 reply; 7+ messages in thread
From: Yousong Zhou @ 2015-09-06 2:42 UTC (permalink / raw)
To: Maciej W. Rozycki
Cc: Ralf Baechle, Chen Jie, linux-mips, linux-kernel, Florian Fainelli
Hi, Maciej, first of all, thank you for your time on this,
appreciate it.
Comments inline
On 5 September 2015 at 22:25, Maciej W. Rozycki <macro@linux-mips.org> wrote:
> On Sat, 5 Sep 2015, Yousong Zhou wrote:
>
>> > This can't be true. The compiler does not intepret the contents of an
>> > inline asm and therefore cannot decide whether to inline a function
>> > containing one or not based on the lone presence or the absence of an
>> > assembly directive within.
>> >
>>
>> Most of the time I trust my compiler and never meddle with the
>> toolchain. Anyway I made a patch because it really did not work for
>> me. No big deal. It's not the end of world. It started with a
>> comment from OpenWrt packages feeds [1]. Actually this "unrecognized
>> opcode" problem have occurred within OpenWrt quite a few times before.
>>
>> [1] https://github.com/openwrt/packages/commit/1e29676a8ac74f797f8ca799364681cec575ae6f#commitcomment-12901931
>
> The bug certainly was there, it's just your analysis and consequently the
> fix that are wrong in the general case for some reason, maybe a buggy
> compiler.
>
This is the compiler "--version",
mips-openwrt-linux-gcc (OpenWrt/Linaro GCC 4.8-2014.04 r46763) 4.8.3
>> > It looks to me you're papering something over here -- when you use a
>> > `.set nomips16' directive the assembler will happily switch your
>> > instruction set in the middle of an instruction stream. Consequently if
>> > this function is (for whatever reason; it should not really) inlined in
>> > MIPS16 code, then you'll get a MIPS32 instruction in the middle, which
>> > will obviously be interpreted differently in the MIPS16 execution mode and
>> > is therefore surely a recipe for disaster.
>>
>> If by "papering" you mean "made up", then whatever. Yeah, it's
>> disaster, an "invalid instruction" abort.
>
> By "papering over" I mean forcing source code to compile successfully at
> the risk of producing incorrect binary code.
Learned a new phrase/idiom :)
>
>> > How did you test your change and made the conclusion quoted with your
>> > patch description?
>> >
>>
>> Compile the following program with a MIPS 24kc big endian variant compiler with
>> flag "-mips32r2 -mips16 -Os".
>>
>> #include <stdio.h>
>> #include <stdint.h>
>>
>> uint16_t __attribute__((noinline)) f(uint16_t v)
>> {
>> v = __cpu_to_le16(v);
>> return v;
>> }
>>
>> int main()
>> {
>> printf("%x\n", f(0xbeef));
>>
>> return 0;
>> }
>>
>> When only ".set nomips16" was specified in __arch_swab16(), this was output
>> from objdump.
>>
>> 242 004003e0 <f>:
>> 243 4003e0: 7c0410a0 wsbh v0,a0
>> 244 4003e4: e820ea31 swc2 $0,-5583(at)
>> 245 4003e8: 65006500 0x65006500
>> 246 4003ec: 65006500 0x65006500
>
> Quite obviously.
>
> For the record: the first instruction has been assembled in the regular
> MIPS mode and that propagated to symbol annotation. Consequently `f' is
> seen by `objdump' as a regular MIPS function and disassembles it all as
> such. You can put a global label immediately after the WSBH instruction
> in your source code to have the rest of the function disassembled
> correctly (of course this won't make this code work at the run time).
Thanks for the trick. Objdump really disassembled it
correctly.
>
>> __arch_swab16() was indeed inlined. That swc2 instruction can be got from
>> assembler with the following code (it's from the "-S" result of GCC).
>>
>> .set mips16
>> .set noreorder
>> .set nomacro
>> j $31
>> zeh $2
>>
>> When only nomips16 function attribute was specified, this time GCC failed with
>> unrecognized opcode
>>
>> /tmp/ccaGCouL.s: Assembler messages:
>> /tmp/ccaGCouL.s:21: Error: unrecognized opcode `wsbh $2,$4'
>>
>> The generated assembly was something in the following form. Looks like the
>> assembler did not automatically switch to MIPS32 mode when ".set arch=mip32r2"
>
> There's no switch to regular MIPS mode implied with `.set arch=mip32r2',
> the directive merely switches the ISA level, affecting both the regular
> MIPS and the MIPS16 mode (the MIPS32r2 ISA level adds extra MIPS16
> instructions too, e.g. ZEH is a new addition).
Agreed
>
>> .set mips16
>>
>> .ent f
>> .type f, @function
>> f
>> ...
>> .set push
>> .set arch=mips32r2
>> wsbh $2,$4
>> .pop
>> j $31
>> zeh $2
>> .end f
>> ...
>
> That's exactly the papered-over buggy code scenario I've been referring
> to above. This is clearly MIPS16 code: ZEH is a MIPS16 instruction only,
> there's no such regular MIPS counterpart. And it also obviously fails to
> assemble because on the contrary there's no MIPS16 WSBH instruction.
>
Yes, it won't assemble.
> Now if you stick `.set nomips16' just above WSBH, then this code will
> happily assemble, because this single instruction only (`.set pop' reverts
> any previous `.set' directives; I'm assuming you wrote above by hand and
> `.pop' is a typo) will assemble in the regular MIPS instruction mode. But
> if this code is ever reached, then the processor will still execute the
> machine code produced by the assembler from the WSBH instruction in the
> MIPS16 mode.
Yes, I hand-copied it from the output of "gcc -S" just to
show the form/pattern (the original output is too long for
this conversation). No, that `.pop' is not a typo (I just
did a double-check).
If I stick `.set nomips16` just above WSBH, that's just what
the original patch tries to do, papering over the fact that
it did not compile/assemble without it.
>
> For example the encoding of:
>
> wsbh $2, $4
>
> is (as you've shown in a dump above) 0x7c0410a0, which in the MIPS16 mode
> yields (in the big-endian mode):
>
> 00000000 <f>:
> 0: 7c04 sd s0,32(a0)
> 2: 10a0 b 144 <f+0x144>
>
> -- so this won't do what you might expect, you'll get a Reserved
> Instruction exception on the SD instruction, which is not supported in the
> 32-bit mode, and consequently SIGILL.
Agreed.
>
>> The patch was run tested on QEMU Malta and an router with Atheros
>> AR9331 SoC. I didn't test __arch_swab64() though. I have done many
>> other trial-and-error tests while preparing this patch. It was a mess
>> when I was sure I should expect some sensible behaviour from the
>> compiler while it actually just did not behave that way.
>
> I've compiled your example provided and as I stated in the original mail
> `__arch_swab16' is always produced as a separate function, whether `.set
> nomips16' is present in the inline assembly placed there or not. This is
> with (unreleased) GCC 6.0.0.
>
> However if you happen to have a buggy compiler that fails to emit
> `__arch_swab16' as a separate function despite the `nomips16' attribute,
> then it's better if the resulting generated assembly code fails to
> assemble rather than if it goes astray at the run time.
Now, my compiler refused to emit `__arch_swab16' as a
separate function even with the `nomips16' function
attribute. This is the kind of "mess" I just meant above, again.
I expect that it should emit a separate function and call it
with a jump observing that the caller `f' is in MIPS16 mode
yet the `__arch_swab16' is noMIPS16. sigh~~
>
>> > So setting aside the correctness issues discussed above, for MIPS16 code
>> > this has to be put out of line by the compiler, with all the usual
>> > overhead of a function call, causing a performance loss rather than a gain
>> > expected here. Especially as switching the ISA mode requires draining the
>> > whole pipeline so that subsequent instructions are interpreted in the new
>> > execution mode. This is expensive in performance terms.
>> >
>> > I'm fairly sure simply disabling these asms (#ifndef __mips16) and
>> > letting the compiler generate the right mask and shift operations from
>> > plain C code will be cheaper in performance terms and possibly cheaper in
>> > code size as well, especially in otherwise leaf functions where an extra
>> > function call will according to the ABI clobber temporaries. So I suggest
>> > going in that direction instead.
>>
>> I agree. Then you will provide the fix right? I am just curious
>> where that __mips16 should be placed or is it from compiler and
>> assembler?
>
> No, it's your bug after all. I think the last paragraph I wrote quoted
> above combined with the source code in question make it clear what to do.
Okay, I will try. Most of the time when textbooks read
clearly/obviously/apparently, things go astray ;)
>
> I have also checked what the difference in generated MIPS16 code is
> between a call to `__arch_swab16':
>
> 0000000c <f>:
> c: 64c4 save 32,ra
> e: 1800 0000 jal 0 <__arch_swab16>
> e: R_MIPS16_26 __arch_swab16
> 12: ec31 zeh a0
> 14: 6444 restore 32,ra
> 16: e8a0 jrc ra
>
> and equivalent generic code:
>
> 0000000c <f>:
> c: ec31 zeh a0
> e: 3280 sll v0,a0,8
> 10: 3482 srl a0,8
> 12: ea8d or v0,a0
> 14: e820 jr ra
> 16: ea31 zeh v0
>
> so the win is I think clear.
>
> Finally the MIPS64 `__arch_swab64' case does not matter as we have no
> MIPS16 support for 64-bit code in Linux, the toolchain will simply refuse
> to build it (only bare metal is supported).
Thanks for the information. Never played with MIPS64
before.
Regards
yousong
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] MIPS: UAPI: Fix unrecognized opcode WSBH/DSBH/DSHD when using MIPS16.
2015-09-06 2:42 ` Yousong Zhou
@ 2015-09-06 23:23 ` Maciej W. Rozycki
0 siblings, 0 replies; 7+ messages in thread
From: Maciej W. Rozycki @ 2015-09-06 23:23 UTC (permalink / raw)
To: Yousong Zhou
Cc: Ralf Baechle, Chen Jie, linux-mips, linux-kernel, Florian Fainelli
On Sun, 6 Sep 2015, Yousong Zhou wrote:
> Hi, Maciej, first of all, thank you for your time on this,
> appreciate it.
You're welcome!
> > The bug certainly was there, it's just your analysis and consequently the
> > fix that are wrong in the general case for some reason, maybe a buggy
> > compiler.
> >
>
> This is the compiler "--version",
>
> mips-openwrt-linux-gcc (OpenWrt/Linaro GCC 4.8-2014.04 r46763) 4.8.3
A-ha! I've checked GCC's history and the symptom you're seeing was PR
target/55777:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55777
and the bug has only been fixed pretty recently (r199328), already after
GCC 4.8 has branched. From your troubles I infer the fix has not been
backported to 4.8 and therefore is only available from GCC 4.9 up; of
course either you or OpenWrt can still backport it and apply locally.
Fortunately as I noted you need not dive into that hole as the
reasonable choice here is to avoid the asm for MIPS16 code altogether,
regardless of the GCC bug.
> > Now if you stick `.set nomips16' just above WSBH, then this code will
> > happily assemble, because this single instruction only (`.set pop' reverts
> > any previous `.set' directives; I'm assuming you wrote above by hand and
> > `.pop' is a typo) will assemble in the regular MIPS instruction mode. But
> > if this code is ever reached, then the processor will still execute the
> > machine code produced by the assembler from the WSBH instruction in the
> > MIPS16 mode.
>
> Yes, I hand-copied it from the output of "gcc -S" just to
> show the form/pattern (the original output is too long for
> this conversation). No, that `.pop' is not a typo (I just
> did a double-check).
Well, please check again then as there's no such pseudo-op:
$ cat pop.s
.pop
$ mips-mti-linux-gnu-as -o pop.o pop.s
pop.s: Assembler messages:
pop.s:1: Error: unknown pseudo-op: `.pop'
$
> > No, it's your bug after all. I think the last paragraph I wrote quoted
> > above combined with the source code in question make it clear what to do.
>
> Okay, I will try. Most of the time when textbooks read
> clearly/obviously/apparently, things go astray ;)
Well, if you really find yourself stuck with it, then come back for
more hints, however please do try figuring it out yourself first as
it'll be a good exercise.
Maciej
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2015-09-06 23:23 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-03 9:47 [PATCH] MIPS: UAPI: Fix unrecognized opcode WSBH/DSBH/DSHD when using MIPS16 Yousong Zhou
2015-09-04 18:52 ` Maciej W. Rozycki
2015-09-05 3:29 ` Yousong Zhou
2015-09-05 3:34 ` Yousong Zhou
2015-09-05 14:25 ` Maciej W. Rozycki
2015-09-06 2:42 ` Yousong Zhou
2015-09-06 23:23 ` Maciej W. Rozycki
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.