* [PATCH] MIPS: UAPI: Fix unrecognized opcode WSBH/DSBH/DSHD when using MIPS16. @ 2015-09-03 9:47 Yousong Zhou 2015-09-04 18:52 ` Maciej W. Rozycki 0 siblings, 1 reply; 7+ messages in thread From: Yousong Zhou @ 2015-09-03 9:47 UTC (permalink / raw) To: Ralf Baechle; +Cc: Chen Jie, Yousong Zhou, linux-mips, linux-kernel The nomips16 has to be added both as function attribute and assembler directive. When only function attribute is specified, the compiler will inline the function with -Os optimization. The generated assembly code cannot be correctly assembled because ISA mode switch has to be done through jump instruction. When only ".set nomips16" directive is used, the generated assembly code will use MIPS32 code for the inline assembly template and MIPS16 for the function return. The compiled binary is invalid: 00403100 <__arch_swab16>: 403100: 7c0410a0 wsbh v0,a0 403104: e820ea31 swc2 $0,-5583(at) while correct code should be: 00402650 <__arch_swab16>: 402650: 7c0410a0 wsbh v0,a0 402654: 03e00008 jr ra 402658: 3042ffff andi v0,v0,0xffff Signed-off-by: Yousong Zhou <yszhou4tech@gmail.com> --- arch/mips/include/uapi/asm/swab.h | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/arch/mips/include/uapi/asm/swab.h b/arch/mips/include/uapi/asm/swab.h index 8f2d184..c4ddc4f 100644 --- a/arch/mips/include/uapi/asm/swab.h +++ b/arch/mips/include/uapi/asm/swab.h @@ -16,11 +16,13 @@ #if (defined(__mips_isa_rev) && (__mips_isa_rev >= 2)) || \ defined(_MIPS_ARCH_LOONGSON3A) -static inline __attribute_const__ __u16 __arch_swab16(__u16 x) +static inline __attribute__((nomips16)) __attribute_const__ + __u16 __arch_swab16(__u16 x) { __asm__( " .set push \n" " .set arch=mips32r2 \n" + " .set nomips16 \n" " wsbh %0, %1 \n" " .set pop \n" : "=r" (x) @@ -30,11 +32,13 @@ static inline __attribute_const__ __u16 __arch_swab16(__u16 x) } #define __arch_swab16 __arch_swab16 -static inline __attribute_const__ __u32 __arch_swab32(__u32 x) +static inline __attribute__((nomips16)) __attribute_const__ + __u32 __arch_swab32(__u32 x) { __asm__( " .set push \n" " .set arch=mips32r2 \n" + " .set nomips16 \n" " wsbh %0, %1 \n" " rotr %0, %0, 16 \n" " .set pop \n" @@ -50,11 +54,13 @@ static inline __attribute_const__ __u32 __arch_swab32(__u32 x) * 64-bit kernel on r2 CPUs. */ #ifdef __mips64 -static inline __attribute_const__ __u64 __arch_swab64(__u64 x) +static inline __attribute__((nomips16)) __attribute_const__ + __u64 __arch_swab64(__u64 x) { __asm__( " .set push \n" " .set arch=mips64r2 \n" + " .set nomips16 \n" " dsbh %0, %1 \n" " dshd %0, %0 \n" " .set pop \n" -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] MIPS: UAPI: Fix unrecognized opcode WSBH/DSBH/DSHD when using MIPS16. 2015-09-03 9:47 [PATCH] MIPS: UAPI: Fix unrecognized opcode WSBH/DSBH/DSHD when using MIPS16 Yousong Zhou @ 2015-09-04 18:52 ` Maciej W. Rozycki 2015-09-05 3:29 ` Yousong Zhou 0 siblings, 1 reply; 7+ messages in thread From: Maciej W. Rozycki @ 2015-09-04 18:52 UTC (permalink / raw) To: Yousong Zhou; +Cc: Ralf Baechle, Chen Jie, linux-mips, linux-kernel On Thu, 3 Sep 2015, Yousong Zhou wrote: > The nomips16 has to be added both as function attribute and assembler > directive. > > When only function attribute is specified, the compiler will inline the > function with -Os optimization. The generated assembly code cannot be > correctly assembled because ISA mode switch has to be done through jump > instruction. This can't be true. The compiler does not intepret the contents of an inline asm and therefore cannot decide whether to inline a function containing one or not based on the lone presence or the absence of an assembly directive within. > When only ".set nomips16" directive is used, the generated assembly code > will use MIPS32 code for the inline assembly template and MIPS16 for the > function return. The compiled binary is invalid: > > 00403100 <__arch_swab16>: > 403100: 7c0410a0 wsbh v0,a0 > 403104: e820ea31 swc2 $0,-5583(at) > > while correct code should be: > > 00402650 <__arch_swab16>: > 402650: 7c0410a0 wsbh v0,a0 > 402654: 03e00008 jr ra > 402658: 3042ffff andi v0,v0,0xffff It looks to me you're papering something over here -- when you use a `.set nomips16' directive the assembler will happily switch your instruction set in the middle of an instruction stream. Consequently if this function is (for whatever reason; it should not really) inlined in MIPS16 code, then you'll get a MIPS32 instruction in the middle, which will obviously be interpreted differently in the MIPS16 execution mode and is therefore surely a recipe for disaster. How did you test your change and made the conclusion quoted with your patch description? > Signed-off-by: Yousong Zhou <yszhou4tech@gmail.com> > --- > arch/mips/include/uapi/asm/swab.h | 12 +++++++++--- > 1 file changed, 9 insertions(+), 3 deletions(-) > > diff --git a/arch/mips/include/uapi/asm/swab.h b/arch/mips/include/uapi/asm/swab.h > index 8f2d184..c4ddc4f 100644 > --- a/arch/mips/include/uapi/asm/swab.h > +++ b/arch/mips/include/uapi/asm/swab.h > @@ -16,11 +16,13 @@ > #if (defined(__mips_isa_rev) && (__mips_isa_rev >= 2)) || \ > defined(_MIPS_ARCH_LOONGSON3A) > > -static inline __attribute_const__ __u16 __arch_swab16(__u16 x) > +static inline __attribute__((nomips16)) __attribute_const__ > + __u16 __arch_swab16(__u16 x) > { > __asm__( > " .set push \n" > " .set arch=mips32r2 \n" > + " .set nomips16 \n" > " wsbh %0, %1 \n" > " .set pop \n" > : "=r" (x) So setting aside the correctness issues discussed above, for MIPS16 code this has to be put out of line by the compiler, with all the usual overhead of a function call, causing a performance loss rather than a gain expected here. Especially as switching the ISA mode requires draining the whole pipeline so that subsequent instructions are interpreted in the new execution mode. This is expensive in performance terms. I'm fairly sure simply disabling these asms (#ifndef __mips16) and letting the compiler generate the right mask and shift operations from plain C code will be cheaper in performance terms and possibly cheaper in code size as well, especially in otherwise leaf functions where an extra function call will according to the ABI clobber temporaries. So I suggest going in that direction instead. So this is a NAK really. Maciej ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] MIPS: UAPI: Fix unrecognized opcode WSBH/DSBH/DSHD when using MIPS16. 2015-09-04 18:52 ` Maciej W. Rozycki @ 2015-09-05 3:29 ` Yousong Zhou 2015-09-05 3:34 ` Yousong Zhou 2015-09-05 14:25 ` Maciej W. Rozycki 0 siblings, 2 replies; 7+ messages in thread From: Yousong Zhou @ 2015-09-05 3:29 UTC (permalink / raw) To: Maciej W. Rozycki Cc: Ralf Baechle, Chen Jie, linux-mips, linux-kernel, Florian Fainelli On 5 September 2015 at 02:52, Maciej W. Rozycki <macro@linux-mips.org> wrote: > On Thu, 3 Sep 2015, Yousong Zhou wrote: > >> The nomips16 has to be added both as function attribute and assembler >> directive. >> >> When only function attribute is specified, the compiler will inline the >> function with -Os optimization. The generated assembly code cannot be >> correctly assembled because ISA mode switch has to be done through jump >> instruction. > > This can't be true. The compiler does not intepret the contents of an > inline asm and therefore cannot decide whether to inline a function > containing one or not based on the lone presence or the absence of an > assembly directive within. > Most of the time I trust my compiler and never meddle with the toolchain. Anyway I made a patch because it really did not work for me. No big deal. It's not the end of world. It started with a comment from OpenWrt packages feeds [1]. Actually this "unrecognized opcode" problem have occurred within OpenWrt quite a few times before. [1] https://github.com/openwrt/packages/commit/1e29676a8ac74f797f8ca799364681cec575ae6f#commitcomment-12901931 >> When only ".set nomips16" directive is used, the generated assembly code >> will use MIPS32 code for the inline assembly template and MIPS16 for the >> function return. The compiled binary is invalid: >> >> 00403100 <__arch_swab16>: >> 403100: 7c0410a0 wsbh v0,a0 >> 403104: e820ea31 swc2 $0,-5583(at) >> >> while correct code should be: >> >> 00402650 <__arch_swab16>: >> 402650: 7c0410a0 wsbh v0,a0 >> 402654: 03e00008 jr ra >> 402658: 3042ffff andi v0,v0,0xffff > > It looks to me you're papering something over here -- when you use a > `.set nomips16' directive the assembler will happily switch your > instruction set in the middle of an instruction stream. Consequently if > this function is (for whatever reason; it should not really) inlined in > MIPS16 code, then you'll get a MIPS32 instruction in the middle, which > will obviously be interpreted differently in the MIPS16 execution mode and > is therefore surely a recipe for disaster. If by "papering" you mean "made up", then whatever. Yeah, it's disaster, an "invalid instruction" abort. > > How did you test your change and made the conclusion quoted with your > patch description? > Compile the following program with a MIPS 24kc big endian variant compiler with flag "-mips32r2 -mips16 -Os". #include <stdio.h> #include <stdint.h> uint16_t __attribute__((noinline)) f(uint16_t v) { v = __cpu_to_le16(v); return v; } int main() { printf("%x\n", f(0xbeef)); return 0; } When only ".set nomips16" was specified in __arch_swab16(), this was output from objdump. 242 004003e0 <f>: 243 4003e0: 7c0410a0 wsbh v0,a0 244 4003e4: e820ea31 swc2 $0,-5583(at) 245 4003e8: 65006500 0x65006500 246 4003ec: 65006500 0x65006500 __arch_swab16() was indeed inlined. That swc2 instruction can be got from assembler with the following code (it's from the "-S" result of GCC). .set mips16 .set noreorder .set nomacro j $31 zeh $2 When only nomips16 function attribute was specified, this time GCC failed with unrecognized opcode /tmp/ccaGCouL.s: Assembler messages: /tmp/ccaGCouL.s:21: Error: unrecognized opcode `wsbh $2,$4' The generated assembly was something in the following form. Looks like the assembler did not automatically switch to MIPS32 mode when ".set arch=mip32r2" .set mips16 .ent f .type f, @function f ... .set push .set arch=mips32r2 wsbh $2,$4 .pop j $31 zeh $2 .end f ... The patch was run tested on QEMU Malta and an router with Atheros AR9331 SoC. I didn't test __arch_swab64() though. I have done many other trial-and-error tests while preparing this patch. It was a mess when I was sure I should expect some sensible behaviour from the compiler while it actually just did not behave that way. >> Signed-off-by: Yousong Zhou <yszhou4tech@gmail.com> >> --- >> arch/mips/include/uapi/asm/swab.h | 12 +++++++++--- >> 1 file changed, 9 insertions(+), 3 deletions(-) >> >> diff --git a/arch/mips/include/uapi/asm/swab.h b/arch/mips/include/uapi/asm/swab.h >> index 8f2d184..c4ddc4f 100644 >> --- a/arch/mips/include/uapi/asm/swab.h >> +++ b/arch/mips/include/uapi/asm/swab.h >> @@ -16,11 +16,13 @@ >> #if (defined(__mips_isa_rev) && (__mips_isa_rev >= 2)) || \ >> defined(_MIPS_ARCH_LOONGSON3A) >> >> -static inline __attribute_const__ __u16 __arch_swab16(__u16 x) >> +static inline __attribute__((nomips16)) __attribute_const__ >> + __u16 __arch_swab16(__u16 x) >> { >> __asm__( >> " .set push \n" >> " .set arch=mips32r2 \n" >> + " .set nomips16 \n" >> " wsbh %0, %1 \n" >> " .set pop \n" >> : "=r" (x) > > So setting aside the correctness issues discussed above, for MIPS16 code > this has to be put out of line by the compiler, with all the usual > overhead of a function call, causing a performance loss rather than a gain > expected here. Especially as switching the ISA mode requires draining the > whole pipeline so that subsequent instructions are interpreted in the new > execution mode. This is expensive in performance terms. > > I'm fairly sure simply disabling these asms (#ifndef __mips16) and > letting the compiler generate the right mask and shift operations from > plain C code will be cheaper in performance terms and possibly cheaper in > code size as well, especially in otherwise leaf functions where an extra > function call will according to the ABI clobber temporaries. So I suggest > going in that direction instead. I agree. Then you will provide the fix right? I am just curious where that __mips16 should be placed or is it from compiler and assembler? > > So this is a NAK really. okay. Cheers yousong ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] MIPS: UAPI: Fix unrecognized opcode WSBH/DSBH/DSHD when using MIPS16. 2015-09-05 3:29 ` Yousong Zhou @ 2015-09-05 3:34 ` Yousong Zhou 2015-09-05 14:25 ` Maciej W. Rozycki 1 sibling, 0 replies; 7+ messages in thread From: Yousong Zhou @ 2015-09-05 3:34 UTC (permalink / raw) To: Maciej W. Rozycki Cc: Ralf Baechle, Chen Jie, linux-mips, linux-kernel, Florian Fainelli On 5 September 2015 at 11:29, Yousong Zhou <yszhou4tech@gmail.com> wrote: > #include <stdio.h> > #include <stdint.h> > #include <asm/byteorder.h> was missed out at the top. yousong > uint16_t __attribute__((noinline)) f(uint16_t v) > { > v = __cpu_to_le16(v); > return v; > } > > int main() > { > printf("%x\n", f(0xbeef)); > > return 0; > } > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] MIPS: UAPI: Fix unrecognized opcode WSBH/DSBH/DSHD when using MIPS16. 2015-09-05 3:29 ` Yousong Zhou 2015-09-05 3:34 ` Yousong Zhou @ 2015-09-05 14:25 ` Maciej W. Rozycki 2015-09-06 2:42 ` Yousong Zhou 1 sibling, 1 reply; 7+ messages in thread From: Maciej W. Rozycki @ 2015-09-05 14:25 UTC (permalink / raw) To: Yousong Zhou Cc: Ralf Baechle, Chen Jie, linux-mips, linux-kernel, Florian Fainelli On Sat, 5 Sep 2015, Yousong Zhou wrote: > > This can't be true. The compiler does not intepret the contents of an > > inline asm and therefore cannot decide whether to inline a function > > containing one or not based on the lone presence or the absence of an > > assembly directive within. > > > > Most of the time I trust my compiler and never meddle with the > toolchain. Anyway I made a patch because it really did not work for > me. No big deal. It's not the end of world. It started with a > comment from OpenWrt packages feeds [1]. Actually this "unrecognized > opcode" problem have occurred within OpenWrt quite a few times before. > > [1] https://github.com/openwrt/packages/commit/1e29676a8ac74f797f8ca799364681cec575ae6f#commitcomment-12901931 The bug certainly was there, it's just your analysis and consequently the fix that are wrong in the general case for some reason, maybe a buggy compiler. > > It looks to me you're papering something over here -- when you use a > > `.set nomips16' directive the assembler will happily switch your > > instruction set in the middle of an instruction stream. Consequently if > > this function is (for whatever reason; it should not really) inlined in > > MIPS16 code, then you'll get a MIPS32 instruction in the middle, which > > will obviously be interpreted differently in the MIPS16 execution mode and > > is therefore surely a recipe for disaster. > > If by "papering" you mean "made up", then whatever. Yeah, it's > disaster, an "invalid instruction" abort. By "papering over" I mean forcing source code to compile successfully at the risk of producing incorrect binary code. > > How did you test your change and made the conclusion quoted with your > > patch description? > > > > Compile the following program with a MIPS 24kc big endian variant compiler with > flag "-mips32r2 -mips16 -Os". > > #include <stdio.h> > #include <stdint.h> > > uint16_t __attribute__((noinline)) f(uint16_t v) > { > v = __cpu_to_le16(v); > return v; > } > > int main() > { > printf("%x\n", f(0xbeef)); > > return 0; > } > > When only ".set nomips16" was specified in __arch_swab16(), this was output > from objdump. > > 242 004003e0 <f>: > 243 4003e0: 7c0410a0 wsbh v0,a0 > 244 4003e4: e820ea31 swc2 $0,-5583(at) > 245 4003e8: 65006500 0x65006500 > 246 4003ec: 65006500 0x65006500 Quite obviously. For the record: the first instruction has been assembled in the regular MIPS mode and that propagated to symbol annotation. Consequently `f' is seen by `objdump' as a regular MIPS function and disassembles it all as such. You can put a global label immediately after the WSBH instruction in your source code to have the rest of the function disassembled correctly (of course this won't make this code work at the run time). > __arch_swab16() was indeed inlined. That swc2 instruction can be got from > assembler with the following code (it's from the "-S" result of GCC). > > .set mips16 > .set noreorder > .set nomacro > j $31 > zeh $2 > > When only nomips16 function attribute was specified, this time GCC failed with > unrecognized opcode > > /tmp/ccaGCouL.s: Assembler messages: > /tmp/ccaGCouL.s:21: Error: unrecognized opcode `wsbh $2,$4' > > The generated assembly was something in the following form. Looks like the > assembler did not automatically switch to MIPS32 mode when ".set arch=mip32r2" There's no switch to regular MIPS mode implied with `.set arch=mip32r2', the directive merely switches the ISA level, affecting both the regular MIPS and the MIPS16 mode (the MIPS32r2 ISA level adds extra MIPS16 instructions too, e.g. ZEH is a new addition). > .set mips16 > > .ent f > .type f, @function > f > ... > .set push > .set arch=mips32r2 > wsbh $2,$4 > .pop > j $31 > zeh $2 > .end f > ... That's exactly the papered-over buggy code scenario I've been referring to above. This is clearly MIPS16 code: ZEH is a MIPS16 instruction only, there's no such regular MIPS counterpart. And it also obviously fails to assemble because on the contrary there's no MIPS16 WSBH instruction. Now if you stick `.set nomips16' just above WSBH, then this code will happily assemble, because this single instruction only (`.set pop' reverts any previous `.set' directives; I'm assuming you wrote above by hand and `.pop' is a typo) will assemble in the regular MIPS instruction mode. But if this code is ever reached, then the processor will still execute the machine code produced by the assembler from the WSBH instruction in the MIPS16 mode. For example the encoding of: wsbh $2, $4 is (as you've shown in a dump above) 0x7c0410a0, which in the MIPS16 mode yields (in the big-endian mode): 00000000 <f>: 0: 7c04 sd s0,32(a0) 2: 10a0 b 144 <f+0x144> -- so this won't do what you might expect, you'll get a Reserved Instruction exception on the SD instruction, which is not supported in the 32-bit mode, and consequently SIGILL. > The patch was run tested on QEMU Malta and an router with Atheros > AR9331 SoC. I didn't test __arch_swab64() though. I have done many > other trial-and-error tests while preparing this patch. It was a mess > when I was sure I should expect some sensible behaviour from the > compiler while it actually just did not behave that way. I've compiled your example provided and as I stated in the original mail `__arch_swab16' is always produced as a separate function, whether `.set nomips16' is present in the inline assembly placed there or not. This is with (unreleased) GCC 6.0.0. However if you happen to have a buggy compiler that fails to emit `__arch_swab16' as a separate function despite the `nomips16' attribute, then it's better if the resulting generated assembly code fails to assemble rather than if it goes astray at the run time. > > So setting aside the correctness issues discussed above, for MIPS16 code > > this has to be put out of line by the compiler, with all the usual > > overhead of a function call, causing a performance loss rather than a gain > > expected here. Especially as switching the ISA mode requires draining the > > whole pipeline so that subsequent instructions are interpreted in the new > > execution mode. This is expensive in performance terms. > > > > I'm fairly sure simply disabling these asms (#ifndef __mips16) and > > letting the compiler generate the right mask and shift operations from > > plain C code will be cheaper in performance terms and possibly cheaper in > > code size as well, especially in otherwise leaf functions where an extra > > function call will according to the ABI clobber temporaries. So I suggest > > going in that direction instead. > > I agree. Then you will provide the fix right? I am just curious > where that __mips16 should be placed or is it from compiler and > assembler? No, it's your bug after all. I think the last paragraph I wrote quoted above combined with the source code in question make it clear what to do. I have also checked what the difference in generated MIPS16 code is between a call to `__arch_swab16': 0000000c <f>: c: 64c4 save 32,ra e: 1800 0000 jal 0 <__arch_swab16> e: R_MIPS16_26 __arch_swab16 12: ec31 zeh a0 14: 6444 restore 32,ra 16: e8a0 jrc ra and equivalent generic code: 0000000c <f>: c: ec31 zeh a0 e: 3280 sll v0,a0,8 10: 3482 srl a0,8 12: ea8d or v0,a0 14: e820 jr ra 16: ea31 zeh v0 so the win is I think clear. Finally the MIPS64 `__arch_swab64' case does not matter as we have no MIPS16 support for 64-bit code in Linux, the toolchain will simply refuse to build it (only bare metal is supported). Maciej ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] MIPS: UAPI: Fix unrecognized opcode WSBH/DSBH/DSHD when using MIPS16. 2015-09-05 14:25 ` Maciej W. Rozycki @ 2015-09-06 2:42 ` Yousong Zhou 2015-09-06 23:23 ` Maciej W. Rozycki 0 siblings, 1 reply; 7+ messages in thread From: Yousong Zhou @ 2015-09-06 2:42 UTC (permalink / raw) To: Maciej W. Rozycki Cc: Ralf Baechle, Chen Jie, linux-mips, linux-kernel, Florian Fainelli Hi, Maciej, first of all, thank you for your time on this, appreciate it. Comments inline On 5 September 2015 at 22:25, Maciej W. Rozycki <macro@linux-mips.org> wrote: > On Sat, 5 Sep 2015, Yousong Zhou wrote: > >> > This can't be true. The compiler does not intepret the contents of an >> > inline asm and therefore cannot decide whether to inline a function >> > containing one or not based on the lone presence or the absence of an >> > assembly directive within. >> > >> >> Most of the time I trust my compiler and never meddle with the >> toolchain. Anyway I made a patch because it really did not work for >> me. No big deal. It's not the end of world. It started with a >> comment from OpenWrt packages feeds [1]. Actually this "unrecognized >> opcode" problem have occurred within OpenWrt quite a few times before. >> >> [1] https://github.com/openwrt/packages/commit/1e29676a8ac74f797f8ca799364681cec575ae6f#commitcomment-12901931 > > The bug certainly was there, it's just your analysis and consequently the > fix that are wrong in the general case for some reason, maybe a buggy > compiler. > This is the compiler "--version", mips-openwrt-linux-gcc (OpenWrt/Linaro GCC 4.8-2014.04 r46763) 4.8.3 >> > It looks to me you're papering something over here -- when you use a >> > `.set nomips16' directive the assembler will happily switch your >> > instruction set in the middle of an instruction stream. Consequently if >> > this function is (for whatever reason; it should not really) inlined in >> > MIPS16 code, then you'll get a MIPS32 instruction in the middle, which >> > will obviously be interpreted differently in the MIPS16 execution mode and >> > is therefore surely a recipe for disaster. >> >> If by "papering" you mean "made up", then whatever. Yeah, it's >> disaster, an "invalid instruction" abort. > > By "papering over" I mean forcing source code to compile successfully at > the risk of producing incorrect binary code. Learned a new phrase/idiom :) > >> > How did you test your change and made the conclusion quoted with your >> > patch description? >> > >> >> Compile the following program with a MIPS 24kc big endian variant compiler with >> flag "-mips32r2 -mips16 -Os". >> >> #include <stdio.h> >> #include <stdint.h> >> >> uint16_t __attribute__((noinline)) f(uint16_t v) >> { >> v = __cpu_to_le16(v); >> return v; >> } >> >> int main() >> { >> printf("%x\n", f(0xbeef)); >> >> return 0; >> } >> >> When only ".set nomips16" was specified in __arch_swab16(), this was output >> from objdump. >> >> 242 004003e0 <f>: >> 243 4003e0: 7c0410a0 wsbh v0,a0 >> 244 4003e4: e820ea31 swc2 $0,-5583(at) >> 245 4003e8: 65006500 0x65006500 >> 246 4003ec: 65006500 0x65006500 > > Quite obviously. > > For the record: the first instruction has been assembled in the regular > MIPS mode and that propagated to symbol annotation. Consequently `f' is > seen by `objdump' as a regular MIPS function and disassembles it all as > such. You can put a global label immediately after the WSBH instruction > in your source code to have the rest of the function disassembled > correctly (of course this won't make this code work at the run time). Thanks for the trick. Objdump really disassembled it correctly. > >> __arch_swab16() was indeed inlined. That swc2 instruction can be got from >> assembler with the following code (it's from the "-S" result of GCC). >> >> .set mips16 >> .set noreorder >> .set nomacro >> j $31 >> zeh $2 >> >> When only nomips16 function attribute was specified, this time GCC failed with >> unrecognized opcode >> >> /tmp/ccaGCouL.s: Assembler messages: >> /tmp/ccaGCouL.s:21: Error: unrecognized opcode `wsbh $2,$4' >> >> The generated assembly was something in the following form. Looks like the >> assembler did not automatically switch to MIPS32 mode when ".set arch=mip32r2" > > There's no switch to regular MIPS mode implied with `.set arch=mip32r2', > the directive merely switches the ISA level, affecting both the regular > MIPS and the MIPS16 mode (the MIPS32r2 ISA level adds extra MIPS16 > instructions too, e.g. ZEH is a new addition). Agreed > >> .set mips16 >> >> .ent f >> .type f, @function >> f >> ... >> .set push >> .set arch=mips32r2 >> wsbh $2,$4 >> .pop >> j $31 >> zeh $2 >> .end f >> ... > > That's exactly the papered-over buggy code scenario I've been referring > to above. This is clearly MIPS16 code: ZEH is a MIPS16 instruction only, > there's no such regular MIPS counterpart. And it also obviously fails to > assemble because on the contrary there's no MIPS16 WSBH instruction. > Yes, it won't assemble. > Now if you stick `.set nomips16' just above WSBH, then this code will > happily assemble, because this single instruction only (`.set pop' reverts > any previous `.set' directives; I'm assuming you wrote above by hand and > `.pop' is a typo) will assemble in the regular MIPS instruction mode. But > if this code is ever reached, then the processor will still execute the > machine code produced by the assembler from the WSBH instruction in the > MIPS16 mode. Yes, I hand-copied it from the output of "gcc -S" just to show the form/pattern (the original output is too long for this conversation). No, that `.pop' is not a typo (I just did a double-check). If I stick `.set nomips16` just above WSBH, that's just what the original patch tries to do, papering over the fact that it did not compile/assemble without it. > > For example the encoding of: > > wsbh $2, $4 > > is (as you've shown in a dump above) 0x7c0410a0, which in the MIPS16 mode > yields (in the big-endian mode): > > 00000000 <f>: > 0: 7c04 sd s0,32(a0) > 2: 10a0 b 144 <f+0x144> > > -- so this won't do what you might expect, you'll get a Reserved > Instruction exception on the SD instruction, which is not supported in the > 32-bit mode, and consequently SIGILL. Agreed. > >> The patch was run tested on QEMU Malta and an router with Atheros >> AR9331 SoC. I didn't test __arch_swab64() though. I have done many >> other trial-and-error tests while preparing this patch. It was a mess >> when I was sure I should expect some sensible behaviour from the >> compiler while it actually just did not behave that way. > > I've compiled your example provided and as I stated in the original mail > `__arch_swab16' is always produced as a separate function, whether `.set > nomips16' is present in the inline assembly placed there or not. This is > with (unreleased) GCC 6.0.0. > > However if you happen to have a buggy compiler that fails to emit > `__arch_swab16' as a separate function despite the `nomips16' attribute, > then it's better if the resulting generated assembly code fails to > assemble rather than if it goes astray at the run time. Now, my compiler refused to emit `__arch_swab16' as a separate function even with the `nomips16' function attribute. This is the kind of "mess" I just meant above, again. I expect that it should emit a separate function and call it with a jump observing that the caller `f' is in MIPS16 mode yet the `__arch_swab16' is noMIPS16. sigh~~ > >> > So setting aside the correctness issues discussed above, for MIPS16 code >> > this has to be put out of line by the compiler, with all the usual >> > overhead of a function call, causing a performance loss rather than a gain >> > expected here. Especially as switching the ISA mode requires draining the >> > whole pipeline so that subsequent instructions are interpreted in the new >> > execution mode. This is expensive in performance terms. >> > >> > I'm fairly sure simply disabling these asms (#ifndef __mips16) and >> > letting the compiler generate the right mask and shift operations from >> > plain C code will be cheaper in performance terms and possibly cheaper in >> > code size as well, especially in otherwise leaf functions where an extra >> > function call will according to the ABI clobber temporaries. So I suggest >> > going in that direction instead. >> >> I agree. Then you will provide the fix right? I am just curious >> where that __mips16 should be placed or is it from compiler and >> assembler? > > No, it's your bug after all. I think the last paragraph I wrote quoted > above combined with the source code in question make it clear what to do. Okay, I will try. Most of the time when textbooks read clearly/obviously/apparently, things go astray ;) > > I have also checked what the difference in generated MIPS16 code is > between a call to `__arch_swab16': > > 0000000c <f>: > c: 64c4 save 32,ra > e: 1800 0000 jal 0 <__arch_swab16> > e: R_MIPS16_26 __arch_swab16 > 12: ec31 zeh a0 > 14: 6444 restore 32,ra > 16: e8a0 jrc ra > > and equivalent generic code: > > 0000000c <f>: > c: ec31 zeh a0 > e: 3280 sll v0,a0,8 > 10: 3482 srl a0,8 > 12: ea8d or v0,a0 > 14: e820 jr ra > 16: ea31 zeh v0 > > so the win is I think clear. > > Finally the MIPS64 `__arch_swab64' case does not matter as we have no > MIPS16 support for 64-bit code in Linux, the toolchain will simply refuse > to build it (only bare metal is supported). Thanks for the information. Never played with MIPS64 before. Regards yousong ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] MIPS: UAPI: Fix unrecognized opcode WSBH/DSBH/DSHD when using MIPS16. 2015-09-06 2:42 ` Yousong Zhou @ 2015-09-06 23:23 ` Maciej W. Rozycki 0 siblings, 0 replies; 7+ messages in thread From: Maciej W. Rozycki @ 2015-09-06 23:23 UTC (permalink / raw) To: Yousong Zhou Cc: Ralf Baechle, Chen Jie, linux-mips, linux-kernel, Florian Fainelli On Sun, 6 Sep 2015, Yousong Zhou wrote: > Hi, Maciej, first of all, thank you for your time on this, > appreciate it. You're welcome! > > The bug certainly was there, it's just your analysis and consequently the > > fix that are wrong in the general case for some reason, maybe a buggy > > compiler. > > > > This is the compiler "--version", > > mips-openwrt-linux-gcc (OpenWrt/Linaro GCC 4.8-2014.04 r46763) 4.8.3 A-ha! I've checked GCC's history and the symptom you're seeing was PR target/55777: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55777 and the bug has only been fixed pretty recently (r199328), already after GCC 4.8 has branched. From your troubles I infer the fix has not been backported to 4.8 and therefore is only available from GCC 4.9 up; of course either you or OpenWrt can still backport it and apply locally. Fortunately as I noted you need not dive into that hole as the reasonable choice here is to avoid the asm for MIPS16 code altogether, regardless of the GCC bug. > > Now if you stick `.set nomips16' just above WSBH, then this code will > > happily assemble, because this single instruction only (`.set pop' reverts > > any previous `.set' directives; I'm assuming you wrote above by hand and > > `.pop' is a typo) will assemble in the regular MIPS instruction mode. But > > if this code is ever reached, then the processor will still execute the > > machine code produced by the assembler from the WSBH instruction in the > > MIPS16 mode. > > Yes, I hand-copied it from the output of "gcc -S" just to > show the form/pattern (the original output is too long for > this conversation). No, that `.pop' is not a typo (I just > did a double-check). Well, please check again then as there's no such pseudo-op: $ cat pop.s .pop $ mips-mti-linux-gnu-as -o pop.o pop.s pop.s: Assembler messages: pop.s:1: Error: unknown pseudo-op: `.pop' $ > > No, it's your bug after all. I think the last paragraph I wrote quoted > > above combined with the source code in question make it clear what to do. > > Okay, I will try. Most of the time when textbooks read > clearly/obviously/apparently, things go astray ;) Well, if you really find yourself stuck with it, then come back for more hints, however please do try figuring it out yourself first as it'll be a good exercise. Maciej ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2015-09-06 23:23 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-09-03 9:47 [PATCH] MIPS: UAPI: Fix unrecognized opcode WSBH/DSBH/DSHD when using MIPS16 Yousong Zhou 2015-09-04 18:52 ` Maciej W. Rozycki 2015-09-05 3:29 ` Yousong Zhou 2015-09-05 3:34 ` Yousong Zhou 2015-09-05 14:25 ` Maciej W. Rozycki 2015-09-06 2:42 ` Yousong Zhou 2015-09-06 23:23 ` Maciej W. Rozycki
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.