All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] MIPS: fix code generation for non-DSP capable CPUs
@ 2013-03-14 13:18 Florian Fainelli
  2013-03-14 14:22 ` Steven J. Hill
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Florian Fainelli @ 2013-03-14 13:18 UTC (permalink / raw)
  To: ralf; +Cc: blogic, Steven.Hill, linux-mips, Florian Fainelli

Commit 32a7ede (MIPS: dsp: Add assembler support for DSP ASEs) has
enabled the use of DSP ASE specific instructions such as rddsp and wrdsp
under the idea that all code path that will make use of these two
instructions are properly checking for cpu_has_dsp to ensure that the
particular CPU we are running on *actually* supports DSP ASE.

This commit actually causes the following oops on QEMU Malta emulating a
MIPS 24Kc without the DSP ASE implemented:

[    7.960000] Reserved instruction in kernel
[    7.960000] Cpu 0
[    7.960000] $ 0   : 00000000 00000000 00000014 00000005
[    7.960000] $ 4   : 8fc2de48 00000001 00000000 8f59ddb0
[    7.960000] $ 8   : 8f5ceec4 00000018 00000c00 00800000
[    7.960000] $12   : 00000100 00000200 00000000 00457b84
[    7.960000] $16   : 00000000 8fc2ba78 8f4ec980 00000001
[    7.960000] $20   : 80418f90 00000000 00000000 000002dd
[    7.960000] $24   : 0000009c 7730d7b8
[    7.960000] $28   : 8f59c000 8f59dd38 00000001 80104248
[    7.960000] Hi    : 0000001d
[    7.960000] Lo    : 0000000b
[    7.960000] epc   : 801041ec thread_saved_pc+0x2c/0x38
[    7.960000]     Not tainted
[    7.960000] ra    : 80104248 get_wchan+0x48/0xac
[    7.960000] Status: 1000b703    KERNEL EXL IE
[    7.960000] Cause : 10800028
[    7.960000] PrId  : 00019300 (MIPS 24Kc)
[    7.960000] Modules linked in:
[    7.960000] Process killall (pid: 1574, threadinfo=8f59c000,
task=8fd14558, tls=773aa440)
[    7.960000] Stack : 8fc2ba78 8012b008 0000000c 0000001d 00000000
00000000 8f58a380
                  8f58a380 8fc2ba78 80202668 8f59de78 8f468600 8f59de28
801b2a3c 8f59df00 8f98ba20 74696e69
                  8f468600 8f59de28 801b7308 0081c007 00000000 00000000
00000000 00000000 00000000 00000000
                  00000000 8fc2bbb4 00000001 0000001d 0000000b 77f038cc
7fe80648 ffffffff ffffffff 00000000
                  00000001 0016e000 00000000 ...
[    7.960000] Call Trace:
[    7.960000] [<801041ec>] thread_saved_pc+0x2c/0x38
[    7.960000] [<80104248>] get_wchan+0x48/0xac

The disassembly of thread_saved_pc points to the following:
000006d0 <thread_saved_pc>:
 6d0:   8c820208        lw      v0,520(a0)
 6d4:   3c030000        lui     v1,0x0
 6d8:   24630000        addiu   v1,v1,0
 6dc:   10430008        beq     v0,v1,700 <thread_saved_pc+0x30>
 6e0:   00000000        nop
 6e4:   3c020000        lui     v0,0x0
 6e8:   8c43000c        lw      v1,12(v0)
 6ec:   04620004        bltzl   v1,700 <thread_saved_pc+0x30>
 6f0:   00001021        move    v0,zero
 6f4:   8c840200        lw      a0,512(a0)
 6f8:   00031080        sll     v0,v1,0x2
 6fc:   7c44100a        lwx     v0,a0(v0)   <------------
 700:   03e00008        jr      ra
 704:   00000000        nop

If we specifically disable -mdsp/-mdspr2 for arch/mips/kernel/process.o,
we get the following (non-crashing) assembly:

00000708 <thread_saved_pc>:
 708:   8c820208        lw      v0,520(a0)
 70c:   3c030000        lui     v1,0x0
 710:   24630000        addiu   v1,v1,0
 714:   10430009        beq     v0,v1,73c <thread_saved_pc+0x34>
 718:   00000000        nop
 71c:   3c020000        lui     v0,0x0
 720:   8c42000c        lw      v0,12(v0)
 724:   04420005        bltzl   v0,73c <thread_saved_pc+0x34>
 728:   00001021        move    v0,zero
 72c:   8c830200        lw      v1,512(a0)
 730:   00021080        sll     v0,v0,0x2
 734:   00431021        addu    v0,v0,v1
 738:   8c420000        lw      v0,0(v0)
 73c:   03e00008        jr      ra
 740:   00000000        nop

The specific line that leads a different assembly being produced is:

unsigned long thread_saved_pc(struct task_struct *tsk)
...
	return ((unsigned long *)t->reg29)[schedule_mfi.pc_offset]; <---

The problem here is that the compiler was given the right to use DSP
instructions with the -mdsp / -mdspr2 command-line switches and
performed some optimization for us and used DSP ASE instructions where
we are not checking that the running CPU actually supports DSP ASE.

This patch fixes the issue by partially reverting commit 32a7ede for
arch/mips/kernel/Makefile in order to remove the -mdsp / -mdspr2
compiler command-line switches such that we are now guaranteed that the
compiler will not optimize using DSP ASE reserved instructions. We also
need to fixup the rddsp/wrdsp and m{t,h}{hi,lo}{0,1,2,3} macros in
arch/mips/include/asm/mipsregs.h to tell the assembler that we are going
to explicitely use DSP ASE reserved instructions.

Signed-off-by: Florian Fainelli <florian@openwrt.org>
---
Ralf, John,

This should be part of your 3.9-rc3 pull request if I may ;)

 arch/mips/include/asm/mipsregs.h |  209 ++++++++++++++++++++++++++++++++++----
 arch/mips/kernel/Makefile        |   14 ---
 2 files changed, 190 insertions(+), 33 deletions(-)

diff --git a/arch/mips/include/asm/mipsregs.h b/arch/mips/include/asm/mipsregs.h
index 12b70c2..0da44d4 100644
--- a/arch/mips/include/asm/mipsregs.h
+++ b/arch/mips/include/asm/mipsregs.h
@@ -1166,7 +1166,10 @@ do {									\
 	unsigned int __dspctl;						\
 									\
 	__asm__ __volatile__(						\
+	"	.set push					\n"	\
+	"	.set dsp					\n"	\
 	"	rddsp	%0, %x1					\n"	\
+	"	.set pop					\n"	\
 	: "=r" (__dspctl)						\
 	: "i" (mask));							\
 	__dspctl;							\
@@ -1175,30 +1178,198 @@ do {									\
 #define wrdsp(val, mask)						\
 do {									\
 	__asm__ __volatile__(						\
+	"	.set push					\n"	\
+	"	.set dsp					\n"	\
 	"	wrdsp	%0, %x1					\n"	\
+	"	.set pop					\n"	\
 	:								\
 	: "r" (val), "i" (mask));					\
 } while (0)
 
-#define mflo0() ({ long mflo0; __asm__("mflo %0, $ac0" : "=r" (mflo0)); mflo0;})
-#define mflo1() ({ long mflo1; __asm__("mflo %0, $ac1" : "=r" (mflo1)); mflo1;})
-#define mflo2() ({ long mflo2; __asm__("mflo %0, $ac2" : "=r" (mflo2)); mflo2;})
-#define mflo3() ({ long mflo3; __asm__("mflo %0, $ac3" : "=r" (mflo3)); mflo3;})
-
-#define mfhi0() ({ long mfhi0; __asm__("mfhi %0, $ac0" : "=r" (mfhi0)); mfhi0;})
-#define mfhi1() ({ long mfhi1; __asm__("mfhi %0, $ac1" : "=r" (mfhi1)); mfhi1;})
-#define mfhi2() ({ long mfhi2; __asm__("mfhi %0, $ac2" : "=r" (mfhi2)); mfhi2;})
-#define mfhi3() ({ long mfhi3; __asm__("mfhi %0, $ac3" : "=r" (mfhi3)); mfhi3;})
-
-#define mtlo0(x) __asm__("mtlo %0, $ac0" ::"r" (x))
-#define mtlo1(x) __asm__("mtlo %0, $ac1" ::"r" (x))
-#define mtlo2(x) __asm__("mtlo %0, $ac2" ::"r" (x))
-#define mtlo3(x) __asm__("mtlo %0, $ac3" ::"r" (x))
-
-#define mthi0(x) __asm__("mthi %0, $ac0" ::"r" (x))
-#define mthi1(x) __asm__("mthi %0, $ac1" ::"r" (x))
-#define mthi2(x) __asm__("mthi %0, $ac2" ::"r" (x))
-#define mthi3(x) __asm__("mthi %0, $ac3" ::"r" (x))
+#define mflo0()								\
+({									\
+	long mflo0;							\
+	__asm__(							\
+	"	.set push					\n"	\
+	"	.set dsp					\n"	\
+	"	mflo %0, $ac0					\n"	\
+	"	.set pop					\n" 	\
+	: "=r" (mflo0)); 						\
+	mflo0;								\
+})
+
+#define mflo1()								\
+({									\
+	long mflo1;							\
+	__asm__(							\
+	"	.set push					\n"	\
+	"	.set dsp					\n"	\
+	"	mflo %0, $ac1					\n"	\
+	"	.set pop					\n" 	\
+	: "=r" (mflo1)); 						\
+	mflo1;								\
+})
+
+#define mflo2()								\
+({									\
+	long mflo2;							\
+	__asm__(							\
+	"	.set push					\n"	\
+	"	.set dsp					\n"	\
+	"	mflo %0, $ac2					\n"	\
+	"	.set pop					\n" 	\
+	: "=r" (mflo2)); 						\
+	mflo2;								\
+})
+
+#define mflo3()								\
+({									\
+	long mflo3;							\
+	__asm__(							\
+	"	.set push					\n"	\
+	"	.set dsp					\n"	\
+	"	mflo %0, $ac3					\n"	\
+	"	.set pop					\n" 	\
+	: "=r" (mflo3)); 						\
+	mflo3;								\
+})
+
+#define mfhi0()								\
+({									\
+	long mfhi0;							\
+	__asm__(							\
+	"	.set push					\n"	\
+	"	.set dsp					\n"	\
+	"	mfhi %0, $ac0					\n"	\
+	"	.set pop					\n" 	\
+	: "=r" (mfhi0)); 						\
+	mfhi0;								\
+})
+
+#define mfhi1()								\
+({									\
+	long mfhi1;							\
+	__asm__(							\
+	"	.set push					\n"	\
+	"	.set dsp					\n"	\
+	"	mfhi %0, $ac1					\n"	\
+	"	.set pop					\n" 	\
+	: "=r" (mfhi1)); 						\
+	mfhi1;								\
+})
+
+#define mfhi2()								\
+({									\
+	long mfhi2;							\
+	__asm__(							\
+	"	.set push					\n"	\
+	"	.set dsp					\n"	\
+	"	mfhi %0, $ac2					\n"	\
+	"	.set pop					\n" 	\
+	: "=r" (mfhi2)); 						\
+	mfhi2;								\
+})
+
+#define mfhi3()								\
+({									\
+	long mfhi3;							\
+	__asm__(							\
+	"	.set push					\n"	\
+	"	.set dsp					\n"	\
+	"	mfhi %0, $ac3					\n"	\
+	"	.set pop					\n" 	\
+	: "=r" (mfhi3)); 						\
+	mfhi3;								\
+})
+
+
+#define mtlo0(x)							\
+({									\
+	__asm__(							\
+	"	.set push					\n"	\
+	"	.set dsp					\n"	\
+	"	mtlo %0, $ac0					\n"	\
+	"	.set pop					\n"	\
+	:								\
+	: "r" (x));							\
+})
+
+#define mtlo1(x)							\
+({									\
+	__asm__(							\
+	"	.set push					\n"	\
+	"	.set dsp					\n"	\
+	"	mtlo %0, $ac1					\n"	\
+	"	.set pop					\n"	\
+	:								\
+	: "r" (x));							\
+})
+
+#define mtlo2(x)							\
+({									\
+	__asm__(							\
+	"	.set push					\n"	\
+	"	.set dsp					\n"	\
+	"	mtlo %0, $ac2					\n"	\
+	"	.set pop					\n"	\
+	:								\
+	: "r" (x));							\
+})
+
+#define mtlo3(x)							\
+({									\
+	__asm__(							\
+	"	.set push					\n"	\
+	"	.set dsp					\n"	\
+	"	mtlo %0, $ac3					\n"	\
+	"	.set pop					\n"	\
+	:								\
+	: "r" (x));							\
+})
+
+#define mthi0(x)							\
+({									\
+	__asm__(							\
+	"	.set push					\n"	\
+	"	.set dsp					\n"	\
+	"	mthi %0, $ac0					\n"	\
+	"	.set pop					\n"	\
+	:								\
+	: "r" (x));							\
+})
+
+#define mthi1(x)							\
+({									\
+	__asm__(							\
+	"	.set push					\n"	\
+	"	.set dsp					\n"	\
+	"	mthi %0, $ac1					\n"	\
+	"	.set pop					\n"	\
+	:								\
+	: "r" (x));							\
+})
+
+#define mthi2(x)							\
+({									\
+	__asm__(							\
+	"	.set push					\n"	\
+	"	.set dsp					\n"	\
+	"	mthi %0, $ac2					\n"	\
+	"	.set pop					\n"	\
+	:								\
+	: "r" (x));							\
+})
+
+#define mthi3(x)							\
+({									\
+	__asm__(							\
+	"	.set push					\n"	\
+	"	.set dsp					\n"	\
+	"	mthi %0, $ac3					\n"	\
+	"	.set pop					\n"	\
+	:								\
+	: "r" (x));							\
+})
 
 #else
 
diff --git a/arch/mips/kernel/Makefile b/arch/mips/kernel/Makefile
index f81d98f..2e8a9c1 100644
--- a/arch/mips/kernel/Makefile
+++ b/arch/mips/kernel/Makefile
@@ -109,20 +109,6 @@ obj-$(CONFIG_JUMP_LABEL)	+= jump_label.o
 ifeq ($(CONFIG_CPU_MIPSR2), y)
 CFLAGS_DSP 			= -DHAVE_AS_DSP
 
-#
-# Check if assembler supports DSP ASE
-#
-ifeq ($(call cc-option-yn,-mdsp), y)
-CFLAGS_DSP			+= -mdsp
-endif
-
-#
-# Check if assembler supports DSP ASE Rev2
-#
-ifeq ($(call cc-option-yn,-mdspr2), y)
-CFLAGS_DSP			+= -mdspr2
-endif
-
 CFLAGS_signal.o			= $(CFLAGS_DSP)
 CFLAGS_signal32.o		= $(CFLAGS_DSP)
 CFLAGS_process.o		= $(CFLAGS_DSP)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* RE: [PATCH] MIPS: fix code generation for non-DSP capable CPUs
  2013-03-14 13:18 [PATCH] MIPS: fix code generation for non-DSP capable CPUs Florian Fainelli
@ 2013-03-14 14:22 ` Steven J. Hill
  2013-03-16 14:06   ` Florian Fainelli
  2013-03-16 20:52 ` Steven J. Hill
  2013-03-18 16:56 ` [PATCH v2] " Florian Fainelli
  2 siblings, 1 reply; 6+ messages in thread
From: Steven J. Hill @ 2013-03-14 14:22 UTC (permalink / raw)
  To: Florian Fainelli, ralf; +Cc: blogic, linux-mips

Florian,

Let me test this patch out with our DSP testsuite and then I can give you an ACK or NACK. Thanks!

-Steve

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] MIPS: fix code generation for non-DSP capable CPUs
  2013-03-14 14:22 ` Steven J. Hill
@ 2013-03-16 14:06   ` Florian Fainelli
  0 siblings, 0 replies; 6+ messages in thread
From: Florian Fainelli @ 2013-03-16 14:06 UTC (permalink / raw)
  To: linux-mips; +Cc: Steven J. Hill, ralf, blogic

Hey Steven,

Le jeudi 14 mars 2013 15:22:28, Steven J. Hill a écrit :
> Florian,
> 
> Let me test this patch out with our DSP testsuite and then I can give you
> an ACK or NACK. Thanks!

Have you been able to give this a try?
-- 
Florian

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH] MIPS: fix code generation for non-DSP capable CPUs
  2013-03-14 13:18 [PATCH] MIPS: fix code generation for non-DSP capable CPUs Florian Fainelli
  2013-03-14 14:22 ` Steven J. Hill
@ 2013-03-16 20:52 ` Steven J. Hill
  2013-03-17 14:26   ` Florian Fainelli
  2013-03-18 16:56 ` [PATCH v2] " Florian Fainelli
  2 siblings, 1 reply; 6+ messages in thread
From: Steven J. Hill @ 2013-03-16 20:52 UTC (permalink / raw)
  To: Florian Fainelli, ralf; +Cc: blogic, linux-mips

Florian,

I just tested this patch with our DSP testsuite and everything works. I also disassembled the vmlinux ELF binary to make sure the instructions were being generated correctly.


Acked-by: Steven J. Hill <Steven.Hill@imgtec.com>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] MIPS: fix code generation for non-DSP capable CPUs
  2013-03-16 20:52 ` Steven J. Hill
@ 2013-03-17 14:26   ` Florian Fainelli
  0 siblings, 0 replies; 6+ messages in thread
From: Florian Fainelli @ 2013-03-17 14:26 UTC (permalink / raw)
  To: Steven J. Hill; +Cc: ralf, blogic, linux-mips

On Saturday 16 March 2013 20:52:44 Steven J. Hill wrote:
> Florian,
> 
> I just tested this patch with our DSP testsuite and everything works. I also 
disassembled the vmlinux ELF binary to make sure the instructions were being 
generated correctly.
> 
> 
> Acked-by: Steven J. Hill <Steven.Hill@imgtec.com>

Great, thanks Steven!
-- 
Florian

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2] MIPS: fix code generation for non-DSP capable CPUs
  2013-03-14 13:18 [PATCH] MIPS: fix code generation for non-DSP capable CPUs Florian Fainelli
  2013-03-14 14:22 ` Steven J. Hill
  2013-03-16 20:52 ` Steven J. Hill
@ 2013-03-18 16:56 ` Florian Fainelli
  2 siblings, 0 replies; 6+ messages in thread
From: Florian Fainelli @ 2013-03-18 16:56 UTC (permalink / raw)
  To: ralf; +Cc: linux-mips, blogic, Florian Fainelli

Commit 32a7ede (MIPS: dsp: Add assembler support for DSP ASEs) has
enabled the use of DSP ASE specific instructions such as rddsp and wrdsp
under the idea that all code path that will make use of these two
instructions are properly checking for cpu_has_dsp to ensure that the
particular CPU we are running on *actually* supports DSP ASE.

This commit actually causes the following oops on QEMU Malta emulating a
MIPS 24Kc without the DSP ASE implemented:

[    7.960000] Reserved instruction in kernel
[    7.960000] Cpu 0
[    7.960000] $ 0   : 00000000 00000000 00000014 00000005
[    7.960000] $ 4   : 8fc2de48 00000001 00000000 8f59ddb0
[    7.960000] $ 8   : 8f5ceec4 00000018 00000c00 00800000
[    7.960000] $12   : 00000100 00000200 00000000 00457b84
[    7.960000] $16   : 00000000 8fc2ba78 8f4ec980 00000001
[    7.960000] $20   : 80418f90 00000000 00000000 000002dd
[    7.960000] $24   : 0000009c 7730d7b8
[    7.960000] $28   : 8f59c000 8f59dd38 00000001 80104248
[    7.960000] Hi    : 0000001d
[    7.960000] Lo    : 0000000b
[    7.960000] epc   : 801041ec thread_saved_pc+0x2c/0x38
[    7.960000]     Not tainted
[    7.960000] ra    : 80104248 get_wchan+0x48/0xac
[    7.960000] Status: 1000b703    KERNEL EXL IE
[    7.960000] Cause : 10800028
[    7.960000] PrId  : 00019300 (MIPS 24Kc)
[    7.960000] Modules linked in:
[    7.960000] Process killall (pid: 1574, threadinfo=8f59c000,
task=8fd14558, tls=773aa440)
[    7.960000] Stack : 8fc2ba78 8012b008 0000000c 0000001d 00000000
00000000 8f58a380
                  8f58a380 8fc2ba78 80202668 8f59de78 8f468600 8f59de28
801b2a3c 8f59df00 8f98ba20 74696e69
                  8f468600 8f59de28 801b7308 0081c007 00000000 00000000
00000000 00000000 00000000 00000000
                  00000000 8fc2bbb4 00000001 0000001d 0000000b 77f038cc
7fe80648 ffffffff ffffffff 00000000
                  00000001 0016e000 00000000 ...
[    7.960000] Call Trace:
[    7.960000] [<801041ec>] thread_saved_pc+0x2c/0x38
[    7.960000] [<80104248>] get_wchan+0x48/0xac

The disassembly of thread_saved_pc points to the following:
000006d0 <thread_saved_pc>:
 6d0:   8c820208        lw      v0,520(a0)
 6d4:   3c030000        lui     v1,0x0
 6d8:   24630000        addiu   v1,v1,0
 6dc:   10430008        beq     v0,v1,700 <thread_saved_pc+0x30>
 6e0:   00000000        nop
 6e4:   3c020000        lui     v0,0x0
 6e8:   8c43000c        lw      v1,12(v0)
 6ec:   04620004        bltzl   v1,700 <thread_saved_pc+0x30>
 6f0:   00001021        move    v0,zero
 6f4:   8c840200        lw      a0,512(a0)
 6f8:   00031080        sll     v0,v1,0x2
 6fc:   7c44100a        lwx     v0,a0(v0)   <------------
 700:   03e00008        jr      ra
 704:   00000000        nop

If we specifically disable -mdsp/-mdspr2 for arch/mips/kernel/process.o,
we get the following (non-crashing) assembly:

00000708 <thread_saved_pc>:
 708:   8c820208        lw      v0,520(a0)
 70c:   3c030000        lui     v1,0x0
 710:   24630000        addiu   v1,v1,0
 714:   10430009        beq     v0,v1,73c <thread_saved_pc+0x34>
 718:   00000000        nop
 71c:   3c020000        lui     v0,0x0
 720:   8c42000c        lw      v0,12(v0)
 724:   04420005        bltzl   v0,73c <thread_saved_pc+0x34>
 728:   00001021        move    v0,zero
 72c:   8c830200        lw      v1,512(a0)
 730:   00021080        sll     v0,v0,0x2
 734:   00431021        addu    v0,v0,v1
 738:   8c420000        lw      v0,0(v0)
 73c:   03e00008        jr      ra
 740:   00000000        nop

The specific line that leads a different assembly being produced is:

unsigned long thread_saved_pc(struct task_struct *tsk)
...
	return ((unsigned long *)t->reg29)[schedule_mfi.pc_offset]; <---

The problem here is that the compiler was given the right to use DSP
instructions with the -mdsp / -mdspr2 command-line switches and
performed some optimization for us and used DSP ASE instructions where
we are not checking that the running CPU actually supports DSP ASE.

This patch fixes the issue by partially reverting commit 32a7ede for
arch/mips/kernel/Makefile in order to remove the -mdsp / -mdspr2
compiler command-line switches such that we are now guaranteed that the
compiler will not optimize using DSP ASE reserved instructions. We also
need to fixup the rddsp/wrdsp and m{t,h}{hi,lo}{0,1,2,3} macros in
arch/mips/include/asm/mipsregs.h to tell the assembler that we are going
to explicitely use DSP ASE reserved instructions. The comment in
arch/mips/kernel/Makefile is also updated to reflect that.

Acked-by: Steven J. Hill <Steven.Hill@imgtec.com>
Signed-off-by: Florian Fainelli <florian@openwrt.org>
---
Changes since v1:
- updated comment in arch/mips/kernel/Makefile
- added Steven's Acked-by tag

 arch/mips/include/asm/mipsregs.h |  209 ++++++++++++++++++++++++++++++++++----
 arch/mips/kernel/Makefile        |   25 ++---
 2 files changed, 196 insertions(+), 38 deletions(-)

diff --git a/arch/mips/include/asm/mipsregs.h b/arch/mips/include/asm/mipsregs.h
index 12b70c2..0da44d4 100644
--- a/arch/mips/include/asm/mipsregs.h
+++ b/arch/mips/include/asm/mipsregs.h
@@ -1166,7 +1166,10 @@ do {									\
 	unsigned int __dspctl;						\
 									\
 	__asm__ __volatile__(						\
+	"	.set push					\n"	\
+	"	.set dsp					\n"	\
 	"	rddsp	%0, %x1					\n"	\
+	"	.set pop					\n"	\
 	: "=r" (__dspctl)						\
 	: "i" (mask));							\
 	__dspctl;							\
@@ -1175,30 +1178,198 @@ do {									\
 #define wrdsp(val, mask)						\
 do {									\
 	__asm__ __volatile__(						\
+	"	.set push					\n"	\
+	"	.set dsp					\n"	\
 	"	wrdsp	%0, %x1					\n"	\
+	"	.set pop					\n"	\
 	:								\
 	: "r" (val), "i" (mask));					\
 } while (0)
 
-#define mflo0() ({ long mflo0; __asm__("mflo %0, $ac0" : "=r" (mflo0)); mflo0;})
-#define mflo1() ({ long mflo1; __asm__("mflo %0, $ac1" : "=r" (mflo1)); mflo1;})
-#define mflo2() ({ long mflo2; __asm__("mflo %0, $ac2" : "=r" (mflo2)); mflo2;})
-#define mflo3() ({ long mflo3; __asm__("mflo %0, $ac3" : "=r" (mflo3)); mflo3;})
-
-#define mfhi0() ({ long mfhi0; __asm__("mfhi %0, $ac0" : "=r" (mfhi0)); mfhi0;})
-#define mfhi1() ({ long mfhi1; __asm__("mfhi %0, $ac1" : "=r" (mfhi1)); mfhi1;})
-#define mfhi2() ({ long mfhi2; __asm__("mfhi %0, $ac2" : "=r" (mfhi2)); mfhi2;})
-#define mfhi3() ({ long mfhi3; __asm__("mfhi %0, $ac3" : "=r" (mfhi3)); mfhi3;})
-
-#define mtlo0(x) __asm__("mtlo %0, $ac0" ::"r" (x))
-#define mtlo1(x) __asm__("mtlo %0, $ac1" ::"r" (x))
-#define mtlo2(x) __asm__("mtlo %0, $ac2" ::"r" (x))
-#define mtlo3(x) __asm__("mtlo %0, $ac3" ::"r" (x))
-
-#define mthi0(x) __asm__("mthi %0, $ac0" ::"r" (x))
-#define mthi1(x) __asm__("mthi %0, $ac1" ::"r" (x))
-#define mthi2(x) __asm__("mthi %0, $ac2" ::"r" (x))
-#define mthi3(x) __asm__("mthi %0, $ac3" ::"r" (x))
+#define mflo0()								\
+({									\
+	long mflo0;							\
+	__asm__(							\
+	"	.set push					\n"	\
+	"	.set dsp					\n"	\
+	"	mflo %0, $ac0					\n"	\
+	"	.set pop					\n" 	\
+	: "=r" (mflo0)); 						\
+	mflo0;								\
+})
+
+#define mflo1()								\
+({									\
+	long mflo1;							\
+	__asm__(							\
+	"	.set push					\n"	\
+	"	.set dsp					\n"	\
+	"	mflo %0, $ac1					\n"	\
+	"	.set pop					\n" 	\
+	: "=r" (mflo1)); 						\
+	mflo1;								\
+})
+
+#define mflo2()								\
+({									\
+	long mflo2;							\
+	__asm__(							\
+	"	.set push					\n"	\
+	"	.set dsp					\n"	\
+	"	mflo %0, $ac2					\n"	\
+	"	.set pop					\n" 	\
+	: "=r" (mflo2)); 						\
+	mflo2;								\
+})
+
+#define mflo3()								\
+({									\
+	long mflo3;							\
+	__asm__(							\
+	"	.set push					\n"	\
+	"	.set dsp					\n"	\
+	"	mflo %0, $ac3					\n"	\
+	"	.set pop					\n" 	\
+	: "=r" (mflo3)); 						\
+	mflo3;								\
+})
+
+#define mfhi0()								\
+({									\
+	long mfhi0;							\
+	__asm__(							\
+	"	.set push					\n"	\
+	"	.set dsp					\n"	\
+	"	mfhi %0, $ac0					\n"	\
+	"	.set pop					\n" 	\
+	: "=r" (mfhi0)); 						\
+	mfhi0;								\
+})
+
+#define mfhi1()								\
+({									\
+	long mfhi1;							\
+	__asm__(							\
+	"	.set push					\n"	\
+	"	.set dsp					\n"	\
+	"	mfhi %0, $ac1					\n"	\
+	"	.set pop					\n" 	\
+	: "=r" (mfhi1)); 						\
+	mfhi1;								\
+})
+
+#define mfhi2()								\
+({									\
+	long mfhi2;							\
+	__asm__(							\
+	"	.set push					\n"	\
+	"	.set dsp					\n"	\
+	"	mfhi %0, $ac2					\n"	\
+	"	.set pop					\n" 	\
+	: "=r" (mfhi2)); 						\
+	mfhi2;								\
+})
+
+#define mfhi3()								\
+({									\
+	long mfhi3;							\
+	__asm__(							\
+	"	.set push					\n"	\
+	"	.set dsp					\n"	\
+	"	mfhi %0, $ac3					\n"	\
+	"	.set pop					\n" 	\
+	: "=r" (mfhi3)); 						\
+	mfhi3;								\
+})
+
+
+#define mtlo0(x)							\
+({									\
+	__asm__(							\
+	"	.set push					\n"	\
+	"	.set dsp					\n"	\
+	"	mtlo %0, $ac0					\n"	\
+	"	.set pop					\n"	\
+	:								\
+	: "r" (x));							\
+})
+
+#define mtlo1(x)							\
+({									\
+	__asm__(							\
+	"	.set push					\n"	\
+	"	.set dsp					\n"	\
+	"	mtlo %0, $ac1					\n"	\
+	"	.set pop					\n"	\
+	:								\
+	: "r" (x));							\
+})
+
+#define mtlo2(x)							\
+({									\
+	__asm__(							\
+	"	.set push					\n"	\
+	"	.set dsp					\n"	\
+	"	mtlo %0, $ac2					\n"	\
+	"	.set pop					\n"	\
+	:								\
+	: "r" (x));							\
+})
+
+#define mtlo3(x)							\
+({									\
+	__asm__(							\
+	"	.set push					\n"	\
+	"	.set dsp					\n"	\
+	"	mtlo %0, $ac3					\n"	\
+	"	.set pop					\n"	\
+	:								\
+	: "r" (x));							\
+})
+
+#define mthi0(x)							\
+({									\
+	__asm__(							\
+	"	.set push					\n"	\
+	"	.set dsp					\n"	\
+	"	mthi %0, $ac0					\n"	\
+	"	.set pop					\n"	\
+	:								\
+	: "r" (x));							\
+})
+
+#define mthi1(x)							\
+({									\
+	__asm__(							\
+	"	.set push					\n"	\
+	"	.set dsp					\n"	\
+	"	mthi %0, $ac1					\n"	\
+	"	.set pop					\n"	\
+	:								\
+	: "r" (x));							\
+})
+
+#define mthi2(x)							\
+({									\
+	__asm__(							\
+	"	.set push					\n"	\
+	"	.set dsp					\n"	\
+	"	mthi %0, $ac2					\n"	\
+	"	.set pop					\n"	\
+	:								\
+	: "r" (x));							\
+})
+
+#define mthi3(x)							\
+({									\
+	__asm__(							\
+	"	.set push					\n"	\
+	"	.set dsp					\n"	\
+	"	mthi %0, $ac3					\n"	\
+	"	.set pop					\n"	\
+	:								\
+	: "r" (x));							\
+})
 
 #else
 
diff --git a/arch/mips/kernel/Makefile b/arch/mips/kernel/Makefile
index f81d98f..de75fb5 100644
--- a/arch/mips/kernel/Makefile
+++ b/arch/mips/kernel/Makefile
@@ -100,29 +100,16 @@ obj-$(CONFIG_HW_PERF_EVENTS)	+= perf_event_mipsxx.o
 obj-$(CONFIG_JUMP_LABEL)	+= jump_label.o
 
 #
-# DSP ASE supported for MIPS32 or MIPS64 Release 2 cores only. It is safe
-# to enable DSP assembler support here even if the MIPS Release 2 CPU we
-# are targetting does not support DSP because all code-paths making use of
-# it properly check that the running CPU *actually does* support these
-# instructions.
+# DSP ASE supported for MIPS32 or MIPS64 Release 2 cores only. It is not
+# safe to unconditionnaly use the assembler -mdsp / -mdspr2 switches
+# here because the compiler may use DSP ASE instructions (such as lwx) in
+# code paths where we cannot check that the CPU we are running on supports it.
+# Proper abstraction using HAVE_AS_DSP and macros is done in
+# arch/mips/include/asm/mipsregs.h.
 #
 ifeq ($(CONFIG_CPU_MIPSR2), y)
 CFLAGS_DSP 			= -DHAVE_AS_DSP
 
-#
-# Check if assembler supports DSP ASE
-#
-ifeq ($(call cc-option-yn,-mdsp), y)
-CFLAGS_DSP			+= -mdsp
-endif
-
-#
-# Check if assembler supports DSP ASE Rev2
-#
-ifeq ($(call cc-option-yn,-mdspr2), y)
-CFLAGS_DSP			+= -mdspr2
-endif
-
 CFLAGS_signal.o			= $(CFLAGS_DSP)
 CFLAGS_signal32.o		= $(CFLAGS_DSP)
 CFLAGS_process.o		= $(CFLAGS_DSP)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-03-18 17:45 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-14 13:18 [PATCH] MIPS: fix code generation for non-DSP capable CPUs Florian Fainelli
2013-03-14 14:22 ` Steven J. Hill
2013-03-16 14:06   ` Florian Fainelli
2013-03-16 20:52 ` Steven J. Hill
2013-03-17 14:26   ` Florian Fainelli
2013-03-18 16:56 ` [PATCH v2] " Florian Fainelli

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.