All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] MIPS: MSA fixes
@ 2016-04-15  9:07 ` James Hogan
  0 siblings, 0 replies; 15+ messages in thread
From: James Hogan @ 2016-04-15  9:07 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: linux-mips, Paul Burton, James Hogan, stable

Here are some miscellaneous fixes for MSA (MIPS SIMD Architecture)
support:
1) Fix MSA build with recent toolchains
2) Fix 32-bit pointer additions on 64-bit with non-MSA capable
   toolchain.
3) Fix MSA + 64-bit + lockdep build due to large immediate offsets
4) Fix some MSA assembler warnings due to missing .set fp=64

James Hogan (3):
  MIPS: Fix MSA ld_*/st_* asm macros to use PTR_ADDU
  MIPS: Fix MSA assembly with big thread offsets
  MIPS: Fix MSA assembly warnings

Paul Burton (1):
  MIPS: Use copy_s.fmt rather than copy_u.fmt

 arch/mips/include/asm/asmmacro.h | 193 ++++++++++++++++++++++-----------------
 arch/mips/kernel/r4k_fpu.S       |  10 +-
 2 files changed, 113 insertions(+), 90 deletions(-)

Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org>
-- 
2.4.10


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 0/4] MIPS: MSA fixes
@ 2016-04-15  9:07 ` James Hogan
  0 siblings, 0 replies; 15+ messages in thread
From: James Hogan @ 2016-04-15  9:07 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: linux-mips, Paul Burton, James Hogan, stable

Here are some miscellaneous fixes for MSA (MIPS SIMD Architecture)
support:
1) Fix MSA build with recent toolchains
2) Fix 32-bit pointer additions on 64-bit with non-MSA capable
   toolchain.
3) Fix MSA + 64-bit + lockdep build due to large immediate offsets
4) Fix some MSA assembler warnings due to missing .set fp=64

James Hogan (3):
  MIPS: Fix MSA ld_*/st_* asm macros to use PTR_ADDU
  MIPS: Fix MSA assembly with big thread offsets
  MIPS: Fix MSA assembly warnings

Paul Burton (1):
  MIPS: Use copy_s.fmt rather than copy_u.fmt

 arch/mips/include/asm/asmmacro.h | 193 ++++++++++++++++++++++-----------------
 arch/mips/kernel/r4k_fpu.S       |  10 +-
 2 files changed, 113 insertions(+), 90 deletions(-)

Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org>
-- 
2.4.10

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/4] MIPS: Use copy_s.fmt rather than copy_u.fmt
@ 2016-04-15  9:07   ` James Hogan
  0 siblings, 0 replies; 15+ messages in thread
From: James Hogan @ 2016-04-15  9:07 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: linux-mips, Paul Burton, James Hogan, stable

From: Paul Burton <paul.burton@imgtec.com>

In revision 1.12 of the MSA specification, the copy_u.w instruction has
been removed for MIPS32 & the copy_u.d instruction has been removed for
MIPS64. Newer toolchains (eg. Codescape SDK essentials 2015.10) will
complain about this like so:

arch/mips/kernel/r4k_fpu.S:290: Error: opcode not supported on this
processor: mips32r2 (mips32r2) `copy_u.w $1,$w26[3]'

Since we always copy to the width of a GPR, simply use copy_s instead of
copy_u to fix this.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 4.1.x-
---
 arch/mips/include/asm/asmmacro.h | 24 ++++++++++++------------
 arch/mips/kernel/r4k_fpu.S       | 10 +++++-----
 2 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/arch/mips/include/asm/asmmacro.h b/arch/mips/include/asm/asmmacro.h
index 867f924b05c7..b99b38862fcb 100644
--- a/arch/mips/include/asm/asmmacro.h
+++ b/arch/mips/include/asm/asmmacro.h
@@ -298,21 +298,21 @@
 	.set	pop
 	.endm
 
-	.macro	copy_u_w	ws, n
+	.macro	copy_s_w	ws, n
 	.set	push
 	.set	mips32r2
 	.set	fp=64
 	.set	msa
-	copy_u.w $1, $w\ws[\n]
+	copy_s.w $1, $w\ws[\n]
 	.set	pop
 	.endm
 
-	.macro	copy_u_d	ws, n
+	.macro	copy_s_d	ws, n
 	.set	push
 	.set	mips64r2
 	.set	fp=64
 	.set	msa
-	copy_u.d $1, $w\ws[\n]
+	copy_s.d $1, $w\ws[\n]
 	.set	pop
 	.endm
 
@@ -346,8 +346,8 @@
 #define STH_MSA_INSN		0x5800081f
 #define STW_MSA_INSN		0x5800082f
 #define STD_MSA_INSN		0x5800083f
-#define COPY_UW_MSA_INSN	0x58f00056
-#define COPY_UD_MSA_INSN	0x58f80056
+#define COPY_SW_MSA_INSN	0x58b00056
+#define COPY_SD_MSA_INSN	0x58b80056
 #define INSERT_W_MSA_INSN	0x59300816
 #define INSERT_D_MSA_INSN	0x59380816
 #else
@@ -361,8 +361,8 @@
 #define STH_MSA_INSN		0x78000825
 #define STW_MSA_INSN		0x78000826
 #define STD_MSA_INSN		0x78000827
-#define COPY_UW_MSA_INSN	0x78f00059
-#define COPY_UD_MSA_INSN	0x78f80059
+#define COPY_SW_MSA_INSN	0x78b00059
+#define COPY_SD_MSA_INSN	0x78b80059
 #define INSERT_W_MSA_INSN	0x79300819
 #define INSERT_D_MSA_INSN	0x79380819
 #endif
@@ -461,21 +461,21 @@
 	.set	pop
 	.endm
 
-	.macro	copy_u_w	ws, n
+	.macro	copy_s_w	ws, n
 	.set	push
 	.set	noat
 	SET_HARDFLOAT
 	.insn
-	.word	COPY_UW_MSA_INSN | (\n << 16) | (\ws << 11)
+	.word	COPY_SW_MSA_INSN | (\n << 16) | (\ws << 11)
 	.set	pop
 	.endm
 
-	.macro	copy_u_d	ws, n
+	.macro	copy_s_d	ws, n
 	.set	push
 	.set	noat
 	SET_HARDFLOAT
 	.insn
-	.word	COPY_UD_MSA_INSN | (\n << 16) | (\ws << 11)
+	.word	COPY_SD_MSA_INSN | (\n << 16) | (\ws << 11)
 	.set	pop
 	.endm
 
diff --git a/arch/mips/kernel/r4k_fpu.S b/arch/mips/kernel/r4k_fpu.S
index 17732f876eff..56d86b09c917 100644
--- a/arch/mips/kernel/r4k_fpu.S
+++ b/arch/mips/kernel/r4k_fpu.S
@@ -244,17 +244,17 @@ LEAF(\name)
 	.set	push
 	.set	noat
 #ifdef CONFIG_64BIT
-	copy_u_d \wr, 1
+	copy_s_d \wr, 1
 	EX sd	$1, \off(\base)
 #elif defined(CONFIG_CPU_LITTLE_ENDIAN)
-	copy_u_w \wr, 2
+	copy_s_w \wr, 2
 	EX sw	$1, \off(\base)
-	copy_u_w \wr, 3
+	copy_s_w \wr, 3
 	EX sw	$1, (\off+4)(\base)
 #else /* CONFIG_CPU_BIG_ENDIAN */
-	copy_u_w \wr, 2
+	copy_s_w \wr, 2
 	EX sw	$1, (\off+4)(\base)
-	copy_u_w \wr, 3
+	copy_s_w \wr, 3
 	EX sw	$1, \off(\base)
 #endif
 	.set	pop
-- 
2.4.10


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 1/4] MIPS: Use copy_s.fmt rather than copy_u.fmt
@ 2016-04-15  9:07   ` James Hogan
  0 siblings, 0 replies; 15+ messages in thread
From: James Hogan @ 2016-04-15  9:07 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: linux-mips, Paul Burton, James Hogan, stable

From: Paul Burton <paul.burton@imgtec.com>

In revision 1.12 of the MSA specification, the copy_u.w instruction has
been removed for MIPS32 & the copy_u.d instruction has been removed for
MIPS64. Newer toolchains (eg. Codescape SDK essentials 2015.10) will
complain about this like so:

arch/mips/kernel/r4k_fpu.S:290: Error: opcode not supported on this
processor: mips32r2 (mips32r2) `copy_u.w $1,$w26[3]'

Since we always copy to the width of a GPR, simply use copy_s instead of
copy_u to fix this.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 4.1.x-
---
 arch/mips/include/asm/asmmacro.h | 24 ++++++++++++------------
 arch/mips/kernel/r4k_fpu.S       | 10 +++++-----
 2 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/arch/mips/include/asm/asmmacro.h b/arch/mips/include/asm/asmmacro.h
index 867f924b05c7..b99b38862fcb 100644
--- a/arch/mips/include/asm/asmmacro.h
+++ b/arch/mips/include/asm/asmmacro.h
@@ -298,21 +298,21 @@
 	.set	pop
 	.endm
 
-	.macro	copy_u_w	ws, n
+	.macro	copy_s_w	ws, n
 	.set	push
 	.set	mips32r2
 	.set	fp=64
 	.set	msa
-	copy_u.w $1, $w\ws[\n]
+	copy_s.w $1, $w\ws[\n]
 	.set	pop
 	.endm
 
-	.macro	copy_u_d	ws, n
+	.macro	copy_s_d	ws, n
 	.set	push
 	.set	mips64r2
 	.set	fp=64
 	.set	msa
-	copy_u.d $1, $w\ws[\n]
+	copy_s.d $1, $w\ws[\n]
 	.set	pop
 	.endm
 
@@ -346,8 +346,8 @@
 #define STH_MSA_INSN		0x5800081f
 #define STW_MSA_INSN		0x5800082f
 #define STD_MSA_INSN		0x5800083f
-#define COPY_UW_MSA_INSN	0x58f00056
-#define COPY_UD_MSA_INSN	0x58f80056
+#define COPY_SW_MSA_INSN	0x58b00056
+#define COPY_SD_MSA_INSN	0x58b80056
 #define INSERT_W_MSA_INSN	0x59300816
 #define INSERT_D_MSA_INSN	0x59380816
 #else
@@ -361,8 +361,8 @@
 #define STH_MSA_INSN		0x78000825
 #define STW_MSA_INSN		0x78000826
 #define STD_MSA_INSN		0x78000827
-#define COPY_UW_MSA_INSN	0x78f00059
-#define COPY_UD_MSA_INSN	0x78f80059
+#define COPY_SW_MSA_INSN	0x78b00059
+#define COPY_SD_MSA_INSN	0x78b80059
 #define INSERT_W_MSA_INSN	0x79300819
 #define INSERT_D_MSA_INSN	0x79380819
 #endif
@@ -461,21 +461,21 @@
 	.set	pop
 	.endm
 
-	.macro	copy_u_w	ws, n
+	.macro	copy_s_w	ws, n
 	.set	push
 	.set	noat
 	SET_HARDFLOAT
 	.insn
-	.word	COPY_UW_MSA_INSN | (\n << 16) | (\ws << 11)
+	.word	COPY_SW_MSA_INSN | (\n << 16) | (\ws << 11)
 	.set	pop
 	.endm
 
-	.macro	copy_u_d	ws, n
+	.macro	copy_s_d	ws, n
 	.set	push
 	.set	noat
 	SET_HARDFLOAT
 	.insn
-	.word	COPY_UD_MSA_INSN | (\n << 16) | (\ws << 11)
+	.word	COPY_SD_MSA_INSN | (\n << 16) | (\ws << 11)
 	.set	pop
 	.endm
 
diff --git a/arch/mips/kernel/r4k_fpu.S b/arch/mips/kernel/r4k_fpu.S
index 17732f876eff..56d86b09c917 100644
--- a/arch/mips/kernel/r4k_fpu.S
+++ b/arch/mips/kernel/r4k_fpu.S
@@ -244,17 +244,17 @@ LEAF(\name)
 	.set	push
 	.set	noat
 #ifdef CONFIG_64BIT
-	copy_u_d \wr, 1
+	copy_s_d \wr, 1
 	EX sd	$1, \off(\base)
 #elif defined(CONFIG_CPU_LITTLE_ENDIAN)
-	copy_u_w \wr, 2
+	copy_s_w \wr, 2
 	EX sw	$1, \off(\base)
-	copy_u_w \wr, 3
+	copy_s_w \wr, 3
 	EX sw	$1, (\off+4)(\base)
 #else /* CONFIG_CPU_BIG_ENDIAN */
-	copy_u_w \wr, 2
+	copy_s_w \wr, 2
 	EX sw	$1, (\off+4)(\base)
-	copy_u_w \wr, 3
+	copy_s_w \wr, 3
 	EX sw	$1, \off(\base)
 #endif
 	.set	pop
-- 
2.4.10

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/4] MIPS: Fix MSA ld_*/st_* asm macros to use PTR_ADDU
@ 2016-04-15  9:07   ` James Hogan
  0 siblings, 0 replies; 15+ messages in thread
From: James Hogan @ 2016-04-15  9:07 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: linux-mips, Paul Burton, James Hogan, stable

The MSA ld_*/st_* assembler macros for when the toolchain doesn't
support MSA use addu to offset the base address. However it is a virtual
memory pointer so fix it to use PTR_ADDU which expands to daddu for
64-bit kernels.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 4.3.y-
---
 arch/mips/include/asm/asmmacro.h | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/mips/include/asm/asmmacro.h b/arch/mips/include/asm/asmmacro.h
index b99b38862fcb..e689b894353c 100644
--- a/arch/mips/include/asm/asmmacro.h
+++ b/arch/mips/include/asm/asmmacro.h
@@ -393,7 +393,7 @@
 	.set	push
 	.set	noat
 	SET_HARDFLOAT
-	addu	$1, \base, \off
+	PTR_ADDU $1, \base, \off
 	.word	LDB_MSA_INSN | (\wd << 6)
 	.set	pop
 	.endm
@@ -402,7 +402,7 @@
 	.set	push
 	.set	noat
 	SET_HARDFLOAT
-	addu	$1, \base, \off
+	PTR_ADDU $1, \base, \off
 	.word	LDH_MSA_INSN | (\wd << 6)
 	.set	pop
 	.endm
@@ -411,7 +411,7 @@
 	.set	push
 	.set	noat
 	SET_HARDFLOAT
-	addu	$1, \base, \off
+	PTR_ADDU $1, \base, \off
 	.word	LDW_MSA_INSN | (\wd << 6)
 	.set	pop
 	.endm
@@ -420,7 +420,7 @@
 	.set	push
 	.set	noat
 	SET_HARDFLOAT
-	addu	$1, \base, \off
+	PTR_ADDU $1, \base, \off
 	.word	LDD_MSA_INSN | (\wd << 6)
 	.set	pop
 	.endm
@@ -429,7 +429,7 @@
 	.set	push
 	.set	noat
 	SET_HARDFLOAT
-	addu	$1, \base, \off
+	PTR_ADDU $1, \base, \off
 	.word	STB_MSA_INSN | (\wd << 6)
 	.set	pop
 	.endm
@@ -438,7 +438,7 @@
 	.set	push
 	.set	noat
 	SET_HARDFLOAT
-	addu	$1, \base, \off
+	PTR_ADDU $1, \base, \off
 	.word	STH_MSA_INSN | (\wd << 6)
 	.set	pop
 	.endm
@@ -447,7 +447,7 @@
 	.set	push
 	.set	noat
 	SET_HARDFLOAT
-	addu	$1, \base, \off
+	PTR_ADDU $1, \base, \off
 	.word	STW_MSA_INSN | (\wd << 6)
 	.set	pop
 	.endm
@@ -456,7 +456,7 @@
 	.set	push
 	.set	noat
 	SET_HARDFLOAT
-	addu	$1, \base, \off
+	PTR_ADDU $1, \base, \off
 	.word	STD_MSA_INSN | (\wd << 6)
 	.set	pop
 	.endm
-- 
2.4.10


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/4] MIPS: Fix MSA ld_*/st_* asm macros to use PTR_ADDU
@ 2016-04-15  9:07   ` James Hogan
  0 siblings, 0 replies; 15+ messages in thread
From: James Hogan @ 2016-04-15  9:07 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: linux-mips, Paul Burton, James Hogan, stable

The MSA ld_*/st_* assembler macros for when the toolchain doesn't
support MSA use addu to offset the base address. However it is a virtual
memory pointer so fix it to use PTR_ADDU which expands to daddu for
64-bit kernels.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 4.3.y-
---
 arch/mips/include/asm/asmmacro.h | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/mips/include/asm/asmmacro.h b/arch/mips/include/asm/asmmacro.h
index b99b38862fcb..e689b894353c 100644
--- a/arch/mips/include/asm/asmmacro.h
+++ b/arch/mips/include/asm/asmmacro.h
@@ -393,7 +393,7 @@
 	.set	push
 	.set	noat
 	SET_HARDFLOAT
-	addu	$1, \base, \off
+	PTR_ADDU $1, \base, \off
 	.word	LDB_MSA_INSN | (\wd << 6)
 	.set	pop
 	.endm
@@ -402,7 +402,7 @@
 	.set	push
 	.set	noat
 	SET_HARDFLOAT
-	addu	$1, \base, \off
+	PTR_ADDU $1, \base, \off
 	.word	LDH_MSA_INSN | (\wd << 6)
 	.set	pop
 	.endm
@@ -411,7 +411,7 @@
 	.set	push
 	.set	noat
 	SET_HARDFLOAT
-	addu	$1, \base, \off
+	PTR_ADDU $1, \base, \off
 	.word	LDW_MSA_INSN | (\wd << 6)
 	.set	pop
 	.endm
@@ -420,7 +420,7 @@
 	.set	push
 	.set	noat
 	SET_HARDFLOAT
-	addu	$1, \base, \off
+	PTR_ADDU $1, \base, \off
 	.word	LDD_MSA_INSN | (\wd << 6)
 	.set	pop
 	.endm
@@ -429,7 +429,7 @@
 	.set	push
 	.set	noat
 	SET_HARDFLOAT
-	addu	$1, \base, \off
+	PTR_ADDU $1, \base, \off
 	.word	STB_MSA_INSN | (\wd << 6)
 	.set	pop
 	.endm
@@ -438,7 +438,7 @@
 	.set	push
 	.set	noat
 	SET_HARDFLOAT
-	addu	$1, \base, \off
+	PTR_ADDU $1, \base, \off
 	.word	STH_MSA_INSN | (\wd << 6)
 	.set	pop
 	.endm
@@ -447,7 +447,7 @@
 	.set	push
 	.set	noat
 	SET_HARDFLOAT
-	addu	$1, \base, \off
+	PTR_ADDU $1, \base, \off
 	.word	STW_MSA_INSN | (\wd << 6)
 	.set	pop
 	.endm
@@ -456,7 +456,7 @@
 	.set	push
 	.set	noat
 	SET_HARDFLOAT
-	addu	$1, \base, \off
+	PTR_ADDU $1, \base, \off
 	.word	STD_MSA_INSN | (\wd << 6)
 	.set	pop
 	.endm
-- 
2.4.10

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/4] MIPS: Fix MSA assembly with big thread offsets
@ 2016-04-15  9:07   ` James Hogan
  0 siblings, 0 replies; 15+ messages in thread
From: James Hogan @ 2016-04-15  9:07 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: linux-mips, Paul Burton, James Hogan

When lockdep is enabled on a 64-bit kernel the FPR offset into the
thread structure exceeds the maximum range of the MSA ld.d/st.d
instructions. For example THREAD_FPR31 = 4644 (instead of 2448), while
the signed immediate field is only 10 bits with an implicit multiply by
8, giving a maximum offset of 511*8 = 4088.

This isn't a problem when the toolchain doesn't support MSA as the
ld_*/st_* macros perform the addition separately into $1 with [d]addui
which has a 16bit signed immediate field.

Fix the case where the toolchain does support MSA by doing a single
addition of THREAD_FPR0 into $1 with [d]addui, and doing the ld_*/st_*
relative to that.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
---
 arch/mips/include/asm/asmmacro.h | 147 ++++++++++++++++++++++-----------------
 1 file changed, 82 insertions(+), 65 deletions(-)

diff --git a/arch/mips/include/asm/asmmacro.h b/arch/mips/include/asm/asmmacro.h
index e689b894353c..637fccab5604 100644
--- a/arch/mips/include/asm/asmmacro.h
+++ b/arch/mips/include/asm/asmmacro.h
@@ -496,41 +496,52 @@
 	.endm
 #endif
 
+#ifdef TOOLCHAIN_SUPPORTS_MSA
+#define FPR_BASE_OFFS	THREAD_FPR0
+#define FPR_BASE	$1
+#else
+#define FPR_BASE_OFFS	0
+#define FPR_BASE	\thread
+#endif
+
 	.macro	msa_save_all	thread
-	st_d	0, THREAD_FPR0, \thread
-	st_d	1, THREAD_FPR1, \thread
-	st_d	2, THREAD_FPR2, \thread
-	st_d	3, THREAD_FPR3, \thread
-	st_d	4, THREAD_FPR4, \thread
-	st_d	5, THREAD_FPR5, \thread
-	st_d	6, THREAD_FPR6, \thread
-	st_d	7, THREAD_FPR7, \thread
-	st_d	8, THREAD_FPR8, \thread
-	st_d	9, THREAD_FPR9, \thread
-	st_d	10, THREAD_FPR10, \thread
-	st_d	11, THREAD_FPR11, \thread
-	st_d	12, THREAD_FPR12, \thread
-	st_d	13, THREAD_FPR13, \thread
-	st_d	14, THREAD_FPR14, \thread
-	st_d	15, THREAD_FPR15, \thread
-	st_d	16, THREAD_FPR16, \thread
-	st_d	17, THREAD_FPR17, \thread
-	st_d	18, THREAD_FPR18, \thread
-	st_d	19, THREAD_FPR19, \thread
-	st_d	20, THREAD_FPR20, \thread
-	st_d	21, THREAD_FPR21, \thread
-	st_d	22, THREAD_FPR22, \thread
-	st_d	23, THREAD_FPR23, \thread
-	st_d	24, THREAD_FPR24, \thread
-	st_d	25, THREAD_FPR25, \thread
-	st_d	26, THREAD_FPR26, \thread
-	st_d	27, THREAD_FPR27, \thread
-	st_d	28, THREAD_FPR28, \thread
-	st_d	29, THREAD_FPR29, \thread
-	st_d	30, THREAD_FPR30, \thread
-	st_d	31, THREAD_FPR31, \thread
 	.set	push
 	.set	noat
+#ifdef TOOLCHAIN_SUPPORTS_MSA
+	PTR_ADDU FPR_BASE, \thread, FPR_BASE_OFFS
+#endif
+	st_d	 0, THREAD_FPR0  - FPR_BASE_OFFS, FPR_BASE
+	st_d	 1, THREAD_FPR1  - FPR_BASE_OFFS, FPR_BASE
+	st_d	 2, THREAD_FPR2  - FPR_BASE_OFFS, FPR_BASE
+	st_d	 3, THREAD_FPR3  - FPR_BASE_OFFS, FPR_BASE
+	st_d	 4, THREAD_FPR4  - FPR_BASE_OFFS, FPR_BASE
+	st_d	 5, THREAD_FPR5  - FPR_BASE_OFFS, FPR_BASE
+	st_d	 6, THREAD_FPR6  - FPR_BASE_OFFS, FPR_BASE
+	st_d	 7, THREAD_FPR7  - FPR_BASE_OFFS, FPR_BASE
+	st_d	 8, THREAD_FPR8  - FPR_BASE_OFFS, FPR_BASE
+	st_d	 9, THREAD_FPR9  - FPR_BASE_OFFS, FPR_BASE
+	st_d	10, THREAD_FPR10 - FPR_BASE_OFFS, FPR_BASE
+	st_d	11, THREAD_FPR11 - FPR_BASE_OFFS, FPR_BASE
+	st_d	12, THREAD_FPR12 - FPR_BASE_OFFS, FPR_BASE
+	st_d	13, THREAD_FPR13 - FPR_BASE_OFFS, FPR_BASE
+	st_d	14, THREAD_FPR14 - FPR_BASE_OFFS, FPR_BASE
+	st_d	15, THREAD_FPR15 - FPR_BASE_OFFS, FPR_BASE
+	st_d	16, THREAD_FPR16 - FPR_BASE_OFFS, FPR_BASE
+	st_d	17, THREAD_FPR17 - FPR_BASE_OFFS, FPR_BASE
+	st_d	18, THREAD_FPR18 - FPR_BASE_OFFS, FPR_BASE
+	st_d	19, THREAD_FPR19 - FPR_BASE_OFFS, FPR_BASE
+	st_d	20, THREAD_FPR20 - FPR_BASE_OFFS, FPR_BASE
+	st_d	21, THREAD_FPR21 - FPR_BASE_OFFS, FPR_BASE
+	st_d	22, THREAD_FPR22 - FPR_BASE_OFFS, FPR_BASE
+	st_d	23, THREAD_FPR23 - FPR_BASE_OFFS, FPR_BASE
+	st_d	24, THREAD_FPR24 - FPR_BASE_OFFS, FPR_BASE
+	st_d	25, THREAD_FPR25 - FPR_BASE_OFFS, FPR_BASE
+	st_d	26, THREAD_FPR26 - FPR_BASE_OFFS, FPR_BASE
+	st_d	27, THREAD_FPR27 - FPR_BASE_OFFS, FPR_BASE
+	st_d	28, THREAD_FPR28 - FPR_BASE_OFFS, FPR_BASE
+	st_d	29, THREAD_FPR29 - FPR_BASE_OFFS, FPR_BASE
+	st_d	30, THREAD_FPR30 - FPR_BASE_OFFS, FPR_BASE
+	st_d	31, THREAD_FPR31 - FPR_BASE_OFFS, FPR_BASE
 	SET_HARDFLOAT
 	_cfcmsa	$1, MSA_CSR
 	sw	$1, THREAD_MSA_CSR(\thread)
@@ -543,41 +554,47 @@
 	SET_HARDFLOAT
 	lw	$1, THREAD_MSA_CSR(\thread)
 	_ctcmsa	MSA_CSR, $1
-	.set	pop
-	ld_d	0, THREAD_FPR0, \thread
-	ld_d	1, THREAD_FPR1, \thread
-	ld_d	2, THREAD_FPR2, \thread
-	ld_d	3, THREAD_FPR3, \thread
-	ld_d	4, THREAD_FPR4, \thread
-	ld_d	5, THREAD_FPR5, \thread
-	ld_d	6, THREAD_FPR6, \thread
-	ld_d	7, THREAD_FPR7, \thread
-	ld_d	8, THREAD_FPR8, \thread
-	ld_d	9, THREAD_FPR9, \thread
-	ld_d	10, THREAD_FPR10, \thread
-	ld_d	11, THREAD_FPR11, \thread
-	ld_d	12, THREAD_FPR12, \thread
-	ld_d	13, THREAD_FPR13, \thread
-	ld_d	14, THREAD_FPR14, \thread
-	ld_d	15, THREAD_FPR15, \thread
-	ld_d	16, THREAD_FPR16, \thread
-	ld_d	17, THREAD_FPR17, \thread
-	ld_d	18, THREAD_FPR18, \thread
-	ld_d	19, THREAD_FPR19, \thread
-	ld_d	20, THREAD_FPR20, \thread
-	ld_d	21, THREAD_FPR21, \thread
-	ld_d	22, THREAD_FPR22, \thread
-	ld_d	23, THREAD_FPR23, \thread
-	ld_d	24, THREAD_FPR24, \thread
-	ld_d	25, THREAD_FPR25, \thread
-	ld_d	26, THREAD_FPR26, \thread
-	ld_d	27, THREAD_FPR27, \thread
-	ld_d	28, THREAD_FPR28, \thread
-	ld_d	29, THREAD_FPR29, \thread
-	ld_d	30, THREAD_FPR30, \thread
-	ld_d	31, THREAD_FPR31, \thread
+#ifdef TOOLCHAIN_SUPPORTS_MSA
+	PTR_ADDU FPR_BASE, \thread, FPR_BASE_OFFS
+#endif
+	ld_d	 0, THREAD_FPR0  - FPR_BASE_OFFS, FPR_BASE
+	ld_d	 1, THREAD_FPR1  - FPR_BASE_OFFS, FPR_BASE
+	ld_d	 2, THREAD_FPR2  - FPR_BASE_OFFS, FPR_BASE
+	ld_d	 3, THREAD_FPR3  - FPR_BASE_OFFS, FPR_BASE
+	ld_d	 4, THREAD_FPR4  - FPR_BASE_OFFS, FPR_BASE
+	ld_d	 5, THREAD_FPR5  - FPR_BASE_OFFS, FPR_BASE
+	ld_d	 6, THREAD_FPR6  - FPR_BASE_OFFS, FPR_BASE
+	ld_d	 7, THREAD_FPR7  - FPR_BASE_OFFS, FPR_BASE
+	ld_d	 8, THREAD_FPR8  - FPR_BASE_OFFS, FPR_BASE
+	ld_d	 9, THREAD_FPR9  - FPR_BASE_OFFS, FPR_BASE
+	ld_d	10, THREAD_FPR10 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	11, THREAD_FPR11 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	12, THREAD_FPR12 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	13, THREAD_FPR13 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	14, THREAD_FPR14 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	15, THREAD_FPR15 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	16, THREAD_FPR16 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	17, THREAD_FPR17 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	18, THREAD_FPR18 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	19, THREAD_FPR19 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	20, THREAD_FPR20 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	21, THREAD_FPR21 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	22, THREAD_FPR22 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	23, THREAD_FPR23 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	24, THREAD_FPR24 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	25, THREAD_FPR25 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	26, THREAD_FPR26 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	27, THREAD_FPR27 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	28, THREAD_FPR28 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	29, THREAD_FPR29 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	30, THREAD_FPR30 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	31, THREAD_FPR31 - FPR_BASE_OFFS, FPR_BASE
+	.set pop
 	.endm
 
+#undef FPR_BASE_OFFS
+#undef FPR_BASE
+
 	.macro	msa_init_upper wd
 #ifdef CONFIG_64BIT
 	insert_d \wd, 1
-- 
2.4.10

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/4] MIPS: Fix MSA assembly with big thread offsets
@ 2016-04-15  9:07   ` James Hogan
  0 siblings, 0 replies; 15+ messages in thread
From: James Hogan @ 2016-04-15  9:07 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: linux-mips, Paul Burton, James Hogan

When lockdep is enabled on a 64-bit kernel the FPR offset into the
thread structure exceeds the maximum range of the MSA ld.d/st.d
instructions. For example THREAD_FPR31 = 4644 (instead of 2448), while
the signed immediate field is only 10 bits with an implicit multiply by
8, giving a maximum offset of 511*8 = 4088.

This isn't a problem when the toolchain doesn't support MSA as the
ld_*/st_* macros perform the addition separately into $1 with [d]addui
which has a 16bit signed immediate field.

Fix the case where the toolchain does support MSA by doing a single
addition of THREAD_FPR0 into $1 with [d]addui, and doing the ld_*/st_*
relative to that.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
---
 arch/mips/include/asm/asmmacro.h | 147 ++++++++++++++++++++++-----------------
 1 file changed, 82 insertions(+), 65 deletions(-)

diff --git a/arch/mips/include/asm/asmmacro.h b/arch/mips/include/asm/asmmacro.h
index e689b894353c..637fccab5604 100644
--- a/arch/mips/include/asm/asmmacro.h
+++ b/arch/mips/include/asm/asmmacro.h
@@ -496,41 +496,52 @@
 	.endm
 #endif
 
+#ifdef TOOLCHAIN_SUPPORTS_MSA
+#define FPR_BASE_OFFS	THREAD_FPR0
+#define FPR_BASE	$1
+#else
+#define FPR_BASE_OFFS	0
+#define FPR_BASE	\thread
+#endif
+
 	.macro	msa_save_all	thread
-	st_d	0, THREAD_FPR0, \thread
-	st_d	1, THREAD_FPR1, \thread
-	st_d	2, THREAD_FPR2, \thread
-	st_d	3, THREAD_FPR3, \thread
-	st_d	4, THREAD_FPR4, \thread
-	st_d	5, THREAD_FPR5, \thread
-	st_d	6, THREAD_FPR6, \thread
-	st_d	7, THREAD_FPR7, \thread
-	st_d	8, THREAD_FPR8, \thread
-	st_d	9, THREAD_FPR9, \thread
-	st_d	10, THREAD_FPR10, \thread
-	st_d	11, THREAD_FPR11, \thread
-	st_d	12, THREAD_FPR12, \thread
-	st_d	13, THREAD_FPR13, \thread
-	st_d	14, THREAD_FPR14, \thread
-	st_d	15, THREAD_FPR15, \thread
-	st_d	16, THREAD_FPR16, \thread
-	st_d	17, THREAD_FPR17, \thread
-	st_d	18, THREAD_FPR18, \thread
-	st_d	19, THREAD_FPR19, \thread
-	st_d	20, THREAD_FPR20, \thread
-	st_d	21, THREAD_FPR21, \thread
-	st_d	22, THREAD_FPR22, \thread
-	st_d	23, THREAD_FPR23, \thread
-	st_d	24, THREAD_FPR24, \thread
-	st_d	25, THREAD_FPR25, \thread
-	st_d	26, THREAD_FPR26, \thread
-	st_d	27, THREAD_FPR27, \thread
-	st_d	28, THREAD_FPR28, \thread
-	st_d	29, THREAD_FPR29, \thread
-	st_d	30, THREAD_FPR30, \thread
-	st_d	31, THREAD_FPR31, \thread
 	.set	push
 	.set	noat
+#ifdef TOOLCHAIN_SUPPORTS_MSA
+	PTR_ADDU FPR_BASE, \thread, FPR_BASE_OFFS
+#endif
+	st_d	 0, THREAD_FPR0  - FPR_BASE_OFFS, FPR_BASE
+	st_d	 1, THREAD_FPR1  - FPR_BASE_OFFS, FPR_BASE
+	st_d	 2, THREAD_FPR2  - FPR_BASE_OFFS, FPR_BASE
+	st_d	 3, THREAD_FPR3  - FPR_BASE_OFFS, FPR_BASE
+	st_d	 4, THREAD_FPR4  - FPR_BASE_OFFS, FPR_BASE
+	st_d	 5, THREAD_FPR5  - FPR_BASE_OFFS, FPR_BASE
+	st_d	 6, THREAD_FPR6  - FPR_BASE_OFFS, FPR_BASE
+	st_d	 7, THREAD_FPR7  - FPR_BASE_OFFS, FPR_BASE
+	st_d	 8, THREAD_FPR8  - FPR_BASE_OFFS, FPR_BASE
+	st_d	 9, THREAD_FPR9  - FPR_BASE_OFFS, FPR_BASE
+	st_d	10, THREAD_FPR10 - FPR_BASE_OFFS, FPR_BASE
+	st_d	11, THREAD_FPR11 - FPR_BASE_OFFS, FPR_BASE
+	st_d	12, THREAD_FPR12 - FPR_BASE_OFFS, FPR_BASE
+	st_d	13, THREAD_FPR13 - FPR_BASE_OFFS, FPR_BASE
+	st_d	14, THREAD_FPR14 - FPR_BASE_OFFS, FPR_BASE
+	st_d	15, THREAD_FPR15 - FPR_BASE_OFFS, FPR_BASE
+	st_d	16, THREAD_FPR16 - FPR_BASE_OFFS, FPR_BASE
+	st_d	17, THREAD_FPR17 - FPR_BASE_OFFS, FPR_BASE
+	st_d	18, THREAD_FPR18 - FPR_BASE_OFFS, FPR_BASE
+	st_d	19, THREAD_FPR19 - FPR_BASE_OFFS, FPR_BASE
+	st_d	20, THREAD_FPR20 - FPR_BASE_OFFS, FPR_BASE
+	st_d	21, THREAD_FPR21 - FPR_BASE_OFFS, FPR_BASE
+	st_d	22, THREAD_FPR22 - FPR_BASE_OFFS, FPR_BASE
+	st_d	23, THREAD_FPR23 - FPR_BASE_OFFS, FPR_BASE
+	st_d	24, THREAD_FPR24 - FPR_BASE_OFFS, FPR_BASE
+	st_d	25, THREAD_FPR25 - FPR_BASE_OFFS, FPR_BASE
+	st_d	26, THREAD_FPR26 - FPR_BASE_OFFS, FPR_BASE
+	st_d	27, THREAD_FPR27 - FPR_BASE_OFFS, FPR_BASE
+	st_d	28, THREAD_FPR28 - FPR_BASE_OFFS, FPR_BASE
+	st_d	29, THREAD_FPR29 - FPR_BASE_OFFS, FPR_BASE
+	st_d	30, THREAD_FPR30 - FPR_BASE_OFFS, FPR_BASE
+	st_d	31, THREAD_FPR31 - FPR_BASE_OFFS, FPR_BASE
 	SET_HARDFLOAT
 	_cfcmsa	$1, MSA_CSR
 	sw	$1, THREAD_MSA_CSR(\thread)
@@ -543,41 +554,47 @@
 	SET_HARDFLOAT
 	lw	$1, THREAD_MSA_CSR(\thread)
 	_ctcmsa	MSA_CSR, $1
-	.set	pop
-	ld_d	0, THREAD_FPR0, \thread
-	ld_d	1, THREAD_FPR1, \thread
-	ld_d	2, THREAD_FPR2, \thread
-	ld_d	3, THREAD_FPR3, \thread
-	ld_d	4, THREAD_FPR4, \thread
-	ld_d	5, THREAD_FPR5, \thread
-	ld_d	6, THREAD_FPR6, \thread
-	ld_d	7, THREAD_FPR7, \thread
-	ld_d	8, THREAD_FPR8, \thread
-	ld_d	9, THREAD_FPR9, \thread
-	ld_d	10, THREAD_FPR10, \thread
-	ld_d	11, THREAD_FPR11, \thread
-	ld_d	12, THREAD_FPR12, \thread
-	ld_d	13, THREAD_FPR13, \thread
-	ld_d	14, THREAD_FPR14, \thread
-	ld_d	15, THREAD_FPR15, \thread
-	ld_d	16, THREAD_FPR16, \thread
-	ld_d	17, THREAD_FPR17, \thread
-	ld_d	18, THREAD_FPR18, \thread
-	ld_d	19, THREAD_FPR19, \thread
-	ld_d	20, THREAD_FPR20, \thread
-	ld_d	21, THREAD_FPR21, \thread
-	ld_d	22, THREAD_FPR22, \thread
-	ld_d	23, THREAD_FPR23, \thread
-	ld_d	24, THREAD_FPR24, \thread
-	ld_d	25, THREAD_FPR25, \thread
-	ld_d	26, THREAD_FPR26, \thread
-	ld_d	27, THREAD_FPR27, \thread
-	ld_d	28, THREAD_FPR28, \thread
-	ld_d	29, THREAD_FPR29, \thread
-	ld_d	30, THREAD_FPR30, \thread
-	ld_d	31, THREAD_FPR31, \thread
+#ifdef TOOLCHAIN_SUPPORTS_MSA
+	PTR_ADDU FPR_BASE, \thread, FPR_BASE_OFFS
+#endif
+	ld_d	 0, THREAD_FPR0  - FPR_BASE_OFFS, FPR_BASE
+	ld_d	 1, THREAD_FPR1  - FPR_BASE_OFFS, FPR_BASE
+	ld_d	 2, THREAD_FPR2  - FPR_BASE_OFFS, FPR_BASE
+	ld_d	 3, THREAD_FPR3  - FPR_BASE_OFFS, FPR_BASE
+	ld_d	 4, THREAD_FPR4  - FPR_BASE_OFFS, FPR_BASE
+	ld_d	 5, THREAD_FPR5  - FPR_BASE_OFFS, FPR_BASE
+	ld_d	 6, THREAD_FPR6  - FPR_BASE_OFFS, FPR_BASE
+	ld_d	 7, THREAD_FPR7  - FPR_BASE_OFFS, FPR_BASE
+	ld_d	 8, THREAD_FPR8  - FPR_BASE_OFFS, FPR_BASE
+	ld_d	 9, THREAD_FPR9  - FPR_BASE_OFFS, FPR_BASE
+	ld_d	10, THREAD_FPR10 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	11, THREAD_FPR11 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	12, THREAD_FPR12 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	13, THREAD_FPR13 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	14, THREAD_FPR14 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	15, THREAD_FPR15 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	16, THREAD_FPR16 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	17, THREAD_FPR17 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	18, THREAD_FPR18 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	19, THREAD_FPR19 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	20, THREAD_FPR20 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	21, THREAD_FPR21 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	22, THREAD_FPR22 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	23, THREAD_FPR23 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	24, THREAD_FPR24 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	25, THREAD_FPR25 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	26, THREAD_FPR26 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	27, THREAD_FPR27 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	28, THREAD_FPR28 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	29, THREAD_FPR29 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	30, THREAD_FPR30 - FPR_BASE_OFFS, FPR_BASE
+	ld_d	31, THREAD_FPR31 - FPR_BASE_OFFS, FPR_BASE
+	.set pop
 	.endm
 
+#undef FPR_BASE_OFFS
+#undef FPR_BASE
+
 	.macro	msa_init_upper wd
 #ifdef CONFIG_64BIT
 	insert_d \wd, 1
-- 
2.4.10

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/4] MIPS: Fix MSA assembly warnings
@ 2016-04-15  9:07   ` James Hogan
  0 siblings, 0 replies; 15+ messages in thread
From: James Hogan @ 2016-04-15  9:07 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: linux-mips, Paul Burton, James Hogan

Building an MSA capable kernel with a toolchain that supports MSA
produces warnings such as this:

arch/mips/kernel/r4k_fpu.S:229: Warning: the `msa' extension requires 64-bit FPRs

This is due to ".set msa" without ".set fp=64" in the non doubleword MSA
load/store macros, since MSA requires the 64-bit FPU registers (FR=1).
Add the missing fp=64 in these macros to silence the warnings.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
---
 arch/mips/include/asm/asmmacro.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/mips/include/asm/asmmacro.h b/arch/mips/include/asm/asmmacro.h
index 637fccab5604..6741673c92ca 100644
--- a/arch/mips/include/asm/asmmacro.h
+++ b/arch/mips/include/asm/asmmacro.h
@@ -235,6 +235,7 @@
 	.macro	ld_b	wd, off, base
 	.set	push
 	.set	mips32r2
+	.set	fp=64
 	.set	msa
 	ld.b	$w\wd, \off(\base)
 	.set	pop
@@ -243,6 +244,7 @@
 	.macro	ld_h	wd, off, base
 	.set	push
 	.set	mips32r2
+	.set	fp=64
 	.set	msa
 	ld.h	$w\wd, \off(\base)
 	.set	pop
@@ -251,6 +253,7 @@
 	.macro	ld_w	wd, off, base
 	.set	push
 	.set	mips32r2
+	.set	fp=64
 	.set	msa
 	ld.w	$w\wd, \off(\base)
 	.set	pop
@@ -268,6 +271,7 @@
 	.macro	st_b	wd, off, base
 	.set	push
 	.set	mips32r2
+	.set	fp=64
 	.set	msa
 	st.b	$w\wd, \off(\base)
 	.set	pop
@@ -276,6 +280,7 @@
 	.macro	st_h	wd, off, base
 	.set	push
 	.set	mips32r2
+	.set	fp=64
 	.set	msa
 	st.h	$w\wd, \off(\base)
 	.set	pop
@@ -284,6 +289,7 @@
 	.macro	st_w	wd, off, base
 	.set	push
 	.set	mips32r2
+	.set	fp=64
 	.set	msa
 	st.w	$w\wd, \off(\base)
 	.set	pop
-- 
2.4.10

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/4] MIPS: Fix MSA assembly warnings
@ 2016-04-15  9:07   ` James Hogan
  0 siblings, 0 replies; 15+ messages in thread
From: James Hogan @ 2016-04-15  9:07 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: linux-mips, Paul Burton, James Hogan

Building an MSA capable kernel with a toolchain that supports MSA
produces warnings such as this:

arch/mips/kernel/r4k_fpu.S:229: Warning: the `msa' extension requires 64-bit FPRs

This is due to ".set msa" without ".set fp=64" in the non doubleword MSA
load/store macros, since MSA requires the 64-bit FPU registers (FR=1).
Add the missing fp=64 in these macros to silence the warnings.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
---
 arch/mips/include/asm/asmmacro.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/mips/include/asm/asmmacro.h b/arch/mips/include/asm/asmmacro.h
index 637fccab5604..6741673c92ca 100644
--- a/arch/mips/include/asm/asmmacro.h
+++ b/arch/mips/include/asm/asmmacro.h
@@ -235,6 +235,7 @@
 	.macro	ld_b	wd, off, base
 	.set	push
 	.set	mips32r2
+	.set	fp=64
 	.set	msa
 	ld.b	$w\wd, \off(\base)
 	.set	pop
@@ -243,6 +244,7 @@
 	.macro	ld_h	wd, off, base
 	.set	push
 	.set	mips32r2
+	.set	fp=64
 	.set	msa
 	ld.h	$w\wd, \off(\base)
 	.set	pop
@@ -251,6 +253,7 @@
 	.macro	ld_w	wd, off, base
 	.set	push
 	.set	mips32r2
+	.set	fp=64
 	.set	msa
 	ld.w	$w\wd, \off(\base)
 	.set	pop
@@ -268,6 +271,7 @@
 	.macro	st_b	wd, off, base
 	.set	push
 	.set	mips32r2
+	.set	fp=64
 	.set	msa
 	st.b	$w\wd, \off(\base)
 	.set	pop
@@ -276,6 +280,7 @@
 	.macro	st_h	wd, off, base
 	.set	push
 	.set	mips32r2
+	.set	fp=64
 	.set	msa
 	st.h	$w\wd, \off(\base)
 	.set	pop
@@ -284,6 +289,7 @@
 	.macro	st_w	wd, off, base
 	.set	push
 	.set	mips32r2
+	.set	fp=64
 	.set	msa
 	st.w	$w\wd, \off(\base)
 	.set	pop
-- 
2.4.10

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/4] MIPS: Use copy_s.fmt rather than copy_u.fmt
  2016-04-15  9:07   ` James Hogan
  (?)
@ 2016-04-15  9:59   ` Ralf Baechle
  2016-04-15 10:15       ` James Hogan
  -1 siblings, 1 reply; 15+ messages in thread
From: Ralf Baechle @ 2016-04-15  9:59 UTC (permalink / raw)
  To: James Hogan; +Cc: linux-mips, Paul Burton, stable

On Fri, Apr 15, 2016 at 10:07:23AM +0100, James Hogan wrote:

> From: Paul Burton <paul.burton@imgtec.com>
> 
> In revision 1.12 of the MSA specification, the copy_u.w instruction has
> been removed for MIPS32 & the copy_u.d instruction has been removed for
> MIPS64. Newer toolchains (eg. Codescape SDK essentials 2015.10) will
> complain about this like so:
> 
> arch/mips/kernel/r4k_fpu.S:290: Error: opcode not supported on this
> processor: mips32r2 (mips32r2) `copy_u.w $1,$w26[3]'
> 
> Since we always copy to the width of a GPR, simply use copy_s instead of
> copy_u to fix this.
> 
> Signed-off-by: Paul Burton <paul.burton@imgtec.com>
> Signed-off-by: James Hogan <james.hogan@imgtec.com>
> Cc: Ralf Baechle <ralf@linux-mips.org>
> Cc: linux-mips@linux-mips.org
> Cc: <stable@vger.kernel.org> # 4.1.x-

Looking good but seems to apply only to 4.3+

  Ralf

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/4] MIPS: MSA fixes
  2016-04-15  9:07 ` James Hogan
                   ` (4 preceding siblings ...)
  (?)
@ 2016-04-15 10:11 ` Ralf Baechle
  -1 siblings, 0 replies; 15+ messages in thread
From: Ralf Baechle @ 2016-04-15 10:11 UTC (permalink / raw)
  To: James Hogan; +Cc: linux-mips, Paul Burton, stable

On Fri, Apr 15, 2016 at 10:07:22AM +0100, James Hogan wrote:

> Here are some miscellaneous fixes for MSA (MIPS SIMD Architecture)
> support:
> 1) Fix MSA build with recent toolchains
> 2) Fix 32-bit pointer additions on 64-bit with non-MSA capable
>    toolchain.
> 3) Fix MSA + 64-bit + lockdep build due to large immediate offsets
> 4) Fix some MSA assembler warnings due to missing .set fp=64
> 
> James Hogan (3):
>   MIPS: Fix MSA ld_*/st_* asm macros to use PTR_ADDU
>   MIPS: Fix MSA assembly with big thread offsets
>   MIPS: Fix MSA assembly warnings
> 
> Paul Burton (1):
>   MIPS: Use copy_s.fmt rather than copy_u.fmt

Thanks, whole series applied.

  Ralf

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/4] MIPS: Use copy_s.fmt rather than copy_u.fmt
@ 2016-04-15 10:15       ` James Hogan
  0 siblings, 0 replies; 15+ messages in thread
From: James Hogan @ 2016-04-15 10:15 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: linux-mips, Paul Burton, stable

[-- Attachment #1: Type: text/plain, Size: 1284 bytes --]

On Fri, Apr 15, 2016 at 11:59:41AM +0200, Ralf Baechle wrote:
> On Fri, Apr 15, 2016 at 10:07:23AM +0100, James Hogan wrote:
> 
> > From: Paul Burton <paul.burton@imgtec.com>
> > 
> > In revision 1.12 of the MSA specification, the copy_u.w instruction has
> > been removed for MIPS32 & the copy_u.d instruction has been removed for
> > MIPS64. Newer toolchains (eg. Codescape SDK essentials 2015.10) will
> > complain about this like so:
> > 
> > arch/mips/kernel/r4k_fpu.S:290: Error: opcode not supported on this
> > processor: mips32r2 (mips32r2) `copy_u.w $1,$w26[3]'
> > 
> > Since we always copy to the width of a GPR, simply use copy_s instead of
> > copy_u to fix this.
> > 
> > Signed-off-by: Paul Burton <paul.burton@imgtec.com>
> > Signed-off-by: James Hogan <james.hogan@imgtec.com>
> > Cc: Ralf Baechle <ralf@linux-mips.org>
> > Cc: linux-mips@linux-mips.org
> > Cc: <stable@vger.kernel.org> # 4.1.x-
> 
> Looking good but seems to apply only to 4.3+
> 
>   Ralf

Yes, sorry. Without bf82cb30c7e58b3a9742f0a45962ebdf51befac7 I figured
the changes in r4k_fpu.S can be easily skipped, but actually I should
have looked deeper as the macros aren't even used until that commit.

Could you change the stable tag to 4.3 please.

Thanks!
James

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/4] MIPS: Use copy_s.fmt rather than copy_u.fmt
@ 2016-04-15 10:15       ` James Hogan
  0 siblings, 0 replies; 15+ messages in thread
From: James Hogan @ 2016-04-15 10:15 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: linux-mips, Paul Burton, stable

[-- Attachment #1: Type: text/plain, Size: 1284 bytes --]

On Fri, Apr 15, 2016 at 11:59:41AM +0200, Ralf Baechle wrote:
> On Fri, Apr 15, 2016 at 10:07:23AM +0100, James Hogan wrote:
> 
> > From: Paul Burton <paul.burton@imgtec.com>
> > 
> > In revision 1.12 of the MSA specification, the copy_u.w instruction has
> > been removed for MIPS32 & the copy_u.d instruction has been removed for
> > MIPS64. Newer toolchains (eg. Codescape SDK essentials 2015.10) will
> > complain about this like so:
> > 
> > arch/mips/kernel/r4k_fpu.S:290: Error: opcode not supported on this
> > processor: mips32r2 (mips32r2) `copy_u.w $1,$w26[3]'
> > 
> > Since we always copy to the width of a GPR, simply use copy_s instead of
> > copy_u to fix this.
> > 
> > Signed-off-by: Paul Burton <paul.burton@imgtec.com>
> > Signed-off-by: James Hogan <james.hogan@imgtec.com>
> > Cc: Ralf Baechle <ralf@linux-mips.org>
> > Cc: linux-mips@linux-mips.org
> > Cc: <stable@vger.kernel.org> # 4.1.x-
> 
> Looking good but seems to apply only to 4.3+
> 
>   Ralf

Yes, sorry. Without bf82cb30c7e58b3a9742f0a45962ebdf51befac7 I figured
the changes in r4k_fpu.S can be easily skipped, but actually I should
have looked deeper as the macros aren't even used until that commit.

Could you change the stable tag to 4.3 please.

Thanks!
James

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/4] MIPS: Use copy_s.fmt rather than copy_u.fmt
  2016-04-15 10:15       ` James Hogan
  (?)
@ 2016-04-15 10:28       ` Ralf Baechle
  -1 siblings, 0 replies; 15+ messages in thread
From: Ralf Baechle @ 2016-04-15 10:28 UTC (permalink / raw)
  To: James Hogan; +Cc: linux-mips, Paul Burton, stable

On Fri, Apr 15, 2016 at 11:15:36AM +0100, James Hogan wrote:

> Could you change the stable tag to 4.3 please.

Done but will push the new tree in an hour or two not to disturb buildbot
which doesn't like rapid pushed.

  Ralf

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2016-04-15 10:28 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-15  9:07 [PATCH 0/4] MIPS: MSA fixes James Hogan
2016-04-15  9:07 ` James Hogan
2016-04-15  9:07 ` [PATCH 1/4] MIPS: Use copy_s.fmt rather than copy_u.fmt James Hogan
2016-04-15  9:07   ` James Hogan
2016-04-15  9:59   ` Ralf Baechle
2016-04-15 10:15     ` James Hogan
2016-04-15 10:15       ` James Hogan
2016-04-15 10:28       ` Ralf Baechle
2016-04-15  9:07 ` [PATCH 2/4] MIPS: Fix MSA ld_*/st_* asm macros to use PTR_ADDU James Hogan
2016-04-15  9:07   ` James Hogan
2016-04-15  9:07 ` [PATCH 3/4] MIPS: Fix MSA assembly with big thread offsets James Hogan
2016-04-15  9:07   ` James Hogan
2016-04-15  9:07 ` [PATCH 4/4] MIPS: Fix MSA assembly warnings James Hogan
2016-04-15  9:07   ` James Hogan
2016-04-15 10:11 ` [PATCH 0/4] MIPS: MSA fixes Ralf Baechle

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.