* [PATCH 0/4] MIPS: MSA fixes
@ 2016-04-15 9:07 ` James Hogan
0 siblings, 0 replies; 15+ messages in thread
From: James Hogan @ 2016-04-15 9:07 UTC (permalink / raw)
To: Ralf Baechle; +Cc: linux-mips, Paul Burton, James Hogan, stable
Here are some miscellaneous fixes for MSA (MIPS SIMD Architecture)
support:
1) Fix MSA build with recent toolchains
2) Fix 32-bit pointer additions on 64-bit with non-MSA capable
toolchain.
3) Fix MSA + 64-bit + lockdep build due to large immediate offsets
4) Fix some MSA assembler warnings due to missing .set fp=64
James Hogan (3):
MIPS: Fix MSA ld_*/st_* asm macros to use PTR_ADDU
MIPS: Fix MSA assembly with big thread offsets
MIPS: Fix MSA assembly warnings
Paul Burton (1):
MIPS: Use copy_s.fmt rather than copy_u.fmt
arch/mips/include/asm/asmmacro.h | 193 ++++++++++++++++++++++-----------------
arch/mips/kernel/r4k_fpu.S | 10 +-
2 files changed, 113 insertions(+), 90 deletions(-)
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org>
--
2.4.10
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 0/4] MIPS: MSA fixes
@ 2016-04-15 9:07 ` James Hogan
0 siblings, 0 replies; 15+ messages in thread
From: James Hogan @ 2016-04-15 9:07 UTC (permalink / raw)
To: Ralf Baechle; +Cc: linux-mips, Paul Burton, James Hogan, stable
Here are some miscellaneous fixes for MSA (MIPS SIMD Architecture)
support:
1) Fix MSA build with recent toolchains
2) Fix 32-bit pointer additions on 64-bit with non-MSA capable
toolchain.
3) Fix MSA + 64-bit + lockdep build due to large immediate offsets
4) Fix some MSA assembler warnings due to missing .set fp=64
James Hogan (3):
MIPS: Fix MSA ld_*/st_* asm macros to use PTR_ADDU
MIPS: Fix MSA assembly with big thread offsets
MIPS: Fix MSA assembly warnings
Paul Burton (1):
MIPS: Use copy_s.fmt rather than copy_u.fmt
arch/mips/include/asm/asmmacro.h | 193 ++++++++++++++++++++++-----------------
arch/mips/kernel/r4k_fpu.S | 10 +-
2 files changed, 113 insertions(+), 90 deletions(-)
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org>
--
2.4.10
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 1/4] MIPS: Use copy_s.fmt rather than copy_u.fmt
@ 2016-04-15 9:07 ` James Hogan
0 siblings, 0 replies; 15+ messages in thread
From: James Hogan @ 2016-04-15 9:07 UTC (permalink / raw)
To: Ralf Baechle; +Cc: linux-mips, Paul Burton, James Hogan, stable
From: Paul Burton <paul.burton@imgtec.com>
In revision 1.12 of the MSA specification, the copy_u.w instruction has
been removed for MIPS32 & the copy_u.d instruction has been removed for
MIPS64. Newer toolchains (eg. Codescape SDK essentials 2015.10) will
complain about this like so:
arch/mips/kernel/r4k_fpu.S:290: Error: opcode not supported on this
processor: mips32r2 (mips32r2) `copy_u.w $1,$w26[3]'
Since we always copy to the width of a GPR, simply use copy_s instead of
copy_u to fix this.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 4.1.x-
---
arch/mips/include/asm/asmmacro.h | 24 ++++++++++++------------
arch/mips/kernel/r4k_fpu.S | 10 +++++-----
2 files changed, 17 insertions(+), 17 deletions(-)
diff --git a/arch/mips/include/asm/asmmacro.h b/arch/mips/include/asm/asmmacro.h
index 867f924b05c7..b99b38862fcb 100644
--- a/arch/mips/include/asm/asmmacro.h
+++ b/arch/mips/include/asm/asmmacro.h
@@ -298,21 +298,21 @@
.set pop
.endm
- .macro copy_u_w ws, n
+ .macro copy_s_w ws, n
.set push
.set mips32r2
.set fp=64
.set msa
- copy_u.w $1, $w\ws[\n]
+ copy_s.w $1, $w\ws[\n]
.set pop
.endm
- .macro copy_u_d ws, n
+ .macro copy_s_d ws, n
.set push
.set mips64r2
.set fp=64
.set msa
- copy_u.d $1, $w\ws[\n]
+ copy_s.d $1, $w\ws[\n]
.set pop
.endm
@@ -346,8 +346,8 @@
#define STH_MSA_INSN 0x5800081f
#define STW_MSA_INSN 0x5800082f
#define STD_MSA_INSN 0x5800083f
-#define COPY_UW_MSA_INSN 0x58f00056
-#define COPY_UD_MSA_INSN 0x58f80056
+#define COPY_SW_MSA_INSN 0x58b00056
+#define COPY_SD_MSA_INSN 0x58b80056
#define INSERT_W_MSA_INSN 0x59300816
#define INSERT_D_MSA_INSN 0x59380816
#else
@@ -361,8 +361,8 @@
#define STH_MSA_INSN 0x78000825
#define STW_MSA_INSN 0x78000826
#define STD_MSA_INSN 0x78000827
-#define COPY_UW_MSA_INSN 0x78f00059
-#define COPY_UD_MSA_INSN 0x78f80059
+#define COPY_SW_MSA_INSN 0x78b00059
+#define COPY_SD_MSA_INSN 0x78b80059
#define INSERT_W_MSA_INSN 0x79300819
#define INSERT_D_MSA_INSN 0x79380819
#endif
@@ -461,21 +461,21 @@
.set pop
.endm
- .macro copy_u_w ws, n
+ .macro copy_s_w ws, n
.set push
.set noat
SET_HARDFLOAT
.insn
- .word COPY_UW_MSA_INSN | (\n << 16) | (\ws << 11)
+ .word COPY_SW_MSA_INSN | (\n << 16) | (\ws << 11)
.set pop
.endm
- .macro copy_u_d ws, n
+ .macro copy_s_d ws, n
.set push
.set noat
SET_HARDFLOAT
.insn
- .word COPY_UD_MSA_INSN | (\n << 16) | (\ws << 11)
+ .word COPY_SD_MSA_INSN | (\n << 16) | (\ws << 11)
.set pop
.endm
diff --git a/arch/mips/kernel/r4k_fpu.S b/arch/mips/kernel/r4k_fpu.S
index 17732f876eff..56d86b09c917 100644
--- a/arch/mips/kernel/r4k_fpu.S
+++ b/arch/mips/kernel/r4k_fpu.S
@@ -244,17 +244,17 @@ LEAF(\name)
.set push
.set noat
#ifdef CONFIG_64BIT
- copy_u_d \wr, 1
+ copy_s_d \wr, 1
EX sd $1, \off(\base)
#elif defined(CONFIG_CPU_LITTLE_ENDIAN)
- copy_u_w \wr, 2
+ copy_s_w \wr, 2
EX sw $1, \off(\base)
- copy_u_w \wr, 3
+ copy_s_w \wr, 3
EX sw $1, (\off+4)(\base)
#else /* CONFIG_CPU_BIG_ENDIAN */
- copy_u_w \wr, 2
+ copy_s_w \wr, 2
EX sw $1, (\off+4)(\base)
- copy_u_w \wr, 3
+ copy_s_w \wr, 3
EX sw $1, \off(\base)
#endif
.set pop
--
2.4.10
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 1/4] MIPS: Use copy_s.fmt rather than copy_u.fmt
@ 2016-04-15 9:07 ` James Hogan
0 siblings, 0 replies; 15+ messages in thread
From: James Hogan @ 2016-04-15 9:07 UTC (permalink / raw)
To: Ralf Baechle; +Cc: linux-mips, Paul Burton, James Hogan, stable
From: Paul Burton <paul.burton@imgtec.com>
In revision 1.12 of the MSA specification, the copy_u.w instruction has
been removed for MIPS32 & the copy_u.d instruction has been removed for
MIPS64. Newer toolchains (eg. Codescape SDK essentials 2015.10) will
complain about this like so:
arch/mips/kernel/r4k_fpu.S:290: Error: opcode not supported on this
processor: mips32r2 (mips32r2) `copy_u.w $1,$w26[3]'
Since we always copy to the width of a GPR, simply use copy_s instead of
copy_u to fix this.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 4.1.x-
---
arch/mips/include/asm/asmmacro.h | 24 ++++++++++++------------
arch/mips/kernel/r4k_fpu.S | 10 +++++-----
2 files changed, 17 insertions(+), 17 deletions(-)
diff --git a/arch/mips/include/asm/asmmacro.h b/arch/mips/include/asm/asmmacro.h
index 867f924b05c7..b99b38862fcb 100644
--- a/arch/mips/include/asm/asmmacro.h
+++ b/arch/mips/include/asm/asmmacro.h
@@ -298,21 +298,21 @@
.set pop
.endm
- .macro copy_u_w ws, n
+ .macro copy_s_w ws, n
.set push
.set mips32r2
.set fp=64
.set msa
- copy_u.w $1, $w\ws[\n]
+ copy_s.w $1, $w\ws[\n]
.set pop
.endm
- .macro copy_u_d ws, n
+ .macro copy_s_d ws, n
.set push
.set mips64r2
.set fp=64
.set msa
- copy_u.d $1, $w\ws[\n]
+ copy_s.d $1, $w\ws[\n]
.set pop
.endm
@@ -346,8 +346,8 @@
#define STH_MSA_INSN 0x5800081f
#define STW_MSA_INSN 0x5800082f
#define STD_MSA_INSN 0x5800083f
-#define COPY_UW_MSA_INSN 0x58f00056
-#define COPY_UD_MSA_INSN 0x58f80056
+#define COPY_SW_MSA_INSN 0x58b00056
+#define COPY_SD_MSA_INSN 0x58b80056
#define INSERT_W_MSA_INSN 0x59300816
#define INSERT_D_MSA_INSN 0x59380816
#else
@@ -361,8 +361,8 @@
#define STH_MSA_INSN 0x78000825
#define STW_MSA_INSN 0x78000826
#define STD_MSA_INSN 0x78000827
-#define COPY_UW_MSA_INSN 0x78f00059
-#define COPY_UD_MSA_INSN 0x78f80059
+#define COPY_SW_MSA_INSN 0x78b00059
+#define COPY_SD_MSA_INSN 0x78b80059
#define INSERT_W_MSA_INSN 0x79300819
#define INSERT_D_MSA_INSN 0x79380819
#endif
@@ -461,21 +461,21 @@
.set pop
.endm
- .macro copy_u_w ws, n
+ .macro copy_s_w ws, n
.set push
.set noat
SET_HARDFLOAT
.insn
- .word COPY_UW_MSA_INSN | (\n << 16) | (\ws << 11)
+ .word COPY_SW_MSA_INSN | (\n << 16) | (\ws << 11)
.set pop
.endm
- .macro copy_u_d ws, n
+ .macro copy_s_d ws, n
.set push
.set noat
SET_HARDFLOAT
.insn
- .word COPY_UD_MSA_INSN | (\n << 16) | (\ws << 11)
+ .word COPY_SD_MSA_INSN | (\n << 16) | (\ws << 11)
.set pop
.endm
diff --git a/arch/mips/kernel/r4k_fpu.S b/arch/mips/kernel/r4k_fpu.S
index 17732f876eff..56d86b09c917 100644
--- a/arch/mips/kernel/r4k_fpu.S
+++ b/arch/mips/kernel/r4k_fpu.S
@@ -244,17 +244,17 @@ LEAF(\name)
.set push
.set noat
#ifdef CONFIG_64BIT
- copy_u_d \wr, 1
+ copy_s_d \wr, 1
EX sd $1, \off(\base)
#elif defined(CONFIG_CPU_LITTLE_ENDIAN)
- copy_u_w \wr, 2
+ copy_s_w \wr, 2
EX sw $1, \off(\base)
- copy_u_w \wr, 3
+ copy_s_w \wr, 3
EX sw $1, (\off+4)(\base)
#else /* CONFIG_CPU_BIG_ENDIAN */
- copy_u_w \wr, 2
+ copy_s_w \wr, 2
EX sw $1, (\off+4)(\base)
- copy_u_w \wr, 3
+ copy_s_w \wr, 3
EX sw $1, \off(\base)
#endif
.set pop
--
2.4.10
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 2/4] MIPS: Fix MSA ld_*/st_* asm macros to use PTR_ADDU
@ 2016-04-15 9:07 ` James Hogan
0 siblings, 0 replies; 15+ messages in thread
From: James Hogan @ 2016-04-15 9:07 UTC (permalink / raw)
To: Ralf Baechle; +Cc: linux-mips, Paul Burton, James Hogan, stable
The MSA ld_*/st_* assembler macros for when the toolchain doesn't
support MSA use addu to offset the base address. However it is a virtual
memory pointer so fix it to use PTR_ADDU which expands to daddu for
64-bit kernels.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 4.3.y-
---
arch/mips/include/asm/asmmacro.h | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/arch/mips/include/asm/asmmacro.h b/arch/mips/include/asm/asmmacro.h
index b99b38862fcb..e689b894353c 100644
--- a/arch/mips/include/asm/asmmacro.h
+++ b/arch/mips/include/asm/asmmacro.h
@@ -393,7 +393,7 @@
.set push
.set noat
SET_HARDFLOAT
- addu $1, \base, \off
+ PTR_ADDU $1, \base, \off
.word LDB_MSA_INSN | (\wd << 6)
.set pop
.endm
@@ -402,7 +402,7 @@
.set push
.set noat
SET_HARDFLOAT
- addu $1, \base, \off
+ PTR_ADDU $1, \base, \off
.word LDH_MSA_INSN | (\wd << 6)
.set pop
.endm
@@ -411,7 +411,7 @@
.set push
.set noat
SET_HARDFLOAT
- addu $1, \base, \off
+ PTR_ADDU $1, \base, \off
.word LDW_MSA_INSN | (\wd << 6)
.set pop
.endm
@@ -420,7 +420,7 @@
.set push
.set noat
SET_HARDFLOAT
- addu $1, \base, \off
+ PTR_ADDU $1, \base, \off
.word LDD_MSA_INSN | (\wd << 6)
.set pop
.endm
@@ -429,7 +429,7 @@
.set push
.set noat
SET_HARDFLOAT
- addu $1, \base, \off
+ PTR_ADDU $1, \base, \off
.word STB_MSA_INSN | (\wd << 6)
.set pop
.endm
@@ -438,7 +438,7 @@
.set push
.set noat
SET_HARDFLOAT
- addu $1, \base, \off
+ PTR_ADDU $1, \base, \off
.word STH_MSA_INSN | (\wd << 6)
.set pop
.endm
@@ -447,7 +447,7 @@
.set push
.set noat
SET_HARDFLOAT
- addu $1, \base, \off
+ PTR_ADDU $1, \base, \off
.word STW_MSA_INSN | (\wd << 6)
.set pop
.endm
@@ -456,7 +456,7 @@
.set push
.set noat
SET_HARDFLOAT
- addu $1, \base, \off
+ PTR_ADDU $1, \base, \off
.word STD_MSA_INSN | (\wd << 6)
.set pop
.endm
--
2.4.10
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 2/4] MIPS: Fix MSA ld_*/st_* asm macros to use PTR_ADDU
@ 2016-04-15 9:07 ` James Hogan
0 siblings, 0 replies; 15+ messages in thread
From: James Hogan @ 2016-04-15 9:07 UTC (permalink / raw)
To: Ralf Baechle; +Cc: linux-mips, Paul Burton, James Hogan, stable
The MSA ld_*/st_* assembler macros for when the toolchain doesn't
support MSA use addu to offset the base address. However it is a virtual
memory pointer so fix it to use PTR_ADDU which expands to daddu for
64-bit kernels.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 4.3.y-
---
arch/mips/include/asm/asmmacro.h | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/arch/mips/include/asm/asmmacro.h b/arch/mips/include/asm/asmmacro.h
index b99b38862fcb..e689b894353c 100644
--- a/arch/mips/include/asm/asmmacro.h
+++ b/arch/mips/include/asm/asmmacro.h
@@ -393,7 +393,7 @@
.set push
.set noat
SET_HARDFLOAT
- addu $1, \base, \off
+ PTR_ADDU $1, \base, \off
.word LDB_MSA_INSN | (\wd << 6)
.set pop
.endm
@@ -402,7 +402,7 @@
.set push
.set noat
SET_HARDFLOAT
- addu $1, \base, \off
+ PTR_ADDU $1, \base, \off
.word LDH_MSA_INSN | (\wd << 6)
.set pop
.endm
@@ -411,7 +411,7 @@
.set push
.set noat
SET_HARDFLOAT
- addu $1, \base, \off
+ PTR_ADDU $1, \base, \off
.word LDW_MSA_INSN | (\wd << 6)
.set pop
.endm
@@ -420,7 +420,7 @@
.set push
.set noat
SET_HARDFLOAT
- addu $1, \base, \off
+ PTR_ADDU $1, \base, \off
.word LDD_MSA_INSN | (\wd << 6)
.set pop
.endm
@@ -429,7 +429,7 @@
.set push
.set noat
SET_HARDFLOAT
- addu $1, \base, \off
+ PTR_ADDU $1, \base, \off
.word STB_MSA_INSN | (\wd << 6)
.set pop
.endm
@@ -438,7 +438,7 @@
.set push
.set noat
SET_HARDFLOAT
- addu $1, \base, \off
+ PTR_ADDU $1, \base, \off
.word STH_MSA_INSN | (\wd << 6)
.set pop
.endm
@@ -447,7 +447,7 @@
.set push
.set noat
SET_HARDFLOAT
- addu $1, \base, \off
+ PTR_ADDU $1, \base, \off
.word STW_MSA_INSN | (\wd << 6)
.set pop
.endm
@@ -456,7 +456,7 @@
.set push
.set noat
SET_HARDFLOAT
- addu $1, \base, \off
+ PTR_ADDU $1, \base, \off
.word STD_MSA_INSN | (\wd << 6)
.set pop
.endm
--
2.4.10
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 3/4] MIPS: Fix MSA assembly with big thread offsets
@ 2016-04-15 9:07 ` James Hogan
0 siblings, 0 replies; 15+ messages in thread
From: James Hogan @ 2016-04-15 9:07 UTC (permalink / raw)
To: Ralf Baechle; +Cc: linux-mips, Paul Burton, James Hogan
When lockdep is enabled on a 64-bit kernel the FPR offset into the
thread structure exceeds the maximum range of the MSA ld.d/st.d
instructions. For example THREAD_FPR31 = 4644 (instead of 2448), while
the signed immediate field is only 10 bits with an implicit multiply by
8, giving a maximum offset of 511*8 = 4088.
This isn't a problem when the toolchain doesn't support MSA as the
ld_*/st_* macros perform the addition separately into $1 with [d]addui
which has a 16bit signed immediate field.
Fix the case where the toolchain does support MSA by doing a single
addition of THREAD_FPR0 into $1 with [d]addui, and doing the ld_*/st_*
relative to that.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
---
arch/mips/include/asm/asmmacro.h | 147 ++++++++++++++++++++++-----------------
1 file changed, 82 insertions(+), 65 deletions(-)
diff --git a/arch/mips/include/asm/asmmacro.h b/arch/mips/include/asm/asmmacro.h
index e689b894353c..637fccab5604 100644
--- a/arch/mips/include/asm/asmmacro.h
+++ b/arch/mips/include/asm/asmmacro.h
@@ -496,41 +496,52 @@
.endm
#endif
+#ifdef TOOLCHAIN_SUPPORTS_MSA
+#define FPR_BASE_OFFS THREAD_FPR0
+#define FPR_BASE $1
+#else
+#define FPR_BASE_OFFS 0
+#define FPR_BASE \thread
+#endif
+
.macro msa_save_all thread
- st_d 0, THREAD_FPR0, \thread
- st_d 1, THREAD_FPR1, \thread
- st_d 2, THREAD_FPR2, \thread
- st_d 3, THREAD_FPR3, \thread
- st_d 4, THREAD_FPR4, \thread
- st_d 5, THREAD_FPR5, \thread
- st_d 6, THREAD_FPR6, \thread
- st_d 7, THREAD_FPR7, \thread
- st_d 8, THREAD_FPR8, \thread
- st_d 9, THREAD_FPR9, \thread
- st_d 10, THREAD_FPR10, \thread
- st_d 11, THREAD_FPR11, \thread
- st_d 12, THREAD_FPR12, \thread
- st_d 13, THREAD_FPR13, \thread
- st_d 14, THREAD_FPR14, \thread
- st_d 15, THREAD_FPR15, \thread
- st_d 16, THREAD_FPR16, \thread
- st_d 17, THREAD_FPR17, \thread
- st_d 18, THREAD_FPR18, \thread
- st_d 19, THREAD_FPR19, \thread
- st_d 20, THREAD_FPR20, \thread
- st_d 21, THREAD_FPR21, \thread
- st_d 22, THREAD_FPR22, \thread
- st_d 23, THREAD_FPR23, \thread
- st_d 24, THREAD_FPR24, \thread
- st_d 25, THREAD_FPR25, \thread
- st_d 26, THREAD_FPR26, \thread
- st_d 27, THREAD_FPR27, \thread
- st_d 28, THREAD_FPR28, \thread
- st_d 29, THREAD_FPR29, \thread
- st_d 30, THREAD_FPR30, \thread
- st_d 31, THREAD_FPR31, \thread
.set push
.set noat
+#ifdef TOOLCHAIN_SUPPORTS_MSA
+ PTR_ADDU FPR_BASE, \thread, FPR_BASE_OFFS
+#endif
+ st_d 0, THREAD_FPR0 - FPR_BASE_OFFS, FPR_BASE
+ st_d 1, THREAD_FPR1 - FPR_BASE_OFFS, FPR_BASE
+ st_d 2, THREAD_FPR2 - FPR_BASE_OFFS, FPR_BASE
+ st_d 3, THREAD_FPR3 - FPR_BASE_OFFS, FPR_BASE
+ st_d 4, THREAD_FPR4 - FPR_BASE_OFFS, FPR_BASE
+ st_d 5, THREAD_FPR5 - FPR_BASE_OFFS, FPR_BASE
+ st_d 6, THREAD_FPR6 - FPR_BASE_OFFS, FPR_BASE
+ st_d 7, THREAD_FPR7 - FPR_BASE_OFFS, FPR_BASE
+ st_d 8, THREAD_FPR8 - FPR_BASE_OFFS, FPR_BASE
+ st_d 9, THREAD_FPR9 - FPR_BASE_OFFS, FPR_BASE
+ st_d 10, THREAD_FPR10 - FPR_BASE_OFFS, FPR_BASE
+ st_d 11, THREAD_FPR11 - FPR_BASE_OFFS, FPR_BASE
+ st_d 12, THREAD_FPR12 - FPR_BASE_OFFS, FPR_BASE
+ st_d 13, THREAD_FPR13 - FPR_BASE_OFFS, FPR_BASE
+ st_d 14, THREAD_FPR14 - FPR_BASE_OFFS, FPR_BASE
+ st_d 15, THREAD_FPR15 - FPR_BASE_OFFS, FPR_BASE
+ st_d 16, THREAD_FPR16 - FPR_BASE_OFFS, FPR_BASE
+ st_d 17, THREAD_FPR17 - FPR_BASE_OFFS, FPR_BASE
+ st_d 18, THREAD_FPR18 - FPR_BASE_OFFS, FPR_BASE
+ st_d 19, THREAD_FPR19 - FPR_BASE_OFFS, FPR_BASE
+ st_d 20, THREAD_FPR20 - FPR_BASE_OFFS, FPR_BASE
+ st_d 21, THREAD_FPR21 - FPR_BASE_OFFS, FPR_BASE
+ st_d 22, THREAD_FPR22 - FPR_BASE_OFFS, FPR_BASE
+ st_d 23, THREAD_FPR23 - FPR_BASE_OFFS, FPR_BASE
+ st_d 24, THREAD_FPR24 - FPR_BASE_OFFS, FPR_BASE
+ st_d 25, THREAD_FPR25 - FPR_BASE_OFFS, FPR_BASE
+ st_d 26, THREAD_FPR26 - FPR_BASE_OFFS, FPR_BASE
+ st_d 27, THREAD_FPR27 - FPR_BASE_OFFS, FPR_BASE
+ st_d 28, THREAD_FPR28 - FPR_BASE_OFFS, FPR_BASE
+ st_d 29, THREAD_FPR29 - FPR_BASE_OFFS, FPR_BASE
+ st_d 30, THREAD_FPR30 - FPR_BASE_OFFS, FPR_BASE
+ st_d 31, THREAD_FPR31 - FPR_BASE_OFFS, FPR_BASE
SET_HARDFLOAT
_cfcmsa $1, MSA_CSR
sw $1, THREAD_MSA_CSR(\thread)
@@ -543,41 +554,47 @@
SET_HARDFLOAT
lw $1, THREAD_MSA_CSR(\thread)
_ctcmsa MSA_CSR, $1
- .set pop
- ld_d 0, THREAD_FPR0, \thread
- ld_d 1, THREAD_FPR1, \thread
- ld_d 2, THREAD_FPR2, \thread
- ld_d 3, THREAD_FPR3, \thread
- ld_d 4, THREAD_FPR4, \thread
- ld_d 5, THREAD_FPR5, \thread
- ld_d 6, THREAD_FPR6, \thread
- ld_d 7, THREAD_FPR7, \thread
- ld_d 8, THREAD_FPR8, \thread
- ld_d 9, THREAD_FPR9, \thread
- ld_d 10, THREAD_FPR10, \thread
- ld_d 11, THREAD_FPR11, \thread
- ld_d 12, THREAD_FPR12, \thread
- ld_d 13, THREAD_FPR13, \thread
- ld_d 14, THREAD_FPR14, \thread
- ld_d 15, THREAD_FPR15, \thread
- ld_d 16, THREAD_FPR16, \thread
- ld_d 17, THREAD_FPR17, \thread
- ld_d 18, THREAD_FPR18, \thread
- ld_d 19, THREAD_FPR19, \thread
- ld_d 20, THREAD_FPR20, \thread
- ld_d 21, THREAD_FPR21, \thread
- ld_d 22, THREAD_FPR22, \thread
- ld_d 23, THREAD_FPR23, \thread
- ld_d 24, THREAD_FPR24, \thread
- ld_d 25, THREAD_FPR25, \thread
- ld_d 26, THREAD_FPR26, \thread
- ld_d 27, THREAD_FPR27, \thread
- ld_d 28, THREAD_FPR28, \thread
- ld_d 29, THREAD_FPR29, \thread
- ld_d 30, THREAD_FPR30, \thread
- ld_d 31, THREAD_FPR31, \thread
+#ifdef TOOLCHAIN_SUPPORTS_MSA
+ PTR_ADDU FPR_BASE, \thread, FPR_BASE_OFFS
+#endif
+ ld_d 0, THREAD_FPR0 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 1, THREAD_FPR1 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 2, THREAD_FPR2 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 3, THREAD_FPR3 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 4, THREAD_FPR4 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 5, THREAD_FPR5 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 6, THREAD_FPR6 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 7, THREAD_FPR7 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 8, THREAD_FPR8 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 9, THREAD_FPR9 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 10, THREAD_FPR10 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 11, THREAD_FPR11 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 12, THREAD_FPR12 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 13, THREAD_FPR13 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 14, THREAD_FPR14 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 15, THREAD_FPR15 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 16, THREAD_FPR16 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 17, THREAD_FPR17 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 18, THREAD_FPR18 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 19, THREAD_FPR19 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 20, THREAD_FPR20 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 21, THREAD_FPR21 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 22, THREAD_FPR22 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 23, THREAD_FPR23 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 24, THREAD_FPR24 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 25, THREAD_FPR25 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 26, THREAD_FPR26 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 27, THREAD_FPR27 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 28, THREAD_FPR28 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 29, THREAD_FPR29 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 30, THREAD_FPR30 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 31, THREAD_FPR31 - FPR_BASE_OFFS, FPR_BASE
+ .set pop
.endm
+#undef FPR_BASE_OFFS
+#undef FPR_BASE
+
.macro msa_init_upper wd
#ifdef CONFIG_64BIT
insert_d \wd, 1
--
2.4.10
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 3/4] MIPS: Fix MSA assembly with big thread offsets
@ 2016-04-15 9:07 ` James Hogan
0 siblings, 0 replies; 15+ messages in thread
From: James Hogan @ 2016-04-15 9:07 UTC (permalink / raw)
To: Ralf Baechle; +Cc: linux-mips, Paul Burton, James Hogan
When lockdep is enabled on a 64-bit kernel the FPR offset into the
thread structure exceeds the maximum range of the MSA ld.d/st.d
instructions. For example THREAD_FPR31 = 4644 (instead of 2448), while
the signed immediate field is only 10 bits with an implicit multiply by
8, giving a maximum offset of 511*8 = 4088.
This isn't a problem when the toolchain doesn't support MSA as the
ld_*/st_* macros perform the addition separately into $1 with [d]addui
which has a 16bit signed immediate field.
Fix the case where the toolchain does support MSA by doing a single
addition of THREAD_FPR0 into $1 with [d]addui, and doing the ld_*/st_*
relative to that.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
---
arch/mips/include/asm/asmmacro.h | 147 ++++++++++++++++++++++-----------------
1 file changed, 82 insertions(+), 65 deletions(-)
diff --git a/arch/mips/include/asm/asmmacro.h b/arch/mips/include/asm/asmmacro.h
index e689b894353c..637fccab5604 100644
--- a/arch/mips/include/asm/asmmacro.h
+++ b/arch/mips/include/asm/asmmacro.h
@@ -496,41 +496,52 @@
.endm
#endif
+#ifdef TOOLCHAIN_SUPPORTS_MSA
+#define FPR_BASE_OFFS THREAD_FPR0
+#define FPR_BASE $1
+#else
+#define FPR_BASE_OFFS 0
+#define FPR_BASE \thread
+#endif
+
.macro msa_save_all thread
- st_d 0, THREAD_FPR0, \thread
- st_d 1, THREAD_FPR1, \thread
- st_d 2, THREAD_FPR2, \thread
- st_d 3, THREAD_FPR3, \thread
- st_d 4, THREAD_FPR4, \thread
- st_d 5, THREAD_FPR5, \thread
- st_d 6, THREAD_FPR6, \thread
- st_d 7, THREAD_FPR7, \thread
- st_d 8, THREAD_FPR8, \thread
- st_d 9, THREAD_FPR9, \thread
- st_d 10, THREAD_FPR10, \thread
- st_d 11, THREAD_FPR11, \thread
- st_d 12, THREAD_FPR12, \thread
- st_d 13, THREAD_FPR13, \thread
- st_d 14, THREAD_FPR14, \thread
- st_d 15, THREAD_FPR15, \thread
- st_d 16, THREAD_FPR16, \thread
- st_d 17, THREAD_FPR17, \thread
- st_d 18, THREAD_FPR18, \thread
- st_d 19, THREAD_FPR19, \thread
- st_d 20, THREAD_FPR20, \thread
- st_d 21, THREAD_FPR21, \thread
- st_d 22, THREAD_FPR22, \thread
- st_d 23, THREAD_FPR23, \thread
- st_d 24, THREAD_FPR24, \thread
- st_d 25, THREAD_FPR25, \thread
- st_d 26, THREAD_FPR26, \thread
- st_d 27, THREAD_FPR27, \thread
- st_d 28, THREAD_FPR28, \thread
- st_d 29, THREAD_FPR29, \thread
- st_d 30, THREAD_FPR30, \thread
- st_d 31, THREAD_FPR31, \thread
.set push
.set noat
+#ifdef TOOLCHAIN_SUPPORTS_MSA
+ PTR_ADDU FPR_BASE, \thread, FPR_BASE_OFFS
+#endif
+ st_d 0, THREAD_FPR0 - FPR_BASE_OFFS, FPR_BASE
+ st_d 1, THREAD_FPR1 - FPR_BASE_OFFS, FPR_BASE
+ st_d 2, THREAD_FPR2 - FPR_BASE_OFFS, FPR_BASE
+ st_d 3, THREAD_FPR3 - FPR_BASE_OFFS, FPR_BASE
+ st_d 4, THREAD_FPR4 - FPR_BASE_OFFS, FPR_BASE
+ st_d 5, THREAD_FPR5 - FPR_BASE_OFFS, FPR_BASE
+ st_d 6, THREAD_FPR6 - FPR_BASE_OFFS, FPR_BASE
+ st_d 7, THREAD_FPR7 - FPR_BASE_OFFS, FPR_BASE
+ st_d 8, THREAD_FPR8 - FPR_BASE_OFFS, FPR_BASE
+ st_d 9, THREAD_FPR9 - FPR_BASE_OFFS, FPR_BASE
+ st_d 10, THREAD_FPR10 - FPR_BASE_OFFS, FPR_BASE
+ st_d 11, THREAD_FPR11 - FPR_BASE_OFFS, FPR_BASE
+ st_d 12, THREAD_FPR12 - FPR_BASE_OFFS, FPR_BASE
+ st_d 13, THREAD_FPR13 - FPR_BASE_OFFS, FPR_BASE
+ st_d 14, THREAD_FPR14 - FPR_BASE_OFFS, FPR_BASE
+ st_d 15, THREAD_FPR15 - FPR_BASE_OFFS, FPR_BASE
+ st_d 16, THREAD_FPR16 - FPR_BASE_OFFS, FPR_BASE
+ st_d 17, THREAD_FPR17 - FPR_BASE_OFFS, FPR_BASE
+ st_d 18, THREAD_FPR18 - FPR_BASE_OFFS, FPR_BASE
+ st_d 19, THREAD_FPR19 - FPR_BASE_OFFS, FPR_BASE
+ st_d 20, THREAD_FPR20 - FPR_BASE_OFFS, FPR_BASE
+ st_d 21, THREAD_FPR21 - FPR_BASE_OFFS, FPR_BASE
+ st_d 22, THREAD_FPR22 - FPR_BASE_OFFS, FPR_BASE
+ st_d 23, THREAD_FPR23 - FPR_BASE_OFFS, FPR_BASE
+ st_d 24, THREAD_FPR24 - FPR_BASE_OFFS, FPR_BASE
+ st_d 25, THREAD_FPR25 - FPR_BASE_OFFS, FPR_BASE
+ st_d 26, THREAD_FPR26 - FPR_BASE_OFFS, FPR_BASE
+ st_d 27, THREAD_FPR27 - FPR_BASE_OFFS, FPR_BASE
+ st_d 28, THREAD_FPR28 - FPR_BASE_OFFS, FPR_BASE
+ st_d 29, THREAD_FPR29 - FPR_BASE_OFFS, FPR_BASE
+ st_d 30, THREAD_FPR30 - FPR_BASE_OFFS, FPR_BASE
+ st_d 31, THREAD_FPR31 - FPR_BASE_OFFS, FPR_BASE
SET_HARDFLOAT
_cfcmsa $1, MSA_CSR
sw $1, THREAD_MSA_CSR(\thread)
@@ -543,41 +554,47 @@
SET_HARDFLOAT
lw $1, THREAD_MSA_CSR(\thread)
_ctcmsa MSA_CSR, $1
- .set pop
- ld_d 0, THREAD_FPR0, \thread
- ld_d 1, THREAD_FPR1, \thread
- ld_d 2, THREAD_FPR2, \thread
- ld_d 3, THREAD_FPR3, \thread
- ld_d 4, THREAD_FPR4, \thread
- ld_d 5, THREAD_FPR5, \thread
- ld_d 6, THREAD_FPR6, \thread
- ld_d 7, THREAD_FPR7, \thread
- ld_d 8, THREAD_FPR8, \thread
- ld_d 9, THREAD_FPR9, \thread
- ld_d 10, THREAD_FPR10, \thread
- ld_d 11, THREAD_FPR11, \thread
- ld_d 12, THREAD_FPR12, \thread
- ld_d 13, THREAD_FPR13, \thread
- ld_d 14, THREAD_FPR14, \thread
- ld_d 15, THREAD_FPR15, \thread
- ld_d 16, THREAD_FPR16, \thread
- ld_d 17, THREAD_FPR17, \thread
- ld_d 18, THREAD_FPR18, \thread
- ld_d 19, THREAD_FPR19, \thread
- ld_d 20, THREAD_FPR20, \thread
- ld_d 21, THREAD_FPR21, \thread
- ld_d 22, THREAD_FPR22, \thread
- ld_d 23, THREAD_FPR23, \thread
- ld_d 24, THREAD_FPR24, \thread
- ld_d 25, THREAD_FPR25, \thread
- ld_d 26, THREAD_FPR26, \thread
- ld_d 27, THREAD_FPR27, \thread
- ld_d 28, THREAD_FPR28, \thread
- ld_d 29, THREAD_FPR29, \thread
- ld_d 30, THREAD_FPR30, \thread
- ld_d 31, THREAD_FPR31, \thread
+#ifdef TOOLCHAIN_SUPPORTS_MSA
+ PTR_ADDU FPR_BASE, \thread, FPR_BASE_OFFS
+#endif
+ ld_d 0, THREAD_FPR0 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 1, THREAD_FPR1 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 2, THREAD_FPR2 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 3, THREAD_FPR3 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 4, THREAD_FPR4 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 5, THREAD_FPR5 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 6, THREAD_FPR6 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 7, THREAD_FPR7 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 8, THREAD_FPR8 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 9, THREAD_FPR9 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 10, THREAD_FPR10 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 11, THREAD_FPR11 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 12, THREAD_FPR12 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 13, THREAD_FPR13 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 14, THREAD_FPR14 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 15, THREAD_FPR15 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 16, THREAD_FPR16 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 17, THREAD_FPR17 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 18, THREAD_FPR18 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 19, THREAD_FPR19 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 20, THREAD_FPR20 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 21, THREAD_FPR21 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 22, THREAD_FPR22 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 23, THREAD_FPR23 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 24, THREAD_FPR24 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 25, THREAD_FPR25 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 26, THREAD_FPR26 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 27, THREAD_FPR27 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 28, THREAD_FPR28 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 29, THREAD_FPR29 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 30, THREAD_FPR30 - FPR_BASE_OFFS, FPR_BASE
+ ld_d 31, THREAD_FPR31 - FPR_BASE_OFFS, FPR_BASE
+ .set pop
.endm
+#undef FPR_BASE_OFFS
+#undef FPR_BASE
+
.macro msa_init_upper wd
#ifdef CONFIG_64BIT
insert_d \wd, 1
--
2.4.10
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 4/4] MIPS: Fix MSA assembly warnings
@ 2016-04-15 9:07 ` James Hogan
0 siblings, 0 replies; 15+ messages in thread
From: James Hogan @ 2016-04-15 9:07 UTC (permalink / raw)
To: Ralf Baechle; +Cc: linux-mips, Paul Burton, James Hogan
Building an MSA capable kernel with a toolchain that supports MSA
produces warnings such as this:
arch/mips/kernel/r4k_fpu.S:229: Warning: the `msa' extension requires 64-bit FPRs
This is due to ".set msa" without ".set fp=64" in the non doubleword MSA
load/store macros, since MSA requires the 64-bit FPU registers (FR=1).
Add the missing fp=64 in these macros to silence the warnings.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
---
arch/mips/include/asm/asmmacro.h | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/arch/mips/include/asm/asmmacro.h b/arch/mips/include/asm/asmmacro.h
index 637fccab5604..6741673c92ca 100644
--- a/arch/mips/include/asm/asmmacro.h
+++ b/arch/mips/include/asm/asmmacro.h
@@ -235,6 +235,7 @@
.macro ld_b wd, off, base
.set push
.set mips32r2
+ .set fp=64
.set msa
ld.b $w\wd, \off(\base)
.set pop
@@ -243,6 +244,7 @@
.macro ld_h wd, off, base
.set push
.set mips32r2
+ .set fp=64
.set msa
ld.h $w\wd, \off(\base)
.set pop
@@ -251,6 +253,7 @@
.macro ld_w wd, off, base
.set push
.set mips32r2
+ .set fp=64
.set msa
ld.w $w\wd, \off(\base)
.set pop
@@ -268,6 +271,7 @@
.macro st_b wd, off, base
.set push
.set mips32r2
+ .set fp=64
.set msa
st.b $w\wd, \off(\base)
.set pop
@@ -276,6 +280,7 @@
.macro st_h wd, off, base
.set push
.set mips32r2
+ .set fp=64
.set msa
st.h $w\wd, \off(\base)
.set pop
@@ -284,6 +289,7 @@
.macro st_w wd, off, base
.set push
.set mips32r2
+ .set fp=64
.set msa
st.w $w\wd, \off(\base)
.set pop
--
2.4.10
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 4/4] MIPS: Fix MSA assembly warnings
@ 2016-04-15 9:07 ` James Hogan
0 siblings, 0 replies; 15+ messages in thread
From: James Hogan @ 2016-04-15 9:07 UTC (permalink / raw)
To: Ralf Baechle; +Cc: linux-mips, Paul Burton, James Hogan
Building an MSA capable kernel with a toolchain that supports MSA
produces warnings such as this:
arch/mips/kernel/r4k_fpu.S:229: Warning: the `msa' extension requires 64-bit FPRs
This is due to ".set msa" without ".set fp=64" in the non doubleword MSA
load/store macros, since MSA requires the 64-bit FPU registers (FR=1).
Add the missing fp=64 in these macros to silence the warnings.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
---
arch/mips/include/asm/asmmacro.h | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/arch/mips/include/asm/asmmacro.h b/arch/mips/include/asm/asmmacro.h
index 637fccab5604..6741673c92ca 100644
--- a/arch/mips/include/asm/asmmacro.h
+++ b/arch/mips/include/asm/asmmacro.h
@@ -235,6 +235,7 @@
.macro ld_b wd, off, base
.set push
.set mips32r2
+ .set fp=64
.set msa
ld.b $w\wd, \off(\base)
.set pop
@@ -243,6 +244,7 @@
.macro ld_h wd, off, base
.set push
.set mips32r2
+ .set fp=64
.set msa
ld.h $w\wd, \off(\base)
.set pop
@@ -251,6 +253,7 @@
.macro ld_w wd, off, base
.set push
.set mips32r2
+ .set fp=64
.set msa
ld.w $w\wd, \off(\base)
.set pop
@@ -268,6 +271,7 @@
.macro st_b wd, off, base
.set push
.set mips32r2
+ .set fp=64
.set msa
st.b $w\wd, \off(\base)
.set pop
@@ -276,6 +280,7 @@
.macro st_h wd, off, base
.set push
.set mips32r2
+ .set fp=64
.set msa
st.h $w\wd, \off(\base)
.set pop
@@ -284,6 +289,7 @@
.macro st_w wd, off, base
.set push
.set mips32r2
+ .set fp=64
.set msa
st.w $w\wd, \off(\base)
.set pop
--
2.4.10
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 1/4] MIPS: Use copy_s.fmt rather than copy_u.fmt
2016-04-15 9:07 ` James Hogan
(?)
@ 2016-04-15 9:59 ` Ralf Baechle
2016-04-15 10:15 ` James Hogan
-1 siblings, 1 reply; 15+ messages in thread
From: Ralf Baechle @ 2016-04-15 9:59 UTC (permalink / raw)
To: James Hogan; +Cc: linux-mips, Paul Burton, stable
On Fri, Apr 15, 2016 at 10:07:23AM +0100, James Hogan wrote:
> From: Paul Burton <paul.burton@imgtec.com>
>
> In revision 1.12 of the MSA specification, the copy_u.w instruction has
> been removed for MIPS32 & the copy_u.d instruction has been removed for
> MIPS64. Newer toolchains (eg. Codescape SDK essentials 2015.10) will
> complain about this like so:
>
> arch/mips/kernel/r4k_fpu.S:290: Error: opcode not supported on this
> processor: mips32r2 (mips32r2) `copy_u.w $1,$w26[3]'
>
> Since we always copy to the width of a GPR, simply use copy_s instead of
> copy_u to fix this.
>
> Signed-off-by: Paul Burton <paul.burton@imgtec.com>
> Signed-off-by: James Hogan <james.hogan@imgtec.com>
> Cc: Ralf Baechle <ralf@linux-mips.org>
> Cc: linux-mips@linux-mips.org
> Cc: <stable@vger.kernel.org> # 4.1.x-
Looking good but seems to apply only to 4.3+
Ralf
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/4] MIPS: MSA fixes
2016-04-15 9:07 ` James Hogan
` (4 preceding siblings ...)
(?)
@ 2016-04-15 10:11 ` Ralf Baechle
-1 siblings, 0 replies; 15+ messages in thread
From: Ralf Baechle @ 2016-04-15 10:11 UTC (permalink / raw)
To: James Hogan; +Cc: linux-mips, Paul Burton, stable
On Fri, Apr 15, 2016 at 10:07:22AM +0100, James Hogan wrote:
> Here are some miscellaneous fixes for MSA (MIPS SIMD Architecture)
> support:
> 1) Fix MSA build with recent toolchains
> 2) Fix 32-bit pointer additions on 64-bit with non-MSA capable
> toolchain.
> 3) Fix MSA + 64-bit + lockdep build due to large immediate offsets
> 4) Fix some MSA assembler warnings due to missing .set fp=64
>
> James Hogan (3):
> MIPS: Fix MSA ld_*/st_* asm macros to use PTR_ADDU
> MIPS: Fix MSA assembly with big thread offsets
> MIPS: Fix MSA assembly warnings
>
> Paul Burton (1):
> MIPS: Use copy_s.fmt rather than copy_u.fmt
Thanks, whole series applied.
Ralf
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/4] MIPS: Use copy_s.fmt rather than copy_u.fmt
@ 2016-04-15 10:15 ` James Hogan
0 siblings, 0 replies; 15+ messages in thread
From: James Hogan @ 2016-04-15 10:15 UTC (permalink / raw)
To: Ralf Baechle; +Cc: linux-mips, Paul Burton, stable
[-- Attachment #1: Type: text/plain, Size: 1284 bytes --]
On Fri, Apr 15, 2016 at 11:59:41AM +0200, Ralf Baechle wrote:
> On Fri, Apr 15, 2016 at 10:07:23AM +0100, James Hogan wrote:
>
> > From: Paul Burton <paul.burton@imgtec.com>
> >
> > In revision 1.12 of the MSA specification, the copy_u.w instruction has
> > been removed for MIPS32 & the copy_u.d instruction has been removed for
> > MIPS64. Newer toolchains (eg. Codescape SDK essentials 2015.10) will
> > complain about this like so:
> >
> > arch/mips/kernel/r4k_fpu.S:290: Error: opcode not supported on this
> > processor: mips32r2 (mips32r2) `copy_u.w $1,$w26[3]'
> >
> > Since we always copy to the width of a GPR, simply use copy_s instead of
> > copy_u to fix this.
> >
> > Signed-off-by: Paul Burton <paul.burton@imgtec.com>
> > Signed-off-by: James Hogan <james.hogan@imgtec.com>
> > Cc: Ralf Baechle <ralf@linux-mips.org>
> > Cc: linux-mips@linux-mips.org
> > Cc: <stable@vger.kernel.org> # 4.1.x-
>
> Looking good but seems to apply only to 4.3+
>
> Ralf
Yes, sorry. Without bf82cb30c7e58b3a9742f0a45962ebdf51befac7 I figured
the changes in r4k_fpu.S can be easily skipped, but actually I should
have looked deeper as the macros aren't even used until that commit.
Could you change the stable tag to 4.3 please.
Thanks!
James
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/4] MIPS: Use copy_s.fmt rather than copy_u.fmt
@ 2016-04-15 10:15 ` James Hogan
0 siblings, 0 replies; 15+ messages in thread
From: James Hogan @ 2016-04-15 10:15 UTC (permalink / raw)
To: Ralf Baechle; +Cc: linux-mips, Paul Burton, stable
[-- Attachment #1: Type: text/plain, Size: 1284 bytes --]
On Fri, Apr 15, 2016 at 11:59:41AM +0200, Ralf Baechle wrote:
> On Fri, Apr 15, 2016 at 10:07:23AM +0100, James Hogan wrote:
>
> > From: Paul Burton <paul.burton@imgtec.com>
> >
> > In revision 1.12 of the MSA specification, the copy_u.w instruction has
> > been removed for MIPS32 & the copy_u.d instruction has been removed for
> > MIPS64. Newer toolchains (eg. Codescape SDK essentials 2015.10) will
> > complain about this like so:
> >
> > arch/mips/kernel/r4k_fpu.S:290: Error: opcode not supported on this
> > processor: mips32r2 (mips32r2) `copy_u.w $1,$w26[3]'
> >
> > Since we always copy to the width of a GPR, simply use copy_s instead of
> > copy_u to fix this.
> >
> > Signed-off-by: Paul Burton <paul.burton@imgtec.com>
> > Signed-off-by: James Hogan <james.hogan@imgtec.com>
> > Cc: Ralf Baechle <ralf@linux-mips.org>
> > Cc: linux-mips@linux-mips.org
> > Cc: <stable@vger.kernel.org> # 4.1.x-
>
> Looking good but seems to apply only to 4.3+
>
> Ralf
Yes, sorry. Without bf82cb30c7e58b3a9742f0a45962ebdf51befac7 I figured
the changes in r4k_fpu.S can be easily skipped, but actually I should
have looked deeper as the macros aren't even used until that commit.
Could you change the stable tag to 4.3 please.
Thanks!
James
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/4] MIPS: Use copy_s.fmt rather than copy_u.fmt
2016-04-15 10:15 ` James Hogan
(?)
@ 2016-04-15 10:28 ` Ralf Baechle
-1 siblings, 0 replies; 15+ messages in thread
From: Ralf Baechle @ 2016-04-15 10:28 UTC (permalink / raw)
To: James Hogan; +Cc: linux-mips, Paul Burton, stable
On Fri, Apr 15, 2016 at 11:15:36AM +0100, James Hogan wrote:
> Could you change the stable tag to 4.3 please.
Done but will push the new tree in an hour or two not to disturb buildbot
which doesn't like rapid pushed.
Ralf
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2016-04-15 10:28 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-15 9:07 [PATCH 0/4] MIPS: MSA fixes James Hogan
2016-04-15 9:07 ` James Hogan
2016-04-15 9:07 ` [PATCH 1/4] MIPS: Use copy_s.fmt rather than copy_u.fmt James Hogan
2016-04-15 9:07 ` James Hogan
2016-04-15 9:59 ` Ralf Baechle
2016-04-15 10:15 ` James Hogan
2016-04-15 10:15 ` James Hogan
2016-04-15 10:28 ` Ralf Baechle
2016-04-15 9:07 ` [PATCH 2/4] MIPS: Fix MSA ld_*/st_* asm macros to use PTR_ADDU James Hogan
2016-04-15 9:07 ` James Hogan
2016-04-15 9:07 ` [PATCH 3/4] MIPS: Fix MSA assembly with big thread offsets James Hogan
2016-04-15 9:07 ` James Hogan
2016-04-15 9:07 ` [PATCH 4/4] MIPS: Fix MSA assembly warnings James Hogan
2016-04-15 9:07 ` James Hogan
2016-04-15 10:11 ` [PATCH 0/4] MIPS: MSA fixes Ralf Baechle
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.