All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/11] FP emulation fixes
@ 2016-04-21 13:04 ` Paul Burton
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Burton @ 2016-04-21 13:04 UTC (permalink / raw)
  To: linux-mips, Ralf Baechle
  Cc: Paul Burton, Adam Buchbinder, Maciej W. Rozycki, Leonid Yegoshin,
	linux-kernel, James Hogan, Markos Chandras

This series fixes up some issues with FPU emulation, ranging from
missing instructions to outright backwards branches to a non-zero zero
register. Some cleanups are made along the way, reducing unnecessary
duplication of code.

There are still issues around the R6 maddf & msubf instructions, which
currently appear to have been formed by duplicating the existing
multiply & addition code then gluing it together & thus don't implement
true fused multiply-addition, but those fixes need more testing so will
come later.

This series applies atop v4.6-rc4.

Paul Burton (11):
  MIPS: math-emu: Fix BC1{EQ,NE}Z emulation
  MIPS: Fix BC1{EQ,NE}Z return offset calculation
  MIPS: inst: Declare fsel_op for sel.fmt instruction
  MIPS: math-emu: Emulate MIPSr6 sel.fmt instruction
  MIPS: math-emu: Unify ieee754sp_m{add,sub}f
  MIPS: math-emu: Unify ieee754dp_m{add,sub}f
  MIPS: math-emu: Add z argument macros
  MIPS: math-emu: Fix bit-width in ieee754dp_{mul,maddf,msubf} comments
  MIPS: math-emu: Fix code indentation
  MIPS: math-emu: Fix m{add,sub}.s shifts
  MIPS: math-emu: Fix jalr emulation when rd == $0

 arch/mips/include/uapi/asm/inst.h |   1 +
 arch/mips/kernel/branch.c         |  18 +--
 arch/mips/math-emu/Makefile       |   4 +-
 arch/mips/math-emu/cp1emu.c       |  45 +++++--
 arch/mips/math-emu/dp_maddf.c     |  35 +++--
 arch/mips/math-emu/dp_msubf.c     | 269 --------------------------------------
 arch/mips/math-emu/dp_mul.c       |   4 +-
 arch/mips/math-emu/ieee754dp.h    |   1 +
 arch/mips/math-emu/ieee754int.h   |  10 ++
 arch/mips/math-emu/ieee754sp.c    |   3 +-
 arch/mips/math-emu/ieee754sp.h    |  17 ++-
 arch/mips/math-emu/sp_add.c       |   6 +-
 arch/mips/math-emu/sp_maddf.c     |  43 ++++--
 arch/mips/math-emu/sp_msubf.c     | 258 ------------------------------------
 arch/mips/math-emu/sp_sub.c       |   6 +-
 15 files changed, 130 insertions(+), 590 deletions(-)
 delete mode 100644 arch/mips/math-emu/dp_msubf.c
 delete mode 100644 arch/mips/math-emu/sp_msubf.c

-- 
2.8.0

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 00/11] FP emulation fixes
@ 2016-04-21 13:04 ` Paul Burton
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Burton @ 2016-04-21 13:04 UTC (permalink / raw)
  To: linux-mips, Ralf Baechle
  Cc: Paul Burton, Adam Buchbinder, Maciej W. Rozycki, Leonid Yegoshin,
	linux-kernel, James Hogan, Markos Chandras

This series fixes up some issues with FPU emulation, ranging from
missing instructions to outright backwards branches to a non-zero zero
register. Some cleanups are made along the way, reducing unnecessary
duplication of code.

There are still issues around the R6 maddf & msubf instructions, which
currently appear to have been formed by duplicating the existing
multiply & addition code then gluing it together & thus don't implement
true fused multiply-addition, but those fixes need more testing so will
come later.

This series applies atop v4.6-rc4.

Paul Burton (11):
  MIPS: math-emu: Fix BC1{EQ,NE}Z emulation
  MIPS: Fix BC1{EQ,NE}Z return offset calculation
  MIPS: inst: Declare fsel_op for sel.fmt instruction
  MIPS: math-emu: Emulate MIPSr6 sel.fmt instruction
  MIPS: math-emu: Unify ieee754sp_m{add,sub}f
  MIPS: math-emu: Unify ieee754dp_m{add,sub}f
  MIPS: math-emu: Add z argument macros
  MIPS: math-emu: Fix bit-width in ieee754dp_{mul,maddf,msubf} comments
  MIPS: math-emu: Fix code indentation
  MIPS: math-emu: Fix m{add,sub}.s shifts
  MIPS: math-emu: Fix jalr emulation when rd == $0

 arch/mips/include/uapi/asm/inst.h |   1 +
 arch/mips/kernel/branch.c         |  18 +--
 arch/mips/math-emu/Makefile       |   4 +-
 arch/mips/math-emu/cp1emu.c       |  45 +++++--
 arch/mips/math-emu/dp_maddf.c     |  35 +++--
 arch/mips/math-emu/dp_msubf.c     | 269 --------------------------------------
 arch/mips/math-emu/dp_mul.c       |   4 +-
 arch/mips/math-emu/ieee754dp.h    |   1 +
 arch/mips/math-emu/ieee754int.h   |  10 ++
 arch/mips/math-emu/ieee754sp.c    |   3 +-
 arch/mips/math-emu/ieee754sp.h    |  17 ++-
 arch/mips/math-emu/sp_add.c       |   6 +-
 arch/mips/math-emu/sp_maddf.c     |  43 ++++--
 arch/mips/math-emu/sp_msubf.c     | 258 ------------------------------------
 arch/mips/math-emu/sp_sub.c       |   6 +-
 15 files changed, 130 insertions(+), 590 deletions(-)
 delete mode 100644 arch/mips/math-emu/dp_msubf.c
 delete mode 100644 arch/mips/math-emu/sp_msubf.c

-- 
2.8.0

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 01/11] MIPS: math-emu: Fix BC1{EQ,NE}Z emulation
@ 2016-04-21 13:04   ` Paul Burton
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Burton @ 2016-04-21 13:04 UTC (permalink / raw)
  To: linux-mips, Ralf Baechle
  Cc: Paul Burton, Maciej W. Rozycki, linux-kernel, James Hogan,
	Markos Chandras

The conditions for branching when emulating the BC1EQZ & BC1NEZ
instructions were backwards, leading to each of those instructions being
treated as the other. Fix this by reversing the conditions, and clear up
the code a little for readability & checkpatch.

Fixes: c909ca718e8f ("MIPS: math-emu: Emulate missing BC1{EQ,NE}Z instructions")
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
---

 arch/mips/math-emu/cp1emu.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/mips/math-emu/cp1emu.c b/arch/mips/math-emu/cp1emu.c
index cdfd44f..99977c3 100644
--- a/arch/mips/math-emu/cp1emu.c
+++ b/arch/mips/math-emu/cp1emu.c
@@ -973,9 +973,10 @@ static int cop1Emulate(struct pt_regs *xcp, struct mips_fpu_struct *ctx,
 		struct mm_decoded_insn dec_insn, void *__user *fault_addr)
 {
 	unsigned long contpc = xcp->cp0_epc + dec_insn.pc_inc;
-	unsigned int cond, cbit;
+	unsigned int cond, cbit, bit0;
 	mips_instruction ir;
 	int likely, pc_inc;
+	union fpureg *fpr;
 	u32 __user *wva;
 	u64 __user *dva;
 	u32 wval;
@@ -1187,14 +1188,14 @@ emul:
 				return SIGILL;
 
 			cond = likely = 0;
+			fpr = &current->thread.fpu.fpr[MIPSInst_RT(ir)];
+			bit0 = get_fpr32(fpr, 0) & 0x1;
 			switch (MIPSInst_RS(ir)) {
 			case bc1eqz_op:
-				if (get_fpr32(&current->thread.fpu.fpr[MIPSInst_RT(ir)], 0) & 0x1)
-				    cond = 1;
+				cond = bit0 == 0;
 				break;
 			case bc1nez_op:
-				if (!(get_fpr32(&current->thread.fpu.fpr[MIPSInst_RT(ir)], 0) & 0x1))
-				    cond = 1;
+				cond = bit0 != 0;
 				break;
 			}
 			goto branch_common;
-- 
2.8.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 01/11] MIPS: math-emu: Fix BC1{EQ,NE}Z emulation
@ 2016-04-21 13:04   ` Paul Burton
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Burton @ 2016-04-21 13:04 UTC (permalink / raw)
  To: linux-mips, Ralf Baechle
  Cc: Paul Burton, Maciej W. Rozycki, linux-kernel, James Hogan,
	Markos Chandras

The conditions for branching when emulating the BC1EQZ & BC1NEZ
instructions were backwards, leading to each of those instructions being
treated as the other. Fix this by reversing the conditions, and clear up
the code a little for readability & checkpatch.

Fixes: c909ca718e8f ("MIPS: math-emu: Emulate missing BC1{EQ,NE}Z instructions")
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
---

 arch/mips/math-emu/cp1emu.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/mips/math-emu/cp1emu.c b/arch/mips/math-emu/cp1emu.c
index cdfd44f..99977c3 100644
--- a/arch/mips/math-emu/cp1emu.c
+++ b/arch/mips/math-emu/cp1emu.c
@@ -973,9 +973,10 @@ static int cop1Emulate(struct pt_regs *xcp, struct mips_fpu_struct *ctx,
 		struct mm_decoded_insn dec_insn, void *__user *fault_addr)
 {
 	unsigned long contpc = xcp->cp0_epc + dec_insn.pc_inc;
-	unsigned int cond, cbit;
+	unsigned int cond, cbit, bit0;
 	mips_instruction ir;
 	int likely, pc_inc;
+	union fpureg *fpr;
 	u32 __user *wva;
 	u64 __user *dva;
 	u32 wval;
@@ -1187,14 +1188,14 @@ emul:
 				return SIGILL;
 
 			cond = likely = 0;
+			fpr = &current->thread.fpu.fpr[MIPSInst_RT(ir)];
+			bit0 = get_fpr32(fpr, 0) & 0x1;
 			switch (MIPSInst_RS(ir)) {
 			case bc1eqz_op:
-				if (get_fpr32(&current->thread.fpu.fpr[MIPSInst_RT(ir)], 0) & 0x1)
-				    cond = 1;
+				cond = bit0 == 0;
 				break;
 			case bc1nez_op:
-				if (!(get_fpr32(&current->thread.fpu.fpr[MIPSInst_RT(ir)], 0) & 0x1))
-				    cond = 1;
+				cond = bit0 != 0;
 				break;
 			}
 			goto branch_common;
-- 
2.8.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 02/11] MIPS: Fix BC1{EQ,NE}Z return offset calculation
@ 2016-04-21 13:04   ` Paul Burton
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Burton @ 2016-04-21 13:04 UTC (permalink / raw)
  To: linux-mips, Ralf Baechle
  Cc: Paul Burton, linux-kernel, James Hogan, Markos Chandras

The conditions for branching when emulating the BC1EQZ & BC1NEZ
instructions were backwards, leading to each of those instructions being
treated as the other. Fix this by reversing the conditions, and clear up
the code a little for readability & checkpatch.

Fixes: c8a34581ec09 ("MIPS: Emulate the BC1{EQ,NE}Z FPU instructions")
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
---

 arch/mips/kernel/branch.c | 18 +++---------------
 1 file changed, 3 insertions(+), 15 deletions(-)

diff --git a/arch/mips/kernel/branch.c b/arch/mips/kernel/branch.c
index d8f9b35..ceca6cc 100644
--- a/arch/mips/kernel/branch.c
+++ b/arch/mips/kernel/branch.c
@@ -688,21 +688,9 @@ int __compute_return_epc_for_insn(struct pt_regs *regs,
 			}
 			lose_fpu(1);    /* Save FPU state for the emulator. */
 			reg = insn.i_format.rt;
-			bit = 0;
-			switch (insn.i_format.rs) {
-			case bc1eqz_op:
-				/* Test bit 0 */
-				if (get_fpr32(&current->thread.fpu.fpr[reg], 0)
-				    & 0x1)
-					bit = 1;
-				break;
-			case bc1nez_op:
-				/* Test bit 0 */
-				if (!(get_fpr32(&current->thread.fpu.fpr[reg], 0)
-				      & 0x1))
-					bit = 1;
-				break;
-			}
+			bit = get_fpr32(&current->thread.fpu.fpr[reg], 0) & 0x1;
+			if (insn.i_format.rs == bc1eqz_op)
+				bit = !bit;
 			own_fpu(1);
 			if (bit)
 				epc = epc + 4 +
-- 
2.8.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 02/11] MIPS: Fix BC1{EQ,NE}Z return offset calculation
@ 2016-04-21 13:04   ` Paul Burton
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Burton @ 2016-04-21 13:04 UTC (permalink / raw)
  To: linux-mips, Ralf Baechle
  Cc: Paul Burton, linux-kernel, James Hogan, Markos Chandras

The conditions for branching when emulating the BC1EQZ & BC1NEZ
instructions were backwards, leading to each of those instructions being
treated as the other. Fix this by reversing the conditions, and clear up
the code a little for readability & checkpatch.

Fixes: c8a34581ec09 ("MIPS: Emulate the BC1{EQ,NE}Z FPU instructions")
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
---

 arch/mips/kernel/branch.c | 18 +++---------------
 1 file changed, 3 insertions(+), 15 deletions(-)

diff --git a/arch/mips/kernel/branch.c b/arch/mips/kernel/branch.c
index d8f9b35..ceca6cc 100644
--- a/arch/mips/kernel/branch.c
+++ b/arch/mips/kernel/branch.c
@@ -688,21 +688,9 @@ int __compute_return_epc_for_insn(struct pt_regs *regs,
 			}
 			lose_fpu(1);    /* Save FPU state for the emulator. */
 			reg = insn.i_format.rt;
-			bit = 0;
-			switch (insn.i_format.rs) {
-			case bc1eqz_op:
-				/* Test bit 0 */
-				if (get_fpr32(&current->thread.fpu.fpr[reg], 0)
-				    & 0x1)
-					bit = 1;
-				break;
-			case bc1nez_op:
-				/* Test bit 0 */
-				if (!(get_fpr32(&current->thread.fpu.fpr[reg], 0)
-				      & 0x1))
-					bit = 1;
-				break;
-			}
+			bit = get_fpr32(&current->thread.fpu.fpr[reg], 0) & 0x1;
+			if (insn.i_format.rs == bc1eqz_op)
+				bit = !bit;
 			own_fpu(1);
 			if (bit)
 				epc = epc + 4 +
-- 
2.8.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 03/11] MIPS: inst: Declare fsel_op for sel.fmt instruction
@ 2016-04-21 13:04   ` Paul Burton
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Burton @ 2016-04-21 13:04 UTC (permalink / raw)
  To: linux-mips, Ralf Baechle
  Cc: Paul Burton, Maciej W. Rozycki, Leonid Yegoshin, linux-kernel,
	James Hogan, Markos Chandras

Declare the opcode for the MIPSr6 sel.fmt instruction, as fsel_op in
order to match other FP op names.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
---

 arch/mips/include/uapi/asm/inst.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/mips/include/uapi/asm/inst.h b/arch/mips/include/uapi/asm/inst.h
index ddea53e..28f4151 100644
--- a/arch/mips/include/uapi/asm/inst.h
+++ b/arch/mips/include/uapi/asm/inst.h
@@ -167,6 +167,7 @@ enum cop1_sdw_func {
 	fceill_op    =	0x0a, ffloorl_op   =  0x0b,
 	fround_op    =	0x0c, ftrunc_op	   =  0x0d,
 	fceil_op     =	0x0e, ffloor_op	   =  0x0f,
+	fsel_op      =  0x10,
 	fmovc_op     =	0x11, fmovz_op	   =  0x12,
 	fmovn_op     =	0x13, fseleqz_op   =  0x14,
 	frecip_op    =  0x15, frsqrt_op    =  0x16,
-- 
2.8.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 03/11] MIPS: inst: Declare fsel_op for sel.fmt instruction
@ 2016-04-21 13:04   ` Paul Burton
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Burton @ 2016-04-21 13:04 UTC (permalink / raw)
  To: linux-mips, Ralf Baechle
  Cc: Paul Burton, Maciej W. Rozycki, Leonid Yegoshin, linux-kernel,
	James Hogan, Markos Chandras

Declare the opcode for the MIPSr6 sel.fmt instruction, as fsel_op in
order to match other FP op names.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
---

 arch/mips/include/uapi/asm/inst.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/mips/include/uapi/asm/inst.h b/arch/mips/include/uapi/asm/inst.h
index ddea53e..28f4151 100644
--- a/arch/mips/include/uapi/asm/inst.h
+++ b/arch/mips/include/uapi/asm/inst.h
@@ -167,6 +167,7 @@ enum cop1_sdw_func {
 	fceill_op    =	0x0a, ffloorl_op   =  0x0b,
 	fround_op    =	0x0c, ftrunc_op	   =  0x0d,
 	fceil_op     =	0x0e, ffloor_op	   =  0x0f,
+	fsel_op      =  0x10,
 	fmovc_op     =	0x11, fmovz_op	   =  0x12,
 	fmovn_op     =	0x13, fseleqz_op   =  0x14,
 	frecip_op    =  0x15, frsqrt_op    =  0x16,
-- 
2.8.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 04/11] MIPS: math-emu: Emulate MIPSr6 sel.fmt instruction
@ 2016-04-21 13:04   ` Paul Burton
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Burton @ 2016-04-21 13:04 UTC (permalink / raw)
  To: linux-mips, Ralf Baechle
  Cc: Paul Burton, Maciej W. Rozycki, linux-kernel, James Hogan,
	Markos Chandras

Add support for emulating the MIPSr6 sel.fmt instruction, which was
previously missing from the FPU emulation code. This instruction selects
its result from 2 possible source registers, based upon bit 0 of the
destination register, and is valid only for S (single) & D (double) data
types.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
---

 arch/mips/math-emu/cp1emu.c | 26 ++++++++++++++++++++++++--
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/arch/mips/math-emu/cp1emu.c b/arch/mips/math-emu/cp1emu.c
index 99977c3..85dd174 100644
--- a/arch/mips/math-emu/cp1emu.c
+++ b/arch/mips/math-emu/cp1emu.c
@@ -1675,7 +1675,7 @@ static int fpu_emu(struct pt_regs *xcp, struct mips_fpu_struct *ctx,
 			union ieee754sp(*b) (union ieee754sp, union ieee754sp);
 			union ieee754sp(*u) (union ieee754sp);
 		} handler;
-		union ieee754sp fs, ft;
+		union ieee754sp fd, fs, ft;
 
 		switch (MIPSInst_FUNC(ir)) {
 			/* binary ops */
@@ -1946,6 +1946,17 @@ copcsr:
 			rfmt = w_fmt;
 			goto copcsr;
 
+		case fsel_op:
+			if (!cpu_has_mips_r6)
+				return SIGILL;
+
+			SPFROMREG(fd, MIPSInst_FD(ir));
+			if (fd.bits & 0x1)
+				SPFROMREG(rv.s, MIPSInst_FT(ir));
+			else
+				SPFROMREG(rv.s, MIPSInst_FS(ir));
+			break;
+
 		case fcvtl_op:
 			if (!cpu_has_mips_3_4_5_64_r2_r6)
 				return SIGILL;
@@ -1994,7 +2005,7 @@ copcsr:
 	}
 
 	case d_fmt: {
-		union ieee754dp fs, ft;
+		union ieee754dp fd, fs, ft;
 		union {
 			union ieee754dp(*b) (union ieee754dp, union ieee754dp);
 			union ieee754dp(*u) (union ieee754dp);
@@ -2244,6 +2255,17 @@ dcopuop:
 			rfmt = w_fmt;
 			goto copcsr;
 
+		case fsel_op:
+			if (!cpu_has_mips_r6)
+				return SIGILL;
+
+			DPFROMREG(fd, MIPSInst_FD(ir));
+			if (fd.bits & 0x1)
+				DPFROMREG(rv.d, MIPSInst_FT(ir));
+			else
+				DPFROMREG(rv.d, MIPSInst_FS(ir));
+			break;
+
 		case fcvtl_op:
 			if (!cpu_has_mips_3_4_5_64_r2_r6)
 				return SIGILL;
-- 
2.8.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 04/11] MIPS: math-emu: Emulate MIPSr6 sel.fmt instruction
@ 2016-04-21 13:04   ` Paul Burton
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Burton @ 2016-04-21 13:04 UTC (permalink / raw)
  To: linux-mips, Ralf Baechle
  Cc: Paul Burton, Maciej W. Rozycki, linux-kernel, James Hogan,
	Markos Chandras

Add support for emulating the MIPSr6 sel.fmt instruction, which was
previously missing from the FPU emulation code. This instruction selects
its result from 2 possible source registers, based upon bit 0 of the
destination register, and is valid only for S (single) & D (double) data
types.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
---

 arch/mips/math-emu/cp1emu.c | 26 ++++++++++++++++++++++++--
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/arch/mips/math-emu/cp1emu.c b/arch/mips/math-emu/cp1emu.c
index 99977c3..85dd174 100644
--- a/arch/mips/math-emu/cp1emu.c
+++ b/arch/mips/math-emu/cp1emu.c
@@ -1675,7 +1675,7 @@ static int fpu_emu(struct pt_regs *xcp, struct mips_fpu_struct *ctx,
 			union ieee754sp(*b) (union ieee754sp, union ieee754sp);
 			union ieee754sp(*u) (union ieee754sp);
 		} handler;
-		union ieee754sp fs, ft;
+		union ieee754sp fd, fs, ft;
 
 		switch (MIPSInst_FUNC(ir)) {
 			/* binary ops */
@@ -1946,6 +1946,17 @@ copcsr:
 			rfmt = w_fmt;
 			goto copcsr;
 
+		case fsel_op:
+			if (!cpu_has_mips_r6)
+				return SIGILL;
+
+			SPFROMREG(fd, MIPSInst_FD(ir));
+			if (fd.bits & 0x1)
+				SPFROMREG(rv.s, MIPSInst_FT(ir));
+			else
+				SPFROMREG(rv.s, MIPSInst_FS(ir));
+			break;
+
 		case fcvtl_op:
 			if (!cpu_has_mips_3_4_5_64_r2_r6)
 				return SIGILL;
@@ -1994,7 +2005,7 @@ copcsr:
 	}
 
 	case d_fmt: {
-		union ieee754dp fs, ft;
+		union ieee754dp fd, fs, ft;
 		union {
 			union ieee754dp(*b) (union ieee754dp, union ieee754dp);
 			union ieee754dp(*u) (union ieee754dp);
@@ -2244,6 +2255,17 @@ dcopuop:
 			rfmt = w_fmt;
 			goto copcsr;
 
+		case fsel_op:
+			if (!cpu_has_mips_r6)
+				return SIGILL;
+
+			DPFROMREG(fd, MIPSInst_FD(ir));
+			if (fd.bits & 0x1)
+				DPFROMREG(rv.d, MIPSInst_FT(ir));
+			else
+				DPFROMREG(rv.d, MIPSInst_FS(ir));
+			break;
+
 		case fcvtl_op:
 			if (!cpu_has_mips_3_4_5_64_r2_r6)
 				return SIGILL;
-- 
2.8.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 05/11] MIPS: math-emu: Unify ieee754sp_m{add,sub}f
@ 2016-04-21 13:04   ` Paul Burton
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Burton @ 2016-04-21 13:04 UTC (permalink / raw)
  To: linux-mips, Ralf Baechle; +Cc: Paul Burton, Markos Chandras, linux-kernel

The code for emulating MIPSr6 madd.s & msub.s instructions has
previously been implemented as 2 different functions, namely
ieee754sp_maddf & ieee754sp_msubf. The difference in behaviour of these
2 instructions is merely the sign of the product, so we can easily share
the code implementing them. Do this for the single precision variant,
removing the original ieee754sp_msubf in favor of reusing the code from
ieee754sp_maddf.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
---

 arch/mips/math-emu/Makefile   |   2 +-
 arch/mips/math-emu/sp_maddf.c |  22 +++-
 arch/mips/math-emu/sp_msubf.c | 258 ------------------------------------------
 3 files changed, 21 insertions(+), 261 deletions(-)
 delete mode 100644 arch/mips/math-emu/sp_msubf.c

diff --git a/arch/mips/math-emu/Makefile b/arch/mips/math-emu/Makefile
index a19641d..3389aff 100644
--- a/arch/mips/math-emu/Makefile
+++ b/arch/mips/math-emu/Makefile
@@ -6,7 +6,7 @@ obj-y	+= cp1emu.o ieee754dp.o ieee754sp.o ieee754.o \
 	   dp_div.o dp_mul.o dp_sub.o dp_add.o dp_fsp.o dp_cmp.o dp_simple.o \
 	   dp_tint.o dp_fint.o dp_maddf.o dp_msubf.o dp_2008class.o dp_fmin.o dp_fmax.o \
 	   sp_div.o sp_mul.o sp_sub.o sp_add.o sp_fdp.o sp_cmp.o sp_simple.o \
-	   sp_tint.o sp_fint.o sp_maddf.o sp_msubf.o sp_2008class.o sp_fmin.o sp_fmax.o \
+	   sp_tint.o sp_fint.o sp_maddf.o sp_2008class.o sp_fmin.o sp_fmax.o \
 	   dsemul.o
 
 lib-y	+= ieee754d.o \
diff --git a/arch/mips/math-emu/sp_maddf.c b/arch/mips/math-emu/sp_maddf.c
index dd1dd83..93b7132 100644
--- a/arch/mips/math-emu/sp_maddf.c
+++ b/arch/mips/math-emu/sp_maddf.c
@@ -14,8 +14,12 @@
 
 #include "ieee754sp.h"
 
-union ieee754sp ieee754sp_maddf(union ieee754sp z, union ieee754sp x,
-				union ieee754sp y)
+enum maddf_flags {
+	maddf_negate_product	= 1 << 0,
+};
+
+static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
+				 union ieee754sp y, enum maddf_flags flags)
 {
 	int re;
 	int rs;
@@ -154,6 +158,8 @@ union ieee754sp ieee754sp_maddf(union ieee754sp z, union ieee754sp x,
 
 	re = xe + ye;
 	rs = xs ^ ys;
+	if (flags & maddf_negate_product)
+		rs ^= 1;
 
 	/* shunt to top of word */
 	xm <<= 32 - (SP_FBITS + 1);
@@ -253,3 +259,15 @@ union ieee754sp ieee754sp_maddf(union ieee754sp z, union ieee754sp x,
 	}
 	return ieee754sp_format(zs, ze, zm);
 }
+
+union ieee754sp ieee754sp_maddf(union ieee754sp z, union ieee754sp x,
+				union ieee754sp y)
+{
+	return _sp_maddf(z, x, y, 0);
+}
+
+union ieee754sp ieee754sp_msubf(union ieee754sp z, union ieee754sp x,
+				union ieee754sp y)
+{
+	return _sp_maddf(z, x, y, maddf_negate_product);
+}
diff --git a/arch/mips/math-emu/sp_msubf.c b/arch/mips/math-emu/sp_msubf.c
deleted file mode 100644
index 81c38b980..0000000
--- a/arch/mips/math-emu/sp_msubf.c
+++ /dev/null
@@ -1,258 +0,0 @@
-/*
- * IEEE754 floating point arithmetic
- * single precision: MSUB.f (Fused Multiply Subtract)
- * MSUBF.fmt: FPR[fd] = FPR[fd] - (FPR[fs] x FPR[ft])
- *
- * MIPS floating point support
- * Copyright (C) 2015 Imagination Technologies, Ltd.
- * Author: Markos Chandras <markos.chandras@imgtec.com>
- *
- *  This program is free software; you can distribute it and/or modify it
- *  under the terms of the GNU General Public License as published by the
- *  Free Software Foundation; version 2 of the License.
- */
-
-#include "ieee754sp.h"
-
-union ieee754sp ieee754sp_msubf(union ieee754sp z, union ieee754sp x,
-				union ieee754sp y)
-{
-	int re;
-	int rs;
-	unsigned rm;
-	unsigned short lxm;
-	unsigned short hxm;
-	unsigned short lym;
-	unsigned short hym;
-	unsigned lrm;
-	unsigned hrm;
-	unsigned t;
-	unsigned at;
-	int s;
-
-	COMPXSP;
-	COMPYSP;
-	u32 zm; int ze; int zs __maybe_unused; int zc;
-
-	EXPLODEXSP;
-	EXPLODEYSP;
-	EXPLODESP(z, zc, zs, ze, zm)
-
-	FLUSHXSP;
-	FLUSHYSP;
-	FLUSHSP(z, zc, zs, ze, zm);
-
-	ieee754_clearcx();
-
-	switch (zc) {
-	case IEEE754_CLASS_SNAN:
-		ieee754_setcx(IEEE754_INVALID_OPERATION);
-		return ieee754sp_nanxcpt(z);
-	case IEEE754_CLASS_DNORM:
-		SPDNORMx(zm, ze);
-	/* QNAN is handled separately below */
-	}
-
-	switch (CLPAIR(xc, yc)) {
-	case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_SNAN):
-	case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_SNAN):
-	case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_SNAN):
-	case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_SNAN):
-	case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_SNAN):
-		return ieee754sp_nanxcpt(y);
-
-	case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_SNAN):
-	case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_QNAN):
-	case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_ZERO):
-	case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_NORM):
-	case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_DNORM):
-	case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
-		return ieee754sp_nanxcpt(x);
-
-	case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
-	case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
-	case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_QNAN):
-	case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_QNAN):
-		return y;
-
-	case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
-	case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_ZERO):
-	case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_NORM):
-	case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_DNORM):
-	case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_INF):
-		return x;
-
-	/*
-	 * Infinity handling
-	 */
-	case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_ZERO):
-	case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_INF):
-		if (zc == IEEE754_CLASS_QNAN)
-			return z;
-		ieee754_setcx(IEEE754_INVALID_OPERATION);
-		return ieee754sp_indef();
-
-	case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_INF):
-	case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_INF):
-	case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_NORM):
-	case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
-	case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
-		if (zc == IEEE754_CLASS_QNAN)
-			return z;
-		return ieee754sp_inf(xs ^ ys);
-
-	case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
-	case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_NORM):
-	case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_DNORM):
-	case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_ZERO):
-	case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_ZERO):
-		if (zc == IEEE754_CLASS_INF)
-			return ieee754sp_inf(zs);
-		/* Multiplication is 0 so just return z */
-		return z;
-
-	case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_DNORM):
-		SPDNORMX;
-
-	case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_DNORM):
-		if (zc == IEEE754_CLASS_QNAN)
-			return z;
-		else if (zc == IEEE754_CLASS_INF)
-			return ieee754sp_inf(zs);
-		SPDNORMY;
-		break;
-
-	case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_NORM):
-		if (zc == IEEE754_CLASS_QNAN)
-			return z;
-		else if (zc == IEEE754_CLASS_INF)
-			return ieee754sp_inf(zs);
-		SPDNORMX;
-		break;
-
-	case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_NORM):
-		if (zc == IEEE754_CLASS_QNAN)
-			return z;
-		else if (zc == IEEE754_CLASS_INF)
-			return ieee754sp_inf(zs);
-		/* fall through to real compuation */
-	}
-
-	/* Finally get to do some computation */
-
-	/*
-	 * Do the multiplication bit first
-	 *
-	 * rm = xm * ym, re = xe + ye basically
-	 *
-	 * At this point xm and ym should have been normalized.
-	 */
-
-	/* rm = xm * ym, re = xe+ye basically */
-	assert(xm & SP_HIDDEN_BIT);
-	assert(ym & SP_HIDDEN_BIT);
-
-	re = xe + ye;
-	rs = xs ^ ys;
-
-	/* shunt to top of word */
-	xm <<= 32 - (SP_FBITS + 1);
-	ym <<= 32 - (SP_FBITS + 1);
-
-	/*
-	 * Multiply 32 bits xm, ym to give high 32 bits rm with stickness.
-	 */
-	lxm = xm & 0xffff;
-	hxm = xm >> 16;
-	lym = ym & 0xffff;
-	hym = ym >> 16;
-
-	lrm = lxm * lym;	/* 16 * 16 => 32 */
-	hrm = hxm * hym;	/* 16 * 16 => 32 */
-
-	t = lxm * hym; /* 16 * 16 => 32 */
-	at = lrm + (t << 16);
-	hrm += at < lrm;
-	lrm = at;
-	hrm = hrm + (t >> 16);
-
-	t = hxm * lym; /* 16 * 16 => 32 */
-	at = lrm + (t << 16);
-	hrm += at < lrm;
-	lrm = at;
-	hrm = hrm + (t >> 16);
-
-	rm = hrm | (lrm != 0);
-
-	/*
-	 * Sticky shift down to normal rounding precision.
-	 */
-	if ((int) rm < 0) {
-		rm = (rm >> (32 - (SP_FBITS + 1 + 3))) |
-		    ((rm << (SP_FBITS + 1 + 3)) != 0);
-		re++;
-	} else {
-		rm = (rm >> (32 - (SP_FBITS + 1 + 3 + 1))) |
-		     ((rm << (SP_FBITS + 1 + 3 + 1)) != 0);
-	}
-	assert(rm & (SP_HIDDEN_BIT << 3));
-
-	/* And now the subtraction */
-
-	/* Flip sign of r and handle as add */
-	rs ^= 1;
-
-	assert(zm & SP_HIDDEN_BIT);
-
-	/*
-	 * Provide guard,round and stick bit space.
-	 */
-	zm <<= 3;
-
-	if (ze > re) {
-		/*
-		 * Have to shift y fraction right to align.
-		 */
-		s = ze - re;
-		SPXSRSYn(s);
-	} else if (re > ze) {
-		/*
-		 * Have to shift x fraction right to align.
-		 */
-		s = re - ze;
-		SPXSRSYn(s);
-	}
-	assert(ze == re);
-	assert(ze <= SP_EMAX);
-
-	if (zs == rs) {
-		/*
-		 * Generate 28 bit result of adding two 27 bit numbers
-		 * leaving result in zm, zs and ze.
-		 */
-		zm = zm + rm;
-
-		if (zm >> (SP_FBITS + 1 + 3)) { /* carry out */
-			SPXSRSX1(); /* shift preserving sticky */
-		}
-	} else {
-		if (zm >= rm) {
-			zm = zm - rm;
-		} else {
-			zm = rm - zm;
-			zs = rs;
-		}
-		if (zm == 0)
-			return ieee754sp_zero(ieee754_csr.rm == FPU_CSR_RD);
-
-		/*
-		 * Normalize in extended single precision
-		 */
-		while ((zm >> (SP_MBITS + 3)) == 0) {
-			zm <<= 1;
-			ze--;
-		}
-
-	}
-	return ieee754sp_format(zs, ze, zm);
-}
-- 
2.8.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 05/11] MIPS: math-emu: Unify ieee754sp_m{add,sub}f
@ 2016-04-21 13:04   ` Paul Burton
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Burton @ 2016-04-21 13:04 UTC (permalink / raw)
  To: linux-mips, Ralf Baechle; +Cc: Paul Burton, Markos Chandras, linux-kernel

The code for emulating MIPSr6 madd.s & msub.s instructions has
previously been implemented as 2 different functions, namely
ieee754sp_maddf & ieee754sp_msubf. The difference in behaviour of these
2 instructions is merely the sign of the product, so we can easily share
the code implementing them. Do this for the single precision variant,
removing the original ieee754sp_msubf in favor of reusing the code from
ieee754sp_maddf.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
---

 arch/mips/math-emu/Makefile   |   2 +-
 arch/mips/math-emu/sp_maddf.c |  22 +++-
 arch/mips/math-emu/sp_msubf.c | 258 ------------------------------------------
 3 files changed, 21 insertions(+), 261 deletions(-)
 delete mode 100644 arch/mips/math-emu/sp_msubf.c

diff --git a/arch/mips/math-emu/Makefile b/arch/mips/math-emu/Makefile
index a19641d..3389aff 100644
--- a/arch/mips/math-emu/Makefile
+++ b/arch/mips/math-emu/Makefile
@@ -6,7 +6,7 @@ obj-y	+= cp1emu.o ieee754dp.o ieee754sp.o ieee754.o \
 	   dp_div.o dp_mul.o dp_sub.o dp_add.o dp_fsp.o dp_cmp.o dp_simple.o \
 	   dp_tint.o dp_fint.o dp_maddf.o dp_msubf.o dp_2008class.o dp_fmin.o dp_fmax.o \
 	   sp_div.o sp_mul.o sp_sub.o sp_add.o sp_fdp.o sp_cmp.o sp_simple.o \
-	   sp_tint.o sp_fint.o sp_maddf.o sp_msubf.o sp_2008class.o sp_fmin.o sp_fmax.o \
+	   sp_tint.o sp_fint.o sp_maddf.o sp_2008class.o sp_fmin.o sp_fmax.o \
 	   dsemul.o
 
 lib-y	+= ieee754d.o \
diff --git a/arch/mips/math-emu/sp_maddf.c b/arch/mips/math-emu/sp_maddf.c
index dd1dd83..93b7132 100644
--- a/arch/mips/math-emu/sp_maddf.c
+++ b/arch/mips/math-emu/sp_maddf.c
@@ -14,8 +14,12 @@
 
 #include "ieee754sp.h"
 
-union ieee754sp ieee754sp_maddf(union ieee754sp z, union ieee754sp x,
-				union ieee754sp y)
+enum maddf_flags {
+	maddf_negate_product	= 1 << 0,
+};
+
+static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
+				 union ieee754sp y, enum maddf_flags flags)
 {
 	int re;
 	int rs;
@@ -154,6 +158,8 @@ union ieee754sp ieee754sp_maddf(union ieee754sp z, union ieee754sp x,
 
 	re = xe + ye;
 	rs = xs ^ ys;
+	if (flags & maddf_negate_product)
+		rs ^= 1;
 
 	/* shunt to top of word */
 	xm <<= 32 - (SP_FBITS + 1);
@@ -253,3 +259,15 @@ union ieee754sp ieee754sp_maddf(union ieee754sp z, union ieee754sp x,
 	}
 	return ieee754sp_format(zs, ze, zm);
 }
+
+union ieee754sp ieee754sp_maddf(union ieee754sp z, union ieee754sp x,
+				union ieee754sp y)
+{
+	return _sp_maddf(z, x, y, 0);
+}
+
+union ieee754sp ieee754sp_msubf(union ieee754sp z, union ieee754sp x,
+				union ieee754sp y)
+{
+	return _sp_maddf(z, x, y, maddf_negate_product);
+}
diff --git a/arch/mips/math-emu/sp_msubf.c b/arch/mips/math-emu/sp_msubf.c
deleted file mode 100644
index 81c38b980..0000000
--- a/arch/mips/math-emu/sp_msubf.c
+++ /dev/null
@@ -1,258 +0,0 @@
-/*
- * IEEE754 floating point arithmetic
- * single precision: MSUB.f (Fused Multiply Subtract)
- * MSUBF.fmt: FPR[fd] = FPR[fd] - (FPR[fs] x FPR[ft])
- *
- * MIPS floating point support
- * Copyright (C) 2015 Imagination Technologies, Ltd.
- * Author: Markos Chandras <markos.chandras@imgtec.com>
- *
- *  This program is free software; you can distribute it and/or modify it
- *  under the terms of the GNU General Public License as published by the
- *  Free Software Foundation; version 2 of the License.
- */
-
-#include "ieee754sp.h"
-
-union ieee754sp ieee754sp_msubf(union ieee754sp z, union ieee754sp x,
-				union ieee754sp y)
-{
-	int re;
-	int rs;
-	unsigned rm;
-	unsigned short lxm;
-	unsigned short hxm;
-	unsigned short lym;
-	unsigned short hym;
-	unsigned lrm;
-	unsigned hrm;
-	unsigned t;
-	unsigned at;
-	int s;
-
-	COMPXSP;
-	COMPYSP;
-	u32 zm; int ze; int zs __maybe_unused; int zc;
-
-	EXPLODEXSP;
-	EXPLODEYSP;
-	EXPLODESP(z, zc, zs, ze, zm)
-
-	FLUSHXSP;
-	FLUSHYSP;
-	FLUSHSP(z, zc, zs, ze, zm);
-
-	ieee754_clearcx();
-
-	switch (zc) {
-	case IEEE754_CLASS_SNAN:
-		ieee754_setcx(IEEE754_INVALID_OPERATION);
-		return ieee754sp_nanxcpt(z);
-	case IEEE754_CLASS_DNORM:
-		SPDNORMx(zm, ze);
-	/* QNAN is handled separately below */
-	}
-
-	switch (CLPAIR(xc, yc)) {
-	case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_SNAN):
-	case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_SNAN):
-	case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_SNAN):
-	case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_SNAN):
-	case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_SNAN):
-		return ieee754sp_nanxcpt(y);
-
-	case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_SNAN):
-	case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_QNAN):
-	case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_ZERO):
-	case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_NORM):
-	case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_DNORM):
-	case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
-		return ieee754sp_nanxcpt(x);
-
-	case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
-	case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
-	case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_QNAN):
-	case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_QNAN):
-		return y;
-
-	case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
-	case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_ZERO):
-	case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_NORM):
-	case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_DNORM):
-	case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_INF):
-		return x;
-
-	/*
-	 * Infinity handling
-	 */
-	case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_ZERO):
-	case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_INF):
-		if (zc == IEEE754_CLASS_QNAN)
-			return z;
-		ieee754_setcx(IEEE754_INVALID_OPERATION);
-		return ieee754sp_indef();
-
-	case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_INF):
-	case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_INF):
-	case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_NORM):
-	case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
-	case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
-		if (zc == IEEE754_CLASS_QNAN)
-			return z;
-		return ieee754sp_inf(xs ^ ys);
-
-	case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
-	case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_NORM):
-	case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_DNORM):
-	case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_ZERO):
-	case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_ZERO):
-		if (zc == IEEE754_CLASS_INF)
-			return ieee754sp_inf(zs);
-		/* Multiplication is 0 so just return z */
-		return z;
-
-	case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_DNORM):
-		SPDNORMX;
-
-	case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_DNORM):
-		if (zc == IEEE754_CLASS_QNAN)
-			return z;
-		else if (zc == IEEE754_CLASS_INF)
-			return ieee754sp_inf(zs);
-		SPDNORMY;
-		break;
-
-	case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_NORM):
-		if (zc == IEEE754_CLASS_QNAN)
-			return z;
-		else if (zc == IEEE754_CLASS_INF)
-			return ieee754sp_inf(zs);
-		SPDNORMX;
-		break;
-
-	case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_NORM):
-		if (zc == IEEE754_CLASS_QNAN)
-			return z;
-		else if (zc == IEEE754_CLASS_INF)
-			return ieee754sp_inf(zs);
-		/* fall through to real compuation */
-	}
-
-	/* Finally get to do some computation */
-
-	/*
-	 * Do the multiplication bit first
-	 *
-	 * rm = xm * ym, re = xe + ye basically
-	 *
-	 * At this point xm and ym should have been normalized.
-	 */
-
-	/* rm = xm * ym, re = xe+ye basically */
-	assert(xm & SP_HIDDEN_BIT);
-	assert(ym & SP_HIDDEN_BIT);
-
-	re = xe + ye;
-	rs = xs ^ ys;
-
-	/* shunt to top of word */
-	xm <<= 32 - (SP_FBITS + 1);
-	ym <<= 32 - (SP_FBITS + 1);
-
-	/*
-	 * Multiply 32 bits xm, ym to give high 32 bits rm with stickness.
-	 */
-	lxm = xm & 0xffff;
-	hxm = xm >> 16;
-	lym = ym & 0xffff;
-	hym = ym >> 16;
-
-	lrm = lxm * lym;	/* 16 * 16 => 32 */
-	hrm = hxm * hym;	/* 16 * 16 => 32 */
-
-	t = lxm * hym; /* 16 * 16 => 32 */
-	at = lrm + (t << 16);
-	hrm += at < lrm;
-	lrm = at;
-	hrm = hrm + (t >> 16);
-
-	t = hxm * lym; /* 16 * 16 => 32 */
-	at = lrm + (t << 16);
-	hrm += at < lrm;
-	lrm = at;
-	hrm = hrm + (t >> 16);
-
-	rm = hrm | (lrm != 0);
-
-	/*
-	 * Sticky shift down to normal rounding precision.
-	 */
-	if ((int) rm < 0) {
-		rm = (rm >> (32 - (SP_FBITS + 1 + 3))) |
-		    ((rm << (SP_FBITS + 1 + 3)) != 0);
-		re++;
-	} else {
-		rm = (rm >> (32 - (SP_FBITS + 1 + 3 + 1))) |
-		     ((rm << (SP_FBITS + 1 + 3 + 1)) != 0);
-	}
-	assert(rm & (SP_HIDDEN_BIT << 3));
-
-	/* And now the subtraction */
-
-	/* Flip sign of r and handle as add */
-	rs ^= 1;
-
-	assert(zm & SP_HIDDEN_BIT);
-
-	/*
-	 * Provide guard,round and stick bit space.
-	 */
-	zm <<= 3;
-
-	if (ze > re) {
-		/*
-		 * Have to shift y fraction right to align.
-		 */
-		s = ze - re;
-		SPXSRSYn(s);
-	} else if (re > ze) {
-		/*
-		 * Have to shift x fraction right to align.
-		 */
-		s = re - ze;
-		SPXSRSYn(s);
-	}
-	assert(ze == re);
-	assert(ze <= SP_EMAX);
-
-	if (zs == rs) {
-		/*
-		 * Generate 28 bit result of adding two 27 bit numbers
-		 * leaving result in zm, zs and ze.
-		 */
-		zm = zm + rm;
-
-		if (zm >> (SP_FBITS + 1 + 3)) { /* carry out */
-			SPXSRSX1(); /* shift preserving sticky */
-		}
-	} else {
-		if (zm >= rm) {
-			zm = zm - rm;
-		} else {
-			zm = rm - zm;
-			zs = rs;
-		}
-		if (zm == 0)
-			return ieee754sp_zero(ieee754_csr.rm == FPU_CSR_RD);
-
-		/*
-		 * Normalize in extended single precision
-		 */
-		while ((zm >> (SP_MBITS + 3)) == 0) {
-			zm <<= 1;
-			ze--;
-		}
-
-	}
-	return ieee754sp_format(zs, ze, zm);
-}
-- 
2.8.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 06/11] MIPS: math-emu: Unify ieee754dp_m{add,sub}f
@ 2016-04-21 13:04   ` Paul Burton
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Burton @ 2016-04-21 13:04 UTC (permalink / raw)
  To: linux-mips, Ralf Baechle; +Cc: Paul Burton, Markos Chandras, linux-kernel

The code for emulating MIPSr6 madd.d & msub.d instructions has
previously been implemented as 2 different functions, namely
ieee754dp_maddf & ieee754dp_msubf. The difference in behaviour of these
2 instructions is merely the sign of the product, so we can easily share
the code implementing them. Do this for the double precision variant,
removing the original ieee754dp_msubf in favor of reusing the code from
ieee754dp_maddf.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
---

 arch/mips/math-emu/Makefile   |   2 +-
 arch/mips/math-emu/dp_maddf.c |  22 +++-
 arch/mips/math-emu/dp_msubf.c | 269 ------------------------------------------
 3 files changed, 21 insertions(+), 272 deletions(-)
 delete mode 100644 arch/mips/math-emu/dp_msubf.c

diff --git a/arch/mips/math-emu/Makefile b/arch/mips/math-emu/Makefile
index 3389aff..e9bbc2a 100644
--- a/arch/mips/math-emu/Makefile
+++ b/arch/mips/math-emu/Makefile
@@ -4,7 +4,7 @@
 
 obj-y	+= cp1emu.o ieee754dp.o ieee754sp.o ieee754.o \
 	   dp_div.o dp_mul.o dp_sub.o dp_add.o dp_fsp.o dp_cmp.o dp_simple.o \
-	   dp_tint.o dp_fint.o dp_maddf.o dp_msubf.o dp_2008class.o dp_fmin.o dp_fmax.o \
+	   dp_tint.o dp_fint.o dp_maddf.o dp_2008class.o dp_fmin.o dp_fmax.o \
 	   sp_div.o sp_mul.o sp_sub.o sp_add.o sp_fdp.o sp_cmp.o sp_simple.o \
 	   sp_tint.o sp_fint.o sp_maddf.o sp_2008class.o sp_fmin.o sp_fmax.o \
 	   dsemul.o
diff --git a/arch/mips/math-emu/dp_maddf.c b/arch/mips/math-emu/dp_maddf.c
index 119eda9..d5e0fb1 100644
--- a/arch/mips/math-emu/dp_maddf.c
+++ b/arch/mips/math-emu/dp_maddf.c
@@ -14,8 +14,12 @@
 
 #include "ieee754dp.h"
 
-union ieee754dp ieee754dp_maddf(union ieee754dp z, union ieee754dp x,
-				union ieee754dp y)
+enum maddf_flags {
+	maddf_negate_product	= 1 << 0,
+};
+
+static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
+				 union ieee754dp y, enum maddf_flags flags)
 {
 	int re;
 	int rs;
@@ -154,6 +158,8 @@ union ieee754dp ieee754dp_maddf(union ieee754dp z, union ieee754dp x,
 
 	re = xe + ye;
 	rs = xs ^ ys;
+	if (flags & maddf_negate_product)
+		rs ^= 1;
 
 	/* shunt to top of word */
 	xm <<= 64 - (DP_FBITS + 1);
@@ -263,3 +269,15 @@ union ieee754dp ieee754dp_maddf(union ieee754dp z, union ieee754dp x,
 
 	return ieee754dp_format(zs, ze, zm);
 }
+
+union ieee754dp ieee754dp_maddf(union ieee754dp z, union ieee754dp x,
+				union ieee754dp y)
+{
+	return _dp_maddf(z, x, y, 0);
+}
+
+union ieee754dp ieee754dp_msubf(union ieee754dp z, union ieee754dp x,
+				union ieee754dp y)
+{
+	return _dp_maddf(z, x, y, maddf_negate_product);
+}
diff --git a/arch/mips/math-emu/dp_msubf.c b/arch/mips/math-emu/dp_msubf.c
deleted file mode 100644
index 1224126..0000000
--- a/arch/mips/math-emu/dp_msubf.c
+++ /dev/null
@@ -1,269 +0,0 @@
-/*
- * IEEE754 floating point arithmetic
- * double precision: MSUB.f (Fused Multiply Subtract)
- * MSUBF.fmt: FPR[fd] = FPR[fd] - (FPR[fs] x FPR[ft])
- *
- * MIPS floating point support
- * Copyright (C) 2015 Imagination Technologies, Ltd.
- * Author: Markos Chandras <markos.chandras@imgtec.com>
- *
- *  This program is free software; you can distribute it and/or modify it
- *  under the terms of the GNU General Public License as published by the
- *  Free Software Foundation; version 2 of the License.
- */
-
-#include "ieee754dp.h"
-
-union ieee754dp ieee754dp_msubf(union ieee754dp z, union ieee754dp x,
-				union ieee754dp y)
-{
-	int re;
-	int rs;
-	u64 rm;
-	unsigned lxm;
-	unsigned hxm;
-	unsigned lym;
-	unsigned hym;
-	u64 lrm;
-	u64 hrm;
-	u64 t;
-	u64 at;
-	int s;
-
-	COMPXDP;
-	COMPYDP;
-
-	u64 zm; int ze; int zs __maybe_unused; int zc;
-
-	EXPLODEXDP;
-	EXPLODEYDP;
-	EXPLODEDP(z, zc, zs, ze, zm)
-
-	FLUSHXDP;
-	FLUSHYDP;
-	FLUSHDP(z, zc, zs, ze, zm);
-
-	ieee754_clearcx();
-
-	switch (zc) {
-	case IEEE754_CLASS_SNAN:
-		ieee754_setcx(IEEE754_INVALID_OPERATION);
-		return ieee754dp_nanxcpt(z);
-	case IEEE754_CLASS_DNORM:
-		DPDNORMx(zm, ze);
-	/* QNAN is handled separately below */
-	}
-
-	switch (CLPAIR(xc, yc)) {
-	case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_SNAN):
-	case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_SNAN):
-	case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_SNAN):
-	case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_SNAN):
-	case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_SNAN):
-		return ieee754dp_nanxcpt(y);
-
-	case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_SNAN):
-	case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_QNAN):
-	case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_ZERO):
-	case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_NORM):
-	case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_DNORM):
-	case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
-		return ieee754dp_nanxcpt(x);
-
-	case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
-	case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
-	case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_QNAN):
-	case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_QNAN):
-		return y;
-
-	case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
-	case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_ZERO):
-	case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_NORM):
-	case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_DNORM):
-	case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_INF):
-		return x;
-
-
-	/*
-	 * Infinity handling
-	 */
-	case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_ZERO):
-	case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_INF):
-		if (zc == IEEE754_CLASS_QNAN)
-			return z;
-		ieee754_setcx(IEEE754_INVALID_OPERATION);
-		return ieee754dp_indef();
-
-	case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_INF):
-	case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_INF):
-	case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_NORM):
-	case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
-	case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
-		if (zc == IEEE754_CLASS_QNAN)
-			return z;
-		return ieee754dp_inf(xs ^ ys);
-
-	case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
-	case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_NORM):
-	case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_DNORM):
-	case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_ZERO):
-	case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_ZERO):
-		if (zc == IEEE754_CLASS_INF)
-			return ieee754dp_inf(zs);
-		/* Multiplication is 0 so just return z */
-		return z;
-
-	case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_DNORM):
-		DPDNORMX;
-
-	case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_DNORM):
-		if (zc == IEEE754_CLASS_QNAN)
-			return z;
-		else if (zc == IEEE754_CLASS_INF)
-			return ieee754dp_inf(zs);
-		DPDNORMY;
-		break;
-
-	case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_NORM):
-		if (zc == IEEE754_CLASS_QNAN)
-			return z;
-		else if (zc == IEEE754_CLASS_INF)
-			return ieee754dp_inf(zs);
-		DPDNORMX;
-		break;
-
-	case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_NORM):
-		if (zc == IEEE754_CLASS_QNAN)
-			return z;
-		else if (zc == IEEE754_CLASS_INF)
-			return ieee754dp_inf(zs);
-		/* fall through to real computations */
-	}
-
-	/* Finally get to do some computation */
-
-	/*
-	 * Do the multiplication bit first
-	 *
-	 * rm = xm * ym, re = xe + ye basically
-	 *
-	 * At this point xm and ym should have been normalized.
-	 */
-	assert(xm & DP_HIDDEN_BIT);
-	assert(ym & DP_HIDDEN_BIT);
-
-	re = xe + ye;
-	rs = xs ^ ys;
-
-	/* shunt to top of word */
-	xm <<= 64 - (DP_FBITS + 1);
-	ym <<= 64 - (DP_FBITS + 1);
-
-	/*
-	 * Multiply 32 bits xm, ym to give high 32 bits rm with stickness.
-	 */
-
-	/* 32 * 32 => 64 */
-#define DPXMULT(x, y)	((u64)(x) * (u64)y)
-
-	lxm = xm;
-	hxm = xm >> 32;
-	lym = ym;
-	hym = ym >> 32;
-
-	lrm = DPXMULT(lxm, lym);
-	hrm = DPXMULT(hxm, hym);
-
-	t = DPXMULT(lxm, hym);
-
-	at = lrm + (t << 32);
-	hrm += at < lrm;
-	lrm = at;
-
-	hrm = hrm + (t >> 32);
-
-	t = DPXMULT(hxm, lym);
-
-	at = lrm + (t << 32);
-	hrm += at < lrm;
-	lrm = at;
-
-	hrm = hrm + (t >> 32);
-
-	rm = hrm | (lrm != 0);
-
-	/*
-	 * Sticky shift down to normal rounding precision.
-	 */
-	if ((s64) rm < 0) {
-		rm = (rm >> (64 - (DP_FBITS + 1 + 3))) |
-		     ((rm << (DP_FBITS + 1 + 3)) != 0);
-			re++;
-	} else {
-		rm = (rm >> (64 - (DP_FBITS + 1 + 3 + 1))) |
-		     ((rm << (DP_FBITS + 1 + 3 + 1)) != 0);
-	}
-	assert(rm & (DP_HIDDEN_BIT << 3));
-
-	/* And now the subtraction */
-
-	/* flip sign of r and handle as add */
-	rs ^= 1;
-
-	assert(zm & DP_HIDDEN_BIT);
-
-	/*
-	 * Provide guard,round and stick bit space.
-	 */
-	zm <<= 3;
-
-	if (ze > re) {
-		/*
-		 * Have to shift y fraction right to align.
-		 */
-		s = ze - re;
-		rm = XDPSRS(rm, s);
-		re += s;
-	} else if (re > ze) {
-		/*
-		 * Have to shift x fraction right to align.
-		 */
-		s = re - ze;
-		zm = XDPSRS(zm, s);
-		ze += s;
-	}
-	assert(ze == re);
-	assert(ze <= DP_EMAX);
-
-	if (zs == rs) {
-		/*
-		 * Generate 28 bit result of adding two 27 bit numbers
-		 * leaving result in xm, xs and xe.
-		 */
-		zm = zm + rm;
-
-		if (zm >> (DP_FBITS + 1 + 3)) { /* carry out */
-			zm = XDPSRS1(zm);
-			ze++;
-		}
-	} else {
-		if (zm >= rm) {
-			zm = zm - rm;
-		} else {
-			zm = rm - zm;
-			zs = rs;
-		}
-		if (zm == 0)
-			return ieee754dp_zero(ieee754_csr.rm == FPU_CSR_RD);
-
-		/*
-		 * Normalize to rounding precision.
-		 */
-		while ((zm >> (DP_FBITS + 3)) == 0) {
-			zm <<= 1;
-			ze--;
-		}
-	}
-
-	return ieee754dp_format(zs, ze, zm);
-}
-- 
2.8.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 06/11] MIPS: math-emu: Unify ieee754dp_m{add,sub}f
@ 2016-04-21 13:04   ` Paul Burton
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Burton @ 2016-04-21 13:04 UTC (permalink / raw)
  To: linux-mips, Ralf Baechle; +Cc: Paul Burton, Markos Chandras, linux-kernel

The code for emulating MIPSr6 madd.d & msub.d instructions has
previously been implemented as 2 different functions, namely
ieee754dp_maddf & ieee754dp_msubf. The difference in behaviour of these
2 instructions is merely the sign of the product, so we can easily share
the code implementing them. Do this for the double precision variant,
removing the original ieee754dp_msubf in favor of reusing the code from
ieee754dp_maddf.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
---

 arch/mips/math-emu/Makefile   |   2 +-
 arch/mips/math-emu/dp_maddf.c |  22 +++-
 arch/mips/math-emu/dp_msubf.c | 269 ------------------------------------------
 3 files changed, 21 insertions(+), 272 deletions(-)
 delete mode 100644 arch/mips/math-emu/dp_msubf.c

diff --git a/arch/mips/math-emu/Makefile b/arch/mips/math-emu/Makefile
index 3389aff..e9bbc2a 100644
--- a/arch/mips/math-emu/Makefile
+++ b/arch/mips/math-emu/Makefile
@@ -4,7 +4,7 @@
 
 obj-y	+= cp1emu.o ieee754dp.o ieee754sp.o ieee754.o \
 	   dp_div.o dp_mul.o dp_sub.o dp_add.o dp_fsp.o dp_cmp.o dp_simple.o \
-	   dp_tint.o dp_fint.o dp_maddf.o dp_msubf.o dp_2008class.o dp_fmin.o dp_fmax.o \
+	   dp_tint.o dp_fint.o dp_maddf.o dp_2008class.o dp_fmin.o dp_fmax.o \
 	   sp_div.o sp_mul.o sp_sub.o sp_add.o sp_fdp.o sp_cmp.o sp_simple.o \
 	   sp_tint.o sp_fint.o sp_maddf.o sp_2008class.o sp_fmin.o sp_fmax.o \
 	   dsemul.o
diff --git a/arch/mips/math-emu/dp_maddf.c b/arch/mips/math-emu/dp_maddf.c
index 119eda9..d5e0fb1 100644
--- a/arch/mips/math-emu/dp_maddf.c
+++ b/arch/mips/math-emu/dp_maddf.c
@@ -14,8 +14,12 @@
 
 #include "ieee754dp.h"
 
-union ieee754dp ieee754dp_maddf(union ieee754dp z, union ieee754dp x,
-				union ieee754dp y)
+enum maddf_flags {
+	maddf_negate_product	= 1 << 0,
+};
+
+static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
+				 union ieee754dp y, enum maddf_flags flags)
 {
 	int re;
 	int rs;
@@ -154,6 +158,8 @@ union ieee754dp ieee754dp_maddf(union ieee754dp z, union ieee754dp x,
 
 	re = xe + ye;
 	rs = xs ^ ys;
+	if (flags & maddf_negate_product)
+		rs ^= 1;
 
 	/* shunt to top of word */
 	xm <<= 64 - (DP_FBITS + 1);
@@ -263,3 +269,15 @@ union ieee754dp ieee754dp_maddf(union ieee754dp z, union ieee754dp x,
 
 	return ieee754dp_format(zs, ze, zm);
 }
+
+union ieee754dp ieee754dp_maddf(union ieee754dp z, union ieee754dp x,
+				union ieee754dp y)
+{
+	return _dp_maddf(z, x, y, 0);
+}
+
+union ieee754dp ieee754dp_msubf(union ieee754dp z, union ieee754dp x,
+				union ieee754dp y)
+{
+	return _dp_maddf(z, x, y, maddf_negate_product);
+}
diff --git a/arch/mips/math-emu/dp_msubf.c b/arch/mips/math-emu/dp_msubf.c
deleted file mode 100644
index 1224126..0000000
--- a/arch/mips/math-emu/dp_msubf.c
+++ /dev/null
@@ -1,269 +0,0 @@
-/*
- * IEEE754 floating point arithmetic
- * double precision: MSUB.f (Fused Multiply Subtract)
- * MSUBF.fmt: FPR[fd] = FPR[fd] - (FPR[fs] x FPR[ft])
- *
- * MIPS floating point support
- * Copyright (C) 2015 Imagination Technologies, Ltd.
- * Author: Markos Chandras <markos.chandras@imgtec.com>
- *
- *  This program is free software; you can distribute it and/or modify it
- *  under the terms of the GNU General Public License as published by the
- *  Free Software Foundation; version 2 of the License.
- */
-
-#include "ieee754dp.h"
-
-union ieee754dp ieee754dp_msubf(union ieee754dp z, union ieee754dp x,
-				union ieee754dp y)
-{
-	int re;
-	int rs;
-	u64 rm;
-	unsigned lxm;
-	unsigned hxm;
-	unsigned lym;
-	unsigned hym;
-	u64 lrm;
-	u64 hrm;
-	u64 t;
-	u64 at;
-	int s;
-
-	COMPXDP;
-	COMPYDP;
-
-	u64 zm; int ze; int zs __maybe_unused; int zc;
-
-	EXPLODEXDP;
-	EXPLODEYDP;
-	EXPLODEDP(z, zc, zs, ze, zm)
-
-	FLUSHXDP;
-	FLUSHYDP;
-	FLUSHDP(z, zc, zs, ze, zm);
-
-	ieee754_clearcx();
-
-	switch (zc) {
-	case IEEE754_CLASS_SNAN:
-		ieee754_setcx(IEEE754_INVALID_OPERATION);
-		return ieee754dp_nanxcpt(z);
-	case IEEE754_CLASS_DNORM:
-		DPDNORMx(zm, ze);
-	/* QNAN is handled separately below */
-	}
-
-	switch (CLPAIR(xc, yc)) {
-	case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_SNAN):
-	case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_SNAN):
-	case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_SNAN):
-	case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_SNAN):
-	case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_SNAN):
-		return ieee754dp_nanxcpt(y);
-
-	case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_SNAN):
-	case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_QNAN):
-	case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_ZERO):
-	case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_NORM):
-	case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_DNORM):
-	case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
-		return ieee754dp_nanxcpt(x);
-
-	case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
-	case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
-	case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_QNAN):
-	case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_QNAN):
-		return y;
-
-	case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
-	case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_ZERO):
-	case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_NORM):
-	case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_DNORM):
-	case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_INF):
-		return x;
-
-
-	/*
-	 * Infinity handling
-	 */
-	case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_ZERO):
-	case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_INF):
-		if (zc == IEEE754_CLASS_QNAN)
-			return z;
-		ieee754_setcx(IEEE754_INVALID_OPERATION);
-		return ieee754dp_indef();
-
-	case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_INF):
-	case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_INF):
-	case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_NORM):
-	case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
-	case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
-		if (zc == IEEE754_CLASS_QNAN)
-			return z;
-		return ieee754dp_inf(xs ^ ys);
-
-	case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
-	case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_NORM):
-	case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_DNORM):
-	case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_ZERO):
-	case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_ZERO):
-		if (zc == IEEE754_CLASS_INF)
-			return ieee754dp_inf(zs);
-		/* Multiplication is 0 so just return z */
-		return z;
-
-	case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_DNORM):
-		DPDNORMX;
-
-	case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_DNORM):
-		if (zc == IEEE754_CLASS_QNAN)
-			return z;
-		else if (zc == IEEE754_CLASS_INF)
-			return ieee754dp_inf(zs);
-		DPDNORMY;
-		break;
-
-	case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_NORM):
-		if (zc == IEEE754_CLASS_QNAN)
-			return z;
-		else if (zc == IEEE754_CLASS_INF)
-			return ieee754dp_inf(zs);
-		DPDNORMX;
-		break;
-
-	case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_NORM):
-		if (zc == IEEE754_CLASS_QNAN)
-			return z;
-		else if (zc == IEEE754_CLASS_INF)
-			return ieee754dp_inf(zs);
-		/* fall through to real computations */
-	}
-
-	/* Finally get to do some computation */
-
-	/*
-	 * Do the multiplication bit first
-	 *
-	 * rm = xm * ym, re = xe + ye basically
-	 *
-	 * At this point xm and ym should have been normalized.
-	 */
-	assert(xm & DP_HIDDEN_BIT);
-	assert(ym & DP_HIDDEN_BIT);
-
-	re = xe + ye;
-	rs = xs ^ ys;
-
-	/* shunt to top of word */
-	xm <<= 64 - (DP_FBITS + 1);
-	ym <<= 64 - (DP_FBITS + 1);
-
-	/*
-	 * Multiply 32 bits xm, ym to give high 32 bits rm with stickness.
-	 */
-
-	/* 32 * 32 => 64 */
-#define DPXMULT(x, y)	((u64)(x) * (u64)y)
-
-	lxm = xm;
-	hxm = xm >> 32;
-	lym = ym;
-	hym = ym >> 32;
-
-	lrm = DPXMULT(lxm, lym);
-	hrm = DPXMULT(hxm, hym);
-
-	t = DPXMULT(lxm, hym);
-
-	at = lrm + (t << 32);
-	hrm += at < lrm;
-	lrm = at;
-
-	hrm = hrm + (t >> 32);
-
-	t = DPXMULT(hxm, lym);
-
-	at = lrm + (t << 32);
-	hrm += at < lrm;
-	lrm = at;
-
-	hrm = hrm + (t >> 32);
-
-	rm = hrm | (lrm != 0);
-
-	/*
-	 * Sticky shift down to normal rounding precision.
-	 */
-	if ((s64) rm < 0) {
-		rm = (rm >> (64 - (DP_FBITS + 1 + 3))) |
-		     ((rm << (DP_FBITS + 1 + 3)) != 0);
-			re++;
-	} else {
-		rm = (rm >> (64 - (DP_FBITS + 1 + 3 + 1))) |
-		     ((rm << (DP_FBITS + 1 + 3 + 1)) != 0);
-	}
-	assert(rm & (DP_HIDDEN_BIT << 3));
-
-	/* And now the subtraction */
-
-	/* flip sign of r and handle as add */
-	rs ^= 1;
-
-	assert(zm & DP_HIDDEN_BIT);
-
-	/*
-	 * Provide guard,round and stick bit space.
-	 */
-	zm <<= 3;
-
-	if (ze > re) {
-		/*
-		 * Have to shift y fraction right to align.
-		 */
-		s = ze - re;
-		rm = XDPSRS(rm, s);
-		re += s;
-	} else if (re > ze) {
-		/*
-		 * Have to shift x fraction right to align.
-		 */
-		s = re - ze;
-		zm = XDPSRS(zm, s);
-		ze += s;
-	}
-	assert(ze == re);
-	assert(ze <= DP_EMAX);
-
-	if (zs == rs) {
-		/*
-		 * Generate 28 bit result of adding two 27 bit numbers
-		 * leaving result in xm, xs and xe.
-		 */
-		zm = zm + rm;
-
-		if (zm >> (DP_FBITS + 1 + 3)) { /* carry out */
-			zm = XDPSRS1(zm);
-			ze++;
-		}
-	} else {
-		if (zm >= rm) {
-			zm = zm - rm;
-		} else {
-			zm = rm - zm;
-			zs = rs;
-		}
-		if (zm == 0)
-			return ieee754dp_zero(ieee754_csr.rm == FPU_CSR_RD);
-
-		/*
-		 * Normalize to rounding precision.
-		 */
-		while ((zm >> (DP_FBITS + 3)) == 0) {
-			zm <<= 1;
-			ze--;
-		}
-	}
-
-	return ieee754dp_format(zs, ze, zm);
-}
-- 
2.8.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 07/11] MIPS: math-emu: Add z argument macros
@ 2016-04-21 13:04   ` Paul Burton
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Burton @ 2016-04-21 13:04 UTC (permalink / raw)
  To: linux-mips, Ralf Baechle
  Cc: Paul Burton, Maciej W. Rozycki, linux-kernel, Markos Chandras

Introduce macros for handling the "z" argument to maddf & msubf, making
its handling consistent with that of the "x" & "y" arguments rather than
open-coding equivalents.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
---

 arch/mips/math-emu/dp_maddf.c   |  9 ++++-----
 arch/mips/math-emu/ieee754dp.h  |  1 +
 arch/mips/math-emu/ieee754int.h | 10 ++++++++++
 arch/mips/math-emu/ieee754sp.h  |  1 +
 arch/mips/math-emu/sp_maddf.c   |  8 ++++----
 5 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/arch/mips/math-emu/dp_maddf.c b/arch/mips/math-emu/dp_maddf.c
index d5e0fb1..0e1d4d8 100644
--- a/arch/mips/math-emu/dp_maddf.c
+++ b/arch/mips/math-emu/dp_maddf.c
@@ -36,16 +36,15 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
 
 	COMPXDP;
 	COMPYDP;
-
-	u64 zm; int ze; int zs __maybe_unused; int zc;
+	COMPZDP;
 
 	EXPLODEXDP;
 	EXPLODEYDP;
-	EXPLODEDP(z, zc, zs, ze, zm)
+	EXPLODEZDP;
 
 	FLUSHXDP;
 	FLUSHYDP;
-	FLUSHDP(z, zc, zs, ze, zm);
+	FLUSHZDP;
 
 	ieee754_clearcx();
 
@@ -54,7 +53,7 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
 		ieee754_setcx(IEEE754_INVALID_OPERATION);
 		return ieee754dp_nanxcpt(z);
 	case IEEE754_CLASS_DNORM:
-		DPDNORMx(zm, ze);
+		DPDNORMZ;
 	/* QNAN is handled separately below */
 	}
 
diff --git a/arch/mips/math-emu/ieee754dp.h b/arch/mips/math-emu/ieee754dp.h
index e2babd9..9ba0230 100644
--- a/arch/mips/math-emu/ieee754dp.h
+++ b/arch/mips/math-emu/ieee754dp.h
@@ -60,6 +60,7 @@ static inline int ieee754dp_finite(union ieee754dp x)
 	while ((m >> DP_FBITS) == 0) { m <<= 1; e--; }
 #define DPDNORMX	DPDNORMx(xm, xe)
 #define DPDNORMY	DPDNORMx(ym, ye)
+#define DPDNORMZ	DPDNORMx(zm, ze)
 
 static inline union ieee754dp builddp(int s, int bx, u64 m)
 {
diff --git a/arch/mips/math-emu/ieee754int.h b/arch/mips/math-emu/ieee754int.h
index ed7bb27..8bc2f69 100644
--- a/arch/mips/math-emu/ieee754int.h
+++ b/arch/mips/math-emu/ieee754int.h
@@ -55,6 +55,9 @@ static inline int ieee754_class_nan(int xc)
 #define COMPYSP \
 	unsigned ym; int ye; int ys; int yc
 
+#define COMPZSP \
+	unsigned zm; int ze; int zs; int zc
+
 #define EXPLODESP(v, vc, vs, ve, vm)					\
 {									\
 	vs = SPSIGN(v);							\
@@ -81,6 +84,7 @@ static inline int ieee754_class_nan(int xc)
 }
 #define EXPLODEXSP EXPLODESP(x, xc, xs, xe, xm)
 #define EXPLODEYSP EXPLODESP(y, yc, ys, ye, ym)
+#define EXPLODEZSP EXPLODESP(z, zc, zs, ze, zm)
 
 
 #define COMPXDP \
@@ -89,6 +93,9 @@ static inline int ieee754_class_nan(int xc)
 #define COMPYDP \
 	u64 ym; int ye; int ys; int yc
 
+#define COMPZDP \
+	u64 zm; int ze; int zs; int zc
+
 #define EXPLODEDP(v, vc, vs, ve, vm)					\
 {									\
 	vm = DPMANT(v);							\
@@ -115,6 +122,7 @@ static inline int ieee754_class_nan(int xc)
 }
 #define EXPLODEXDP EXPLODEDP(x, xc, xs, xe, xm)
 #define EXPLODEYDP EXPLODEDP(y, yc, ys, ye, ym)
+#define EXPLODEZDP EXPLODEDP(z, zc, zs, ze, zm)
 
 #define FLUSHDP(v, vc, vs, ve, vm)					\
 	if (vc==IEEE754_CLASS_DNORM) {					\
@@ -140,7 +148,9 @@ static inline int ieee754_class_nan(int xc)
 
 #define FLUSHXDP FLUSHDP(x, xc, xs, xe, xm)
 #define FLUSHYDP FLUSHDP(y, yc, ys, ye, ym)
+#define FLUSHZDP FLUSHDP(z, zc, zs, ze, zm)
 #define FLUSHXSP FLUSHSP(x, xc, xs, xe, xm)
 #define FLUSHYSP FLUSHSP(y, yc, ys, ye, ym)
+#define FLUSHZSP FLUSHSP(z, zc, zs, ze, zm)
 
 #endif /* __IEEE754INT_H  */
diff --git a/arch/mips/math-emu/ieee754sp.h b/arch/mips/math-emu/ieee754sp.h
index 374a3f0..b24fdff 100644
--- a/arch/mips/math-emu/ieee754sp.h
+++ b/arch/mips/math-emu/ieee754sp.h
@@ -65,6 +65,7 @@ static inline int ieee754sp_finite(union ieee754sp x)
 	while ((m >> SP_FBITS) == 0) { m <<= 1; e--; }
 #define SPDNORMX	SPDNORMx(xm, xe)
 #define SPDNORMY	SPDNORMx(ym, ye)
+#define SPDNORMZ	SPDNORMx(zm, ze)
 
 static inline union ieee754sp buildsp(int s, int bx, unsigned m)
 {
diff --git a/arch/mips/math-emu/sp_maddf.c b/arch/mips/math-emu/sp_maddf.c
index 93b7132..86e1d0b 100644
--- a/arch/mips/math-emu/sp_maddf.c
+++ b/arch/mips/math-emu/sp_maddf.c
@@ -36,15 +36,15 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
 
 	COMPXSP;
 	COMPYSP;
-	u32 zm; int ze; int zs __maybe_unused; int zc;
+	COMPZSP;
 
 	EXPLODEXSP;
 	EXPLODEYSP;
-	EXPLODESP(z, zc, zs, ze, zm)
+	EXPLODEZSP;
 
 	FLUSHXSP;
 	FLUSHYSP;
-	FLUSHSP(z, zc, zs, ze, zm);
+	FLUSHZSP;
 
 	ieee754_clearcx();
 
@@ -53,7 +53,7 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
 		ieee754_setcx(IEEE754_INVALID_OPERATION);
 		return ieee754sp_nanxcpt(z);
 	case IEEE754_CLASS_DNORM:
-		SPDNORMx(zm, ze);
+		SPDNORMZ;
 	/* QNAN is handled separately below */
 	}
 
-- 
2.8.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 07/11] MIPS: math-emu: Add z argument macros
@ 2016-04-21 13:04   ` Paul Burton
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Burton @ 2016-04-21 13:04 UTC (permalink / raw)
  To: linux-mips, Ralf Baechle
  Cc: Paul Burton, Maciej W. Rozycki, linux-kernel, Markos Chandras

Introduce macros for handling the "z" argument to maddf & msubf, making
its handling consistent with that of the "x" & "y" arguments rather than
open-coding equivalents.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
---

 arch/mips/math-emu/dp_maddf.c   |  9 ++++-----
 arch/mips/math-emu/ieee754dp.h  |  1 +
 arch/mips/math-emu/ieee754int.h | 10 ++++++++++
 arch/mips/math-emu/ieee754sp.h  |  1 +
 arch/mips/math-emu/sp_maddf.c   |  8 ++++----
 5 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/arch/mips/math-emu/dp_maddf.c b/arch/mips/math-emu/dp_maddf.c
index d5e0fb1..0e1d4d8 100644
--- a/arch/mips/math-emu/dp_maddf.c
+++ b/arch/mips/math-emu/dp_maddf.c
@@ -36,16 +36,15 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
 
 	COMPXDP;
 	COMPYDP;
-
-	u64 zm; int ze; int zs __maybe_unused; int zc;
+	COMPZDP;
 
 	EXPLODEXDP;
 	EXPLODEYDP;
-	EXPLODEDP(z, zc, zs, ze, zm)
+	EXPLODEZDP;
 
 	FLUSHXDP;
 	FLUSHYDP;
-	FLUSHDP(z, zc, zs, ze, zm);
+	FLUSHZDP;
 
 	ieee754_clearcx();
 
@@ -54,7 +53,7 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
 		ieee754_setcx(IEEE754_INVALID_OPERATION);
 		return ieee754dp_nanxcpt(z);
 	case IEEE754_CLASS_DNORM:
-		DPDNORMx(zm, ze);
+		DPDNORMZ;
 	/* QNAN is handled separately below */
 	}
 
diff --git a/arch/mips/math-emu/ieee754dp.h b/arch/mips/math-emu/ieee754dp.h
index e2babd9..9ba0230 100644
--- a/arch/mips/math-emu/ieee754dp.h
+++ b/arch/mips/math-emu/ieee754dp.h
@@ -60,6 +60,7 @@ static inline int ieee754dp_finite(union ieee754dp x)
 	while ((m >> DP_FBITS) == 0) { m <<= 1; e--; }
 #define DPDNORMX	DPDNORMx(xm, xe)
 #define DPDNORMY	DPDNORMx(ym, ye)
+#define DPDNORMZ	DPDNORMx(zm, ze)
 
 static inline union ieee754dp builddp(int s, int bx, u64 m)
 {
diff --git a/arch/mips/math-emu/ieee754int.h b/arch/mips/math-emu/ieee754int.h
index ed7bb27..8bc2f69 100644
--- a/arch/mips/math-emu/ieee754int.h
+++ b/arch/mips/math-emu/ieee754int.h
@@ -55,6 +55,9 @@ static inline int ieee754_class_nan(int xc)
 #define COMPYSP \
 	unsigned ym; int ye; int ys; int yc
 
+#define COMPZSP \
+	unsigned zm; int ze; int zs; int zc
+
 #define EXPLODESP(v, vc, vs, ve, vm)					\
 {									\
 	vs = SPSIGN(v);							\
@@ -81,6 +84,7 @@ static inline int ieee754_class_nan(int xc)
 }
 #define EXPLODEXSP EXPLODESP(x, xc, xs, xe, xm)
 #define EXPLODEYSP EXPLODESP(y, yc, ys, ye, ym)
+#define EXPLODEZSP EXPLODESP(z, zc, zs, ze, zm)
 
 
 #define COMPXDP \
@@ -89,6 +93,9 @@ static inline int ieee754_class_nan(int xc)
 #define COMPYDP \
 	u64 ym; int ye; int ys; int yc
 
+#define COMPZDP \
+	u64 zm; int ze; int zs; int zc
+
 #define EXPLODEDP(v, vc, vs, ve, vm)					\
 {									\
 	vm = DPMANT(v);							\
@@ -115,6 +122,7 @@ static inline int ieee754_class_nan(int xc)
 }
 #define EXPLODEXDP EXPLODEDP(x, xc, xs, xe, xm)
 #define EXPLODEYDP EXPLODEDP(y, yc, ys, ye, ym)
+#define EXPLODEZDP EXPLODEDP(z, zc, zs, ze, zm)
 
 #define FLUSHDP(v, vc, vs, ve, vm)					\
 	if (vc==IEEE754_CLASS_DNORM) {					\
@@ -140,7 +148,9 @@ static inline int ieee754_class_nan(int xc)
 
 #define FLUSHXDP FLUSHDP(x, xc, xs, xe, xm)
 #define FLUSHYDP FLUSHDP(y, yc, ys, ye, ym)
+#define FLUSHZDP FLUSHDP(z, zc, zs, ze, zm)
 #define FLUSHXSP FLUSHSP(x, xc, xs, xe, xm)
 #define FLUSHYSP FLUSHSP(y, yc, ys, ye, ym)
+#define FLUSHZSP FLUSHSP(z, zc, zs, ze, zm)
 
 #endif /* __IEEE754INT_H  */
diff --git a/arch/mips/math-emu/ieee754sp.h b/arch/mips/math-emu/ieee754sp.h
index 374a3f0..b24fdff 100644
--- a/arch/mips/math-emu/ieee754sp.h
+++ b/arch/mips/math-emu/ieee754sp.h
@@ -65,6 +65,7 @@ static inline int ieee754sp_finite(union ieee754sp x)
 	while ((m >> SP_FBITS) == 0) { m <<= 1; e--; }
 #define SPDNORMX	SPDNORMx(xm, xe)
 #define SPDNORMY	SPDNORMx(ym, ye)
+#define SPDNORMZ	SPDNORMx(zm, ze)
 
 static inline union ieee754sp buildsp(int s, int bx, unsigned m)
 {
diff --git a/arch/mips/math-emu/sp_maddf.c b/arch/mips/math-emu/sp_maddf.c
index 93b7132..86e1d0b 100644
--- a/arch/mips/math-emu/sp_maddf.c
+++ b/arch/mips/math-emu/sp_maddf.c
@@ -36,15 +36,15 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
 
 	COMPXSP;
 	COMPYSP;
-	u32 zm; int ze; int zs __maybe_unused; int zc;
+	COMPZSP;
 
 	EXPLODEXSP;
 	EXPLODEYSP;
-	EXPLODESP(z, zc, zs, ze, zm)
+	EXPLODEZSP;
 
 	FLUSHXSP;
 	FLUSHYSP;
-	FLUSHSP(z, zc, zs, ze, zm);
+	FLUSHZSP;
 
 	ieee754_clearcx();
 
@@ -53,7 +53,7 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
 		ieee754_setcx(IEEE754_INVALID_OPERATION);
 		return ieee754sp_nanxcpt(z);
 	case IEEE754_CLASS_DNORM:
-		SPDNORMx(zm, ze);
+		SPDNORMZ;
 	/* QNAN is handled separately below */
 	}
 
-- 
2.8.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 08/11] MIPS: math-emu: Fix bit-width in ieee754dp_{mul,maddf,msubf} comments
@ 2016-04-21 13:04   ` Paul Burton
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Burton @ 2016-04-21 13:04 UTC (permalink / raw)
  To: linux-mips, Ralf Baechle; +Cc: Paul Burton, Markos Chandras, linux-kernel

A comment in ieee754dp_mul indicates that the code is about to perform a
32b x 32b multiplication & keep the high 32b of the result. It appears
this was copied from the single-precision multiplication code, since the
code actually goes on to perform a 64b x 64b multiplication & keep the
high 64b of the result. Fix the comment to indicate 64b.

It appears also that this comment was copied verbatim along with the
rest of the multiplication code into ieee754dp_maddf, which has since
been renamed _dp_maddf. Fix the same issue there.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
---

 arch/mips/math-emu/dp_maddf.c | 2 +-
 arch/mips/math-emu/dp_mul.c   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/mips/math-emu/dp_maddf.c b/arch/mips/math-emu/dp_maddf.c
index 0e1d4d8..dc83fdd 100644
--- a/arch/mips/math-emu/dp_maddf.c
+++ b/arch/mips/math-emu/dp_maddf.c
@@ -165,7 +165,7 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
 	ym <<= 64 - (DP_FBITS + 1);
 
 	/*
-	 * Multiply 32 bits xm, ym to give high 32 bits rm with stickness.
+	 * Multiply 64 bits xm, ym to give high 64 bits rm with stickness.
 	 */
 
 	/* 32 * 32 => 64 */
diff --git a/arch/mips/math-emu/dp_mul.c b/arch/mips/math-emu/dp_mul.c
index d0901f0..d6a7573 100644
--- a/arch/mips/math-emu/dp_mul.c
+++ b/arch/mips/math-emu/dp_mul.c
@@ -125,7 +125,7 @@ union ieee754dp ieee754dp_mul(union ieee754dp x, union ieee754dp y)
 	ym <<= 64 - (DP_FBITS + 1);
 
 	/*
-	 * Multiply 32 bits xm, ym to give high 32 bits rm with stickness.
+	 * Multiply 64 bits xm, ym to give high 64 bits rm with stickness.
 	 */
 
 	/* 32 * 32 => 64 */
-- 
2.8.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 08/11] MIPS: math-emu: Fix bit-width in ieee754dp_{mul,maddf,msubf} comments
@ 2016-04-21 13:04   ` Paul Burton
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Burton @ 2016-04-21 13:04 UTC (permalink / raw)
  To: linux-mips, Ralf Baechle; +Cc: Paul Burton, Markos Chandras, linux-kernel

A comment in ieee754dp_mul indicates that the code is about to perform a
32b x 32b multiplication & keep the high 32b of the result. It appears
this was copied from the single-precision multiplication code, since the
code actually goes on to perform a 64b x 64b multiplication & keep the
high 64b of the result. Fix the comment to indicate 64b.

It appears also that this comment was copied verbatim along with the
rest of the multiplication code into ieee754dp_maddf, which has since
been renamed _dp_maddf. Fix the same issue there.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
---

 arch/mips/math-emu/dp_maddf.c | 2 +-
 arch/mips/math-emu/dp_mul.c   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/mips/math-emu/dp_maddf.c b/arch/mips/math-emu/dp_maddf.c
index 0e1d4d8..dc83fdd 100644
--- a/arch/mips/math-emu/dp_maddf.c
+++ b/arch/mips/math-emu/dp_maddf.c
@@ -165,7 +165,7 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
 	ym <<= 64 - (DP_FBITS + 1);
 
 	/*
-	 * Multiply 32 bits xm, ym to give high 32 bits rm with stickness.
+	 * Multiply 64 bits xm, ym to give high 64 bits rm with stickness.
 	 */
 
 	/* 32 * 32 => 64 */
diff --git a/arch/mips/math-emu/dp_mul.c b/arch/mips/math-emu/dp_mul.c
index d0901f0..d6a7573 100644
--- a/arch/mips/math-emu/dp_mul.c
+++ b/arch/mips/math-emu/dp_mul.c
@@ -125,7 +125,7 @@ union ieee754dp ieee754dp_mul(union ieee754dp x, union ieee754dp y)
 	ym <<= 64 - (DP_FBITS + 1);
 
 	/*
-	 * Multiply 32 bits xm, ym to give high 32 bits rm with stickness.
+	 * Multiply 64 bits xm, ym to give high 64 bits rm with stickness.
 	 */
 
 	/* 32 * 32 => 64 */
-- 
2.8.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 09/11] MIPS: math-emu: Fix code indentation
@ 2016-04-21 13:04   ` Paul Burton
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Burton @ 2016-04-21 13:04 UTC (permalink / raw)
  To: linux-mips, Ralf Baechle; +Cc: Paul Burton, Markos Chandras, linux-kernel

A line incrementing the re variable was indented a level too deep in
ieee754dp_mul, making the code unclear to read. Fix the indentation.
This appears to have been copied verbatim along with the rest of the
multiplication code to ieee754dp_maddf, now _dp_maddf, too so fix the
indentation there too.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
---

 arch/mips/math-emu/dp_maddf.c | 2 +-
 arch/mips/math-emu/dp_mul.c   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/mips/math-emu/dp_maddf.c b/arch/mips/math-emu/dp_maddf.c
index dc83fdd..4a2d03c 100644
--- a/arch/mips/math-emu/dp_maddf.c
+++ b/arch/mips/math-emu/dp_maddf.c
@@ -203,7 +203,7 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
 	if ((s64) rm < 0) {
 		rm = (rm >> (64 - (DP_FBITS + 1 + 3))) |
 		     ((rm << (DP_FBITS + 1 + 3)) != 0);
-			re++;
+		re++;
 	} else {
 		rm = (rm >> (64 - (DP_FBITS + 1 + 3 + 1))) |
 		     ((rm << (DP_FBITS + 1 + 3 + 1)) != 0);
diff --git a/arch/mips/math-emu/dp_mul.c b/arch/mips/math-emu/dp_mul.c
index d6a7573..87d0b44 100644
--- a/arch/mips/math-emu/dp_mul.c
+++ b/arch/mips/math-emu/dp_mul.c
@@ -163,7 +163,7 @@ union ieee754dp ieee754dp_mul(union ieee754dp x, union ieee754dp y)
 	if ((s64) rm < 0) {
 		rm = (rm >> (64 - (DP_FBITS + 1 + 3))) |
 		     ((rm << (DP_FBITS + 1 + 3)) != 0);
-			re++;
+		re++;
 	} else {
 		rm = (rm >> (64 - (DP_FBITS + 1 + 3 + 1))) |
 		     ((rm << (DP_FBITS + 1 + 3 + 1)) != 0);
-- 
2.8.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 09/11] MIPS: math-emu: Fix code indentation
@ 2016-04-21 13:04   ` Paul Burton
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Burton @ 2016-04-21 13:04 UTC (permalink / raw)
  To: linux-mips, Ralf Baechle; +Cc: Paul Burton, Markos Chandras, linux-kernel

A line incrementing the re variable was indented a level too deep in
ieee754dp_mul, making the code unclear to read. Fix the indentation.
This appears to have been copied verbatim along with the rest of the
multiplication code to ieee754dp_maddf, now _dp_maddf, too so fix the
indentation there too.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
---

 arch/mips/math-emu/dp_maddf.c | 2 +-
 arch/mips/math-emu/dp_mul.c   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/mips/math-emu/dp_maddf.c b/arch/mips/math-emu/dp_maddf.c
index dc83fdd..4a2d03c 100644
--- a/arch/mips/math-emu/dp_maddf.c
+++ b/arch/mips/math-emu/dp_maddf.c
@@ -203,7 +203,7 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
 	if ((s64) rm < 0) {
 		rm = (rm >> (64 - (DP_FBITS + 1 + 3))) |
 		     ((rm << (DP_FBITS + 1 + 3)) != 0);
-			re++;
+		re++;
 	} else {
 		rm = (rm >> (64 - (DP_FBITS + 1 + 3 + 1))) |
 		     ((rm << (DP_FBITS + 1 + 3 + 1)) != 0);
diff --git a/arch/mips/math-emu/dp_mul.c b/arch/mips/math-emu/dp_mul.c
index d6a7573..87d0b44 100644
--- a/arch/mips/math-emu/dp_mul.c
+++ b/arch/mips/math-emu/dp_mul.c
@@ -163,7 +163,7 @@ union ieee754dp ieee754dp_mul(union ieee754dp x, union ieee754dp y)
 	if ((s64) rm < 0) {
 		rm = (rm >> (64 - (DP_FBITS + 1 + 3))) |
 		     ((rm << (DP_FBITS + 1 + 3)) != 0);
-			re++;
+		re++;
 	} else {
 		rm = (rm >> (64 - (DP_FBITS + 1 + 3 + 1))) |
 		     ((rm << (DP_FBITS + 1 + 3 + 1)) != 0);
-- 
2.8.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 10/11] MIPS: math-emu: Fix m{add,sub}.s shifts
@ 2016-04-21 13:04   ` Paul Burton
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Burton @ 2016-04-21 13:04 UTC (permalink / raw)
  To: linux-mips, Ralf Baechle
  Cc: Paul Burton, Adam Buchbinder, Maciej W. Rozycki, linux-kernel,
	Markos Chandras

The code in _sp_maddf (formerly ieee754sp_madd) appears to have been
copied verbatim from ieee754sp_add, and although it's adding the
unpacked "r" & "z" floats it kept using macros that operate on "x" &
"y". This led to the addition being carried out incorrectly on some
mismash of the product, accumulator & multiplicand fields. Typically
this would lead to the assertions "ze == re" & "ze <= SP_EMAX" failing
since ze & re hadn't been operated upon.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: e24c3bec3e8e ("MIPS: math-emu: Add support for the MIPS R6 MADDF FPU instruction")
---

 arch/mips/math-emu/ieee754sp.c |  3 ++-
 arch/mips/math-emu/ieee754sp.h | 16 +++++++---------
 arch/mips/math-emu/sp_add.c    |  6 ++++--
 arch/mips/math-emu/sp_maddf.c  | 13 ++++++++-----
 arch/mips/math-emu/sp_sub.c    |  6 ++++--
 5 files changed, 25 insertions(+), 19 deletions(-)

diff --git a/arch/mips/math-emu/ieee754sp.c b/arch/mips/math-emu/ieee754sp.c
index e0b2c45..2de0c09 100644
--- a/arch/mips/math-emu/ieee754sp.c
+++ b/arch/mips/math-emu/ieee754sp.c
@@ -138,7 +138,8 @@ union ieee754sp ieee754sp_format(int sn, int xe, unsigned xm)
 		} else {
 			/* sticky right shift es bits
 			 */
-			SPXSRSXn(es);
+			xm = XSPSRS(xm, es);
+			xe += es;
 			assert((xm & (SP_HIDDEN_BIT << 3)) == 0);
 			assert(xe == SP_EMIN);
 		}
diff --git a/arch/mips/math-emu/ieee754sp.h b/arch/mips/math-emu/ieee754sp.h
index b24fdff..8476067 100644
--- a/arch/mips/math-emu/ieee754sp.h
+++ b/arch/mips/math-emu/ieee754sp.h
@@ -46,19 +46,17 @@ static inline int ieee754sp_finite(union ieee754sp x)
 }
 
 /* 3bit extended single precision sticky right shift */
-#define SPXSRSXn(rs)							\
-	(xe += rs,							\
-	 xm = (rs > (SP_FBITS+3))?1:((xm) >> (rs)) | ((xm) << (32-(rs)) != 0))
+#define XSPSRS(v, rs)						\
+	((rs > (SP_FBITS+3))?1:((v) >> (rs)) | ((v) << (32-(rs)) != 0))
 
-#define SPXSRSX1() \
-	(xe++, (xm = (xm >> 1) | (xm & 1)))
+#define XSPSRS1(m) \
+	((m >> 1) | (m & 1))
 
-#define SPXSRSYn(rs)								\
-	(ye+=rs,								\
-	 ym = (rs > (SP_FBITS+3))?1:((ym) >> (rs)) | ((ym) << (32-(rs)) != 0))
+#define SPXSRSX1() \
+	(xe++, (xm = XSPSRS1(xm)))
 
 #define SPXSRSY1() \
-	(ye++, (ym = (ym >> 1) | (ym & 1)))
+	(ye++, (ym = XSPSRS1(ym)))
 
 /* convert denormal to normalized with extended exponent */
 #define SPDNORMx(m,e) \
diff --git a/arch/mips/math-emu/sp_add.c b/arch/mips/math-emu/sp_add.c
index f1c87b0..c55c0c0 100644
--- a/arch/mips/math-emu/sp_add.c
+++ b/arch/mips/math-emu/sp_add.c
@@ -132,13 +132,15 @@ union ieee754sp ieee754sp_add(union ieee754sp x, union ieee754sp y)
 		 * Have to shift y fraction right to align.
 		 */
 		s = xe - ye;
-		SPXSRSYn(s);
+		ym = XSPSRS(ym, s);
+		ye += s;
 	} else if (ye > xe) {
 		/*
 		 * Have to shift x fraction right to align.
 		 */
 		s = ye - xe;
-		SPXSRSXn(s);
+		xm = XSPSRS(xm, s);
+		xe += s;
 	}
 	assert(xe == ye);
 	assert(xe <= SP_EMAX);
diff --git a/arch/mips/math-emu/sp_maddf.c b/arch/mips/math-emu/sp_maddf.c
index 86e1d0b..a8cd8b4 100644
--- a/arch/mips/math-emu/sp_maddf.c
+++ b/arch/mips/math-emu/sp_maddf.c
@@ -214,16 +214,18 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
 
 	if (ze > re) {
 		/*
-		 * Have to shift y fraction right to align.
+		 * Have to shift r fraction right to align.
 		 */
 		s = ze - re;
-		SPXSRSYn(s);
+		rm = XSPSRS(rm, s);
+		re += s;
 	} else if (re > ze) {
 		/*
-		 * Have to shift x fraction right to align.
+		 * Have to shift z fraction right to align.
 		 */
 		s = re - ze;
-		SPXSRSYn(s);
+		zm = XSPSRS(zm, s);
+		ze += s;
 	}
 	assert(ze == re);
 	assert(ze <= SP_EMAX);
@@ -236,7 +238,8 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
 		zm = zm + rm;
 
 		if (zm >> (SP_FBITS + 1 + 3)) { /* carry out */
-			SPXSRSX1();
+			zm = XSPSRS1(zm);
+			ze++;
 		}
 	} else {
 		if (zm >= rm) {
diff --git a/arch/mips/math-emu/sp_sub.c b/arch/mips/math-emu/sp_sub.c
index ec5f937..dc998ed 100644
--- a/arch/mips/math-emu/sp_sub.c
+++ b/arch/mips/math-emu/sp_sub.c
@@ -134,13 +134,15 @@ union ieee754sp ieee754sp_sub(union ieee754sp x, union ieee754sp y)
 		 * have to shift y fraction right to align
 		 */
 		s = xe - ye;
-		SPXSRSYn(s);
+		ym = XSPSRS(ym, s);
+		ye += s;
 	} else if (ye > xe) {
 		/*
 		 * have to shift x fraction right to align
 		 */
 		s = ye - xe;
-		SPXSRSXn(s);
+		xm = XSPSRS(xm, s);
+		xe += s;
 	}
 	assert(xe == ye);
 	assert(xe <= SP_EMAX);
-- 
2.8.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 10/11] MIPS: math-emu: Fix m{add,sub}.s shifts
@ 2016-04-21 13:04   ` Paul Burton
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Burton @ 2016-04-21 13:04 UTC (permalink / raw)
  To: linux-mips, Ralf Baechle
  Cc: Paul Burton, Adam Buchbinder, Maciej W. Rozycki, linux-kernel,
	Markos Chandras

The code in _sp_maddf (formerly ieee754sp_madd) appears to have been
copied verbatim from ieee754sp_add, and although it's adding the
unpacked "r" & "z" floats it kept using macros that operate on "x" &
"y". This led to the addition being carried out incorrectly on some
mismash of the product, accumulator & multiplicand fields. Typically
this would lead to the assertions "ze == re" & "ze <= SP_EMAX" failing
since ze & re hadn't been operated upon.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: e24c3bec3e8e ("MIPS: math-emu: Add support for the MIPS R6 MADDF FPU instruction")
---

 arch/mips/math-emu/ieee754sp.c |  3 ++-
 arch/mips/math-emu/ieee754sp.h | 16 +++++++---------
 arch/mips/math-emu/sp_add.c    |  6 ++++--
 arch/mips/math-emu/sp_maddf.c  | 13 ++++++++-----
 arch/mips/math-emu/sp_sub.c    |  6 ++++--
 5 files changed, 25 insertions(+), 19 deletions(-)

diff --git a/arch/mips/math-emu/ieee754sp.c b/arch/mips/math-emu/ieee754sp.c
index e0b2c45..2de0c09 100644
--- a/arch/mips/math-emu/ieee754sp.c
+++ b/arch/mips/math-emu/ieee754sp.c
@@ -138,7 +138,8 @@ union ieee754sp ieee754sp_format(int sn, int xe, unsigned xm)
 		} else {
 			/* sticky right shift es bits
 			 */
-			SPXSRSXn(es);
+			xm = XSPSRS(xm, es);
+			xe += es;
 			assert((xm & (SP_HIDDEN_BIT << 3)) == 0);
 			assert(xe == SP_EMIN);
 		}
diff --git a/arch/mips/math-emu/ieee754sp.h b/arch/mips/math-emu/ieee754sp.h
index b24fdff..8476067 100644
--- a/arch/mips/math-emu/ieee754sp.h
+++ b/arch/mips/math-emu/ieee754sp.h
@@ -46,19 +46,17 @@ static inline int ieee754sp_finite(union ieee754sp x)
 }
 
 /* 3bit extended single precision sticky right shift */
-#define SPXSRSXn(rs)							\
-	(xe += rs,							\
-	 xm = (rs > (SP_FBITS+3))?1:((xm) >> (rs)) | ((xm) << (32-(rs)) != 0))
+#define XSPSRS(v, rs)						\
+	((rs > (SP_FBITS+3))?1:((v) >> (rs)) | ((v) << (32-(rs)) != 0))
 
-#define SPXSRSX1() \
-	(xe++, (xm = (xm >> 1) | (xm & 1)))
+#define XSPSRS1(m) \
+	((m >> 1) | (m & 1))
 
-#define SPXSRSYn(rs)								\
-	(ye+=rs,								\
-	 ym = (rs > (SP_FBITS+3))?1:((ym) >> (rs)) | ((ym) << (32-(rs)) != 0))
+#define SPXSRSX1() \
+	(xe++, (xm = XSPSRS1(xm)))
 
 #define SPXSRSY1() \
-	(ye++, (ym = (ym >> 1) | (ym & 1)))
+	(ye++, (ym = XSPSRS1(ym)))
 
 /* convert denormal to normalized with extended exponent */
 #define SPDNORMx(m,e) \
diff --git a/arch/mips/math-emu/sp_add.c b/arch/mips/math-emu/sp_add.c
index f1c87b0..c55c0c0 100644
--- a/arch/mips/math-emu/sp_add.c
+++ b/arch/mips/math-emu/sp_add.c
@@ -132,13 +132,15 @@ union ieee754sp ieee754sp_add(union ieee754sp x, union ieee754sp y)
 		 * Have to shift y fraction right to align.
 		 */
 		s = xe - ye;
-		SPXSRSYn(s);
+		ym = XSPSRS(ym, s);
+		ye += s;
 	} else if (ye > xe) {
 		/*
 		 * Have to shift x fraction right to align.
 		 */
 		s = ye - xe;
-		SPXSRSXn(s);
+		xm = XSPSRS(xm, s);
+		xe += s;
 	}
 	assert(xe == ye);
 	assert(xe <= SP_EMAX);
diff --git a/arch/mips/math-emu/sp_maddf.c b/arch/mips/math-emu/sp_maddf.c
index 86e1d0b..a8cd8b4 100644
--- a/arch/mips/math-emu/sp_maddf.c
+++ b/arch/mips/math-emu/sp_maddf.c
@@ -214,16 +214,18 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
 
 	if (ze > re) {
 		/*
-		 * Have to shift y fraction right to align.
+		 * Have to shift r fraction right to align.
 		 */
 		s = ze - re;
-		SPXSRSYn(s);
+		rm = XSPSRS(rm, s);
+		re += s;
 	} else if (re > ze) {
 		/*
-		 * Have to shift x fraction right to align.
+		 * Have to shift z fraction right to align.
 		 */
 		s = re - ze;
-		SPXSRSYn(s);
+		zm = XSPSRS(zm, s);
+		ze += s;
 	}
 	assert(ze == re);
 	assert(ze <= SP_EMAX);
@@ -236,7 +238,8 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
 		zm = zm + rm;
 
 		if (zm >> (SP_FBITS + 1 + 3)) { /* carry out */
-			SPXSRSX1();
+			zm = XSPSRS1(zm);
+			ze++;
 		}
 	} else {
 		if (zm >= rm) {
diff --git a/arch/mips/math-emu/sp_sub.c b/arch/mips/math-emu/sp_sub.c
index ec5f937..dc998ed 100644
--- a/arch/mips/math-emu/sp_sub.c
+++ b/arch/mips/math-emu/sp_sub.c
@@ -134,13 +134,15 @@ union ieee754sp ieee754sp_sub(union ieee754sp x, union ieee754sp y)
 		 * have to shift y fraction right to align
 		 */
 		s = xe - ye;
-		SPXSRSYn(s);
+		ym = XSPSRS(ym, s);
+		ye += s;
 	} else if (ye > xe) {
 		/*
 		 * have to shift x fraction right to align
 		 */
 		s = ye - xe;
-		SPXSRSXn(s);
+		xm = XSPSRS(xm, s);
+		xe += s;
 	}
 	assert(xe == ye);
 	assert(xe <= SP_EMAX);
-- 
2.8.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 11/11] MIPS: math-emu: Fix jalr emulation when rd == $0
@ 2016-04-21 13:04   ` Paul Burton
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Burton @ 2016-04-21 13:04 UTC (permalink / raw)
  To: linux-mips, Ralf Baechle
  Cc: Paul Burton, stable # v3 . 10, Maciej W. Rozycki, linux-kernel,
	James Hogan, Markos Chandras

When emulating a jalr instruction with rd == $0, the code in
isBranchInstr was incorrectly writing to GPR $0 which should actually
always remain zeroed. This would lead to any further instructions
emulated which use $0 operating on a bogus value until the task is next
context switched, at which point the value of $0 in the task context
would be restored to the correct zero by a store in SAVE_SOME. Fix this
by not writing to rd if it is $0.

Fixes: 102cedc32a6e ("MIPS: microMIPS: Floating point support.")
Cc: stable <stable@vger.kernel.org> # v3.10
Signed-off-by: Paul Burton <paul.burton@imgtec.com>

---

 arch/mips/math-emu/cp1emu.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/mips/math-emu/cp1emu.c b/arch/mips/math-emu/cp1emu.c
index 85dd174..d96e912 100644
--- a/arch/mips/math-emu/cp1emu.c
+++ b/arch/mips/math-emu/cp1emu.c
@@ -445,9 +445,11 @@ static int isBranchInstr(struct pt_regs *regs, struct mm_decoded_insn dec_insn,
 	case spec_op:
 		switch (insn.r_format.func) {
 		case jalr_op:
-			regs->regs[insn.r_format.rd] =
-				regs->cp0_epc + dec_insn.pc_inc +
-				dec_insn.next_pc_inc;
+			if (insn.r_format.rd != 0) {
+				regs->regs[insn.r_format.rd] =
+					regs->cp0_epc + dec_insn.pc_inc +
+					dec_insn.next_pc_inc;
+			}
 			/* Fall through */
 		case jr_op:
 			/* For R6, JR already emulated in jalr_op */
-- 
2.8.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 11/11] MIPS: math-emu: Fix jalr emulation when rd == $0
@ 2016-04-21 13:04   ` Paul Burton
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Burton @ 2016-04-21 13:04 UTC (permalink / raw)
  To: linux-mips, Ralf Baechle
  Cc: Paul Burton, stable # v3 . 10, Maciej W. Rozycki, linux-kernel,
	James Hogan, Markos Chandras

When emulating a jalr instruction with rd == $0, the code in
isBranchInstr was incorrectly writing to GPR $0 which should actually
always remain zeroed. This would lead to any further instructions
emulated which use $0 operating on a bogus value until the task is next
context switched, at which point the value of $0 in the task context
would be restored to the correct zero by a store in SAVE_SOME. Fix this
by not writing to rd if it is $0.

Fixes: 102cedc32a6e ("MIPS: microMIPS: Floating point support.")
Cc: stable <stable@vger.kernel.org> # v3.10
Signed-off-by: Paul Burton <paul.burton@imgtec.com>

---

 arch/mips/math-emu/cp1emu.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/mips/math-emu/cp1emu.c b/arch/mips/math-emu/cp1emu.c
index 85dd174..d96e912 100644
--- a/arch/mips/math-emu/cp1emu.c
+++ b/arch/mips/math-emu/cp1emu.c
@@ -445,9 +445,11 @@ static int isBranchInstr(struct pt_regs *regs, struct mm_decoded_insn dec_insn,
 	case spec_op:
 		switch (insn.r_format.func) {
 		case jalr_op:
-			regs->regs[insn.r_format.rd] =
-				regs->cp0_epc + dec_insn.pc_inc +
-				dec_insn.next_pc_inc;
+			if (insn.r_format.rd != 0) {
+				regs->regs[insn.r_format.rd] =
+					regs->cp0_epc + dec_insn.pc_inc +
+					dec_insn.next_pc_inc;
+			}
 			/* Fall through */
 		case jr_op:
 			/* For R6, JR already emulated in jalr_op */
-- 
2.8.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 01/11] MIPS: math-emu: Fix BC1{EQ,NE}Z emulation
  2016-04-21 13:04   ` Paul Burton
  (?)
@ 2016-05-10  8:09   ` Ralf Baechle
  2016-05-10  8:43       ` Paul Burton
  -1 siblings, 1 reply; 28+ messages in thread
From: Ralf Baechle @ 2016-05-10  8:09 UTC (permalink / raw)
  To: Paul Burton
  Cc: linux-mips, Maciej W. Rozycki, linux-kernel, James Hogan,
	Markos Chandras

On Thu, Apr 21, 2016 at 02:04:45PM +0100, Paul Burton wrote:

> The conditions for branching when emulating the BC1EQZ & BC1NEZ
> instructions were backwards, leading to each of those instructions being
> treated as the other. Fix this by reversing the conditions, and clear up
> the code a little for readability & checkpatch.
> 
> Fixes: c909ca718e8f ("MIPS: math-emu: Emulate missing BC1{EQ,NE}Z instructions")
> Signed-off-by: Paul Burton <paul.burton@imgtec.com>
> Reviewed-by: James Hogan <james.hogan@imgtec.com>

Patch ok - but you may want to take Markos off the cc list.  Or iff he's
still interested, is there any new address for him to use instead?

  Ralf

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 01/11] MIPS: math-emu: Fix BC1{EQ,NE}Z emulation
@ 2016-05-10  8:43       ` Paul Burton
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Burton @ 2016-05-10  8:43 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: linux-mips, Maciej W. Rozycki, linux-kernel, James Hogan

On Tue, May 10, 2016 at 10:09:40AM +0200, Ralf Baechle wrote:
> On Thu, Apr 21, 2016 at 02:04:45PM +0100, Paul Burton wrote:
> 
> > The conditions for branching when emulating the BC1EQZ & BC1NEZ
> > instructions were backwards, leading to each of those instructions being
> > treated as the other. Fix this by reversing the conditions, and clear up
> > the code a little for readability & checkpatch.
> > 
> > Fixes: c909ca718e8f ("MIPS: math-emu: Emulate missing BC1{EQ,NE}Z instructions")
> > Signed-off-by: Paul Burton <paul.burton@imgtec.com>
> > Reviewed-by: James Hogan <james.hogan@imgtec.com>
> 
> Patch ok - but you may want to take Markos off the cc list.  Or iff he's
> still interested, is there any new address for him to use instead?
> 
>   Ralf

Hi Ralf,

It's a case of automated tooling cc'ing people from relevant git
history, without the knowledge that the email address is no longer
valid. I'll see if I can get it to filter him out next time.

Thanks,
    Paul

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 01/11] MIPS: math-emu: Fix BC1{EQ,NE}Z emulation
@ 2016-05-10  8:43       ` Paul Burton
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Burton @ 2016-05-10  8:43 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: linux-mips, Maciej W. Rozycki, linux-kernel, James Hogan

On Tue, May 10, 2016 at 10:09:40AM +0200, Ralf Baechle wrote:
> On Thu, Apr 21, 2016 at 02:04:45PM +0100, Paul Burton wrote:
> 
> > The conditions for branching when emulating the BC1EQZ & BC1NEZ
> > instructions were backwards, leading to each of those instructions being
> > treated as the other. Fix this by reversing the conditions, and clear up
> > the code a little for readability & checkpatch.
> > 
> > Fixes: c909ca718e8f ("MIPS: math-emu: Emulate missing BC1{EQ,NE}Z instructions")
> > Signed-off-by: Paul Burton <paul.burton@imgtec.com>
> > Reviewed-by: James Hogan <james.hogan@imgtec.com>
> 
> Patch ok - but you may want to take Markos off the cc list.  Or iff he's
> still interested, is there any new address for him to use instead?
> 
>   Ralf

Hi Ralf,

It's a case of automated tooling cc'ing people from relevant git
history, without the knowledge that the email address is no longer
valid. I'll see if I can get it to filter him out next time.

Thanks,
    Paul

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 01/11] MIPS: math-emu: Fix BC1{EQ,NE}Z emulation
  2016-05-10  8:43       ` Paul Burton
  (?)
@ 2016-05-10  8:54       ` Ralf Baechle
  -1 siblings, 0 replies; 28+ messages in thread
From: Ralf Baechle @ 2016-05-10  8:54 UTC (permalink / raw)
  To: Paul Burton; +Cc: linux-mips, Maciej W. Rozycki, linux-kernel, James Hogan

On Tue, May 10, 2016 at 09:43:15AM +0100, Paul Burton wrote:

> > > The conditions for branching when emulating the BC1EQZ & BC1NEZ
> > > instructions were backwards, leading to each of those instructions being
> > > treated as the other. Fix this by reversing the conditions, and clear up
> > > the code a little for readability & checkpatch.
> > > 
> > > Fixes: c909ca718e8f ("MIPS: math-emu: Emulate missing BC1{EQ,NE}Z instructions")
> > > Signed-off-by: Paul Burton <paul.burton@imgtec.com>
> > > Reviewed-by: James Hogan <james.hogan@imgtec.com>
> > 
> > Patch ok - but you may want to take Markos off the cc list.  Or iff he's
> > still interested, is there any new address for him to use instead?
> > 
> >   Ralf
> 
> Hi Ralf,
> 
> It's a case of automated tooling cc'ing people from relevant git
> history, without the knowledge that the email address is no longer
> valid. I'll see if I can get it to filter him out next time.

I was guessing so - but hoping there was an updated address for him.
Changing the email address in the commit messages of all pending patches
is one of the easier tricks to do :)

  Ralf

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2016-05-10  8:54 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-21 13:04 [PATCH 00/11] FP emulation fixes Paul Burton
2016-04-21 13:04 ` Paul Burton
2016-04-21 13:04 ` [PATCH 01/11] MIPS: math-emu: Fix BC1{EQ,NE}Z emulation Paul Burton
2016-04-21 13:04   ` Paul Burton
2016-05-10  8:09   ` Ralf Baechle
2016-05-10  8:43     ` Paul Burton
2016-05-10  8:43       ` Paul Burton
2016-05-10  8:54       ` Ralf Baechle
2016-04-21 13:04 ` [PATCH 02/11] MIPS: Fix BC1{EQ,NE}Z return offset calculation Paul Burton
2016-04-21 13:04   ` Paul Burton
2016-04-21 13:04 ` [PATCH 03/11] MIPS: inst: Declare fsel_op for sel.fmt instruction Paul Burton
2016-04-21 13:04   ` Paul Burton
2016-04-21 13:04 ` [PATCH 04/11] MIPS: math-emu: Emulate MIPSr6 " Paul Burton
2016-04-21 13:04   ` Paul Burton
2016-04-21 13:04 ` [PATCH 05/11] MIPS: math-emu: Unify ieee754sp_m{add,sub}f Paul Burton
2016-04-21 13:04   ` Paul Burton
2016-04-21 13:04 ` [PATCH 06/11] MIPS: math-emu: Unify ieee754dp_m{add,sub}f Paul Burton
2016-04-21 13:04   ` Paul Burton
2016-04-21 13:04 ` [PATCH 07/11] MIPS: math-emu: Add z argument macros Paul Burton
2016-04-21 13:04   ` Paul Burton
2016-04-21 13:04 ` [PATCH 08/11] MIPS: math-emu: Fix bit-width in ieee754dp_{mul,maddf,msubf} comments Paul Burton
2016-04-21 13:04   ` Paul Burton
2016-04-21 13:04 ` [PATCH 09/11] MIPS: math-emu: Fix code indentation Paul Burton
2016-04-21 13:04   ` Paul Burton
2016-04-21 13:04 ` [PATCH 10/11] MIPS: math-emu: Fix m{add,sub}.s shifts Paul Burton
2016-04-21 13:04   ` Paul Burton
2016-04-21 13:04 ` [PATCH 11/11] MIPS: math-emu: Fix jalr emulation when rd == $0 Paul Burton
2016-04-21 13:04   ` Paul Burton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.