linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/5] powerpc/sstep: VSX 32-byte vector paired load/store instructions
@ 2020-10-11  5:09 Ravi Bangoria
  2020-10-11  5:09 ` [PATCH v5 1/5] powerpc/sstep: Emulate prefixed instructions only when CPU_FTR_ARCH_31 is set Ravi Bangoria
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Ravi Bangoria @ 2020-10-11  5:09 UTC (permalink / raw)
  To: mpe
  Cc: ravi.bangoria, jniethe5, bala24, paulus, sandipan, naveen.n.rao,
	linuxppc-dev

VSX vector paired instructions operates with octword (32-byte)
operand for loads and stores between storage and a pair of two
sequential Vector-Scalar Registers (VSRs). There are 4 word
instructions and 2 prefixed instructions that provides this
32-byte storage access operations - lxvp, lxvpx, stxvp, stxvpx,
plxvp, pstxvp.

Emulation infrastructure doesn't have support for these instructions,
to operate with 32-byte storage access and to operate with 2 VSX
registers. This patch series enables the instruction emulation
support and adds test cases for them respectively.

v4: https://lore.kernel.org/r/20201008072726.233086-1-ravi.bangoria@linux.ibm.com

Changes in v5:
-------------
* Fix build breakage reported by Kernel test robote
* Patch #2 is new. CONFIG_VSX check was missing for some VSX
  instructions. Patch #2 adds that check.

Changes in v4:
-------------
* Patch #1 is (kind of) new.
* Patch #2 now enables both analyse_instr() and emulate_step()
  unlike prev series where both were in separate patches.
* Patch #2 also has important fix for emulation on LE.
* Patch #3 and #4. Added XSP/XTP, D0/D1 instruction operands,
  removed *_EX_OP, __PPC_T[P][X] macros which are incorrect,
  and adhered to PPC_RAW_* convention.
* Added `CPU_FTR_ARCH_31` check in testcases to avoid failing
  in p8/p9.
* Some consmetic changes.
* Rebased to powerpc/next

Changes in v3:
-------------
Worked on review comments and suggestions from Ravi and Naveen,

* Fix the do_vsx_load() to handle vsx instructions if MSR_FP/MSR_VEC
  cleared in exception conditions and it reaches to read/write to
  thread_struct member fp_state/vr_state respectively.
* Fix wrongly used `__vector128 v[2]` in struct vsx_reg as it should
  hold a single vsx register size.
* Remove unnecessary `VSX_CHECK_VEC` flag set and condition to check
  `VSX_LDLEFT` that is not applicable for these vsx instructions.
* Fix comments in emulate_vsx_load() that were misleading.
* Rebased on latest powerpc next branch.

Changes in v2:
-------------
* Fix suggestion from Sandipan, wrap ISA 3.1 instructions with
  cpu_has_feature(CPU_FTR_ARCH_31) check.


Balamuruhan S (4):
  powerpc/sstep: Emulate prefixed instructions only when CPU_FTR_ARCH_31
    is set
  powerpc/sstep: Support VSX vector paired storage access instructions
  powerpc/ppc-opcode: Add encoding macros for VSX vector paired
    instructions
  powerpc/sstep: Add testcases for VSX vector paired load/store
    instructions

Ravi Bangoria (1):
  powerpc/sstep: Cover new VSX instructions under CONFIG_VSX

 arch/powerpc/include/asm/ppc-opcode.h |  13 ++
 arch/powerpc/lib/sstep.c              | 160 ++++++++++++---
 arch/powerpc/lib/test_emulate_step.c  | 270 ++++++++++++++++++++++++++
 3 files changed, 421 insertions(+), 22 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v5 1/5] powerpc/sstep: Emulate prefixed instructions only when CPU_FTR_ARCH_31 is set
  2020-10-11  5:09 [PATCH v5 0/5] powerpc/sstep: VSX 32-byte vector paired load/store instructions Ravi Bangoria
@ 2020-10-11  5:09 ` Ravi Bangoria
  2020-10-11 15:06   ` Sandipan Das
                     ` (2 more replies)
  2020-10-11  5:09 ` [PATCH v5 2/5] powerpc/sstep: Cover new VSX instructions under CONFIG_VSX Ravi Bangoria
                   ` (4 subsequent siblings)
  5 siblings, 3 replies; 13+ messages in thread
From: Ravi Bangoria @ 2020-10-11  5:09 UTC (permalink / raw)
  To: mpe
  Cc: ravi.bangoria, jniethe5, bala24, paulus, sandipan, naveen.n.rao,
	linuxppc-dev

From: Balamuruhan S <bala24@linux.ibm.com>

Unconditional emulation of prefixed instructions will allow
emulation of them on Power10 predecessors which might cause
issues. Restrict that.

Fixes: 3920742b92f5 ("powerpc sstep: Add support for prefixed fixed-point arithmetic")
Fixes: 50b80a12e4cc ("powerpc sstep: Add support for prefixed load/stores")
Signed-off-by: Balamuruhan S <bala24@linux.ibm.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
---
 arch/powerpc/lib/sstep.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index e9dcaba9a4f8..e6242744c71b 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -1346,6 +1346,9 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 	switch (opcode) {
 #ifdef __powerpc64__
 	case 1:
+		if (!cpu_has_feature(CPU_FTR_ARCH_31))
+			return -1;
+
 		prefix_r = GET_PREFIX_R(word);
 		ra = GET_PREFIX_RA(suffix);
 		rd = (suffix >> 21) & 0x1f;
@@ -2733,6 +2736,9 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 		}
 		break;
 	case 1: /* Prefixed instructions */
+		if (!cpu_has_feature(CPU_FTR_ARCH_31))
+			return -1;
+
 		prefix_r = GET_PREFIX_R(word);
 		ra = GET_PREFIX_RA(suffix);
 		op->update_reg = ra;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v5 2/5] powerpc/sstep: Cover new VSX instructions under CONFIG_VSX
  2020-10-11  5:09 [PATCH v5 0/5] powerpc/sstep: VSX 32-byte vector paired load/store instructions Ravi Bangoria
  2020-10-11  5:09 ` [PATCH v5 1/5] powerpc/sstep: Emulate prefixed instructions only when CPU_FTR_ARCH_31 is set Ravi Bangoria
@ 2020-10-11  5:09 ` Ravi Bangoria
  2020-10-11  5:09 ` [PATCH v5 3/5] powerpc/sstep: Support VSX vector paired storage access instructions Ravi Bangoria
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Ravi Bangoria @ 2020-10-11  5:09 UTC (permalink / raw)
  To: mpe
  Cc: ravi.bangoria, jniethe5, bala24, paulus, sandipan, naveen.n.rao,
	linuxppc-dev

Recently added Power10 prefixed VSX instruction are included
unconditionally in the kernel. If they are executed on a
machine without VSX support, it might create issues. Fix that.
Also fix one mnemonics spelling mistake in comment.

Fixes: 50b80a12e4cc ("powerpc sstep: Add support for prefixed load/stores")
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
---
 arch/powerpc/lib/sstep.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index e6242744c71b..faf0bbf3efb7 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -2757,6 +2757,7 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			case 41:	/* plwa */
 				op->type = MKOP(LOAD, PREFIXED | SIGNEXT, 4);
 				break;
+#ifdef CONFIG_VSX
 			case 42:        /* plxsd */
 				op->reg = rd + 32;
 				op->type = MKOP(LOAD_VSX, PREFIXED, 8);
@@ -2797,13 +2798,14 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 				op->element_size = 16;
 				op->vsx_flags = VSX_CHECK_VEC;
 				break;
+#endif /* CONFIG_VSX */
 			case 56:        /* plq */
 				op->type = MKOP(LOAD, PREFIXED, 16);
 				break;
 			case 57:	/* pld */
 				op->type = MKOP(LOAD, PREFIXED, 8);
 				break;
-			case 60:        /* stq */
+			case 60:        /* pstq */
 				op->type = MKOP(STORE, PREFIXED, 16);
 				break;
 			case 61:	/* pstd */
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v5 3/5] powerpc/sstep: Support VSX vector paired storage access instructions
  2020-10-11  5:09 [PATCH v5 0/5] powerpc/sstep: VSX 32-byte vector paired load/store instructions Ravi Bangoria
  2020-10-11  5:09 ` [PATCH v5 1/5] powerpc/sstep: Emulate prefixed instructions only when CPU_FTR_ARCH_31 is set Ravi Bangoria
  2020-10-11  5:09 ` [PATCH v5 2/5] powerpc/sstep: Cover new VSX instructions under CONFIG_VSX Ravi Bangoria
@ 2020-10-11  5:09 ` Ravi Bangoria
  2020-10-11  5:09 ` [PATCH v5 4/5] powerpc/ppc-opcode: Add encoding macros for VSX vector paired instructions Ravi Bangoria
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Ravi Bangoria @ 2020-10-11  5:09 UTC (permalink / raw)
  To: mpe
  Cc: ravi.bangoria, jniethe5, bala24, paulus, sandipan, naveen.n.rao,
	linuxppc-dev

From: Balamuruhan S <bala24@linux.ibm.com>

VSX Vector Paired instructions loads/stores an octword (32 bytes)
from/to storage into two sequential VSRs. Add emulation support
for these new instructions:
  * Load VSX Vector Paired (lxvp)
  * Load VSX Vector Paired Indexed (lxvpx)
  * Prefixed Load VSX Vector Paired (plxvp)
  * Store VSX Vector Paired (stxvp)
  * Store VSX Vector Paired Indexed (stxvpx)
  * Prefixed Store VSX Vector Paired (pstxvp)

Suggested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Balamuruhan S <bala24@linux.ibm.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
[kernel test robot reported a build failure]
Reported-by: kernel test robot <lkp@intel.com>
---
 arch/powerpc/lib/sstep.c | 150 +++++++++++++++++++++++++++++++++------
 1 file changed, 129 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index faf0bbf3efb7..96ca813a65e7 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -32,6 +32,10 @@ extern char system_call_vectored_emulate[];
 #define XER_OV32	0x00080000U
 #define XER_CA32	0x00040000U
 
+#ifdef CONFIG_VSX
+#define VSX_REGISTER_XTP(rd)   ((((rd) & 1) << 5) | ((rd) & 0xfe))
+#endif
+
 #ifdef CONFIG_PPC_FPU
 /*
  * Functions in ldstfp.S
@@ -279,6 +283,19 @@ static nokprobe_inline void do_byte_reverse(void *ptr, int nb)
 		up[1] = tmp;
 		break;
 	}
+	case 32: {
+		unsigned long *up = (unsigned long *)ptr;
+		unsigned long tmp;
+
+		tmp = byterev_8(up[0]);
+		up[0] = byterev_8(up[3]);
+		up[3] = tmp;
+		tmp = byterev_8(up[2]);
+		up[2] = byterev_8(up[1]);
+		up[1] = tmp;
+		break;
+	}
+
 #endif
 	default:
 		WARN_ON_ONCE(1);
@@ -709,6 +726,8 @@ void emulate_vsx_load(struct instruction_op *op, union vsx_reg *reg,
 	reg->d[0] = reg->d[1] = 0;
 
 	switch (op->element_size) {
+	case 32:
+		/* [p]lxvp[x] */
 	case 16:
 		/* whole vector; lxv[x] or lxvl[l] */
 		if (size == 0)
@@ -717,7 +736,7 @@ void emulate_vsx_load(struct instruction_op *op, union vsx_reg *reg,
 		if (IS_LE && (op->vsx_flags & VSX_LDLEFT))
 			rev = !rev;
 		if (rev)
-			do_byte_reverse(reg, 16);
+			do_byte_reverse(reg, size);
 		break;
 	case 8:
 		/* scalar loads, lxvd2x, lxvdsx */
@@ -793,6 +812,20 @@ void emulate_vsx_store(struct instruction_op *op, const union vsx_reg *reg,
 	size = GETSIZE(op->type);
 
 	switch (op->element_size) {
+	case 32:
+		/* [p]stxvp[x] */
+		if (size == 0)
+			break;
+		if (rev) {
+			/* reverse 32 bytes */
+			buf.d[0] = byterev_8(reg->d[3]);
+			buf.d[1] = byterev_8(reg->d[2]);
+			buf.d[2] = byterev_8(reg->d[1]);
+			buf.d[3] = byterev_8(reg->d[0]);
+			reg = &buf;
+		}
+		memcpy(mem, reg, size);
+		break;
 	case 16:
 		/* stxv, stxvx, stxvl, stxvll */
 		if (size == 0)
@@ -861,28 +894,43 @@ static nokprobe_inline int do_vsx_load(struct instruction_op *op,
 				       bool cross_endian)
 {
 	int reg = op->reg;
-	u8 mem[16];
-	union vsx_reg buf;
+	int i, j, nr_vsx_regs;
+	u8 mem[32];
+	union vsx_reg buf[2];
 	int size = GETSIZE(op->type);
 
 	if (!address_ok(regs, ea, size) || copy_mem_in(mem, ea, size, regs))
 		return -EFAULT;
 
-	emulate_vsx_load(op, &buf, mem, cross_endian);
+	nr_vsx_regs = size / sizeof(__vector128);
+	emulate_vsx_load(op, buf, mem, cross_endian);
 	preempt_disable();
 	if (reg < 32) {
 		/* FP regs + extensions */
 		if (regs->msr & MSR_FP) {
-			load_vsrn(reg, &buf);
+			for (i = 0; i < nr_vsx_regs; i++) {
+				j = IS_LE ? nr_vsx_regs - i - 1 : i;
+				load_vsrn(reg + i, &buf[j].v);
+			}
 		} else {
-			current->thread.fp_state.fpr[reg][0] = buf.d[0];
-			current->thread.fp_state.fpr[reg][1] = buf.d[1];
+			for (i = 0; i < nr_vsx_regs; i++) {
+				j = IS_LE ? nr_vsx_regs - i - 1 : i;
+				current->thread.fp_state.fpr[reg + i][0] = buf[j].d[0];
+				current->thread.fp_state.fpr[reg + i][1] = buf[j].d[1];
+			}
 		}
 	} else {
-		if (regs->msr & MSR_VEC)
-			load_vsrn(reg, &buf);
-		else
-			current->thread.vr_state.vr[reg - 32] = buf.v;
+		if (regs->msr & MSR_VEC) {
+			for (i = 0; i < nr_vsx_regs; i++) {
+				j = IS_LE ? nr_vsx_regs - i - 1 : i;
+				load_vsrn(reg + i, &buf[j].v);
+			}
+		} else {
+			for (i = 0; i < nr_vsx_regs; i++) {
+				j = IS_LE ? nr_vsx_regs - i - 1 : i;
+				current->thread.vr_state.vr[reg - 32 + i] = buf[j].v;
+			}
+		}
 	}
 	preempt_enable();
 	return 0;
@@ -893,30 +941,45 @@ static nokprobe_inline int do_vsx_store(struct instruction_op *op,
 					bool cross_endian)
 {
 	int reg = op->reg;
-	u8 mem[16];
-	union vsx_reg buf;
+	int i, j, nr_vsx_regs;
+	u8 mem[32];
+	union vsx_reg buf[2];
 	int size = GETSIZE(op->type);
 
 	if (!address_ok(regs, ea, size))
 		return -EFAULT;
 
+	nr_vsx_regs = size / sizeof(__vector128);
 	preempt_disable();
 	if (reg < 32) {
 		/* FP regs + extensions */
 		if (regs->msr & MSR_FP) {
-			store_vsrn(reg, &buf);
+			for (i = 0; i < nr_vsx_regs; i++) {
+				j = IS_LE ? nr_vsx_regs - i - 1 : i;
+				store_vsrn(reg + i, &buf[j].v);
+			}
 		} else {
-			buf.d[0] = current->thread.fp_state.fpr[reg][0];
-			buf.d[1] = current->thread.fp_state.fpr[reg][1];
+			for (i = 0; i < nr_vsx_regs; i++) {
+				j = IS_LE ? nr_vsx_regs - i - 1 : i;
+				buf[j].d[0] = current->thread.fp_state.fpr[reg + i][0];
+				buf[j].d[1] = current->thread.fp_state.fpr[reg + i][1];
+			}
 		}
 	} else {
-		if (regs->msr & MSR_VEC)
-			store_vsrn(reg, &buf);
-		else
-			buf.v = current->thread.vr_state.vr[reg - 32];
+		if (regs->msr & MSR_VEC) {
+			for (i = 0; i < nr_vsx_regs; i++) {
+				j = IS_LE ? nr_vsx_regs - i - 1 : i;
+				store_vsrn(reg + i, &buf[j].v);
+			}
+		} else {
+			for (i = 0; i < nr_vsx_regs; i++) {
+				j = IS_LE ? nr_vsx_regs - i - 1 : i;
+				buf[j].v = current->thread.vr_state.vr[reg - 32 + i];
+			}
+		}
 	}
 	preempt_enable();
-	emulate_vsx_store(op, &buf, mem, cross_endian);
+	emulate_vsx_store(op, buf, mem, cross_endian);
 	return  copy_mem_out(mem, ea, size, regs);
 }
 #endif /* CONFIG_VSX */
@@ -2403,6 +2466,14 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			op->vsx_flags = VSX_SPLAT;
 			break;
 
+		case 333:       /* lxvpx */
+			if (!cpu_has_feature(CPU_FTR_ARCH_31))
+				return -1;
+			op->reg = VSX_REGISTER_XTP(rd);
+			op->type = MKOP(LOAD_VSX, 0, 32);
+			op->element_size = 32;
+			break;
+
 		case 364:	/* lxvwsx */
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(LOAD_VSX, 0, 4);
@@ -2431,6 +2502,13 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 				VSX_CHECK_VEC;
 			break;
 		}
+		case 461:       /* stxvpx */
+			if (!cpu_has_feature(CPU_FTR_ARCH_31))
+				return -1;
+			op->reg = VSX_REGISTER_XTP(rd);
+			op->type = MKOP(STORE_VSX, 0, 32);
+			op->element_size = 32;
+			break;
 		case 524:	/* lxsspx */
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(LOAD_VSX, 0, 4);
@@ -2672,6 +2750,22 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 #endif
 
 #ifdef CONFIG_VSX
+	case 6:
+		if (!cpu_has_feature(CPU_FTR_ARCH_31))
+			return -1;
+		op->ea = dqform_ea(word, regs);
+		op->reg = VSX_REGISTER_XTP(rd);
+		op->element_size = 32;
+		switch (word & 0xf) {
+		case 0:         /* lxvp */
+			op->type = MKOP(LOAD_VSX, 0, 32);
+			break;
+		case 1:         /* stxvp */
+			op->type = MKOP(STORE_VSX, 0, 32);
+			break;
+		}
+		break;
+
 	case 61:	/* stfdp, lxv, stxsd, stxssp, stxv */
 		switch (word & 7) {
 		case 0:		/* stfdp with LSB of DS field = 0 */
@@ -2805,12 +2899,26 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			case 57:	/* pld */
 				op->type = MKOP(LOAD, PREFIXED, 8);
 				break;
+#ifdef CONFIG_VSX
+			case 58:        /* plxvp */
+				op->reg = VSX_REGISTER_XTP(rd);
+				op->type = MKOP(LOAD_VSX, PREFIXED, 32);
+				op->element_size = 32;
+				break;
+#endif /* CONFIG_VSX */
 			case 60:        /* pstq */
 				op->type = MKOP(STORE, PREFIXED, 16);
 				break;
 			case 61:	/* pstd */
 				op->type = MKOP(STORE, PREFIXED, 8);
 				break;
+#ifdef CONFIG_VSX
+			case 62:        /* pstxvp */
+				op->reg = VSX_REGISTER_XTP(rd);
+				op->type = MKOP(STORE_VSX, PREFIXED, 32);
+				op->element_size = 32;
+				break;
+#endif /* CONFIG_VSX */
 			}
 			break;
 		case 1: /* Type 01 Eight-Byte Register-to-Register */
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v5 4/5] powerpc/ppc-opcode: Add encoding macros for VSX vector paired instructions
  2020-10-11  5:09 [PATCH v5 0/5] powerpc/sstep: VSX 32-byte vector paired load/store instructions Ravi Bangoria
                   ` (2 preceding siblings ...)
  2020-10-11  5:09 ` [PATCH v5 3/5] powerpc/sstep: Support VSX vector paired storage access instructions Ravi Bangoria
@ 2020-10-11  5:09 ` Ravi Bangoria
  2020-10-11  5:09 ` [PATCH v5 5/5] powerpc/sstep: Add testcases for VSX vector paired load/store instructions Ravi Bangoria
  2020-12-15 10:49 ` [PATCH v5 0/5] powerpc/sstep: VSX 32-byte " Michael Ellerman
  5 siblings, 0 replies; 13+ messages in thread
From: Ravi Bangoria @ 2020-10-11  5:09 UTC (permalink / raw)
  To: mpe
  Cc: ravi.bangoria, jniethe5, bala24, paulus, sandipan, naveen.n.rao,
	linuxppc-dev

From: Balamuruhan S <bala24@linux.ibm.com>

Add instruction encodings, DQ, D0, D1 immediate, XTP, XSP operands as
macros for new VSX vector paired instructions,
  * Load VSX Vector Paired (lxvp)
  * Load VSX Vector Paired Indexed (lxvpx)
  * Prefixed Load VSX Vector Paired (plxvp)
  * Store VSX Vector Paired (stxvp)
  * Store VSX Vector Paired Indexed (stxvpx)
  * Prefixed Store VSX Vector Paired (pstxvp)

Suggested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Balamuruhan S <bala24@linux.ibm.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
---
 arch/powerpc/include/asm/ppc-opcode.h | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h
index a6e3700c4566..5e7918ca4fb7 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -78,6 +78,9 @@
 
 #define IMM_L(i)               ((uintptr_t)(i) & 0xffff)
 #define IMM_DS(i)              ((uintptr_t)(i) & 0xfffc)
+#define IMM_DQ(i)              ((uintptr_t)(i) & 0xfff0)
+#define IMM_D0(i)              (((uintptr_t)(i) >> 16) & 0x3ffff)
+#define IMM_D1(i)              IMM_L(i)
 
 /*
  * 16-bit immediate helper macros: HA() is for use with sign-extending instrs
@@ -295,6 +298,8 @@
 #define __PPC_XB(b)	((((b) & 0x1f) << 11) | (((b) & 0x20) >> 4))
 #define __PPC_XS(s)	((((s) & 0x1f) << 21) | (((s) & 0x20) >> 5))
 #define __PPC_XT(s)	__PPC_XS(s)
+#define __PPC_XSP(s)	((((s) & 0x1e) | (((s) >> 5) & 0x1)) << 21)
+#define __PPC_XTP(s)	__PPC_XSP(s)
 #define __PPC_T_TLB(t)	(((t) & 0x3) << 21)
 #define __PPC_WC(w)	(((w) & 0x3) << 21)
 #define __PPC_WS(w)	(((w) & 0x1f) << 11)
@@ -395,6 +400,14 @@
 #define PPC_RAW_XVCPSGNDP(t, a, b)	((0xf0000780 | VSX_XX3((t), (a), (b))))
 #define PPC_RAW_VPERMXOR(vrt, vra, vrb, vrc) \
 	((0x1000002d | ___PPC_RT(vrt) | ___PPC_RA(vra) | ___PPC_RB(vrb) | (((vrc) & 0x1f) << 6)))
+#define PPC_RAW_LXVP(xtp, a, i)		(0x18000000 | __PPC_XTP(xtp) | ___PPC_RA(a) | IMM_DQ(i))
+#define PPC_RAW_STXVP(xsp, a, i)	(0x18000001 | __PPC_XSP(xsp) | ___PPC_RA(a) | IMM_DQ(i))
+#define PPC_RAW_LXVPX(xtp, a, b)	(0x7c00029a | __PPC_XTP(xtp) | ___PPC_RA(a) | ___PPC_RB(b))
+#define PPC_RAW_STXVPX(xsp, a, b)	(0x7c00039a | __PPC_XSP(xsp) | ___PPC_RA(a) | ___PPC_RB(b))
+#define PPC_RAW_PLXVP(xtp, i, a, pr) \
+	((PPC_PREFIX_8LS | __PPC_PRFX_R(pr) | IMM_D0(i)) << 32 | (0xe8000000 | __PPC_XTP(xtp) | ___PPC_RA(a) | IMM_D1(i)))
+#define PPC_RAW_PSTXVP(xsp, i, a, pr) \
+	((PPC_PREFIX_8LS | __PPC_PRFX_R(pr) | IMM_D0(i)) << 32 | (0xf8000000 | __PPC_XSP(xsp) | ___PPC_RA(a) | IMM_D1(i)))
 #define PPC_RAW_NAP			(0x4c000364)
 #define PPC_RAW_SLEEP			(0x4c0003a4)
 #define PPC_RAW_WINKLE			(0x4c0003e4)
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v5 5/5] powerpc/sstep: Add testcases for VSX vector paired load/store instructions
  2020-10-11  5:09 [PATCH v5 0/5] powerpc/sstep: VSX 32-byte vector paired load/store instructions Ravi Bangoria
                   ` (3 preceding siblings ...)
  2020-10-11  5:09 ` [PATCH v5 4/5] powerpc/ppc-opcode: Add encoding macros for VSX vector paired instructions Ravi Bangoria
@ 2020-10-11  5:09 ` Ravi Bangoria
  2020-12-15 10:49 ` [PATCH v5 0/5] powerpc/sstep: VSX 32-byte " Michael Ellerman
  5 siblings, 0 replies; 13+ messages in thread
From: Ravi Bangoria @ 2020-10-11  5:09 UTC (permalink / raw)
  To: mpe
  Cc: ravi.bangoria, jniethe5, bala24, paulus, sandipan, naveen.n.rao,
	linuxppc-dev

From: Balamuruhan S <bala24@linux.ibm.com>

Add testcases for VSX vector paired load/store instructions.
Sample o/p:

  emulate_step_test: lxvp           : PASS
  emulate_step_test: stxvp          : PASS
  emulate_step_test: lxvpx          : PASS
  emulate_step_test: stxvpx         : PASS
  emulate_step_test: plxvp          : PASS
  emulate_step_test: pstxvp         : PASS

Signed-off-by: Balamuruhan S <bala24@linux.ibm.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
---
 arch/powerpc/lib/test_emulate_step.c | 270 +++++++++++++++++++++++++++
 1 file changed, 270 insertions(+)

diff --git a/arch/powerpc/lib/test_emulate_step.c b/arch/powerpc/lib/test_emulate_step.c
index 0a201b771477..783d1b85ecfe 100644
--- a/arch/powerpc/lib/test_emulate_step.c
+++ b/arch/powerpc/lib/test_emulate_step.c
@@ -612,6 +612,273 @@ static void __init test_lxvd2x_stxvd2x(void)
 }
 #endif /* CONFIG_VSX */
 
+#ifdef CONFIG_VSX
+static void __init test_lxvp_stxvp(void)
+{
+	struct pt_regs regs;
+	union {
+		vector128 a;
+		u32 b[4];
+	} c[2];
+	u32 cached_b[8];
+	int stepped = -1;
+
+	if (!cpu_has_feature(CPU_FTR_ARCH_31)) {
+		show_result("lxvp", "SKIP (!CPU_FTR_ARCH_31)");
+		show_result("stxvp", "SKIP (!CPU_FTR_ARCH_31)");
+		return;
+	}
+
+	init_pt_regs(&regs);
+
+	/*** lxvp ***/
+
+	cached_b[0] = c[0].b[0] = 18233;
+	cached_b[1] = c[0].b[1] = 34863571;
+	cached_b[2] = c[0].b[2] = 834;
+	cached_b[3] = c[0].b[3] = 6138911;
+	cached_b[4] = c[1].b[0] = 1234;
+	cached_b[5] = c[1].b[1] = 5678;
+	cached_b[6] = c[1].b[2] = 91011;
+	cached_b[7] = c[1].b[3] = 121314;
+
+	regs.gpr[4] = (unsigned long)&c[0].a;
+
+	/*
+	 * lxvp XTp,DQ(RA)
+	 * XTp = 32xTX + 2xTp
+	 * let TX=1 Tp=1 RA=4 DQ=0
+	 */
+	stepped = emulate_step(&regs, ppc_inst(PPC_RAW_LXVP(34, 4, 0)));
+
+	if (stepped == 1 && cpu_has_feature(CPU_FTR_VSX)) {
+		show_result("lxvp", "PASS");
+	} else {
+		if (!cpu_has_feature(CPU_FTR_VSX))
+			show_result("lxvp", "PASS (!CPU_FTR_VSX)");
+		else
+			show_result("lxvp", "FAIL");
+	}
+
+	/*** stxvp ***/
+
+	c[0].b[0] = 21379463;
+	c[0].b[1] = 87;
+	c[0].b[2] = 374234;
+	c[0].b[3] = 4;
+	c[1].b[0] = 90;
+	c[1].b[1] = 122;
+	c[1].b[2] = 555;
+	c[1].b[3] = 32144;
+
+	/*
+	 * stxvp XSp,DQ(RA)
+	 * XSp = 32xSX + 2xSp
+	 * let SX=1 Sp=1 RA=4 DQ=0
+	 */
+	stepped = emulate_step(&regs, ppc_inst(PPC_RAW_STXVP(34, 4, 0)));
+
+	if (stepped == 1 && cached_b[0] == c[0].b[0] && cached_b[1] == c[0].b[1] &&
+	    cached_b[2] == c[0].b[2] && cached_b[3] == c[0].b[3] &&
+	    cached_b[4] == c[1].b[0] && cached_b[5] == c[1].b[1] &&
+	    cached_b[6] == c[1].b[2] && cached_b[7] == c[1].b[3] &&
+	    cpu_has_feature(CPU_FTR_VSX)) {
+		show_result("stxvp", "PASS");
+	} else {
+		if (!cpu_has_feature(CPU_FTR_VSX))
+			show_result("stxvp", "PASS (!CPU_FTR_VSX)");
+		else
+			show_result("stxvp", "FAIL");
+	}
+}
+#else
+static void __init test_lxvp_stxvp(void)
+{
+	show_result("lxvp", "SKIP (CONFIG_VSX is not set)");
+	show_result("stxvp", "SKIP (CONFIG_VSX is not set)");
+}
+#endif /* CONFIG_VSX */
+
+#ifdef CONFIG_VSX
+static void __init test_lxvpx_stxvpx(void)
+{
+	struct pt_regs regs;
+	union {
+		vector128 a;
+		u32 b[4];
+	} c[2];
+	u32 cached_b[8];
+	int stepped = -1;
+
+	if (!cpu_has_feature(CPU_FTR_ARCH_31)) {
+		show_result("lxvpx", "SKIP (!CPU_FTR_ARCH_31)");
+		show_result("stxvpx", "SKIP (!CPU_FTR_ARCH_31)");
+		return;
+	}
+
+	init_pt_regs(&regs);
+
+	/*** lxvpx ***/
+
+	cached_b[0] = c[0].b[0] = 18233;
+	cached_b[1] = c[0].b[1] = 34863571;
+	cached_b[2] = c[0].b[2] = 834;
+	cached_b[3] = c[0].b[3] = 6138911;
+	cached_b[4] = c[1].b[0] = 1234;
+	cached_b[5] = c[1].b[1] = 5678;
+	cached_b[6] = c[1].b[2] = 91011;
+	cached_b[7] = c[1].b[3] = 121314;
+
+	regs.gpr[3] = (unsigned long)&c[0].a;
+	regs.gpr[4] = 0;
+
+	/*
+	 * lxvpx XTp,RA,RB
+	 * XTp = 32xTX + 2xTp
+	 * let TX=1 Tp=1 RA=3 RB=4
+	 */
+	stepped = emulate_step(&regs, ppc_inst(PPC_RAW_LXVPX(34, 3, 4)));
+
+	if (stepped == 1 && cpu_has_feature(CPU_FTR_VSX)) {
+		show_result("lxvpx", "PASS");
+	} else {
+		if (!cpu_has_feature(CPU_FTR_VSX))
+			show_result("lxvpx", "PASS (!CPU_FTR_VSX)");
+		else
+			show_result("lxvpx", "FAIL");
+	}
+
+	/*** stxvpx ***/
+
+	c[0].b[0] = 21379463;
+	c[0].b[1] = 87;
+	c[0].b[2] = 374234;
+	c[0].b[3] = 4;
+	c[1].b[0] = 90;
+	c[1].b[1] = 122;
+	c[1].b[2] = 555;
+	c[1].b[3] = 32144;
+
+	/*
+	 * stxvpx XSp,RA,RB
+	 * XSp = 32xSX + 2xSp
+	 * let SX=1 Sp=1 RA=3 RB=4
+	 */
+	stepped = emulate_step(&regs, ppc_inst(PPC_RAW_STXVPX(34, 3, 4)));
+
+	if (stepped == 1 && cached_b[0] == c[0].b[0] && cached_b[1] == c[0].b[1] &&
+	    cached_b[2] == c[0].b[2] && cached_b[3] == c[0].b[3] &&
+	    cached_b[4] == c[1].b[0] && cached_b[5] == c[1].b[1] &&
+	    cached_b[6] == c[1].b[2] && cached_b[7] == c[1].b[3] &&
+	    cpu_has_feature(CPU_FTR_VSX)) {
+		show_result("stxvpx", "PASS");
+	} else {
+		if (!cpu_has_feature(CPU_FTR_VSX))
+			show_result("stxvpx", "PASS (!CPU_FTR_VSX)");
+		else
+			show_result("stxvpx", "FAIL");
+	}
+}
+#else
+static void __init test_lxvpx_stxvpx(void)
+{
+	show_result("lxvpx", "SKIP (CONFIG_VSX is not set)");
+	show_result("stxvpx", "SKIP (CONFIG_VSX is not set)");
+}
+#endif /* CONFIG_VSX */
+
+#ifdef CONFIG_VSX
+static void __init test_plxvp_pstxvp(void)
+{
+	struct ppc_inst instr;
+	struct pt_regs regs;
+	union {
+		vector128 a;
+		u32 b[4];
+	} c[2];
+	u32 cached_b[8];
+	int stepped = -1;
+
+	if (!cpu_has_feature(CPU_FTR_ARCH_31)) {
+		show_result("plxvp", "SKIP (!CPU_FTR_ARCH_31)");
+		show_result("pstxvp", "SKIP (!CPU_FTR_ARCH_31)");
+		return;
+	}
+
+	/*** plxvp ***/
+
+	cached_b[0] = c[0].b[0] = 18233;
+	cached_b[1] = c[0].b[1] = 34863571;
+	cached_b[2] = c[0].b[2] = 834;
+	cached_b[3] = c[0].b[3] = 6138911;
+	cached_b[4] = c[1].b[0] = 1234;
+	cached_b[5] = c[1].b[1] = 5678;
+	cached_b[6] = c[1].b[2] = 91011;
+	cached_b[7] = c[1].b[3] = 121314;
+
+	init_pt_regs(&regs);
+	regs.gpr[3] = (unsigned long)&c[0].a;
+
+	/*
+	 * plxvp XTp,D(RA),R
+	 * XTp = 32xTX + 2xTp
+	 * let RA=3 R=0 D=d0||d1=0 R=0 Tp=1 TX=1
+	 */
+	instr = ppc_inst_prefix(PPC_RAW_PLXVP(34, 0, 3, 0) >> 32,
+			PPC_RAW_PLXVP(34, 0, 3, 0) & 0xffffffff);
+
+	stepped = emulate_step(&regs, instr);
+	if (stepped == 1 && cpu_has_feature(CPU_FTR_VSX)) {
+		show_result("plxvp", "PASS");
+	} else {
+		if (!cpu_has_feature(CPU_FTR_VSX))
+			show_result("plxvp", "PASS (!CPU_FTR_VSX)");
+		else
+			show_result("plxvp", "FAIL");
+	}
+
+	/*** pstxvp ***/
+
+	c[0].b[0] = 21379463;
+	c[0].b[1] = 87;
+	c[0].b[2] = 374234;
+	c[0].b[3] = 4;
+	c[1].b[0] = 90;
+	c[1].b[1] = 122;
+	c[1].b[2] = 555;
+	c[1].b[3] = 32144;
+
+	/*
+	 * pstxvp XSp,D(RA),R
+	 * XSp = 32xSX + 2xSp
+	 * let RA=3 D=d0||d1=0 R=0 Sp=1 SX=1
+	 */
+	instr = ppc_inst_prefix(PPC_RAW_PSTXVP(34, 0, 3, 0) >> 32,
+			PPC_RAW_PSTXVP(34, 0, 3, 0) & 0xffffffff);
+
+	stepped = emulate_step(&regs, instr);
+
+	if (stepped == 1 && cached_b[0] == c[0].b[0] && cached_b[1] == c[0].b[1] &&
+	    cached_b[2] == c[0].b[2] && cached_b[3] == c[0].b[3] &&
+	    cached_b[4] == c[1].b[0] && cached_b[5] == c[1].b[1] &&
+	    cached_b[6] == c[1].b[2] && cached_b[7] == c[1].b[3] &&
+	    cpu_has_feature(CPU_FTR_VSX)) {
+		show_result("pstxvp", "PASS");
+	} else {
+		if (!cpu_has_feature(CPU_FTR_VSX))
+			show_result("pstxvp", "PASS (!CPU_FTR_VSX)");
+		else
+			show_result("pstxvp", "FAIL");
+	}
+}
+#else
+static void __init test_plxvp_pstxvp(void)
+{
+	show_result("plxvp", "SKIP (CONFIG_VSX is not set)");
+	show_result("pstxvp", "SKIP (CONFIG_VSX is not set)");
+}
+#endif /* CONFIG_VSX */
+
 static void __init run_tests_load_store(void)
 {
 	test_ld();
@@ -628,6 +895,9 @@ static void __init run_tests_load_store(void)
 	test_plfd_pstfd();
 	test_lvx_stvx();
 	test_lxvd2x_stxvd2x();
+	test_lxvp_stxvp();
+	test_lxvpx_stxvpx();
+	test_plxvp_pstxvp();
 }
 
 struct compute_test {
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 1/5] powerpc/sstep: Emulate prefixed instructions only when CPU_FTR_ARCH_31 is set
  2020-10-11  5:09 ` [PATCH v5 1/5] powerpc/sstep: Emulate prefixed instructions only when CPU_FTR_ARCH_31 is set Ravi Bangoria
@ 2020-10-11 15:06   ` Sandipan Das
  2020-10-12  1:51   ` Daniel Axtens
  2020-10-12 13:44   ` Daniel Axtens
  2 siblings, 0 replies; 13+ messages in thread
From: Sandipan Das @ 2020-10-11 15:06 UTC (permalink / raw)
  To: Ravi Bangoria, bala24; +Cc: naveen.n.rao, linuxppc-dev, paulus, jniethe5


On 11/10/20 10:39 am, Ravi Bangoria wrote:
> From: Balamuruhan S <bala24@linux.ibm.com>
> 
> Unconditional emulation of prefixed instructions will allow
> emulation of them on Power10 predecessors which might cause
> issues. Restrict that.
> 
> Fixes: 3920742b92f5 ("powerpc sstep: Add support for prefixed fixed-point arithmetic")
> Fixes: 50b80a12e4cc ("powerpc sstep: Add support for prefixed load/stores")
> Signed-off-by: Balamuruhan S <bala24@linux.ibm.com>
> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
> ---
>  arch/powerpc/lib/sstep.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> index e9dcaba9a4f8..e6242744c71b 100644
> --- a/arch/powerpc/lib/sstep.c
> +++ b/arch/powerpc/lib/sstep.c
> @@ -1346,6 +1346,9 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
>  	switch (opcode) {
>  #ifdef __powerpc64__
>  	case 1:
> +		if (!cpu_has_feature(CPU_FTR_ARCH_31))
> +			return -1;
> +
>  		prefix_r = GET_PREFIX_R(word);
>  		ra = GET_PREFIX_RA(suffix);
>  		rd = (suffix >> 21) & 0x1f;
> @@ -2733,6 +2736,9 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
>  		}
>  		break;
>  	case 1: /* Prefixed instructions */
> +		if (!cpu_has_feature(CPU_FTR_ARCH_31))
> +			return -1;
> +
>  		prefix_r = GET_PREFIX_R(word);
>  		ra = GET_PREFIX_RA(suffix);
>  		op->update_reg = ra;
> 

LGTM

Reviewed-by: Sandipan Das <sandipan@linux.ibm.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 1/5] powerpc/sstep: Emulate prefixed instructions only when CPU_FTR_ARCH_31 is set
  2020-10-11  5:09 ` [PATCH v5 1/5] powerpc/sstep: Emulate prefixed instructions only when CPU_FTR_ARCH_31 is set Ravi Bangoria
  2020-10-11 15:06   ` Sandipan Das
@ 2020-10-12  1:51   ` Daniel Axtens
  2020-10-12 11:07     ` Ravi Bangoria
  2020-10-12 13:44   ` Daniel Axtens
  2 siblings, 1 reply; 13+ messages in thread
From: Daniel Axtens @ 2020-10-12  1:51 UTC (permalink / raw)
  To: Ravi Bangoria, mpe
  Cc: ravi.bangoria, jniethe5, bala24, paulus, sandipan, naveen.n.rao,
	linuxppc-dev

Hi,

Apologies if this has come up in a previous revision.


>  	case 1:
> +		if (!cpu_has_feature(CPU_FTR_ARCH_31))
> +			return -1;
> +
>  		prefix_r = GET_PREFIX_R(word);
>  		ra = GET_PREFIX_RA(suffix);

The comment above analyse_instr reads in part:

 * Return value is 1 if the instruction can be emulated just by
 * updating *regs with the information in *op, -1 if we need the
 * GPRs but *regs doesn't contain the full register set, or 0
 * otherwise.

I was wondering why returning -1 if the instruction isn't supported the
right thing to do - it seemed to me that it should return 0?

I did look and see that there are other cases where the code returns -1
for an unsupported operation, e.g.:

#ifdef __powerpc64__
	case 4:
		if (!cpu_has_feature(CPU_FTR_ARCH_300))
			return -1;

		switch (word & 0x3f) {
		case 48:	/* maddhd */

That's from commit 930d6288a267 ("powerpc: sstep: Add support for
maddhd, maddhdu, maddld instructions"), but it's not explained there either

I see the same pattern in a number of commits: commit 6324320de609
("powerpc sstep: Add support for modsd, modud instructions"), commit
6c180071509a ("powerpc sstep: Add support for modsw, moduw
instructions"), commit a23987ef267a ("powerpc: sstep: Add support for
darn instruction") and a few others, all of which seem to have come
through Sandipan in February 2019. I haven't spotted any explanation for
why -1 was chosen, but I haven't checked the mailing list archives.

If -1 is OK, would it be possible to update the comment to explain why?

Kind regards,
Daniel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 1/5] powerpc/sstep: Emulate prefixed instructions only when CPU_FTR_ARCH_31 is set
  2020-10-12  1:51   ` Daniel Axtens
@ 2020-10-12 11:07     ` Ravi Bangoria
  2020-10-12 12:55       ` Daniel Axtens
  0 siblings, 1 reply; 13+ messages in thread
From: Ravi Bangoria @ 2020-10-12 11:07 UTC (permalink / raw)
  To: Daniel Axtens
  Cc: Ravi Bangoria, bala24, paulus, sandipan, jniethe5, naveen.n.rao,
	linuxppc-dev

Hi Daniel,

On 10/12/20 7:21 AM, Daniel Axtens wrote:
> Hi,
> 
> Apologies if this has come up in a previous revision.
> 
> 
>>   	case 1:
>> +		if (!cpu_has_feature(CPU_FTR_ARCH_31))
>> +			return -1;
>> +
>>   		prefix_r = GET_PREFIX_R(word);
>>   		ra = GET_PREFIX_RA(suffix);
> 
> The comment above analyse_instr reads in part:
> 
>   * Return value is 1 if the instruction can be emulated just by
>   * updating *regs with the information in *op, -1 if we need the
>   * GPRs but *regs doesn't contain the full register set, or 0
>   * otherwise.
> 
> I was wondering why returning -1 if the instruction isn't supported the
> right thing to do - it seemed to me that it should return 0?
> 
> I did look and see that there are other cases where the code returns -1
> for an unsupported operation, e.g.:
> 
> #ifdef __powerpc64__
> 	case 4:
> 		if (!cpu_has_feature(CPU_FTR_ARCH_300))
> 			return -1;
> 
> 		switch (word & 0x3f) {
> 		case 48:	/* maddhd */
> 
> That's from commit 930d6288a267 ("powerpc: sstep: Add support for
> maddhd, maddhdu, maddld instructions"), but it's not explained there either
> 
> I see the same pattern in a number of commits: commit 6324320de609
> ("powerpc sstep: Add support for modsd, modud instructions"), commit
> 6c180071509a ("powerpc sstep: Add support for modsw, moduw
> instructions"), commit a23987ef267a ("powerpc: sstep: Add support for
> darn instruction") and a few others, all of which seem to have come
> through Sandipan in February 2019. I haven't spotted any explanation for
> why -1 was chosen, but I haven't checked the mailing list archives.
> 
> If -1 is OK, would it be possible to update the comment to explain why?

Agreed. As you rightly pointed out, there are many more such cases and, yes,
we are aware of this issue and it's being worked upon as a separate patch.
(Sandipan did send a fix patch internally some time back).

Thanks for pointing it out!
Ravi

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 1/5] powerpc/sstep: Emulate prefixed instructions only when CPU_FTR_ARCH_31 is set
  2020-10-12 11:07     ` Ravi Bangoria
@ 2020-10-12 12:55       ` Daniel Axtens
  0 siblings, 0 replies; 13+ messages in thread
From: Daniel Axtens @ 2020-10-12 12:55 UTC (permalink / raw)
  To: Ravi Bangoria
  Cc: Ravi Bangoria, bala24, paulus, sandipan, jniethe5, naveen.n.rao,
	linuxppc-dev

Ravi Bangoria <ravi.bangoria@linux.ibm.com> writes:

> Hi Daniel,
>
> On 10/12/20 7:21 AM, Daniel Axtens wrote:
>> Hi,
>> 
>> Apologies if this has come up in a previous revision.
>> 
>> 
>>>   	case 1:
>>> +		if (!cpu_has_feature(CPU_FTR_ARCH_31))
>>> +			return -1;
>>> +
>>>   		prefix_r = GET_PREFIX_R(word);
>>>   		ra = GET_PREFIX_RA(suffix);
>> 
>> The comment above analyse_instr reads in part:
>> 
>>   * Return value is 1 if the instruction can be emulated just by
>>   * updating *regs with the information in *op, -1 if we need the
>>   * GPRs but *regs doesn't contain the full register set, or 0
>>   * otherwise.
>> 
>> I was wondering why returning -1 if the instruction isn't supported the
>> right thing to do - it seemed to me that it should return 0?
>> 
>> I did look and see that there are other cases where the code returns -1
>> for an unsupported operation, e.g.:
>> 
>> #ifdef __powerpc64__
>> 	case 4:
>> 		if (!cpu_has_feature(CPU_FTR_ARCH_300))
>> 			return -1;
>> 
>> 		switch (word & 0x3f) {
>> 		case 48:	/* maddhd */
>> 
>> That's from commit 930d6288a267 ("powerpc: sstep: Add support for
>> maddhd, maddhdu, maddld instructions"), but it's not explained there either
>> 
>> I see the same pattern in a number of commits: commit 6324320de609
>> ("powerpc sstep: Add support for modsd, modud instructions"), commit
>> 6c180071509a ("powerpc sstep: Add support for modsw, moduw
>> instructions"), commit a23987ef267a ("powerpc: sstep: Add support for
>> darn instruction") and a few others, all of which seem to have come
>> through Sandipan in February 2019. I haven't spotted any explanation for
>> why -1 was chosen, but I haven't checked the mailing list archives.
>> 
>> If -1 is OK, would it be possible to update the comment to explain why?
>
> Agreed. As you rightly pointed out, there are many more such cases and, yes,
> we are aware of this issue and it's being worked upon as a separate patch.
> (Sandipan did send a fix patch internally some time back).

That sounds like a perfectly reasonable approach.

I'd love to review the patch when it's sent - if you or Sandipan could
please cc: me so I don't miss it I'd really appreciate that.

I will proceed to review the rest of the patch and series.

Kind regards,
Daniel

>
> Thanks for pointing it out!
> Ravi

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 1/5] powerpc/sstep: Emulate prefixed instructions only when CPU_FTR_ARCH_31 is set
  2020-10-11  5:09 ` [PATCH v5 1/5] powerpc/sstep: Emulate prefixed instructions only when CPU_FTR_ARCH_31 is set Ravi Bangoria
  2020-10-11 15:06   ` Sandipan Das
  2020-10-12  1:51   ` Daniel Axtens
@ 2020-10-12 13:44   ` Daniel Axtens
  2020-10-14  7:34     ` Ravi Bangoria
  2 siblings, 1 reply; 13+ messages in thread
From: Daniel Axtens @ 2020-10-12 13:44 UTC (permalink / raw)
  To: Ravi Bangoria, mpe
  Cc: ravi.bangoria, jniethe5, bala24, paulus, sandipan, naveen.n.rao,
	linuxppc-dev

Hi,

To review this, I looked through the supported instructions to see if
there were any that I thought might have been missed.

I didn't find any other v3.1 ones, although I don't have a v3.1 ISA to
hand so I was basically looking for instructions I didn't recognise.

I did, however, find a number of instructions that are new in ISA 3.0
that aren't guarded:

 - addpcis
 - lxvl/stxvl
 - lxvll/stxvll
 - lxvwsx
 - stxvx
 - lxsipzx
 - lxvh8x
 - lxsihzx
 - lxvb16x/stxvb16x
 - stxsibx
 - stxsihx
 - lxvb16x
 - lxsd/stxsd
 - lxssp/stxssp
 - lxv/stxv
 
Also, I don't know how bothered we are about P7, but if I'm reading the
ISA correctly, lqarx/stqcx. are not supported before ISA 2.07. Likewise
a number of the vector instructions like lxsiwzx and lxsiwax (and the
companion stores).

I realise it's not really the point of this particular patch, so I don't
think this should block acceptance. What I would like to know - and
maybe this is something where we need mpe to weigh in - is whether we
need consistent guards for 2.07 and 3.0. We have some 3.0 guards already
but clearly not everywhere.

With all that said - the patch does what it says it does, and looks good
to me:

Reviewed-by: Daniel Axtens <dja@axtens.net>


Kind regards,
Daniel

> From: Balamuruhan S <bala24@linux.ibm.com>
>
> Unconditional emulation of prefixed instructions will allow
> emulation of them on Power10 predecessors which might cause
> issues. Restrict that.
>
> Fixes: 3920742b92f5 ("powerpc sstep: Add support for prefixed fixed-point arithmetic")
> Fixes: 50b80a12e4cc ("powerpc sstep: Add support for prefixed load/stores")
> Signed-off-by: Balamuruhan S <bala24@linux.ibm.com>
> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
> ---
>  arch/powerpc/lib/sstep.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> index e9dcaba9a4f8..e6242744c71b 100644
> --- a/arch/powerpc/lib/sstep.c
> +++ b/arch/powerpc/lib/sstep.c
> @@ -1346,6 +1346,9 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
>  	switch (opcode) {
>  #ifdef __powerpc64__
>  	case 1:
> +		if (!cpu_has_feature(CPU_FTR_ARCH_31))
> +			return -1;
> +
>  		prefix_r = GET_PREFIX_R(word);
>  		ra = GET_PREFIX_RA(suffix);
>  		rd = (suffix >> 21) & 0x1f;
> @@ -2733,6 +2736,9 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
>  		}
>  		break;
>  	case 1: /* Prefixed instructions */
> +		if (!cpu_has_feature(CPU_FTR_ARCH_31))
> +			return -1;
> +
>  		prefix_r = GET_PREFIX_R(word);
>  		ra = GET_PREFIX_RA(suffix);
>  		op->update_reg = ra;
> -- 
> 2.26.2

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 1/5] powerpc/sstep: Emulate prefixed instructions only when CPU_FTR_ARCH_31 is set
  2020-10-12 13:44   ` Daniel Axtens
@ 2020-10-14  7:34     ` Ravi Bangoria
  0 siblings, 0 replies; 13+ messages in thread
From: Ravi Bangoria @ 2020-10-14  7:34 UTC (permalink / raw)
  To: Daniel Axtens
  Cc: Ravi Bangoria, bala24, paulus, sandipan, jniethe5, naveen.n.rao,
	linuxppc-dev

Hi Daniel,

On 10/12/20 7:14 PM, Daniel Axtens wrote:
> Hi,
> 
> To review this, I looked through the supported instructions to see if
> there were any that I thought might have been missed.
> 
> I didn't find any other v3.1 ones, although I don't have a v3.1 ISA to
> hand so I was basically looking for instructions I didn't recognise.
> 
> I did, however, find a number of instructions that are new in ISA 3.0
> that aren't guarded:
> 
>   - addpcis
>   - lxvl/stxvl
>   - lxvll/stxvll
>   - lxvwsx
>   - stxvx
>   - lxsipzx
>   - lxvh8x
>   - lxsihzx
>   - lxvb16x/stxvb16x
>   - stxsibx
>   - stxsihx
>   - lxvb16x
>   - lxsd/stxsd
>   - lxssp/stxssp
>   - lxv/stxv
>   
> Also, I don't know how bothered we are about P7, but if I'm reading the
> ISA correctly, lqarx/stqcx. are not supported before ISA 2.07. Likewise
> a number of the vector instructions like lxsiwzx and lxsiwax (and the
> companion stores).
> 
> I realise it's not really the point of this particular patch, so I don't
> think this should block acceptance. What I would like to know - and
> maybe this is something where we need mpe to weigh in - is whether we
> need consistent guards for 2.07 and 3.0. We have some 3.0 guards already
> but clearly not everywhere.

Yes, those needs to be handled properly. Otherwise they can be harmful
when emulated on a non-supporting platform. Will work on it when I get
some time. Thanks for reporting it.

> 
> With all that said - the patch does what it says it does, and looks good
> to me:
> 
> Reviewed-by: Daniel Axtens <dja@axtens.net>

Thanks!
Ravi

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 0/5] powerpc/sstep: VSX 32-byte vector paired load/store instructions
  2020-10-11  5:09 [PATCH v5 0/5] powerpc/sstep: VSX 32-byte vector paired load/store instructions Ravi Bangoria
                   ` (4 preceding siblings ...)
  2020-10-11  5:09 ` [PATCH v5 5/5] powerpc/sstep: Add testcases for VSX vector paired load/store instructions Ravi Bangoria
@ 2020-12-15 10:49 ` Michael Ellerman
  5 siblings, 0 replies; 13+ messages in thread
From: Michael Ellerman @ 2020-12-15 10:49 UTC (permalink / raw)
  To: mpe, Ravi Bangoria
  Cc: jniethe5, bala24, paulus, sandipan, naveen.n.rao, linuxppc-dev

On Sun, 11 Oct 2020 10:39:03 +0530, Ravi Bangoria wrote:
> VSX vector paired instructions operates with octword (32-byte)
> operand for loads and stores between storage and a pair of two
> sequential Vector-Scalar Registers (VSRs). There are 4 word
> instructions and 2 prefixed instructions that provides this
> 32-byte storage access operations - lxvp, lxvpx, stxvp, stxvpx,
> plxvp, pstxvp.
> 
> [...]

Applied to powerpc/next.

[1/5] powerpc/sstep: Emulate prefixed instructions only when CPU_FTR_ARCH_31 is set
      https://git.kernel.org/powerpc/c/ef6879f8c8053cc3b493f400a06d452d7fb13650
[2/5] powerpc/sstep: Cover new VSX instructions under CONFIG_VSX
      https://git.kernel.org/powerpc/c/1817de2f141c718f1a0ae59927ec003e9b144349
[3/5] powerpc/sstep: Support VSX vector paired storage access instructions
      https://git.kernel.org/powerpc/c/af99da74333b06d97ea6a9a2cd8906244da4bb26
[4/5] powerpc/ppc-opcode: Add encoding macros for VSX vector paired instructions
      https://git.kernel.org/powerpc/c/6ce73ba7691555fd182bc68529dc03cbd4378f72
[5/5] powerpc/sstep: Add testcases for VSX vector paired load/store instructions
      https://git.kernel.org/powerpc/c/35785b293da0070a8df19b0193f0e7de6c9eaecb

cheers

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-12-15 11:47 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-11  5:09 [PATCH v5 0/5] powerpc/sstep: VSX 32-byte vector paired load/store instructions Ravi Bangoria
2020-10-11  5:09 ` [PATCH v5 1/5] powerpc/sstep: Emulate prefixed instructions only when CPU_FTR_ARCH_31 is set Ravi Bangoria
2020-10-11 15:06   ` Sandipan Das
2020-10-12  1:51   ` Daniel Axtens
2020-10-12 11:07     ` Ravi Bangoria
2020-10-12 12:55       ` Daniel Axtens
2020-10-12 13:44   ` Daniel Axtens
2020-10-14  7:34     ` Ravi Bangoria
2020-10-11  5:09 ` [PATCH v5 2/5] powerpc/sstep: Cover new VSX instructions under CONFIG_VSX Ravi Bangoria
2020-10-11  5:09 ` [PATCH v5 3/5] powerpc/sstep: Support VSX vector paired storage access instructions Ravi Bangoria
2020-10-11  5:09 ` [PATCH v5 4/5] powerpc/ppc-opcode: Add encoding macros for VSX vector paired instructions Ravi Bangoria
2020-10-11  5:09 ` [PATCH v5 5/5] powerpc/sstep: Add testcases for VSX vector paired load/store instructions Ravi Bangoria
2020-12-15 10:49 ` [PATCH v5 0/5] powerpc/sstep: VSX 32-byte " Michael Ellerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).