LinuxPPC-Dev Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v4 0/4] powerpc/sstep: VSX 32-byte vector paired load/store instructions
@ 2020-10-08  7:27 Ravi Bangoria
  2020-10-08  7:27 ` [PATCH v4 1/4] powerpc/sstep: Emulate prefixed instructions only when CPU_FTR_ARCH_31 is set Ravi Bangoria
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Ravi Bangoria @ 2020-10-08  7:27 UTC (permalink / raw)
  To: mpe; +Cc: ravi.bangoria, bala24, paulus, sandipan, naveen.n.rao, linuxppc-dev

VSX vector paired instructions operates with octword (32-byte)
operand for loads and stores between storage and a pair of two
sequential Vector-Scalar Registers (VSRs). There are 4 word
instructions and 2 prefixed instructions that provides this
32-byte storage access operations - lxvp, lxvpx, stxvp, stxvpx,
plxvp, pstxvp.

Emulation infrastructure doesn't have support for these instructions,
to operate with 32-byte storage access and to operate with 2 VSX
registers. This patch series enables the instruction emulation
support and adds test cases for them respectively.

v3: https://lore.kernel.org/linuxppc-dev/20200731081637.1837559-1-bala24@linux.ibm.com/ 

Changes in v4:
-------------
* Patch #1 is (kind of) new.
* Patch #2 now enables both analyse_instr() and emulate_step()
  unlike prev series where both were in separate patches.
* Patch #2 also has important fix for emulation on LE.
* Patch #3 and #4. Added XSP/XTP, D0/D1 instruction operands,
  removed *_EX_OP, __PPC_T[P][X] macros which are incorrect,
  and adhered to PPC_RAW_* convention.
* Added `CPU_FTR_ARCH_31` check in testcases to avoid failing
  in p8/p9.
* Some consmetic changes.
* Rebased to powerpc/next

Changes in v3:
-------------
Worked on review comments and suggestions from Ravi and Naveen,

* Fix the do_vsx_load() to handle vsx instructions if MSR_FP/MSR_VEC
  cleared in exception conditions and it reaches to read/write to
  thread_struct member fp_state/vr_state respectively.
* Fix wrongly used `__vector128 v[2]` in struct vsx_reg as it should
  hold a single vsx register size.
* Remove unnecessary `VSX_CHECK_VEC` flag set and condition to check
  `VSX_LDLEFT` that is not applicable for these vsx instructions.
* Fix comments in emulate_vsx_load() that were misleading.
* Rebased on latest powerpc next branch.

Changes in v2:
-------------
* Fix suggestion from Sandipan, wrap ISA 3.1 instructions with
  cpu_has_feature(CPU_FTR_ARCH_31) check.
* Rebase on latest powerpc next branch.


Balamuruhan S (4):
  powerpc/sstep: Emulate prefixed instructions only when CPU_FTR_ARCH_31
    is set
  powerpc/sstep: Support VSX vector paired storage access instructions
  powerpc/ppc-opcode: Add encoding macros for VSX vector paired
    instructions
  powerpc/sstep: Add testcases for VSX vector paired load/store
    instructions

 arch/powerpc/include/asm/ppc-opcode.h |  13 ++
 arch/powerpc/lib/sstep.c              | 152 +++++++++++++--
 arch/powerpc/lib/test_emulate_step.c  | 270 ++++++++++++++++++++++++++
 3 files changed, 414 insertions(+), 21 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v4 1/4] powerpc/sstep: Emulate prefixed instructions only when CPU_FTR_ARCH_31 is set
  2020-10-08  7:27 [PATCH v4 0/4] powerpc/sstep: VSX 32-byte vector paired load/store instructions Ravi Bangoria
@ 2020-10-08  7:27 ` Ravi Bangoria
  2020-10-08  7:27 ` [PATCH v4 2/4] powerpc/sstep: Support VSX vector paired storage access instructions Ravi Bangoria
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 6+ messages in thread
From: Ravi Bangoria @ 2020-10-08  7:27 UTC (permalink / raw)
  To: mpe; +Cc: ravi.bangoria, bala24, paulus, sandipan, naveen.n.rao, linuxppc-dev

From: Balamuruhan S <bala24@linux.ibm.com>

Unconditional emulation of prefixed instructions will allow
emulation of them on Power10 predecessors which might cause
issues. Restrict that.

Signed-off-by: Balamuruhan S <bala24@linux.ibm.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
---
 arch/powerpc/lib/sstep.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index e9dcaba9a4f8..e6242744c71b 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -1346,6 +1346,9 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 	switch (opcode) {
 #ifdef __powerpc64__
 	case 1:
+		if (!cpu_has_feature(CPU_FTR_ARCH_31))
+			return -1;
+
 		prefix_r = GET_PREFIX_R(word);
 		ra = GET_PREFIX_RA(suffix);
 		rd = (suffix >> 21) & 0x1f;
@@ -2733,6 +2736,9 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 		}
 		break;
 	case 1: /* Prefixed instructions */
+		if (!cpu_has_feature(CPU_FTR_ARCH_31))
+			return -1;
+
 		prefix_r = GET_PREFIX_R(word);
 		ra = GET_PREFIX_RA(suffix);
 		op->update_reg = ra;
-- 
2.26.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v4 2/4] powerpc/sstep: Support VSX vector paired storage access instructions
  2020-10-08  7:27 [PATCH v4 0/4] powerpc/sstep: VSX 32-byte vector paired load/store instructions Ravi Bangoria
  2020-10-08  7:27 ` [PATCH v4 1/4] powerpc/sstep: Emulate prefixed instructions only when CPU_FTR_ARCH_31 is set Ravi Bangoria
@ 2020-10-08  7:27 ` Ravi Bangoria
  2020-10-08 11:24   ` kernel test robot
  2020-10-08  7:27 ` [PATCH v4 3/4] powerpc/ppc-opcode: Add encoding macros for VSX vector paired instructions Ravi Bangoria
  2020-10-08  7:27 ` [PATCH v4 4/4] powerpc/sstep: Add testcases for VSX vector paired load/store instructions Ravi Bangoria
  3 siblings, 1 reply; 6+ messages in thread
From: Ravi Bangoria @ 2020-10-08  7:27 UTC (permalink / raw)
  To: mpe; +Cc: ravi.bangoria, bala24, paulus, sandipan, naveen.n.rao, linuxppc-dev

From: Balamuruhan S <bala24@linux.ibm.com>

VSX Vector Paired instructions loads/stores an octword (32 bytes)
from/to storage into two sequential VSRs. Add emulation support
for these new instructions:
  * Load VSX Vector Paired (lxvp)
  * Load VSX Vector Paired Indexed (lxvpx)
  * Prefixed Load VSX Vector Paired (plxvp)
  * Store VSX Vector Paired (stxvp)
  * Store VSX Vector Paired Indexed (stxvpx)
  * Prefixed Store VSX Vector Paired (pstxvp)

Suggested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Balamuruhan S <bala24@linux.ibm.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
---
 arch/powerpc/lib/sstep.c | 146 +++++++++++++++++++++++++++++++++------
 1 file changed, 125 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index e6242744c71b..e39ee1651636 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -32,6 +32,10 @@ extern char system_call_vectored_emulate[];
 #define XER_OV32	0x00080000U
 #define XER_CA32	0x00040000U
 
+#ifdef CONFIG_VSX
+#define VSX_REGISTER_XTP(rd)   ((((rd) & 1) << 5) | ((rd) & 0xfe))
+#endif
+
 #ifdef CONFIG_PPC_FPU
 /*
  * Functions in ldstfp.S
@@ -279,6 +283,19 @@ static nokprobe_inline void do_byte_reverse(void *ptr, int nb)
 		up[1] = tmp;
 		break;
 	}
+	case 32: {
+		unsigned long *up = (unsigned long *)ptr;
+		unsigned long tmp;
+
+		tmp = byterev_8(up[0]);
+		up[0] = byterev_8(up[3]);
+		up[3] = tmp;
+		tmp = byterev_8(up[2]);
+		up[2] = byterev_8(up[1]);
+		up[1] = tmp;
+		break;
+	}
+
 #endif
 	default:
 		WARN_ON_ONCE(1);
@@ -709,6 +726,8 @@ void emulate_vsx_load(struct instruction_op *op, union vsx_reg *reg,
 	reg->d[0] = reg->d[1] = 0;
 
 	switch (op->element_size) {
+	case 32:
+		/* [p]lxvp[x] */
 	case 16:
 		/* whole vector; lxv[x] or lxvl[l] */
 		if (size == 0)
@@ -717,7 +736,7 @@ void emulate_vsx_load(struct instruction_op *op, union vsx_reg *reg,
 		if (IS_LE && (op->vsx_flags & VSX_LDLEFT))
 			rev = !rev;
 		if (rev)
-			do_byte_reverse(reg, 16);
+			do_byte_reverse(reg, size);
 		break;
 	case 8:
 		/* scalar loads, lxvd2x, lxvdsx */
@@ -793,6 +812,20 @@ void emulate_vsx_store(struct instruction_op *op, const union vsx_reg *reg,
 	size = GETSIZE(op->type);
 
 	switch (op->element_size) {
+	case 32:
+		/* [p]stxvp[x] */
+		if (size == 0)
+			break;
+		if (rev) {
+			/* reverse 32 bytes */
+			buf.d[0] = byterev_8(reg->d[3]);
+			buf.d[1] = byterev_8(reg->d[2]);
+			buf.d[2] = byterev_8(reg->d[1]);
+			buf.d[3] = byterev_8(reg->d[0]);
+			reg = &buf;
+		}
+		memcpy(mem, reg, size);
+		break;
 	case 16:
 		/* stxv, stxvx, stxvl, stxvll */
 		if (size == 0)
@@ -861,28 +894,43 @@ static nokprobe_inline int do_vsx_load(struct instruction_op *op,
 				       bool cross_endian)
 {
 	int reg = op->reg;
-	u8 mem[16];
-	union vsx_reg buf;
+	int i, j, nr_vsx_regs;
+	u8 mem[32];
+	union vsx_reg buf[2];
 	int size = GETSIZE(op->type);
 
 	if (!address_ok(regs, ea, size) || copy_mem_in(mem, ea, size, regs))
 		return -EFAULT;
 
-	emulate_vsx_load(op, &buf, mem, cross_endian);
+	nr_vsx_regs = size / sizeof(__vector128);
+	emulate_vsx_load(op, buf, mem, cross_endian);
 	preempt_disable();
 	if (reg < 32) {
 		/* FP regs + extensions */
 		if (regs->msr & MSR_FP) {
-			load_vsrn(reg, &buf);
+			for (i = 0; i < nr_vsx_regs; i++) {
+				j = IS_LE ? nr_vsx_regs - i - 1 : i;
+				load_vsrn(reg + i, &buf[j].v);
+			}
 		} else {
-			current->thread.fp_state.fpr[reg][0] = buf.d[0];
-			current->thread.fp_state.fpr[reg][1] = buf.d[1];
+			for (i = 0; i < nr_vsx_regs; i++) {
+				j = IS_LE ? nr_vsx_regs - i - 1 : i;
+				current->thread.fp_state.fpr[reg + i][0] = buf[j].d[0];
+				current->thread.fp_state.fpr[reg + i][1] = buf[j].d[1];
+			}
 		}
 	} else {
-		if (regs->msr & MSR_VEC)
-			load_vsrn(reg, &buf);
-		else
-			current->thread.vr_state.vr[reg - 32] = buf.v;
+		if (regs->msr & MSR_VEC) {
+			for (i = 0; i < nr_vsx_regs; i++) {
+				j = IS_LE ? nr_vsx_regs - i - 1 : i;
+				load_vsrn(reg + i, &buf[j].v);
+			}
+		} else {
+			for (i = 0; i < nr_vsx_regs; i++) {
+				j = IS_LE ? nr_vsx_regs - i - 1 : i;
+				current->thread.vr_state.vr[reg - 32 + i] = buf[j].v;
+			}
+		}
 	}
 	preempt_enable();
 	return 0;
@@ -893,30 +941,45 @@ static nokprobe_inline int do_vsx_store(struct instruction_op *op,
 					bool cross_endian)
 {
 	int reg = op->reg;
-	u8 mem[16];
-	union vsx_reg buf;
+	int i, j, nr_vsx_regs;
+	u8 mem[32];
+	union vsx_reg buf[2];
 	int size = GETSIZE(op->type);
 
 	if (!address_ok(regs, ea, size))
 		return -EFAULT;
 
+	nr_vsx_regs = size / sizeof(__vector128);
 	preempt_disable();
 	if (reg < 32) {
 		/* FP regs + extensions */
 		if (regs->msr & MSR_FP) {
-			store_vsrn(reg, &buf);
+			for (i = 0; i < nr_vsx_regs; i++) {
+				j = IS_LE ? nr_vsx_regs - i - 1 : i;
+				store_vsrn(reg + i, &buf[j].v);
+			}
 		} else {
-			buf.d[0] = current->thread.fp_state.fpr[reg][0];
-			buf.d[1] = current->thread.fp_state.fpr[reg][1];
+			for (i = 0; i < nr_vsx_regs; i++) {
+				j = IS_LE ? nr_vsx_regs - i - 1 : i;
+				buf[j].d[0] = current->thread.fp_state.fpr[reg + i][0];
+				buf[j].d[1] = current->thread.fp_state.fpr[reg + i][1];
+			}
 		}
 	} else {
-		if (regs->msr & MSR_VEC)
-			store_vsrn(reg, &buf);
-		else
-			buf.v = current->thread.vr_state.vr[reg - 32];
+		if (regs->msr & MSR_VEC) {
+			for (i = 0; i < nr_vsx_regs; i++) {
+				j = IS_LE ? nr_vsx_regs - i - 1 : i;
+				store_vsrn(reg + i, &buf[j].v);
+			}
+		} else {
+			for (i = 0; i < nr_vsx_regs; i++) {
+				j = IS_LE ? nr_vsx_regs - i - 1 : i;
+				buf[j].v = current->thread.vr_state.vr[reg - 32 + i];
+			}
+		}
 	}
 	preempt_enable();
-	emulate_vsx_store(op, &buf, mem, cross_endian);
+	emulate_vsx_store(op, buf, mem, cross_endian);
 	return  copy_mem_out(mem, ea, size, regs);
 }
 #endif /* CONFIG_VSX */
@@ -2403,6 +2466,14 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			op->vsx_flags = VSX_SPLAT;
 			break;
 
+		case 333:       /* lxvpx */
+			if (!cpu_has_feature(CPU_FTR_ARCH_31))
+				return -1;
+			op->reg = VSX_REGISTER_XTP(rd);
+			op->type = MKOP(LOAD_VSX, 0, 32);
+			op->element_size = 32;
+			break;
+
 		case 364:	/* lxvwsx */
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(LOAD_VSX, 0, 4);
@@ -2431,6 +2502,13 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 				VSX_CHECK_VEC;
 			break;
 		}
+		case 461:       /* stxvpx */
+			if (!cpu_has_feature(CPU_FTR_ARCH_31))
+				return -1;
+			op->reg = VSX_REGISTER_XTP(rd);
+			op->type = MKOP(STORE_VSX, 0, 32);
+			op->element_size = 32;
+			break;
 		case 524:	/* lxsspx */
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(LOAD_VSX, 0, 4);
@@ -2672,6 +2750,22 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 #endif
 
 #ifdef CONFIG_VSX
+	case 6:
+		if (!cpu_has_feature(CPU_FTR_ARCH_31))
+			return -1;
+		op->ea = dqform_ea(word, regs);
+		op->reg = VSX_REGISTER_XTP(rd);
+		op->element_size = 32;
+		switch (word & 0xf) {
+		case 0:         /* lxvp */
+			op->type = MKOP(LOAD_VSX, 0, 32);
+			break;
+		case 1:         /* stxvp */
+			op->type = MKOP(STORE_VSX, 0, 32);
+			break;
+		}
+		break;
+
 	case 61:	/* stfdp, lxv, stxsd, stxssp, stxv */
 		switch (word & 7) {
 		case 0:		/* stfdp with LSB of DS field = 0 */
@@ -2803,12 +2897,22 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			case 57:	/* pld */
 				op->type = MKOP(LOAD, PREFIXED, 8);
 				break;
+			case 58:        /* plxvp */
+				op->reg = VSX_REGISTER_XTP(rd);
+				op->type = MKOP(LOAD_VSX, PREFIXED, 32);
+				op->element_size = 32;
+				break;
 			case 60:        /* stq */
 				op->type = MKOP(STORE, PREFIXED, 16);
 				break;
 			case 61:	/* pstd */
 				op->type = MKOP(STORE, PREFIXED, 8);
 				break;
+			case 62:        /* pstxvp */
+				op->reg = VSX_REGISTER_XTP(rd);
+				op->type = MKOP(STORE_VSX, PREFIXED, 32);
+				op->element_size = 32;
+				break;
 			}
 			break;
 		case 1: /* Type 01 Eight-Byte Register-to-Register */
-- 
2.26.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v4 3/4] powerpc/ppc-opcode: Add encoding macros for VSX vector paired instructions
  2020-10-08  7:27 [PATCH v4 0/4] powerpc/sstep: VSX 32-byte vector paired load/store instructions Ravi Bangoria
  2020-10-08  7:27 ` [PATCH v4 1/4] powerpc/sstep: Emulate prefixed instructions only when CPU_FTR_ARCH_31 is set Ravi Bangoria
  2020-10-08  7:27 ` [PATCH v4 2/4] powerpc/sstep: Support VSX vector paired storage access instructions Ravi Bangoria
@ 2020-10-08  7:27 ` Ravi Bangoria
  2020-10-08  7:27 ` [PATCH v4 4/4] powerpc/sstep: Add testcases for VSX vector paired load/store instructions Ravi Bangoria
  3 siblings, 0 replies; 6+ messages in thread
From: Ravi Bangoria @ 2020-10-08  7:27 UTC (permalink / raw)
  To: mpe; +Cc: ravi.bangoria, bala24, paulus, sandipan, naveen.n.rao, linuxppc-dev

From: Balamuruhan S <bala24@linux.ibm.com>

Add instruction encodings, DQ, D0, D1 immediate, XTP, XSP operands as
macros for new VSX vector paired instructions,
  * Load VSX Vector Paired (lxvp)
  * Load VSX Vector Paired Indexed (lxvpx)
  * Prefixed Load VSX Vector Paired (plxvp)
  * Store VSX Vector Paired (stxvp)
  * Store VSX Vector Paired Indexed (stxvpx)
  * Prefixed Store VSX Vector Paired (pstxvp)

Suggested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Balamuruhan S <bala24@linux.ibm.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
---
 arch/powerpc/include/asm/ppc-opcode.h | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h
index a6e3700c4566..5e7918ca4fb7 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -78,6 +78,9 @@
 
 #define IMM_L(i)               ((uintptr_t)(i) & 0xffff)
 #define IMM_DS(i)              ((uintptr_t)(i) & 0xfffc)
+#define IMM_DQ(i)              ((uintptr_t)(i) & 0xfff0)
+#define IMM_D0(i)              (((uintptr_t)(i) >> 16) & 0x3ffff)
+#define IMM_D1(i)              IMM_L(i)
 
 /*
  * 16-bit immediate helper macros: HA() is for use with sign-extending instrs
@@ -295,6 +298,8 @@
 #define __PPC_XB(b)	((((b) & 0x1f) << 11) | (((b) & 0x20) >> 4))
 #define __PPC_XS(s)	((((s) & 0x1f) << 21) | (((s) & 0x20) >> 5))
 #define __PPC_XT(s)	__PPC_XS(s)
+#define __PPC_XSP(s)	((((s) & 0x1e) | (((s) >> 5) & 0x1)) << 21)
+#define __PPC_XTP(s)	__PPC_XSP(s)
 #define __PPC_T_TLB(t)	(((t) & 0x3) << 21)
 #define __PPC_WC(w)	(((w) & 0x3) << 21)
 #define __PPC_WS(w)	(((w) & 0x1f) << 11)
@@ -395,6 +400,14 @@
 #define PPC_RAW_XVCPSGNDP(t, a, b)	((0xf0000780 | VSX_XX3((t), (a), (b))))
 #define PPC_RAW_VPERMXOR(vrt, vra, vrb, vrc) \
 	((0x1000002d | ___PPC_RT(vrt) | ___PPC_RA(vra) | ___PPC_RB(vrb) | (((vrc) & 0x1f) << 6)))
+#define PPC_RAW_LXVP(xtp, a, i)		(0x18000000 | __PPC_XTP(xtp) | ___PPC_RA(a) | IMM_DQ(i))
+#define PPC_RAW_STXVP(xsp, a, i)	(0x18000001 | __PPC_XSP(xsp) | ___PPC_RA(a) | IMM_DQ(i))
+#define PPC_RAW_LXVPX(xtp, a, b)	(0x7c00029a | __PPC_XTP(xtp) | ___PPC_RA(a) | ___PPC_RB(b))
+#define PPC_RAW_STXVPX(xsp, a, b)	(0x7c00039a | __PPC_XSP(xsp) | ___PPC_RA(a) | ___PPC_RB(b))
+#define PPC_RAW_PLXVP(xtp, i, a, pr) \
+	((PPC_PREFIX_8LS | __PPC_PRFX_R(pr) | IMM_D0(i)) << 32 | (0xe8000000 | __PPC_XTP(xtp) | ___PPC_RA(a) | IMM_D1(i)))
+#define PPC_RAW_PSTXVP(xsp, i, a, pr) \
+	((PPC_PREFIX_8LS | __PPC_PRFX_R(pr) | IMM_D0(i)) << 32 | (0xf8000000 | __PPC_XSP(xsp) | ___PPC_RA(a) | IMM_D1(i)))
 #define PPC_RAW_NAP			(0x4c000364)
 #define PPC_RAW_SLEEP			(0x4c0003a4)
 #define PPC_RAW_WINKLE			(0x4c0003e4)
-- 
2.26.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v4 4/4] powerpc/sstep: Add testcases for VSX vector paired load/store instructions
  2020-10-08  7:27 [PATCH v4 0/4] powerpc/sstep: VSX 32-byte vector paired load/store instructions Ravi Bangoria
                   ` (2 preceding siblings ...)
  2020-10-08  7:27 ` [PATCH v4 3/4] powerpc/ppc-opcode: Add encoding macros for VSX vector paired instructions Ravi Bangoria
@ 2020-10-08  7:27 ` Ravi Bangoria
  3 siblings, 0 replies; 6+ messages in thread
From: Ravi Bangoria @ 2020-10-08  7:27 UTC (permalink / raw)
  To: mpe; +Cc: ravi.bangoria, bala24, paulus, sandipan, naveen.n.rao, linuxppc-dev

From: Balamuruhan S <bala24@linux.ibm.com>

Add testcases for VSX vector paired load/store instructions.
Sample o/p:

  emulate_step_test: lxvp           : PASS
  emulate_step_test: stxvp          : PASS
  emulate_step_test: lxvpx          : PASS
  emulate_step_test: stxvpx         : PASS
  emulate_step_test: plxvp          : PASS
  emulate_step_test: pstxvp         : PASS

Signed-off-by: Balamuruhan S <bala24@linux.ibm.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
---
 arch/powerpc/lib/test_emulate_step.c | 270 +++++++++++++++++++++++++++
 1 file changed, 270 insertions(+)

diff --git a/arch/powerpc/lib/test_emulate_step.c b/arch/powerpc/lib/test_emulate_step.c
index 0a201b771477..783d1b85ecfe 100644
--- a/arch/powerpc/lib/test_emulate_step.c
+++ b/arch/powerpc/lib/test_emulate_step.c
@@ -612,6 +612,273 @@ static void __init test_lxvd2x_stxvd2x(void)
 }
 #endif /* CONFIG_VSX */
 
+#ifdef CONFIG_VSX
+static void __init test_lxvp_stxvp(void)
+{
+	struct pt_regs regs;
+	union {
+		vector128 a;
+		u32 b[4];
+	} c[2];
+	u32 cached_b[8];
+	int stepped = -1;
+
+	if (!cpu_has_feature(CPU_FTR_ARCH_31)) {
+		show_result("lxvp", "SKIP (!CPU_FTR_ARCH_31)");
+		show_result("stxvp", "SKIP (!CPU_FTR_ARCH_31)");
+		return;
+	}
+
+	init_pt_regs(&regs);
+
+	/*** lxvp ***/
+
+	cached_b[0] = c[0].b[0] = 18233;
+	cached_b[1] = c[0].b[1] = 34863571;
+	cached_b[2] = c[0].b[2] = 834;
+	cached_b[3] = c[0].b[3] = 6138911;
+	cached_b[4] = c[1].b[0] = 1234;
+	cached_b[5] = c[1].b[1] = 5678;
+	cached_b[6] = c[1].b[2] = 91011;
+	cached_b[7] = c[1].b[3] = 121314;
+
+	regs.gpr[4] = (unsigned long)&c[0].a;
+
+	/*
+	 * lxvp XTp,DQ(RA)
+	 * XTp = 32xTX + 2xTp
+	 * let TX=1 Tp=1 RA=4 DQ=0
+	 */
+	stepped = emulate_step(&regs, ppc_inst(PPC_RAW_LXVP(34, 4, 0)));
+
+	if (stepped == 1 && cpu_has_feature(CPU_FTR_VSX)) {
+		show_result("lxvp", "PASS");
+	} else {
+		if (!cpu_has_feature(CPU_FTR_VSX))
+			show_result("lxvp", "PASS (!CPU_FTR_VSX)");
+		else
+			show_result("lxvp", "FAIL");
+	}
+
+	/*** stxvp ***/
+
+	c[0].b[0] = 21379463;
+	c[0].b[1] = 87;
+	c[0].b[2] = 374234;
+	c[0].b[3] = 4;
+	c[1].b[0] = 90;
+	c[1].b[1] = 122;
+	c[1].b[2] = 555;
+	c[1].b[3] = 32144;
+
+	/*
+	 * stxvp XSp,DQ(RA)
+	 * XSp = 32xSX + 2xSp
+	 * let SX=1 Sp=1 RA=4 DQ=0
+	 */
+	stepped = emulate_step(&regs, ppc_inst(PPC_RAW_STXVP(34, 4, 0)));
+
+	if (stepped == 1 && cached_b[0] == c[0].b[0] && cached_b[1] == c[0].b[1] &&
+	    cached_b[2] == c[0].b[2] && cached_b[3] == c[0].b[3] &&
+	    cached_b[4] == c[1].b[0] && cached_b[5] == c[1].b[1] &&
+	    cached_b[6] == c[1].b[2] && cached_b[7] == c[1].b[3] &&
+	    cpu_has_feature(CPU_FTR_VSX)) {
+		show_result("stxvp", "PASS");
+	} else {
+		if (!cpu_has_feature(CPU_FTR_VSX))
+			show_result("stxvp", "PASS (!CPU_FTR_VSX)");
+		else
+			show_result("stxvp", "FAIL");
+	}
+}
+#else
+static void __init test_lxvp_stxvp(void)
+{
+	show_result("lxvp", "SKIP (CONFIG_VSX is not set)");
+	show_result("stxvp", "SKIP (CONFIG_VSX is not set)");
+}
+#endif /* CONFIG_VSX */
+
+#ifdef CONFIG_VSX
+static void __init test_lxvpx_stxvpx(void)
+{
+	struct pt_regs regs;
+	union {
+		vector128 a;
+		u32 b[4];
+	} c[2];
+	u32 cached_b[8];
+	int stepped = -1;
+
+	if (!cpu_has_feature(CPU_FTR_ARCH_31)) {
+		show_result("lxvpx", "SKIP (!CPU_FTR_ARCH_31)");
+		show_result("stxvpx", "SKIP (!CPU_FTR_ARCH_31)");
+		return;
+	}
+
+	init_pt_regs(&regs);
+
+	/*** lxvpx ***/
+
+	cached_b[0] = c[0].b[0] = 18233;
+	cached_b[1] = c[0].b[1] = 34863571;
+	cached_b[2] = c[0].b[2] = 834;
+	cached_b[3] = c[0].b[3] = 6138911;
+	cached_b[4] = c[1].b[0] = 1234;
+	cached_b[5] = c[1].b[1] = 5678;
+	cached_b[6] = c[1].b[2] = 91011;
+	cached_b[7] = c[1].b[3] = 121314;
+
+	regs.gpr[3] = (unsigned long)&c[0].a;
+	regs.gpr[4] = 0;
+
+	/*
+	 * lxvpx XTp,RA,RB
+	 * XTp = 32xTX + 2xTp
+	 * let TX=1 Tp=1 RA=3 RB=4
+	 */
+	stepped = emulate_step(&regs, ppc_inst(PPC_RAW_LXVPX(34, 3, 4)));
+
+	if (stepped == 1 && cpu_has_feature(CPU_FTR_VSX)) {
+		show_result("lxvpx", "PASS");
+	} else {
+		if (!cpu_has_feature(CPU_FTR_VSX))
+			show_result("lxvpx", "PASS (!CPU_FTR_VSX)");
+		else
+			show_result("lxvpx", "FAIL");
+	}
+
+	/*** stxvpx ***/
+
+	c[0].b[0] = 21379463;
+	c[0].b[1] = 87;
+	c[0].b[2] = 374234;
+	c[0].b[3] = 4;
+	c[1].b[0] = 90;
+	c[1].b[1] = 122;
+	c[1].b[2] = 555;
+	c[1].b[3] = 32144;
+
+	/*
+	 * stxvpx XSp,RA,RB
+	 * XSp = 32xSX + 2xSp
+	 * let SX=1 Sp=1 RA=3 RB=4
+	 */
+	stepped = emulate_step(&regs, ppc_inst(PPC_RAW_STXVPX(34, 3, 4)));
+
+	if (stepped == 1 && cached_b[0] == c[0].b[0] && cached_b[1] == c[0].b[1] &&
+	    cached_b[2] == c[0].b[2] && cached_b[3] == c[0].b[3] &&
+	    cached_b[4] == c[1].b[0] && cached_b[5] == c[1].b[1] &&
+	    cached_b[6] == c[1].b[2] && cached_b[7] == c[1].b[3] &&
+	    cpu_has_feature(CPU_FTR_VSX)) {
+		show_result("stxvpx", "PASS");
+	} else {
+		if (!cpu_has_feature(CPU_FTR_VSX))
+			show_result("stxvpx", "PASS (!CPU_FTR_VSX)");
+		else
+			show_result("stxvpx", "FAIL");
+	}
+}
+#else
+static void __init test_lxvpx_stxvpx(void)
+{
+	show_result("lxvpx", "SKIP (CONFIG_VSX is not set)");
+	show_result("stxvpx", "SKIP (CONFIG_VSX is not set)");
+}
+#endif /* CONFIG_VSX */
+
+#ifdef CONFIG_VSX
+static void __init test_plxvp_pstxvp(void)
+{
+	struct ppc_inst instr;
+	struct pt_regs regs;
+	union {
+		vector128 a;
+		u32 b[4];
+	} c[2];
+	u32 cached_b[8];
+	int stepped = -1;
+
+	if (!cpu_has_feature(CPU_FTR_ARCH_31)) {
+		show_result("plxvp", "SKIP (!CPU_FTR_ARCH_31)");
+		show_result("pstxvp", "SKIP (!CPU_FTR_ARCH_31)");
+		return;
+	}
+
+	/*** plxvp ***/
+
+	cached_b[0] = c[0].b[0] = 18233;
+	cached_b[1] = c[0].b[1] = 34863571;
+	cached_b[2] = c[0].b[2] = 834;
+	cached_b[3] = c[0].b[3] = 6138911;
+	cached_b[4] = c[1].b[0] = 1234;
+	cached_b[5] = c[1].b[1] = 5678;
+	cached_b[6] = c[1].b[2] = 91011;
+	cached_b[7] = c[1].b[3] = 121314;
+
+	init_pt_regs(&regs);
+	regs.gpr[3] = (unsigned long)&c[0].a;
+
+	/*
+	 * plxvp XTp,D(RA),R
+	 * XTp = 32xTX + 2xTp
+	 * let RA=3 R=0 D=d0||d1=0 R=0 Tp=1 TX=1
+	 */
+	instr = ppc_inst_prefix(PPC_RAW_PLXVP(34, 0, 3, 0) >> 32,
+			PPC_RAW_PLXVP(34, 0, 3, 0) & 0xffffffff);
+
+	stepped = emulate_step(&regs, instr);
+	if (stepped == 1 && cpu_has_feature(CPU_FTR_VSX)) {
+		show_result("plxvp", "PASS");
+	} else {
+		if (!cpu_has_feature(CPU_FTR_VSX))
+			show_result("plxvp", "PASS (!CPU_FTR_VSX)");
+		else
+			show_result("plxvp", "FAIL");
+	}
+
+	/*** pstxvp ***/
+
+	c[0].b[0] = 21379463;
+	c[0].b[1] = 87;
+	c[0].b[2] = 374234;
+	c[0].b[3] = 4;
+	c[1].b[0] = 90;
+	c[1].b[1] = 122;
+	c[1].b[2] = 555;
+	c[1].b[3] = 32144;
+
+	/*
+	 * pstxvp XSp,D(RA),R
+	 * XSp = 32xSX + 2xSp
+	 * let RA=3 D=d0||d1=0 R=0 Sp=1 SX=1
+	 */
+	instr = ppc_inst_prefix(PPC_RAW_PSTXVP(34, 0, 3, 0) >> 32,
+			PPC_RAW_PSTXVP(34, 0, 3, 0) & 0xffffffff);
+
+	stepped = emulate_step(&regs, instr);
+
+	if (stepped == 1 && cached_b[0] == c[0].b[0] && cached_b[1] == c[0].b[1] &&
+	    cached_b[2] == c[0].b[2] && cached_b[3] == c[0].b[3] &&
+	    cached_b[4] == c[1].b[0] && cached_b[5] == c[1].b[1] &&
+	    cached_b[6] == c[1].b[2] && cached_b[7] == c[1].b[3] &&
+	    cpu_has_feature(CPU_FTR_VSX)) {
+		show_result("pstxvp", "PASS");
+	} else {
+		if (!cpu_has_feature(CPU_FTR_VSX))
+			show_result("pstxvp", "PASS (!CPU_FTR_VSX)");
+		else
+			show_result("pstxvp", "FAIL");
+	}
+}
+#else
+static void __init test_plxvp_pstxvp(void)
+{
+	show_result("plxvp", "SKIP (CONFIG_VSX is not set)");
+	show_result("pstxvp", "SKIP (CONFIG_VSX is not set)");
+}
+#endif /* CONFIG_VSX */
+
 static void __init run_tests_load_store(void)
 {
 	test_ld();
@@ -628,6 +895,9 @@ static void __init run_tests_load_store(void)
 	test_plfd_pstfd();
 	test_lvx_stvx();
 	test_lxvd2x_stxvd2x();
+	test_lxvp_stxvp();
+	test_lxvpx_stxvpx();
+	test_plxvp_pstxvp();
 }
 
 struct compute_test {
-- 
2.26.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v4 2/4] powerpc/sstep: Support VSX vector paired storage access instructions
  2020-10-08  7:27 ` [PATCH v4 2/4] powerpc/sstep: Support VSX vector paired storage access instructions Ravi Bangoria
@ 2020-10-08 11:24   ` kernel test robot
  0 siblings, 0 replies; 6+ messages in thread
From: kernel test robot @ 2020-10-08 11:24 UTC (permalink / raw)
  To: Ravi Bangoria, mpe
  Cc: ravi.bangoria, kbuild-all, bala24, paulus, sandipan,
	naveen.n.rao, linuxppc-dev


[-- Attachment #1: Type: text/plain, Size: 6998 bytes --]

Hi Ravi,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on powerpc/next]
[also build test ERROR on v5.9-rc8 next-20201007]
[cannot apply to mpe/next scottwood/next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Ravi-Bangoria/powerpc-sstep-VSX-32-byte-vector-paired-load-store-instructions/20201008-153614
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-g5_defconfig (attached as .config)
compiler: powerpc64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/55def6779849f9aec057f405abf1cd98a8674b4f
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Ravi-Bangoria/powerpc-sstep-VSX-32-byte-vector-paired-load-store-instructions/20201008-153614
        git checkout 55def6779849f9aec057f405abf1cd98a8674b4f
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   arch/powerpc/lib/sstep.c: In function 'analyse_instr':
>> arch/powerpc/lib/sstep.c:2901:15: error: implicit declaration of function 'VSX_REGISTER_XTP'; did you mean 'H_REGISTER_SMR'? [-Werror=implicit-function-declaration]
    2901 |     op->reg = VSX_REGISTER_XTP(rd);
         |               ^~~~~~~~~~~~~~~~
         |               H_REGISTER_SMR
   cc1: all warnings being treated as errors

vim +2901 arch/powerpc/lib/sstep.c

  2815	
  2816	#ifdef __powerpc64__
  2817		case 62:	/* std[u] */
  2818			op->ea = dsform_ea(word, regs);
  2819			switch (word & 3) {
  2820			case 0:		/* std */
  2821				op->type = MKOP(STORE, 0, 8);
  2822				break;
  2823			case 1:		/* stdu */
  2824				op->type = MKOP(STORE, UPDATE, 8);
  2825				break;
  2826			case 2:		/* stq */
  2827				if (!(rd & 1))
  2828					op->type = MKOP(STORE, 0, 16);
  2829				break;
  2830			}
  2831			break;
  2832		case 1: /* Prefixed instructions */
  2833			if (!cpu_has_feature(CPU_FTR_ARCH_31))
  2834				return -1;
  2835	
  2836			prefix_r = GET_PREFIX_R(word);
  2837			ra = GET_PREFIX_RA(suffix);
  2838			op->update_reg = ra;
  2839			rd = (suffix >> 21) & 0x1f;
  2840			op->reg = rd;
  2841			op->val = regs->gpr[rd];
  2842	
  2843			suffixopcode = get_op(suffix);
  2844			prefixtype = (word >> 24) & 0x3;
  2845			switch (prefixtype) {
  2846			case 0: /* Type 00  Eight-Byte Load/Store */
  2847				if (prefix_r && ra)
  2848					break;
  2849				op->ea = mlsd_8lsd_ea(word, suffix, regs);
  2850				switch (suffixopcode) {
  2851				case 41:	/* plwa */
  2852					op->type = MKOP(LOAD, PREFIXED | SIGNEXT, 4);
  2853					break;
  2854				case 42:        /* plxsd */
  2855					op->reg = rd + 32;
  2856					op->type = MKOP(LOAD_VSX, PREFIXED, 8);
  2857					op->element_size = 8;
  2858					op->vsx_flags = VSX_CHECK_VEC;
  2859					break;
  2860				case 43:	/* plxssp */
  2861					op->reg = rd + 32;
  2862					op->type = MKOP(LOAD_VSX, PREFIXED, 4);
  2863					op->element_size = 8;
  2864					op->vsx_flags = VSX_FPCONV | VSX_CHECK_VEC;
  2865					break;
  2866				case 46:	/* pstxsd */
  2867					op->reg = rd + 32;
  2868					op->type = MKOP(STORE_VSX, PREFIXED, 8);
  2869					op->element_size = 8;
  2870					op->vsx_flags = VSX_CHECK_VEC;
  2871					break;
  2872				case 47:	/* pstxssp */
  2873					op->reg = rd + 32;
  2874					op->type = MKOP(STORE_VSX, PREFIXED, 4);
  2875					op->element_size = 8;
  2876					op->vsx_flags = VSX_FPCONV | VSX_CHECK_VEC;
  2877					break;
  2878				case 51:	/* plxv1 */
  2879					op->reg += 32;
  2880					fallthrough;
  2881				case 50:	/* plxv0 */
  2882					op->type = MKOP(LOAD_VSX, PREFIXED, 16);
  2883					op->element_size = 16;
  2884					op->vsx_flags = VSX_CHECK_VEC;
  2885					break;
  2886				case 55:	/* pstxv1 */
  2887					op->reg = rd + 32;
  2888					fallthrough;
  2889				case 54:	/* pstxv0 */
  2890					op->type = MKOP(STORE_VSX, PREFIXED, 16);
  2891					op->element_size = 16;
  2892					op->vsx_flags = VSX_CHECK_VEC;
  2893					break;
  2894				case 56:        /* plq */
  2895					op->type = MKOP(LOAD, PREFIXED, 16);
  2896					break;
  2897				case 57:	/* pld */
  2898					op->type = MKOP(LOAD, PREFIXED, 8);
  2899					break;
  2900				case 58:        /* plxvp */
> 2901					op->reg = VSX_REGISTER_XTP(rd);
  2902					op->type = MKOP(LOAD_VSX, PREFIXED, 32);
  2903					op->element_size = 32;
  2904					break;
  2905				case 60:        /* stq */
  2906					op->type = MKOP(STORE, PREFIXED, 16);
  2907					break;
  2908				case 61:	/* pstd */
  2909					op->type = MKOP(STORE, PREFIXED, 8);
  2910					break;
  2911				case 62:        /* pstxvp */
  2912					op->reg = VSX_REGISTER_XTP(rd);
  2913					op->type = MKOP(STORE_VSX, PREFIXED, 32);
  2914					op->element_size = 32;
  2915					break;
  2916				}
  2917				break;
  2918			case 1: /* Type 01 Eight-Byte Register-to-Register */
  2919				break;
  2920			case 2: /* Type 10 Modified Load/Store */
  2921				if (prefix_r && ra)
  2922					break;
  2923				op->ea = mlsd_8lsd_ea(word, suffix, regs);
  2924				switch (suffixopcode) {
  2925				case 32:	/* plwz */
  2926					op->type = MKOP(LOAD, PREFIXED, 4);
  2927					break;
  2928				case 34:	/* plbz */
  2929					op->type = MKOP(LOAD, PREFIXED, 1);
  2930					break;
  2931				case 36:	/* pstw */
  2932					op->type = MKOP(STORE, PREFIXED, 4);
  2933					break;
  2934				case 38:	/* pstb */
  2935					op->type = MKOP(STORE, PREFIXED, 1);
  2936					break;
  2937				case 40:	/* plhz */
  2938					op->type = MKOP(LOAD, PREFIXED, 2);
  2939					break;
  2940				case 42:	/* plha */
  2941					op->type = MKOP(LOAD, PREFIXED | SIGNEXT, 2);
  2942					break;
  2943				case 44:	/* psth */
  2944					op->type = MKOP(STORE, PREFIXED, 2);
  2945					break;
  2946				case 48:        /* plfs */
  2947					op->type = MKOP(LOAD_FP, PREFIXED | FPCONV, 4);
  2948					break;
  2949				case 50:        /* plfd */
  2950					op->type = MKOP(LOAD_FP, PREFIXED, 8);
  2951					break;
  2952				case 52:        /* pstfs */
  2953					op->type = MKOP(STORE_FP, PREFIXED | FPCONV, 4);
  2954					break;
  2955				case 54:        /* pstfd */
  2956					op->type = MKOP(STORE_FP, PREFIXED, 8);
  2957					break;
  2958				}
  2959				break;
  2960			case 3: /* Type 11 Modified Register-to-Register */
  2961				break;
  2962			}
  2963	#endif /* __powerpc64__ */
  2964	
  2965		}
  2966	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 22196 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, back to index

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-08  7:27 [PATCH v4 0/4] powerpc/sstep: VSX 32-byte vector paired load/store instructions Ravi Bangoria
2020-10-08  7:27 ` [PATCH v4 1/4] powerpc/sstep: Emulate prefixed instructions only when CPU_FTR_ARCH_31 is set Ravi Bangoria
2020-10-08  7:27 ` [PATCH v4 2/4] powerpc/sstep: Support VSX vector paired storage access instructions Ravi Bangoria
2020-10-08 11:24   ` kernel test robot
2020-10-08  7:27 ` [PATCH v4 3/4] powerpc/ppc-opcode: Add encoding macros for VSX vector paired instructions Ravi Bangoria
2020-10-08  7:27 ` [PATCH v4 4/4] powerpc/sstep: Add testcases for VSX vector paired load/store instructions Ravi Bangoria

LinuxPPC-Dev Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linuxppc-dev/0 linuxppc-dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linuxppc-dev linuxppc-dev/ https://lore.kernel.org/linuxppc-dev \
		linuxppc-dev@lists.ozlabs.org linuxppc-dev@ozlabs.org
	public-inbox-index linuxppc-dev

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.ozlabs.lists.linuxppc-dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git