linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/18] Initial Prefixed Instruction support
@ 2019-11-26  5:21 Jordan Niethe
  2019-11-26  5:21 ` [PATCH 01/18] powerpc: Enable Prefixed Instructions Jordan Niethe
                   ` (18 more replies)
  0 siblings, 19 replies; 42+ messages in thread
From: Jordan Niethe @ 2019-11-26  5:21 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Jordan Niethe

A future revision of the ISA will introduce prefixed instructions. A
prefixed instruction is composed of a 4-byte prefix followed by a
4-byte suffix.

All prefixes have the major opcode 1. A prefix will never be a valid
word instruction. A suffix may be an existing word instruction or a new
instruction.

The new instruction formats are:
    * Eight-Byte Load/Store Instructions
    * Eight-Byte Register-to-Register Instructions
    * Modified Load/Store Instructions
    * Modified Register-to-Register Instructions

This series enables prefixed instructions and extends the instruction
emulation to support them. Then the places where prefixed instructions
might need to be emulated are updated.

A future series will add prefixed instruction support to guests running
in KVM.

Alistair Popple (1):
  powerpc: Enable Prefixed Instructions

Jordan Niethe (17):
  powerpc: Add BOUNDARY SRR1 bit for future ISA version
  powerpc: Add PREFIXED SRR1 bit for future ISA version
  powerpc: Rename Bit 35 of SRR1 to indicate new purpose
  powerpc sstep: Prepare to support prefixed instructions
  powerpc sstep: Add support for prefixed integer load/stores
  powerpc sstep: Add support for prefixed floating-point load/stores
  powerpc sstep: Add support for prefixed VSX load/stores
  powerpc sstep: Add support for prefixed fixed-point arithmetic
  powerpc: Support prefixed instructions in alignment handler
  powerpc/traps: Check for prefixed instructions in
    facility_unavailable_exception()
  powerpc/xmon: Add initial support for prefixed instructions
  powerpc/xmon: Dump prefixed instructions
  powerpc/kprobes: Support kprobes on prefixed instructions
  powerpc/uprobes: Add support for prefixed instructions
  powerpc/hw_breakpoints: Initial support for prefixed instructions
  powerpc: Add prefix support to mce_find_instr_ea_and_pfn()
  powerpc/fault: Use analyse_instr() to check for store with updates to
    sp

 arch/powerpc/include/asm/kprobes.h    |   5 +-
 arch/powerpc/include/asm/ppc-opcode.h |   3 +
 arch/powerpc/include/asm/reg.h        |   7 +-
 arch/powerpc/include/asm/sstep.h      |   8 +-
 arch/powerpc/include/asm/uaccess.h    |  30 +++++
 arch/powerpc/include/asm/uprobes.h    |  18 ++-
 arch/powerpc/kernel/align.c           |   8 +-
 arch/powerpc/kernel/dt_cpu_ftrs.c     |  23 ++++
 arch/powerpc/kernel/hw_breakpoint.c   |   8 +-
 arch/powerpc/kernel/kprobes.c         |  46 +++++--
 arch/powerpc/kernel/mce_power.c       |   6 +-
 arch/powerpc/kernel/optprobes.c       |  31 +++--
 arch/powerpc/kernel/optprobes_head.S  |   6 +
 arch/powerpc/kernel/traps.c           |  18 ++-
 arch/powerpc/kernel/uprobes.c         |   4 +-
 arch/powerpc/kvm/book3s_hv_nested.c   |   2 +-
 arch/powerpc/kvm/book3s_hv_rm_mmu.c   |   2 +-
 arch/powerpc/kvm/emulate_loadstore.c  |   2 +-
 arch/powerpc/lib/sstep.c              | 180 +++++++++++++++++++++++++-
 arch/powerpc/lib/test_emulate_step.c  |  30 ++---
 arch/powerpc/mm/fault.c               |  39 ++----
 arch/powerpc/xmon/xmon.c              | 132 +++++++++++++++----
 22 files changed, 490 insertions(+), 118 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH 01/18] powerpc: Enable Prefixed Instructions
  2019-11-26  5:21 [PATCH 00/18] Initial Prefixed Instruction support Jordan Niethe
@ 2019-11-26  5:21 ` Jordan Niethe
  2019-11-26  5:21 ` [PATCH 02/18] powerpc: Add BOUNDARY SRR1 bit for future ISA version Jordan Niethe
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 42+ messages in thread
From: Jordan Niethe @ 2019-11-26  5:21 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair

From: Alistair Popple <alistair@popple.id.au>

Prefix instructions have their own FSCR bit which needs to enabled via
a CPU feature. The kernel will save the FSCR for problem state but it
needs to be enabled initially.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
---
 arch/powerpc/include/asm/reg.h    |  3 +++
 arch/powerpc/kernel/dt_cpu_ftrs.c | 23 +++++++++++++++++++++++
 2 files changed, 26 insertions(+)

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index 0b7900f194c8..521ecbe35507 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -397,6 +397,7 @@
 #define SPRN_RWMR	0x375	/* Region-Weighting Mode Register */
 
 /* HFSCR and FSCR bit numbers are the same */
+#define FSCR_PREFIX_LG	13	/* Enable Prefix Instructions */
 #define FSCR_SCV_LG	12	/* Enable System Call Vectored */
 #define FSCR_MSGP_LG	10	/* Enable MSGP */
 #define FSCR_TAR_LG	8	/* Enable Target Address Register */
@@ -408,11 +409,13 @@
 #define FSCR_VECVSX_LG	1	/* Enable VMX/VSX  */
 #define FSCR_FP_LG	0	/* Enable Floating Point */
 #define SPRN_FSCR	0x099	/* Facility Status & Control Register */
+#define   FSCR_PREFIX	__MASK(FSCR_PREFIX_LG)
 #define   FSCR_SCV	__MASK(FSCR_SCV_LG)
 #define   FSCR_TAR	__MASK(FSCR_TAR_LG)
 #define   FSCR_EBB	__MASK(FSCR_EBB_LG)
 #define   FSCR_DSCR	__MASK(FSCR_DSCR_LG)
 #define SPRN_HFSCR	0xbe	/* HV=1 Facility Status & Control Register */
+#define   HFSCR_PREFIX	__MASK(FSCR_PREFIX_LG)
 #define   HFSCR_MSGP	__MASK(FSCR_MSGP_LG)
 #define   HFSCR_TAR	__MASK(FSCR_TAR_LG)
 #define   HFSCR_EBB	__MASK(FSCR_EBB_LG)
diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c
index 180b3a5d1001..f5ca7dd8fbaf 100644
--- a/arch/powerpc/kernel/dt_cpu_ftrs.c
+++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
@@ -553,6 +553,28 @@ static int __init feat_enable_large_ci(struct dt_cpu_feature *f)
 	return 1;
 }
 
+static int __init feat_enable_prefix(struct dt_cpu_feature *f)
+{
+	u64 fscr, hfscr;
+
+	if (f->usable_privilege & USABLE_HV) {
+		hfscr = mfspr(SPRN_HFSCR);
+		hfscr |= HFSCR_PREFIX;
+		mtspr(SPRN_HFSCR, hfscr);
+	}
+
+	if (f->usable_privilege & USABLE_OS) {
+		fscr = mfspr(SPRN_FSCR);
+		fscr |= FSCR_PREFIX;
+		mtspr(SPRN_FSCR, fscr);
+
+		if (f->usable_privilege & USABLE_PR)
+			current->thread.fscr |= FSCR_PREFIX;
+	}
+
+	return 1;
+}
+
 struct dt_cpu_feature_match {
 	const char *name;
 	int (*enable)(struct dt_cpu_feature *f);
@@ -626,6 +648,7 @@ static struct dt_cpu_feature_match __initdata
 	{"vector-binary128", feat_enable, 0},
 	{"vector-binary16", feat_enable, 0},
 	{"wait-v3", feat_enable, 0},
+	{"prefix-instructions", feat_enable_prefix, 0},
 };
 
 static bool __initdata using_dt_cpu_ftrs;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 02/18] powerpc: Add BOUNDARY SRR1 bit for future ISA version
  2019-11-26  5:21 [PATCH 00/18] Initial Prefixed Instruction support Jordan Niethe
  2019-11-26  5:21 ` [PATCH 01/18] powerpc: Enable Prefixed Instructions Jordan Niethe
@ 2019-11-26  5:21 ` Jordan Niethe
  2019-11-26  5:21 ` [PATCH 03/18] powerpc: Add PREFIXED " Jordan Niethe
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 42+ messages in thread
From: Jordan Niethe @ 2019-11-26  5:21 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Jordan Niethe

Add the bit definition for when the cause of an alignment exception is a
prefixed instruction that crosses a 64-byte boundary.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
 arch/powerpc/include/asm/reg.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index 521ecbe35507..6f9fcc3d4c82 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -777,6 +777,7 @@
 #define   SRR1_PROGADDR		0x00010000 /* SRR0 contains subsequent addr */
 
 #define   SRR1_MCE_MCP		0x00080000 /* Machine check signal caused interrupt */
+#define   SRR1_BOUNDARY		0x10000000 /* Prefixed instruction crosses 64-byte boundary */
 
 #define SPRN_HSRR0	0x13A	/* Save/Restore Register 0 */
 #define SPRN_HSRR1	0x13B	/* Save/Restore Register 1 */
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 03/18] powerpc: Add PREFIXED SRR1 bit for future ISA version
  2019-11-26  5:21 [PATCH 00/18] Initial Prefixed Instruction support Jordan Niethe
  2019-11-26  5:21 ` [PATCH 01/18] powerpc: Enable Prefixed Instructions Jordan Niethe
  2019-11-26  5:21 ` [PATCH 02/18] powerpc: Add BOUNDARY SRR1 bit for future ISA version Jordan Niethe
@ 2019-11-26  5:21 ` Jordan Niethe
  2019-12-18  8:23   ` Daniel Axtens
  2019-11-26  5:21 ` [PATCH 04/18] powerpc: Rename Bit 35 of SRR1 to indicate new purpose Jordan Niethe
                   ` (15 subsequent siblings)
  18 siblings, 1 reply; 42+ messages in thread
From: Jordan Niethe @ 2019-11-26  5:21 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Jordan Niethe

Add the bit definition for exceptions caused by prefixed instructions.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
 arch/powerpc/include/asm/reg.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index 6f9fcc3d4c82..0a6d39fb4769 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -778,6 +778,7 @@
 
 #define   SRR1_MCE_MCP		0x00080000 /* Machine check signal caused interrupt */
 #define   SRR1_BOUNDARY		0x10000000 /* Prefixed instruction crosses 64-byte boundary */
+#define   SRR1_PREFIXED		0x20000000 /* Exception caused by prefixed instruction */
 
 #define SPRN_HSRR0	0x13A	/* Save/Restore Register 0 */
 #define SPRN_HSRR1	0x13B	/* Save/Restore Register 1 */
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 04/18] powerpc: Rename Bit 35 of SRR1 to indicate new purpose
  2019-11-26  5:21 [PATCH 00/18] Initial Prefixed Instruction support Jordan Niethe
                   ` (2 preceding siblings ...)
  2019-11-26  5:21 ` [PATCH 03/18] powerpc: Add PREFIXED " Jordan Niethe
@ 2019-11-26  5:21 ` Jordan Niethe
  2019-11-26  5:21 ` [PATCH 05/18] powerpc sstep: Prepare to support prefixed instructions Jordan Niethe
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 42+ messages in thread
From: Jordan Niethe @ 2019-11-26  5:21 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Jordan Niethe

Bit 35 of SRR1 is called SRR1_ISI_N_OR_G. This name comes from it being
used to indicate that an ISI was due to the access being no-exec or
guarded. A future ISA version adds another purpose. Now it is also set if there
is a access in a cache-inhibited location for prefixed instruction.
Rename from SRR1_ISI_N_OR_G -> SRR1_ISI_N_G_OR_CIP to reflected this new
role.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
 arch/powerpc/include/asm/reg.h      | 2 +-
 arch/powerpc/kvm/book3s_hv_nested.c | 2 +-
 arch/powerpc/kvm/book3s_hv_rm_mmu.c | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index 0a6d39fb4769..d3d8212603cb 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -750,7 +750,7 @@
 #define SPRN_SRR0	0x01A	/* Save/Restore Register 0 */
 #define SPRN_SRR1	0x01B	/* Save/Restore Register 1 */
 #define   SRR1_ISI_NOPT		0x40000000 /* ISI: Not found in hash */
-#define   SRR1_ISI_N_OR_G	0x10000000 /* ISI: Access is no-exec or G */
+#define   SRR1_ISI_N_G_OR_CIP	0x10000000 /* ISI: Access is no-exec or G or CI for a prefixed instruction */
 #define   SRR1_ISI_PROT		0x08000000 /* ISI: Other protection fault */
 #define   SRR1_WAKEMASK		0x00380000 /* reason for wakeup */
 #define   SRR1_WAKEMASK_P8	0x003c0000 /* reason for wakeup on POWER8 and 9 */
diff --git a/arch/powerpc/kvm/book3s_hv_nested.c b/arch/powerpc/kvm/book3s_hv_nested.c
index cdf30c6eaf54..32798ee76f27 100644
--- a/arch/powerpc/kvm/book3s_hv_nested.c
+++ b/arch/powerpc/kvm/book3s_hv_nested.c
@@ -1169,7 +1169,7 @@ static int kvmhv_translate_addr_nested(struct kvm_vcpu *vcpu,
 		} else if (vcpu->arch.trap == BOOK3S_INTERRUPT_H_INST_STORAGE) {
 			/* Can we execute? */
 			if (!gpte_p->may_execute) {
-				flags |= SRR1_ISI_N_OR_G;
+				flags |= SRR1_ISI_N_G_OR_CIP;
 				goto forward_to_l1;
 			}
 		} else {
diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
index 220305454c23..b53a9f1c1a46 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
@@ -1260,7 +1260,7 @@ long kvmppc_hpte_hv_fault(struct kvm_vcpu *vcpu, unsigned long addr,
 	status &= ~DSISR_NOHPTE;	/* DSISR_NOHPTE == SRR1_ISI_NOPT */
 	if (!data) {
 		if (gr & (HPTE_R_N | HPTE_R_G))
-			return status | SRR1_ISI_N_OR_G;
+			return status | SRR1_ISI_N_G_OR_CIP;
 		if (!hpte_read_permission(pp, slb_v & key))
 			return status | SRR1_ISI_PROT;
 	} else if (status & DSISR_ISSTORE) {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 05/18] powerpc sstep: Prepare to support prefixed instructions
  2019-11-26  5:21 [PATCH 00/18] Initial Prefixed Instruction support Jordan Niethe
                   ` (3 preceding siblings ...)
  2019-11-26  5:21 ` [PATCH 04/18] powerpc: Rename Bit 35 of SRR1 to indicate new purpose Jordan Niethe
@ 2019-11-26  5:21 ` Jordan Niethe
  2019-12-18  8:35   ` Daniel Axtens
                     ` (2 more replies)
  2019-11-26  5:21 ` [PATCH 06/18] powerpc sstep: Add support for prefixed integer load/stores Jordan Niethe
                   ` (13 subsequent siblings)
  18 siblings, 3 replies; 42+ messages in thread
From: Jordan Niethe @ 2019-11-26  5:21 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Jordan Niethe

Currently all instructions are a single word long. A future ISA version
will include prefixed instructions which have a double word length. The
functions used for analysing and emulating instructions need to be
modified so that they can handle these new instruction types.

A prefixed instruction is a word prefix followed by a word suffix. All
prefixes uniquely have the primary op-code 1. Suffixes may be valid word
instructions or instructions that only exist as suffixes.

In handling prefixed instructions it will be convenient to treat the
suffix and prefix as separate words. To facilitate this modify
analyse_instr() and emulate_step() to take a take a suffix as a
parameter. For word instructions it does not matter what is passed in
here - it will be ignored.

We also define a new flag, PREFIXED, to be used in instruction_op:type.
This flag will indicate when emulating an analysed instruction if the
NIP should be advanced by word length or double word length.

The callers of analyse_instr() and emulate_step() will need their own
changes to be able to support prefixed instructions. For now modify them
to pass in 0 as a suffix.

Note that at this point no prefixed instructions are emulated or
analysed - this is just making it possible to do so.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
 arch/powerpc/include/asm/ppc-opcode.h |  3 +++
 arch/powerpc/include/asm/sstep.h      |  8 +++++--
 arch/powerpc/include/asm/uaccess.h    | 30 +++++++++++++++++++++++++++
 arch/powerpc/kernel/align.c           |  2 +-
 arch/powerpc/kernel/hw_breakpoint.c   |  4 ++--
 arch/powerpc/kernel/kprobes.c         |  2 +-
 arch/powerpc/kernel/mce_power.c       |  2 +-
 arch/powerpc/kernel/optprobes.c       |  2 +-
 arch/powerpc/kernel/uprobes.c         |  2 +-
 arch/powerpc/kvm/emulate_loadstore.c  |  2 +-
 arch/powerpc/lib/sstep.c              | 12 ++++++-----
 arch/powerpc/lib/test_emulate_step.c  | 30 +++++++++++++--------------
 arch/powerpc/xmon/xmon.c              |  4 ++--
 13 files changed, 71 insertions(+), 32 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h
index c1df75edde44..a1dfa4bdd22f 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -377,6 +377,9 @@
 #define PPC_INST_VCMPEQUD		0x100000c7
 #define PPC_INST_VCMPEQUB		0x10000006
 
+/* macro to check if a word is a prefix */
+#define IS_PREFIX(x) (((x) >> 26) == 1)
+
 /* macros to insert fields into opcodes */
 #define ___PPC_RA(a)	(((a) & 0x1f) << 16)
 #define ___PPC_RB(b)	(((b) & 0x1f) << 11)
diff --git a/arch/powerpc/include/asm/sstep.h b/arch/powerpc/include/asm/sstep.h
index 769f055509c9..6d4cb602e231 100644
--- a/arch/powerpc/include/asm/sstep.h
+++ b/arch/powerpc/include/asm/sstep.h
@@ -89,6 +89,9 @@ enum instruction_type {
 #define VSX_LDLEFT	4	/* load VSX register from left */
 #define VSX_CHECK_VEC	8	/* check MSR_VEC not MSR_VSX for reg >= 32 */
 
+/* Prefixed flag, ORed in with type */
+#define PREFIXED	0x800
+
 /* Size field in type word */
 #define SIZE(n)		((n) << 12)
 #define GETSIZE(w)	((w) >> 12)
@@ -132,7 +135,7 @@ union vsx_reg {
  * otherwise.
  */
 extern int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
-			 unsigned int instr);
+			 unsigned int instr, unsigned int sufx);
 
 /*
  * Emulate an instruction that can be executed just by updating
@@ -149,7 +152,8 @@ void emulate_update_regs(struct pt_regs *reg, struct instruction_op *op);
  * 0 if it could not be emulated, or -1 for an instruction that
  * should not be emulated (rfid, mtmsrd clearing MSR_RI, etc.).
  */
-extern int emulate_step(struct pt_regs *regs, unsigned int instr);
+extern int emulate_step(struct pt_regs *regs, unsigned int instr,
+			unsigned int sufx);
 
 /*
  * Emulate a load or store instruction by reading/writing the
diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
index 15002b51ff18..bc585399e0c7 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -423,4 +423,34 @@ extern long __copy_from_user_flushcache(void *dst, const void __user *src,
 extern void memcpy_page_flushcache(char *to, struct page *page, size_t offset,
 			   size_t len);
 
+/*
+ * When reading an instruction iff it is a prefix, the suffix needs to be also
+ * loaded.
+ */
+#define __get_user_instr(x, y, ptr)			\
+({							\
+	long __gui_ret = 0;				\
+	y = 0;						\
+	__gui_ret = __get_user(x, ptr);			\
+	if (!__gui_ret) {				\
+		if (IS_PREFIX(x))			\
+			__gui_ret = __get_user(y, ptr + 1);	\
+	}						\
+							\
+	__gui_ret;					\
+})
+
+#define __get_user_instr_inatomic(x, y, ptr)		\
+({							\
+	long __gui_ret = 0;				\
+	y = 0;						\
+	__gui_ret = __get_user_inatomic(x, ptr);	\
+	if (!__gui_ret) {				\
+		if (IS_PREFIX(x))			\
+			__gui_ret = __get_user_inatomic(y, ptr + 1);	\
+	}						\
+							\
+	__gui_ret;					\
+})
+
 #endif	/* _ARCH_POWERPC_UACCESS_H */
diff --git a/arch/powerpc/kernel/align.c b/arch/powerpc/kernel/align.c
index 92045ed64976..245e79792a01 100644
--- a/arch/powerpc/kernel/align.c
+++ b/arch/powerpc/kernel/align.c
@@ -334,7 +334,7 @@ int fix_alignment(struct pt_regs *regs)
 	if ((instr & 0xfc0006fe) == (PPC_INST_COPY & 0xfc0006fe))
 		return -EIO;
 
-	r = analyse_instr(&op, regs, instr);
+	r = analyse_instr(&op, regs, instr, 0);
 	if (r < 0)
 		return -EINVAL;
 
diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c
index 58ce3d37c2a3..f4530961998c 100644
--- a/arch/powerpc/kernel/hw_breakpoint.c
+++ b/arch/powerpc/kernel/hw_breakpoint.c
@@ -248,7 +248,7 @@ static bool stepping_handler(struct pt_regs *regs, struct perf_event *bp,
 	if (__get_user_inatomic(instr, (unsigned int *)regs->nip))
 		goto fail;
 
-	ret = analyse_instr(&op, regs, instr);
+	ret = analyse_instr(&op, regs, instr, 0);
 	type = GETTYPE(op.type);
 	size = GETSIZE(op.type);
 
@@ -272,7 +272,7 @@ static bool stepping_handler(struct pt_regs *regs, struct perf_event *bp,
 		return false;
 	}
 
-	if (!emulate_step(regs, instr))
+	if (!emulate_step(regs, instr, 0))
 		goto fail;
 
 	return true;
diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
index 2d27ec4feee4..7303fe3856cc 100644
--- a/arch/powerpc/kernel/kprobes.c
+++ b/arch/powerpc/kernel/kprobes.c
@@ -219,7 +219,7 @@ static int try_to_emulate(struct kprobe *p, struct pt_regs *regs)
 	unsigned int insn = *p->ainsn.insn;
 
 	/* regs->nip is also adjusted if emulate_step returns 1 */
-	ret = emulate_step(regs, insn);
+	ret = emulate_step(regs, insn, 0);
 	if (ret > 0) {
 		/*
 		 * Once this instruction has been boosted
diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c
index 1cbf7f1a4e3d..d862bb549158 100644
--- a/arch/powerpc/kernel/mce_power.c
+++ b/arch/powerpc/kernel/mce_power.c
@@ -374,7 +374,7 @@ static int mce_find_instr_ea_and_phys(struct pt_regs *regs, uint64_t *addr,
 	if (pfn != ULONG_MAX) {
 		instr_addr = (pfn << PAGE_SHIFT) + (regs->nip & ~PAGE_MASK);
 		instr = *(unsigned int *)(instr_addr);
-		if (!analyse_instr(&op, &tmp, instr)) {
+		if (!analyse_instr(&op, &tmp, instr, 0)) {
 			pfn = addr_to_pfn(regs, op.ea);
 			*addr = op.ea;
 			*phys_addr = (pfn << PAGE_SHIFT);
diff --git a/arch/powerpc/kernel/optprobes.c b/arch/powerpc/kernel/optprobes.c
index 024f7aad1952..82dc8a589c87 100644
--- a/arch/powerpc/kernel/optprobes.c
+++ b/arch/powerpc/kernel/optprobes.c
@@ -100,7 +100,7 @@ static unsigned long can_optimize(struct kprobe *p)
 	 * and that can be emulated.
 	 */
 	if (!is_conditional_branch(*p->ainsn.insn) &&
-			analyse_instr(&op, &regs, *p->ainsn.insn) == 1) {
+			analyse_instr(&op, &regs, *p->ainsn.insn, 0) == 1) {
 		emulate_update_regs(&regs, &op);
 		nip = regs.nip;
 	}
diff --git a/arch/powerpc/kernel/uprobes.c b/arch/powerpc/kernel/uprobes.c
index 1cfef0e5fec5..ab1077dc6148 100644
--- a/arch/powerpc/kernel/uprobes.c
+++ b/arch/powerpc/kernel/uprobes.c
@@ -173,7 +173,7 @@ bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs)
 	 * emulate_step() returns 1 if the insn was successfully emulated.
 	 * For all other cases, we need to single-step in hardware.
 	 */
-	ret = emulate_step(regs, auprobe->insn);
+	ret = emulate_step(regs, auprobe->insn, 0);
 	if (ret > 0)
 		return true;
 
diff --git a/arch/powerpc/kvm/emulate_loadstore.c b/arch/powerpc/kvm/emulate_loadstore.c
index 2e496eb86e94..fcab1f31b48d 100644
--- a/arch/powerpc/kvm/emulate_loadstore.c
+++ b/arch/powerpc/kvm/emulate_loadstore.c
@@ -100,7 +100,7 @@ int kvmppc_emulate_loadstore(struct kvm_vcpu *vcpu)
 
 	emulated = EMULATE_FAIL;
 	vcpu->arch.regs.msr = vcpu->arch.shared->msr;
-	if (analyse_instr(&op, &vcpu->arch.regs, inst) == 0) {
+	if (analyse_instr(&op, &vcpu->arch.regs, inst, 0) == 0) {
 		int type = op.type & INSTR_TYPE_MASK;
 		int size = GETSIZE(op.type);
 
diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index c077acb983a1..ade3f5eba2e5 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -1163,7 +1163,7 @@ static nokprobe_inline int trap_compare(long v1, long v2)
  * otherwise.
  */
 int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
-		  unsigned int instr)
+		  unsigned int instr, unsigned int sufx)
 {
 	unsigned int opcode, ra, rb, rc, rd, spr, u;
 	unsigned long int imm;
@@ -2756,7 +2756,8 @@ void emulate_update_regs(struct pt_regs *regs, struct instruction_op *op)
 {
 	unsigned long next_pc;
 
-	next_pc = truncate_if_32bit(regs->msr, regs->nip + 4);
+	next_pc = truncate_if_32bit(regs->msr,
+				    regs->nip + ((op->type & PREFIXED) ? 8 : 4));
 	switch (GETTYPE(op->type)) {
 	case COMPUTE:
 		if (op->type & SETREG)
@@ -3101,14 +3102,14 @@ NOKPROBE_SYMBOL(emulate_loadstore);
  * or -1 if the instruction is one that should not be stepped,
  * such as an rfid, or a mtmsrd that would clear MSR_RI.
  */
-int emulate_step(struct pt_regs *regs, unsigned int instr)
+int emulate_step(struct pt_regs *regs, unsigned int instr, unsigned int sufx)
 {
 	struct instruction_op op;
 	int r, err, type;
 	unsigned long val;
 	unsigned long ea;
 
-	r = analyse_instr(&op, regs, instr);
+	r = analyse_instr(&op, regs, instr, sufx);
 	if (r < 0)
 		return r;
 	if (r > 0) {
@@ -3200,7 +3201,8 @@ int emulate_step(struct pt_regs *regs, unsigned int instr)
 	return 0;
 
  instr_done:
-	regs->nip = truncate_if_32bit(regs->msr, regs->nip + 4);
+	regs->nip = truncate_if_32bit(regs->msr,
+				      regs->nip + ((op.type & PREFIXED) ? 8 : 4));
 	return 1;
 }
 NOKPROBE_SYMBOL(emulate_step);
diff --git a/arch/powerpc/lib/test_emulate_step.c b/arch/powerpc/lib/test_emulate_step.c
index 42347067739c..9288dc6fc715 100644
--- a/arch/powerpc/lib/test_emulate_step.c
+++ b/arch/powerpc/lib/test_emulate_step.c
@@ -103,7 +103,7 @@ static void __init test_ld(void)
 	regs.gpr[3] = (unsigned long) &a;
 
 	/* ld r5, 0(r3) */
-	stepped = emulate_step(&regs, TEST_LD(5, 3, 0));
+	stepped = emulate_step(&regs, TEST_LD(5, 3, 0), 0);
 
 	if (stepped == 1 && regs.gpr[5] == a)
 		show_result("ld", "PASS");
@@ -121,7 +121,7 @@ static void __init test_lwz(void)
 	regs.gpr[3] = (unsigned long) &a;
 
 	/* lwz r5, 0(r3) */
-	stepped = emulate_step(&regs, TEST_LWZ(5, 3, 0));
+	stepped = emulate_step(&regs, TEST_LWZ(5, 3, 0), 0);
 
 	if (stepped == 1 && regs.gpr[5] == a)
 		show_result("lwz", "PASS");
@@ -141,7 +141,7 @@ static void __init test_lwzx(void)
 	regs.gpr[5] = 0x8765;
 
 	/* lwzx r5, r3, r4 */
-	stepped = emulate_step(&regs, TEST_LWZX(5, 3, 4));
+	stepped = emulate_step(&regs, TEST_LWZX(5, 3, 4), 0);
 	if (stepped == 1 && regs.gpr[5] == a[2])
 		show_result("lwzx", "PASS");
 	else
@@ -159,7 +159,7 @@ static void __init test_std(void)
 	regs.gpr[5] = 0x5678;
 
 	/* std r5, 0(r3) */
-	stepped = emulate_step(&regs, TEST_STD(5, 3, 0));
+	stepped = emulate_step(&regs, TEST_STD(5, 3, 0), 0);
 	if (stepped == 1 || regs.gpr[5] == a)
 		show_result("std", "PASS");
 	else
@@ -184,7 +184,7 @@ static void __init test_ldarx_stdcx(void)
 	regs.gpr[5] = 0x5678;
 
 	/* ldarx r5, r3, r4, 0 */
-	stepped = emulate_step(&regs, TEST_LDARX(5, 3, 4, 0));
+	stepped = emulate_step(&regs, TEST_LDARX(5, 3, 4, 0), 0);
 
 	/*
 	 * Don't touch 'a' here. Touching 'a' can do Load/store
@@ -202,7 +202,7 @@ static void __init test_ldarx_stdcx(void)
 	regs.gpr[5] = 0x9ABC;
 
 	/* stdcx. r5, r3, r4 */
-	stepped = emulate_step(&regs, TEST_STDCX(5, 3, 4));
+	stepped = emulate_step(&regs, TEST_STDCX(5, 3, 4), 0);
 
 	/*
 	 * Two possible scenarios that indicates successful emulation
@@ -242,7 +242,7 @@ static void __init test_lfsx_stfsx(void)
 	regs.gpr[4] = 0;
 
 	/* lfsx frt10, r3, r4 */
-	stepped = emulate_step(&regs, TEST_LFSX(10, 3, 4));
+	stepped = emulate_step(&regs, TEST_LFSX(10, 3, 4), 0);
 
 	if (stepped == 1)
 		show_result("lfsx", "PASS");
@@ -255,7 +255,7 @@ static void __init test_lfsx_stfsx(void)
 	c.a = 678.91;
 
 	/* stfsx frs10, r3, r4 */
-	stepped = emulate_step(&regs, TEST_STFSX(10, 3, 4));
+	stepped = emulate_step(&regs, TEST_STFSX(10, 3, 4), 0);
 
 	if (stepped == 1 && c.b == cached_b)
 		show_result("stfsx", "PASS");
@@ -285,7 +285,7 @@ static void __init test_lfdx_stfdx(void)
 	regs.gpr[4] = 0;
 
 	/* lfdx frt10, r3, r4 */
-	stepped = emulate_step(&regs, TEST_LFDX(10, 3, 4));
+	stepped = emulate_step(&regs, TEST_LFDX(10, 3, 4), 0);
 
 	if (stepped == 1)
 		show_result("lfdx", "PASS");
@@ -298,7 +298,7 @@ static void __init test_lfdx_stfdx(void)
 	c.a = 987654.32;
 
 	/* stfdx frs10, r3, r4 */
-	stepped = emulate_step(&regs, TEST_STFDX(10, 3, 4));
+	stepped = emulate_step(&regs, TEST_STFDX(10, 3, 4), 0);
 
 	if (stepped == 1 && c.b == cached_b)
 		show_result("stfdx", "PASS");
@@ -344,7 +344,7 @@ static void __init test_lvx_stvx(void)
 	regs.gpr[4] = 0;
 
 	/* lvx vrt10, r3, r4 */
-	stepped = emulate_step(&regs, TEST_LVX(10, 3, 4));
+	stepped = emulate_step(&regs, TEST_LVX(10, 3, 4), 0);
 
 	if (stepped == 1)
 		show_result("lvx", "PASS");
@@ -360,7 +360,7 @@ static void __init test_lvx_stvx(void)
 	c.b[3] = 498532;
 
 	/* stvx vrs10, r3, r4 */
-	stepped = emulate_step(&regs, TEST_STVX(10, 3, 4));
+	stepped = emulate_step(&regs, TEST_STVX(10, 3, 4), 0);
 
 	if (stepped == 1 && cached_b[0] == c.b[0] && cached_b[1] == c.b[1] &&
 	    cached_b[2] == c.b[2] && cached_b[3] == c.b[3])
@@ -401,7 +401,7 @@ static void __init test_lxvd2x_stxvd2x(void)
 	regs.gpr[4] = 0;
 
 	/* lxvd2x vsr39, r3, r4 */
-	stepped = emulate_step(&regs, TEST_LXVD2X(39, 3, 4));
+	stepped = emulate_step(&regs, TEST_LXVD2X(39, 3, 4), 0);
 
 	if (stepped == 1 && cpu_has_feature(CPU_FTR_VSX)) {
 		show_result("lxvd2x", "PASS");
@@ -421,7 +421,7 @@ static void __init test_lxvd2x_stxvd2x(void)
 	c.b[3] = 4;
 
 	/* stxvd2x vsr39, r3, r4 */
-	stepped = emulate_step(&regs, TEST_STXVD2X(39, 3, 4));
+	stepped = emulate_step(&regs, TEST_STXVD2X(39, 3, 4), 0);
 
 	if (stepped == 1 && cached_b[0] == c.b[0] && cached_b[1] == c.b[1] &&
 	    cached_b[2] == c.b[2] && cached_b[3] == c.b[3] &&
@@ -848,7 +848,7 @@ static int __init emulate_compute_instr(struct pt_regs *regs,
 	if (!regs || !instr)
 		return -EINVAL;
 
-	if (analyse_instr(&op, regs, instr) != 1 ||
+	if (analyse_instr(&op, regs, instr, 0) != 1 ||
 	    GETTYPE(op.type) != COMPUTE) {
 		pr_info("emulation failed, instruction = 0x%08x\n", instr);
 		return -EFAULT;
diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index a7056049709e..f47bd843dc52 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -705,7 +705,7 @@ static int xmon_core(struct pt_regs *regs, int fromipi)
 	if ((regs->msr & (MSR_IR|MSR_PR|MSR_64BIT)) == (MSR_IR|MSR_64BIT)) {
 		bp = at_breakpoint(regs->nip);
 		if (bp != NULL) {
-			int stepped = emulate_step(regs, bp->instr[0]);
+			int stepped = emulate_step(regs, bp->instr[0], 0);
 			if (stepped == 0) {
 				regs->nip = (unsigned long) &bp->instr[0];
 				atomic_inc(&bp->ref_count);
@@ -1170,7 +1170,7 @@ static int do_step(struct pt_regs *regs)
 	/* check we are in 64-bit kernel mode, translation enabled */
 	if ((regs->msr & (MSR_64BIT|MSR_PR|MSR_IR)) == (MSR_64BIT|MSR_IR)) {
 		if (mread(regs->nip, &instr, 4) == 4) {
-			stepped = emulate_step(regs, instr);
+			stepped = emulate_step(regs, instr, 0);
 			if (stepped < 0) {
 				printf("Couldn't single-step %s instruction\n",
 				       (IS_RFID(instr)? "rfid": "mtmsrd"));
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 06/18] powerpc sstep: Add support for prefixed integer load/stores
  2019-11-26  5:21 [PATCH 00/18] Initial Prefixed Instruction support Jordan Niethe
                   ` (4 preceding siblings ...)
  2019-11-26  5:21 ` [PATCH 05/18] powerpc sstep: Prepare to support prefixed instructions Jordan Niethe
@ 2019-11-26  5:21 ` Jordan Niethe
  2020-01-10 10:38   ` Balamuruhan S
  2020-01-10 15:13   ` Balamuruhan S
  2019-11-26  5:21 ` [PATCH 07/18] powerpc sstep: Add support for prefixed floating-point load/stores Jordan Niethe
                   ` (12 subsequent siblings)
  18 siblings, 2 replies; 42+ messages in thread
From: Jordan Niethe @ 2019-11-26  5:21 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Jordan Niethe

This adds emulation support for the following prefixed integer
load/stores:
  * Prefixed Load Byte and Zero (plbz)
  * Prefixed Load Halfword and Zero (plhz)
  * Prefixed Load Halfword Algebraic (plha)
  * Prefixed Load Word and Zero (plwz)
  * Prefixed Load Word Algebraic (plwa)
  * Prefixed Load Doubleword (pld)
  * Prefixed Store Byte (pstb)
  * Prefixed Store Halfword (psth)
  * Prefixed Store Word (pstw)
  * Prefixed Store Doubleword (pstd)
  * Prefixed Load Quadword (plq)
  * Prefixed Store Quadword (pstq)

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
 arch/powerpc/lib/sstep.c | 110 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 110 insertions(+)

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index ade3f5eba2e5..4f5ad1f602d8 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -187,6 +187,43 @@ static nokprobe_inline unsigned long xform_ea(unsigned int instr,
 	return ea;
 }
 
+/*
+ * Calculate effective address for a MLS:D-form / 8LS:D-form prefixed instruction
+ */
+static nokprobe_inline unsigned long mlsd_8lsd_ea(unsigned int instr,
+						  unsigned int sufx,
+						  const struct pt_regs *regs)
+{
+	int ra, prefix_r;
+	unsigned int  dd;
+	unsigned long ea, d0, d1, d;
+
+	prefix_r = instr & (1ul << 20);
+	ra = (sufx >> 16) & 0x1f;
+
+	d0 = instr & 0x3ffff;
+	d1 = sufx & 0xffff;
+	d = (d0 << 16) | d1;
+
+	/*
+	 * sign extend a 34 bit number
+	 */
+	dd = (unsigned int) (d >> 2);
+	ea = (signed int) dd;
+	ea = (ea << 2) | (d & 0x3);
+
+	if (!prefix_r && ra)
+		ea += regs->gpr[ra];
+	else if (!prefix_r && !ra)
+		; /* Leave ea as is */
+	else if (prefix_r && !ra)
+		ea += regs->nip;
+	else if (prefix_r && ra)
+		; /* Invalid form. Should already be checked for by caller! */
+
+	return ea;
+}
+
 /*
  * Return the largest power of 2, not greater than sizeof(unsigned long),
  * such that x is a multiple of it.
@@ -1166,6 +1203,7 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 		  unsigned int instr, unsigned int sufx)
 {
 	unsigned int opcode, ra, rb, rc, rd, spr, u;
+	unsigned int sufxopcode, prefixtype, prefix_r;
 	unsigned long int imm;
 	unsigned long int val, val2;
 	unsigned int mb, me, sh;
@@ -2652,6 +2690,78 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 
 	}
 
+/*
+ * Prefixed instructions
+ */
+	switch (opcode) {
+	case 1:
+		prefix_r = instr & (1ul << 20);
+		ra = (sufx >> 16) & 0x1f;
+		op->update_reg = ra;
+		rd = (sufx >> 21) & 0x1f;
+		op->reg = rd;
+		op->val = regs->gpr[rd];
+
+		sufxopcode = sufx >> 26;
+		prefixtype = (instr >> 24) & 0x3;
+		switch (prefixtype) {
+		case 0: /* Type 00  Eight-Byte Load/Store */
+			if (prefix_r && ra)
+				break;
+			op->ea = mlsd_8lsd_ea(instr, sufx, regs);
+			switch (sufxopcode) {
+			case 41:	/* plwa */
+				op->type = MKOP(LOAD, PREFIXED | SIGNEXT, 4);
+				break;
+			case 56:        /* plq */
+				op->type = MKOP(LOAD, PREFIXED, 16);
+				break;
+			case 57:	/* pld */
+				op->type = MKOP(LOAD, PREFIXED | SIGNEXT, 8);
+				break;
+			case 60:        /* stq */
+				op->type = MKOP(STORE, PREFIXED, 16);
+				break;
+			case 61:	/* pstd */
+				op->type = MKOP(STORE, PREFIXED | SIGNEXT, 8);
+				break;
+			}
+			break;
+		case 1: /* Type 01 Modified Register-to-Register */
+			break;
+		case 2: /* Type 10 Modified Load/Store */
+			if (prefix_r && ra)
+				break;
+			op->ea = mlsd_8lsd_ea(instr, sufx, regs);
+			switch (sufxopcode) {
+			case 32:	/* plwz */
+				op->type = MKOP(LOAD, PREFIXED, 4);
+				break;
+			case 34:	/* plbz */
+				op->type = MKOP(LOAD, PREFIXED, 1);
+				break;
+			case 36:	/* pstw */
+				op->type = MKOP(STORE, PREFIXED, 4);
+				break;
+			case 38:	/* pstb */
+				op->type = MKOP(STORE, PREFIXED, 1);
+				break;
+			case 40:	/* plhz */
+				op->type = MKOP(LOAD, PREFIXED, 2);
+				break;
+			case 42:	/* plha */
+				op->type = MKOP(LOAD, PREFIXED | SIGNEXT, 2);
+				break;
+			case 44:	/* psth */
+				op->type = MKOP(STORE, PREFIXED, 2);
+				break;
+			}
+			break;
+		case 3: /* Type 11 Modified Register-to-Register */
+			break;
+		}
+	}
+
 #ifdef CONFIG_VSX
 	if ((GETTYPE(op->type) == LOAD_VSX ||
 	     GETTYPE(op->type) == STORE_VSX) &&
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 07/18] powerpc sstep: Add support for prefixed floating-point load/stores
  2019-11-26  5:21 [PATCH 00/18] Initial Prefixed Instruction support Jordan Niethe
                   ` (5 preceding siblings ...)
  2019-11-26  5:21 ` [PATCH 06/18] powerpc sstep: Add support for prefixed integer load/stores Jordan Niethe
@ 2019-11-26  5:21 ` Jordan Niethe
  2019-11-26  5:21 ` [PATCH 08/18] powerpc sstep: Add support for prefixed VSX load/stores Jordan Niethe
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 42+ messages in thread
From: Jordan Niethe @ 2019-11-26  5:21 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Jordan Niethe

This adds emulation support for the follow prefixed floating-point
load/stores:
  * Prefixed Load Floating-Point Single (plfs)
  * Prefixed Load Floating-Point Double (plfd)
  * Prefixed Store Floating-Point Single (pstfs)
  * Prefixed Store Floating-Point Double (pstfd)

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
 arch/powerpc/lib/sstep.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index 4f5ad1f602d8..9113b9a21ae9 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -2755,6 +2755,18 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			case 44:	/* psth */
 				op->type = MKOP(STORE, PREFIXED, 2);
 				break;
+			case 48:        /* plfs */
+				op->type = MKOP(LOAD_FP, PREFIXED | FPCONV, 4);
+				break;
+			case 50:        /* plfd */
+				op->type = MKOP(LOAD_FP, PREFIXED, 8);
+				break;
+			case 52:        /* pstfs */
+				op->type = MKOP(STORE_FP, PREFIXED | FPCONV, 4);
+				break;
+			case 54:        /* pstfd */
+				op->type = MKOP(STORE_FP, PREFIXED, 8);
+				break;
 			}
 			break;
 		case 3: /* Type 11 Modified Register-to-Register */
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 08/18] powerpc sstep: Add support for prefixed VSX load/stores
  2019-11-26  5:21 [PATCH 00/18] Initial Prefixed Instruction support Jordan Niethe
                   ` (6 preceding siblings ...)
  2019-11-26  5:21 ` [PATCH 07/18] powerpc sstep: Add support for prefixed floating-point load/stores Jordan Niethe
@ 2019-11-26  5:21 ` Jordan Niethe
  2019-12-18 14:05   ` Daniel Axtens
  2019-11-26  5:21 ` [PATCH 09/18] powerpc sstep: Add support for prefixed fixed-point arithmetic Jordan Niethe
                   ` (10 subsequent siblings)
  18 siblings, 1 reply; 42+ messages in thread
From: Jordan Niethe @ 2019-11-26  5:21 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Jordan Niethe

This adds emulation support for the following prefixed VSX load/stores:
  * Prefixed Load VSX Scalar Doubleword (plxsd)
  * Prefixed Load VSX Scalar Single-Precision (plxssp)
  * Prefixed Load VSX Vector [0|1]  (plxv, plxv0, plxv1)
  * Prefixed Store VSX Scalar Doubleword (pstxsd)
  * Prefixed Store VSX Scalar Single-Precision (pstxssp)
  * Prefixed Store VSX Vector [0|1] (pstxv, pstxv0, pstxv1)

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
 arch/powerpc/lib/sstep.c | 42 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 42 insertions(+)

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index 9113b9a21ae9..9ae8d177b67f 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -2713,6 +2713,48 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			case 41:	/* plwa */
 				op->type = MKOP(LOAD, PREFIXED | SIGNEXT, 4);
 				break;
+			case 42:        /* plxsd */
+				op->reg = rd + 32;
+				op->type = MKOP(LOAD_VSX, PREFIXED, 8);
+				op->element_size = 8;
+				op->vsx_flags = VSX_CHECK_VEC;
+				break;
+			case 43:	/* plxssp */
+				op->reg = rd + 32;
+				op->type = MKOP(LOAD_VSX, PREFIXED, 4);
+				op->element_size = 8;
+				op->vsx_flags = VSX_FPCONV | VSX_CHECK_VEC;
+				break;
+			case 46:	/* pstxsd */
+				op->reg = rd + 32;
+				op->type = MKOP(STORE_VSX, PREFIXED, 8);
+				op->element_size = 8;
+				op->vsx_flags = VSX_CHECK_VEC;
+				break;
+			case 47:	/* pstxssp */
+				op->reg = rd + 32;
+				op->type = MKOP(STORE_VSX, PREFIXED, 4);
+				op->element_size = 8;
+				op->vsx_flags = VSX_FPCONV | VSX_CHECK_VEC;
+				break;
+			case 51:	/* plxv1 */
+				op->reg += 32;
+
+				/* fallthru */
+			case 50:	/* plxv0 */
+				op->type = MKOP(LOAD_VSX, PREFIXED, 16);
+				op->element_size = 16;
+				op->vsx_flags = VSX_CHECK_VEC;
+				break;
+			case 55:	/* pstxv1 */
+				op->reg = rd + 32;
+
+				/* fallthru */
+			case 54:	/* pstxv0 */
+				op->type = MKOP(STORE_VSX, PREFIXED, 16);
+				op->element_size = 16;
+				op->vsx_flags = VSX_CHECK_VEC;
+				break;
 			case 56:        /* plq */
 				op->type = MKOP(LOAD, PREFIXED, 16);
 				break;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 09/18] powerpc sstep: Add support for prefixed fixed-point arithmetic
  2019-11-26  5:21 [PATCH 00/18] Initial Prefixed Instruction support Jordan Niethe
                   ` (7 preceding siblings ...)
  2019-11-26  5:21 ` [PATCH 08/18] powerpc sstep: Add support for prefixed VSX load/stores Jordan Niethe
@ 2019-11-26  5:21 ` Jordan Niethe
  2019-11-26  5:21 ` [PATCH 10/18] powerpc: Support prefixed instructions in alignment handler Jordan Niethe
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 42+ messages in thread
From: Jordan Niethe @ 2019-11-26  5:21 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Jordan Niethe

This adds emulation support for the following prefixed Fixed-Point
Arithmetic instructions:
  * Prefixed Add Immediate (paddi)

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
 arch/powerpc/lib/sstep.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index 9ae8d177b67f..1bb0c79cb774 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -2776,6 +2776,10 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 				break;
 			op->ea = mlsd_8lsd_ea(instr, sufx, regs);
 			switch (sufxopcode) {
+			case 14:	/* paddi */
+				op->type = COMPUTE | PREFIXED;
+				op->val = op->ea;
+				goto compute_done;
 			case 32:	/* plwz */
 				op->type = MKOP(LOAD, PREFIXED, 4);
 				break;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 10/18] powerpc: Support prefixed instructions in alignment handler
  2019-11-26  5:21 [PATCH 00/18] Initial Prefixed Instruction support Jordan Niethe
                   ` (8 preceding siblings ...)
  2019-11-26  5:21 ` [PATCH 09/18] powerpc sstep: Add support for prefixed fixed-point arithmetic Jordan Niethe
@ 2019-11-26  5:21 ` Jordan Niethe
  2019-11-26  5:21 ` [PATCH 11/18] powerpc/traps: Check for prefixed instructions in facility_unavailable_exception() Jordan Niethe
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 42+ messages in thread
From: Jordan Niethe @ 2019-11-26  5:21 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Jordan Niethe

Alignment interrupts can be caused by prefixed instructions accessing
memory. In the alignment handler the instruction that caused the
exception is loaded and attempted emulate. If the instruction is a
prefixed instruction load the prefix and suffix to emulate. After
emulating increment the NIP by 8.

Prefixed instructions are not permitted to cross 64-byte boundaries. If
they do the alignment interrupt is invoked with SRR1 BOUNDARY bit set.
If this occurs send a SIGBUS to the offending process if in user mode.
If in kernel mode call bad_page_fault().

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
 arch/powerpc/kernel/align.c |  8 +++++---
 arch/powerpc/kernel/traps.c | 17 ++++++++++++++++-
 2 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kernel/align.c b/arch/powerpc/kernel/align.c
index 245e79792a01..53493404c25c 100644
--- a/arch/powerpc/kernel/align.c
+++ b/arch/powerpc/kernel/align.c
@@ -293,7 +293,7 @@ static int emulate_spe(struct pt_regs *regs, unsigned int reg,
 
 int fix_alignment(struct pt_regs *regs)
 {
-	unsigned int instr;
+	unsigned int instr, sufx;
 	struct instruction_op op;
 	int r, type;
 
@@ -303,13 +303,15 @@ int fix_alignment(struct pt_regs *regs)
 	 */
 	CHECK_FULL_REGS(regs);
 
-	if (unlikely(__get_user(instr, (unsigned int __user *)regs->nip)))
+	if (unlikely(__get_user_instr(instr, sufx,
+				 (unsigned int __user *)regs->nip)))
 		return -EFAULT;
 	if ((regs->msr & MSR_LE) != (MSR_KERNEL & MSR_LE)) {
 		/* We don't handle PPC little-endian any more... */
 		if (cpu_has_feature(CPU_FTR_PPC_LE))
 			return -EIO;
 		instr = swab32(instr);
+		sufx = swab32(sufx);
 	}
 
 #ifdef CONFIG_SPE
@@ -334,7 +336,7 @@ int fix_alignment(struct pt_regs *regs)
 	if ((instr & 0xfc0006fe) == (PPC_INST_COPY & 0xfc0006fe))
 		return -EIO;
 
-	r = analyse_instr(&op, regs, instr, 0);
+	r = analyse_instr(&op, regs, instr, sufx);
 	if (r < 0)
 		return -EINVAL;
 
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 014ff0701f24..8e262222f464 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -583,6 +583,8 @@ static inline int check_io_access(struct pt_regs *regs)
 #define REASON_ILLEGAL		(ESR_PIL | ESR_PUO)
 #define REASON_PRIVILEGED	ESR_PPR
 #define REASON_TRAP		ESR_PTR
+#define REASON_PREFIXED		0
+#define REASON_BOUNDARY		0
 
 /* single-step stuff */
 #define single_stepping(regs)	(current->thread.debug.dbcr0 & DBCR0_IC)
@@ -597,6 +599,8 @@ static inline int check_io_access(struct pt_regs *regs)
 #define REASON_ILLEGAL		SRR1_PROGILL
 #define REASON_PRIVILEGED	SRR1_PROGPRIV
 #define REASON_TRAP		SRR1_PROGTRAP
+#define REASON_PREFIXED		SRR1_PREFIXED
+#define REASON_BOUNDARY		SRR1_BOUNDARY
 
 #define single_stepping(regs)	((regs)->msr & MSR_SE)
 #define clear_single_step(regs)	((regs)->msr &= ~MSR_SE)
@@ -1593,11 +1597,20 @@ void alignment_exception(struct pt_regs *regs)
 {
 	enum ctx_state prev_state = exception_enter();
 	int sig, code, fixed = 0;
+	unsigned long  reason;
 
 	/* We restore the interrupt state now */
 	if (!arch_irq_disabled_regs(regs))
 		local_irq_enable();
 
+	reason = get_reason(regs);
+
+	if (reason & REASON_BOUNDARY) {
+		sig = SIGBUS;
+		code = BUS_ADRALN;
+		goto bad;
+	}
+
 	if (tm_abort_check(regs, TM_CAUSE_ALIGNMENT | TM_CAUSE_PERSISTENT))
 		goto bail;
 
@@ -1606,7 +1619,8 @@ void alignment_exception(struct pt_regs *regs)
 		fixed = fix_alignment(regs);
 
 	if (fixed == 1) {
-		regs->nip += 4;	/* skip over emulated instruction */
+		/* skip over emulated instruction */
+		regs->nip += (reason & REASON_PREFIXED) ? 8 : 4;
 		emulate_single_step(regs);
 		goto bail;
 	}
@@ -1619,6 +1633,7 @@ void alignment_exception(struct pt_regs *regs)
 		sig = SIGBUS;
 		code = BUS_ADRALN;
 	}
+bad:
 	if (user_mode(regs))
 		_exception(sig, regs, code, regs->dar);
 	else
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 11/18] powerpc/traps: Check for prefixed instructions in facility_unavailable_exception()
  2019-11-26  5:21 [PATCH 00/18] Initial Prefixed Instruction support Jordan Niethe
                   ` (9 preceding siblings ...)
  2019-11-26  5:21 ` [PATCH 10/18] powerpc: Support prefixed instructions in alignment handler Jordan Niethe
@ 2019-11-26  5:21 ` Jordan Niethe
  2019-11-26  5:21 ` [PATCH 12/18] powerpc/xmon: Add initial support for prefixed instructions Jordan Niethe
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 42+ messages in thread
From: Jordan Niethe @ 2019-11-26  5:21 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Jordan Niethe

If prefixed instructions are made unavailable by the [H]FSCR, attempting
to use them will cause a facility unavailable exception. Add "PREFIX" to
the facility_strings[].

Currently there are no prefixed instructions that are actually emulated
by emulate_instruction() within facility_unavailable_exception().
However, when caused by a prefixed instructions the SRR1 PREFIXED bit is
set. Prepare for dealing with emulated prefixed instructions by checking
for this bit.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
 arch/powerpc/kernel/traps.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 8e262222f464..92057830b9b6 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -1726,6 +1726,7 @@ void facility_unavailable_exception(struct pt_regs *regs)
 		[FSCR_TAR_LG] = "TAR",
 		[FSCR_MSGP_LG] = "MSGP",
 		[FSCR_SCV_LG] = "SCV",
+		[FSCR_PREFIX_LG] = "PREFIX",
 	};
 	char *facility = "unknown";
 	u64 value;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 12/18] powerpc/xmon: Add initial support for prefixed instructions
  2019-11-26  5:21 [PATCH 00/18] Initial Prefixed Instruction support Jordan Niethe
                   ` (10 preceding siblings ...)
  2019-11-26  5:21 ` [PATCH 11/18] powerpc/traps: Check for prefixed instructions in facility_unavailable_exception() Jordan Niethe
@ 2019-11-26  5:21 ` Jordan Niethe
  2019-11-26  5:21 ` [PATCH 13/18] powerpc/xmon: Dump " Jordan Niethe
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 42+ messages in thread
From: Jordan Niethe @ 2019-11-26  5:21 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Jordan Niethe

A prefixed instruction is composed of a word prefix and a word suffix.
It does not make sense to be able to have a breakpoint on the suffix of
a prefixed instruction, so make this impossible.

When leaving xmon_core() we check to see if we are currently at a
breakpoint. If this is the case, the breakpoint needs to be proceeded
from. Initially emulate_step() is tried, but if this fails then we need
to execute the saved instruction out of line. The NIP is set to the
address of bpt::instr[] for the current breakpoint.  bpt::instr[]
contains the instruction replaced by the breakpoint, followed by a trap
instruction.  After bpt::instr[0] is executed and we hit the trap we
enter back into xmon_bpt(). We know that if we got here and the offset
indicates we are at bpt::instr[1] then we have just executed out of line
so we can put the NIP back to the instruction after the breakpoint
location and continue on.

Adding prefixed instructions complicates this as the bpt::instr[1] needs
to be used to hold the suffix. To deal with this make bpt::instr[] big
enough for three word instructions.  bpt::instr[2] contains the trap,
and in the case of word instructions pad bpt::instr[1] with a noop.

No support for disassembling prefixed instructions.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
 arch/powerpc/xmon/xmon.c | 82 ++++++++++++++++++++++++++++++++++------
 1 file changed, 71 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index f47bd843dc52..93259a06eadc 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -97,7 +97,8 @@ static long *xmon_fault_jmp[NR_CPUS];
 /* Breakpoint stuff */
 struct bpt {
 	unsigned long	address;
-	unsigned int	instr[2];
+	/* Prefixed instructions can not cross 64-byte boundaries */
+	unsigned int	instr[3] __aligned(64);
 	atomic_t	ref_count;
 	int		enabled;
 	unsigned long	pad;
@@ -113,6 +114,7 @@ static struct bpt bpts[NBPTS];
 static struct bpt dabr;
 static struct bpt *iabr;
 static unsigned bpinstr = 0x7fe00008;	/* trap */
+static unsigned nopinstr = 0x60000000;	/* nop */
 
 #define BP_NUM(bp)	((bp) - bpts + 1)
 
@@ -120,6 +122,7 @@ static unsigned bpinstr = 0x7fe00008;	/* trap */
 static int cmds(struct pt_regs *);
 static int mread(unsigned long, void *, int);
 static int mwrite(unsigned long, void *, int);
+static int read_instr(unsigned long, unsigned int *, unsigned int *);
 static int handle_fault(struct pt_regs *);
 static void byterev(unsigned char *, int);
 static void memex(void);
@@ -705,7 +708,8 @@ static int xmon_core(struct pt_regs *regs, int fromipi)
 	if ((regs->msr & (MSR_IR|MSR_PR|MSR_64BIT)) == (MSR_IR|MSR_64BIT)) {
 		bp = at_breakpoint(regs->nip);
 		if (bp != NULL) {
-			int stepped = emulate_step(regs, bp->instr[0], 0);
+			int stepped = emulate_step(regs, bp->instr[0],
+						   bp->instr[1]);
 			if (stepped == 0) {
 				regs->nip = (unsigned long) &bp->instr[0];
 				atomic_inc(&bp->ref_count);
@@ -760,8 +764,8 @@ static int xmon_bpt(struct pt_regs *regs)
 
 	/* Are we at the trap at bp->instr[1] for some bp? */
 	bp = in_breakpoint_table(regs->nip, &offset);
-	if (bp != NULL && offset == 4) {
-		regs->nip = bp->address + 4;
+	if (bp != NULL && (offset == 4 || offset == 8)) {
+		regs->nip = bp->address + offset;
 		atomic_dec(&bp->ref_count);
 		return 1;
 	}
@@ -863,7 +867,8 @@ static struct bpt *in_breakpoint_table(unsigned long nip, unsigned long *offp)
 		return NULL;
 	off %= sizeof(struct bpt);
 	if (off != offsetof(struct bpt, instr[0])
-	    && off != offsetof(struct bpt, instr[1]))
+	    && off != offsetof(struct bpt, instr[1])
+	    && off != offsetof(struct bpt, instr[2]))
 		return NULL;
 	*offp = off - offsetof(struct bpt, instr[0]);
 	return (struct bpt *) (nip - off);
@@ -880,9 +885,18 @@ static struct bpt *new_breakpoint(unsigned long a)
 
 	for (bp = bpts; bp < &bpts[NBPTS]; ++bp) {
 		if (!bp->enabled && atomic_read(&bp->ref_count) == 0) {
+			/*
+			 * Prefixed instructions are two words, but regular
+			 * instructions are only one. Use a nop to pad out the
+			 * regular instructions so that we can place the trap
+			 * at the same plac. For prefixed instructions the nop
+			 * will get overwritten during insert_bpts().
+			 */
 			bp->address = a;
-			bp->instr[1] = bpinstr;
+			bp->instr[1] = nopinstr;
 			store_inst(&bp->instr[1]);
+			bp->instr[2] = bpinstr;
+			store_inst(&bp->instr[2]);
 			return bp;
 		}
 	}
@@ -894,13 +908,15 @@ static struct bpt *new_breakpoint(unsigned long a)
 static void insert_bpts(void)
 {
 	int i;
-	struct bpt *bp;
+	unsigned int prefix;
+	struct bpt *bp, *bp2;
 
 	bp = bpts;
 	for (i = 0; i < NBPTS; ++i, ++bp) {
 		if ((bp->enabled & (BP_TRAP|BP_CIABR)) == 0)
 			continue;
-		if (mread(bp->address, &bp->instr[0], 4) != 4) {
+		if (!read_instr(bp->address, &bp->instr[0],
+			       &bp->instr[1])) {
 			printf("Couldn't read instruction at %lx, "
 			       "disabling breakpoint there\n", bp->address);
 			bp->enabled = 0;
@@ -912,7 +928,33 @@ static void insert_bpts(void)
 			bp->enabled = 0;
 			continue;
 		}
+		/*
+		 * Check the address is not a suffix by looking for a prefix in
+		 * front of it.
+		 */
+		if ((mread(bp->address - 4, &prefix, 4) == 4) && IS_PREFIX(prefix)) {
+			printf("Breakpoint at %lx is on the second word of a "
+			       "prefixed instruction, disabling it\n",
+			       bp->address);
+			bp->enabled = 0;
+			continue;
+		}
+		/*
+		 * We might still be a suffix - if the prefix has already been
+		 * replaced by a breakpoint we won't catch it with the above
+		 * test.
+		 */
+		bp2 = at_breakpoint(bp->address - 4);
+		if (bp2 && IS_PREFIX(bp2->instr[0])) {
+			printf("Breakpoint at %lx is on the second word of a "
+			       "prefixed instruction, disabling it\n",
+			       bp->address);
+			bp->enabled = 0;
+			continue;
+		}
 		store_inst(&bp->instr[0]);
+		if (IS_PREFIX(bp->instr[0]))
+			store_inst(&bp->instr[1]);
 		if (bp->enabled & BP_CIABR)
 			continue;
 		if (patch_instruction((unsigned int *)bp->address,
@@ -1163,14 +1205,14 @@ static int do_step(struct pt_regs *regs)
  */
 static int do_step(struct pt_regs *regs)
 {
-	unsigned int instr;
+	unsigned int instr, sufx;
 	int stepped;
 
 	force_enable_xmon();
 	/* check we are in 64-bit kernel mode, translation enabled */
 	if ((regs->msr & (MSR_64BIT|MSR_PR|MSR_IR)) == (MSR_64BIT|MSR_IR)) {
-		if (mread(regs->nip, &instr, 4) == 4) {
-			stepped = emulate_step(regs, instr, 0);
+		if (read_instr(regs->nip, &instr, &sufx)) {
+			stepped = emulate_step(regs, instr, sufx);
 			if (stepped < 0) {
 				printf("Couldn't single-step %s instruction\n",
 				       (IS_RFID(instr)? "rfid": "mtmsrd"));
@@ -2127,6 +2169,24 @@ mwrite(unsigned long adrs, void *buf, int size)
 	return n;
 }
 
+static int read_instr(unsigned long addr, unsigned int *instr,
+		      unsigned int *sufx)
+{
+	int r;
+
+	r = mread(addr, instr, 4);
+	if (r != 4)
+		return 0;
+	if (!IS_PREFIX(*instr))
+		return 4;
+	r = mread(addr + 4, sufx, 4);
+	if (r != 4)
+		return 0;
+
+	return 8;
+}
+
+
 static int fault_type;
 static int fault_except;
 static char *fault_chars[] = { "--", "**", "##" };
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 13/18] powerpc/xmon: Dump prefixed instructions
  2019-11-26  5:21 [PATCH 00/18] Initial Prefixed Instruction support Jordan Niethe
                   ` (11 preceding siblings ...)
  2019-11-26  5:21 ` [PATCH 12/18] powerpc/xmon: Add initial support for prefixed instructions Jordan Niethe
@ 2019-11-26  5:21 ` Jordan Niethe
  2019-11-26  5:21 ` [PATCH 14/18] powerpc/kprobes: Support kprobes on " Jordan Niethe
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 42+ messages in thread
From: Jordan Niethe @ 2019-11-26  5:21 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Jordan Niethe

Currently when xmon is dumping instructions it reads a word at a time
and then prints that instruction (either as a hex number or by
disassembling it). For prefixed instructions it would be nice to show
its prefix and suffix as together. Use read_instr() so that if a prefix
is encountered its suffix is loaded too. Then print these in the form:
    prefix:suffix
Xmon uses the disassembly routines from GNU binutils. These currently do
not support prefixed instructions so we will not disassemble the
prefixed instructions yet.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
 arch/powerpc/xmon/xmon.c | 50 +++++++++++++++++++++++++++++++---------
 1 file changed, 39 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index 93259a06eadc..dc8b1c7b3e1b 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -2900,6 +2900,21 @@ prdump(unsigned long adrs, long ndump)
 	}
 }
 
+static bool instrs_are_equal(unsigned long insta, unsigned long sufxa,
+			     unsigned long instb, unsigned long sufxb)
+{
+	if (insta != instb)
+		return false;
+
+	if (!IS_PREFIX(insta) && !IS_PREFIX(instb))
+		return true;
+
+	if (IS_PREFIX(insta) && IS_PREFIX(instb))
+		return sufxa == sufxb;
+
+	return false;
+}
+
 typedef int (*instruction_dump_func)(unsigned long inst, unsigned long addr);
 
 static int
@@ -2908,12 +2923,11 @@ generic_inst_dump(unsigned long adr, long count, int praddr,
 {
 	int nr, dotted;
 	unsigned long first_adr;
-	unsigned int inst, last_inst = 0;
-	unsigned char val[4];
+	unsigned int inst, sufx, last_inst = 0, last_sufx = 0;
 
 	dotted = 0;
-	for (first_adr = adr; count > 0; --count, adr += 4) {
-		nr = mread(adr, val, 4);
+	for (first_adr = adr; count > 0; --count, adr += nr) {
+		nr = read_instr(adr, &inst, &sufx);
 		if (nr == 0) {
 			if (praddr) {
 				const char *x = fault_chars[fault_type];
@@ -2921,8 +2935,9 @@ generic_inst_dump(unsigned long adr, long count, int praddr,
 			}
 			break;
 		}
-		inst = GETWORD(val);
-		if (adr > first_adr && inst == last_inst) {
+		if (adr > first_adr && instrs_are_equal(inst, sufx,
+							last_inst,
+							last_sufx)) {
 			if (!dotted) {
 				printf(" ...\n");
 				dotted = 1;
@@ -2931,11 +2946,24 @@ generic_inst_dump(unsigned long adr, long count, int praddr,
 		}
 		dotted = 0;
 		last_inst = inst;
-		if (praddr)
-			printf(REG"  %.8x", adr, inst);
-		printf("\t");
-		dump_func(inst, adr);
-		printf("\n");
+		last_sufx = sufx;
+		if (IS_PREFIX(inst)) {
+			if (praddr)
+				printf(REG"  %.8x:%.8x", adr, inst, sufx);
+			printf("\t");
+			/*
+			 * Just use this until binutils ppc disassembly
+			 * prints prefixed instructions.
+			 */
+			printf("%.8x:%.8x", inst, sufx);
+			printf("\n");
+		} else {
+			if (praddr)
+				printf(REG"  %.8x", adr, inst);
+			printf("\t");
+			dump_func(inst, adr);
+			printf("\n");
+		}
 	}
 	return adr - first_adr;
 }
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 14/18] powerpc/kprobes: Support kprobes on prefixed instructions
  2019-11-26  5:21 [PATCH 00/18] Initial Prefixed Instruction support Jordan Niethe
                   ` (12 preceding siblings ...)
  2019-11-26  5:21 ` [PATCH 13/18] powerpc/xmon: Dump " Jordan Niethe
@ 2019-11-26  5:21 ` Jordan Niethe
  2020-01-14  7:19   ` Balamuruhan S
  2019-11-26  5:21 ` [PATCH 15/18] powerpc/uprobes: Add support for " Jordan Niethe
                   ` (4 subsequent siblings)
  18 siblings, 1 reply; 42+ messages in thread
From: Jordan Niethe @ 2019-11-26  5:21 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Jordan Niethe

A prefixed instruction is composed of a word prefix followed by a word
suffix. It does not make sense to be able to have a kprobe on the suffix
of a prefixed instruction, so make this impossible.

Kprobes work by replacing an instruction with a trap and saving that
instruction to be single stepped out of place later. Currently there is
not enough space allocated to keep a prefixed instruction for single
stepping. Increase the amount of space allocated for holding the
instruction copy.

kprobe_post_handler() expects all instructions to be 4 bytes long which
means that it does not function correctly for prefixed instructions.
Add checks for prefixed instructions which will use a length of 8 bytes
instead.

For optprobes we normally patch in loading the instruction we put a
probe on into r4 before calling emulate_step(). We now make space and
patch in loading the suffix into r5 as well.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
 arch/powerpc/include/asm/kprobes.h   |  5 +--
 arch/powerpc/kernel/kprobes.c        | 46 +++++++++++++++++++++-------
 arch/powerpc/kernel/optprobes.c      | 31 +++++++++++--------
 arch/powerpc/kernel/optprobes_head.S |  6 ++++
 4 files changed, 62 insertions(+), 26 deletions(-)

diff --git a/arch/powerpc/include/asm/kprobes.h b/arch/powerpc/include/asm/kprobes.h
index 66b3f2983b22..1f03a1cacb1e 100644
--- a/arch/powerpc/include/asm/kprobes.h
+++ b/arch/powerpc/include/asm/kprobes.h
@@ -38,12 +38,13 @@ extern kprobe_opcode_t optprobe_template_entry[];
 extern kprobe_opcode_t optprobe_template_op_address[];
 extern kprobe_opcode_t optprobe_template_call_handler[];
 extern kprobe_opcode_t optprobe_template_insn[];
+extern kprobe_opcode_t optprobe_template_sufx[];
 extern kprobe_opcode_t optprobe_template_call_emulate[];
 extern kprobe_opcode_t optprobe_template_ret[];
 extern kprobe_opcode_t optprobe_template_end[];
 
-/* Fixed instruction size for powerpc */
-#define MAX_INSN_SIZE		1
+/* Prefixed instructions are two words */
+#define MAX_INSN_SIZE		2
 #define MAX_OPTIMIZED_LENGTH	sizeof(kprobe_opcode_t)	/* 4 bytes */
 #define MAX_OPTINSN_SIZE	(optprobe_template_end - optprobe_template_entry)
 #define RELATIVEJUMP_SIZE	sizeof(kprobe_opcode_t)	/* 4 bytes */
diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
index 7303fe3856cc..aa15b3480385 100644
--- a/arch/powerpc/kernel/kprobes.c
+++ b/arch/powerpc/kernel/kprobes.c
@@ -104,17 +104,30 @@ kprobe_opcode_t *kprobe_lookup_name(const char *name, unsigned int offset)
 
 int arch_prepare_kprobe(struct kprobe *p)
 {
+	int len;
 	int ret = 0;
+	struct kprobe *prev;
 	kprobe_opcode_t insn = *p->addr;
+	kprobe_opcode_t prfx = *(p->addr - 1);
 
+	preempt_disable();
 	if ((unsigned long)p->addr & 0x03) {
 		printk("Attempt to register kprobe at an unaligned address\n");
 		ret = -EINVAL;
 	} else if (IS_MTMSRD(insn) || IS_RFID(insn) || IS_RFI(insn)) {
 		printk("Cannot register a kprobe on rfi/rfid or mtmsr[d]\n");
 		ret = -EINVAL;
+	} else if (IS_PREFIX(prfx)) {
+		printk("Cannot register a kprobe on the second word of prefixed instruction\n");
+		ret = -EINVAL;
+	}
+	prev = get_kprobe(p->addr - 1);
+	if (prev && IS_PREFIX(*prev->ainsn.insn)) {
+		printk("Cannot register a kprobe on the second word of prefixed instruction\n");
+		ret = -EINVAL;
 	}
 
+
 	/* insn must be on a special executable page on ppc64.  This is
 	 * not explicitly required on ppc32 (right now), but it doesn't hurt */
 	if (!ret) {
@@ -124,14 +137,18 @@ int arch_prepare_kprobe(struct kprobe *p)
 	}
 
 	if (!ret) {
-		memcpy(p->ainsn.insn, p->addr,
-				MAX_INSN_SIZE * sizeof(kprobe_opcode_t));
+		if (IS_PREFIX(insn))
+			len = MAX_INSN_SIZE * sizeof(kprobe_opcode_t);
+		else
+			len = sizeof(kprobe_opcode_t);
+		memcpy(p->ainsn.insn, p->addr, len);
 		p->opcode = *p->addr;
 		flush_icache_range((unsigned long)p->ainsn.insn,
 			(unsigned long)p->ainsn.insn + sizeof(kprobe_opcode_t));
 	}
 
 	p->ainsn.boostable = 0;
+	preempt_enable_no_resched();
 	return ret;
 }
 NOKPROBE_SYMBOL(arch_prepare_kprobe);
@@ -216,10 +233,11 @@ NOKPROBE_SYMBOL(arch_prepare_kretprobe);
 static int try_to_emulate(struct kprobe *p, struct pt_regs *regs)
 {
 	int ret;
-	unsigned int insn = *p->ainsn.insn;
+	unsigned int insn = p->ainsn.insn[0];
+	unsigned int sufx = p->ainsn.insn[1];
 
 	/* regs->nip is also adjusted if emulate_step returns 1 */
-	ret = emulate_step(regs, insn, 0);
+	ret = emulate_step(regs, insn, sufx);
 	if (ret > 0) {
 		/*
 		 * Once this instruction has been boosted
@@ -233,7 +251,10 @@ static int try_to_emulate(struct kprobe *p, struct pt_regs *regs)
 		 * So, we should never get here... but, its still
 		 * good to catch them, just in case...
 		 */
-		printk("Can't step on instruction %x\n", insn);
+		if (!IS_PREFIX(insn))
+			printk("Can't step on instruction %x\n", insn);
+		else
+			printk("Can't step on instruction %x %x\n", insn, sufx);
 		BUG();
 	} else {
 		/*
@@ -275,7 +296,7 @@ int kprobe_handler(struct pt_regs *regs)
 	if (kprobe_running()) {
 		p = get_kprobe(addr);
 		if (p) {
-			kprobe_opcode_t insn = *p->ainsn.insn;
+			kprobe_opcode_t insn = p->ainsn.insn[0];
 			if (kcb->kprobe_status == KPROBE_HIT_SS &&
 					is_trap(insn)) {
 				/* Turn off 'trace' bits */
@@ -448,9 +469,10 @@ static int trampoline_probe_handler(struct kprobe *p, struct pt_regs *regs)
 	 * the link register properly so that the subsequent 'blr' in
 	 * kretprobe_trampoline jumps back to the right instruction.
 	 *
-	 * For nip, we should set the address to the previous instruction since
-	 * we end up emulating it in kprobe_handler(), which increments the nip
-	 * again.
+	 * To keep the nip at the correct address we need to counter the
+	 * increment that happens when we emulate the kretprobe_trampoline noop
+	 * in kprobe_handler(). We do this by decrementing the address by the
+	 * length of the noop which is always 4 bytes.
 	 */
 	regs->nip = orig_ret_address - 4;
 	regs->link = orig_ret_address;
@@ -478,12 +500,14 @@ int kprobe_post_handler(struct pt_regs *regs)
 {
 	struct kprobe *cur = kprobe_running();
 	struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
+	kprobe_opcode_t insn;
 
 	if (!cur || user_mode(regs))
 		return 0;
 
+	insn = *cur->ainsn.insn;
 	/* make sure we got here for instruction we have a kprobe on */
-	if (((unsigned long)cur->ainsn.insn + 4) != regs->nip)
+	if (((unsigned long)cur->ainsn.insn + (IS_PREFIX(insn) ? 8 : 4)) != regs->nip)
 		return 0;
 
 	if ((kcb->kprobe_status != KPROBE_REENTER) && cur->post_handler) {
@@ -492,7 +516,7 @@ int kprobe_post_handler(struct pt_regs *regs)
 	}
 
 	/* Adjust nip to after the single-stepped instruction */
-	regs->nip = (unsigned long)cur->addr + 4;
+	regs->nip = (unsigned long)cur->addr + (IS_PREFIX(insn) ? 8 : 4);
 	regs->msr |= kcb->kprobe_saved_msr;
 
 	/*Restore back the original saved kprobes variables and continue. */
diff --git a/arch/powerpc/kernel/optprobes.c b/arch/powerpc/kernel/optprobes.c
index 82dc8a589c87..b2aef27bac27 100644
--- a/arch/powerpc/kernel/optprobes.c
+++ b/arch/powerpc/kernel/optprobes.c
@@ -27,6 +27,8 @@
 	(optprobe_template_op_address - optprobe_template_entry)
 #define TMPL_INSN_IDX		\
 	(optprobe_template_insn - optprobe_template_entry)
+#define TMPL_SUFX_IDX		\
+	(optprobe_template_sufx - optprobe_template_entry)
 #define TMPL_END_IDX		\
 	(optprobe_template_end - optprobe_template_entry)
 
@@ -100,7 +102,8 @@ static unsigned long can_optimize(struct kprobe *p)
 	 * and that can be emulated.
 	 */
 	if (!is_conditional_branch(*p->ainsn.insn) &&
-			analyse_instr(&op, &regs, *p->ainsn.insn, 0) == 1) {
+			analyse_instr(&op, &regs, p->ainsn.insn[0],
+				      p->ainsn.insn[1]) == 1) {
 		emulate_update_regs(&regs, &op);
 		nip = regs.nip;
 	}
@@ -140,27 +143,27 @@ void arch_remove_optimized_kprobe(struct optimized_kprobe *op)
 }
 
 /*
- * emulate_step() requires insn to be emulated as
- * second parameter. Load register 'r4' with the
- * instruction.
+ * emulate_step() requires insn to be emulated as second parameter, and the
+ * suffix as the third parameter. Load these into registers.
  */
-void patch_imm32_load_insns(unsigned int val, kprobe_opcode_t *addr)
+static void patch_imm32_load_insns(int reg, unsigned int val,
+				   kprobe_opcode_t *addr)
 {
-	/* addis r4,0,(insn)@h */
-	patch_instruction(addr, PPC_INST_ADDIS | ___PPC_RT(4) |
+	/* addis reg,0,(insn)@h */
+	patch_instruction(addr, PPC_INST_ADDIS | ___PPC_RT(reg) |
 			  ((val >> 16) & 0xffff));
 	addr++;
 
-	/* ori r4,r4,(insn)@l */
-	patch_instruction(addr, PPC_INST_ORI | ___PPC_RA(4) |
-			  ___PPC_RS(4) | (val & 0xffff));
+	/* ori reg,reg,(insn)@l */
+	patch_instruction(addr, PPC_INST_ORI | ___PPC_RA(reg) |
+			  ___PPC_RS(reg) | (val & 0xffff));
 }
 
 /*
  * Generate instructions to load provided immediate 64-bit value
  * to register 'r3' and patch these instructions at 'addr'.
  */
-void patch_imm64_load_insns(unsigned long val, kprobe_opcode_t *addr)
+static void patch_imm64_load_insns(unsigned long val, kprobe_opcode_t *addr)
 {
 	/* lis r3,(op)@highest */
 	patch_instruction(addr, PPC_INST_ADDIS | ___PPC_RT(3) |
@@ -266,9 +269,11 @@ int arch_prepare_optimized_kprobe(struct optimized_kprobe *op, struct kprobe *p)
 	patch_instruction(buff + TMPL_EMULATE_IDX, branch_emulate_step);
 
 	/*
-	 * 3. load instruction to be emulated into relevant register, and
+	 * 3. load instruction and suffix to be emulated into the relevant
+	 * registers, and
 	 */
-	patch_imm32_load_insns(*p->ainsn.insn, buff + TMPL_INSN_IDX);
+	patch_imm32_load_insns(4, p->ainsn.insn[0], buff + TMPL_INSN_IDX);
+	patch_imm32_load_insns(5, p->ainsn.insn[1], buff + TMPL_SUFX_IDX);
 
 	/*
 	 * 4. branch back from trampoline
diff --git a/arch/powerpc/kernel/optprobes_head.S b/arch/powerpc/kernel/optprobes_head.S
index cf383520843f..998359ae44ec 100644
--- a/arch/powerpc/kernel/optprobes_head.S
+++ b/arch/powerpc/kernel/optprobes_head.S
@@ -95,6 +95,12 @@ optprobe_template_insn:
 	nop
 	nop
 
+	.global optprobe_template_sufx
+optprobe_template_sufx:
+	/* Pass suffix to be emulated in r5 */
+	nop
+	nop
+
 	.global optprobe_template_call_emulate
 optprobe_template_call_emulate:
 	/* Branch to emulate_step()  */
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 15/18] powerpc/uprobes: Add support for prefixed instructions
  2019-11-26  5:21 [PATCH 00/18] Initial Prefixed Instruction support Jordan Niethe
                   ` (13 preceding siblings ...)
  2019-11-26  5:21 ` [PATCH 14/18] powerpc/kprobes: Support kprobes on " Jordan Niethe
@ 2019-11-26  5:21 ` Jordan Niethe
  2020-01-13 11:30   ` Balamuruhan S
  2019-11-26  5:21 ` [PATCH 16/18] powerpc/hw_breakpoints: Initial " Jordan Niethe
                   ` (3 subsequent siblings)
  18 siblings, 1 reply; 42+ messages in thread
From: Jordan Niethe @ 2019-11-26  5:21 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Jordan Niethe

Uprobes can execute instructions out of line. Increase the size of the
buffer used  for this so that this works for prefixed instructions. Take
into account the length of prefixed instructions when fixing up the nip.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
 arch/powerpc/include/asm/uprobes.h | 18 ++++++++++++++----
 arch/powerpc/kernel/uprobes.c      |  4 ++--
 2 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/uprobes.h b/arch/powerpc/include/asm/uprobes.h
index 2bbdf27d09b5..5b5e8a3d2f55 100644
--- a/arch/powerpc/include/asm/uprobes.h
+++ b/arch/powerpc/include/asm/uprobes.h
@@ -14,18 +14,28 @@
 
 typedef ppc_opcode_t uprobe_opcode_t;
 
+/*
+ * We have to ensure we have enought space for prefixed instructions, which
+ * are double the size of a word instruction, i.e. 8 bytes. However,
+ * sometimes it is simpler to treat a prefixed instruction like 2 word
+ * instructions.
+ */
 #define MAX_UINSN_BYTES		4
-#define UPROBE_XOL_SLOT_BYTES	(MAX_UINSN_BYTES)
+#define UPROBE_XOL_SLOT_BYTES	(2 * MAX_UINSN_BYTES)
 
 /* The following alias is needed for reference from arch-agnostic code */
 #define UPROBE_SWBP_INSN	BREAKPOINT_INSTRUCTION
 #define UPROBE_SWBP_INSN_SIZE	4 /* swbp insn size in bytes */
 
 struct arch_uprobe {
+	 /*
+	  * Ensure there is enough space for prefixed instructions. Prefixed
+	  * instructions must not cross 64-byte boundaries.
+	  */
 	union {
-		u32	insn;
-		u32	ixol;
-	};
+		uprobe_opcode_t	insn[2];
+		uprobe_opcode_t	ixol[2];
+	} __aligned(64);
 };
 
 struct arch_uprobe_task {
diff --git a/arch/powerpc/kernel/uprobes.c b/arch/powerpc/kernel/uprobes.c
index ab1077dc6148..cfcea6946f8b 100644
--- a/arch/powerpc/kernel/uprobes.c
+++ b/arch/powerpc/kernel/uprobes.c
@@ -111,7 +111,7 @@ int arch_uprobe_post_xol(struct arch_uprobe *auprobe, struct pt_regs *regs)
 	 * support doesn't exist and have to fix-up the next instruction
 	 * to be executed.
 	 */
-	regs->nip = utask->vaddr + MAX_UINSN_BYTES;
+	regs->nip = utask->vaddr + ((IS_PREFIX(auprobe->insn[0])) ? 8 : 4);
 
 	user_disable_single_step(current);
 	return 0;
@@ -173,7 +173,7 @@ bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs)
 	 * emulate_step() returns 1 if the insn was successfully emulated.
 	 * For all other cases, we need to single-step in hardware.
 	 */
-	ret = emulate_step(regs, auprobe->insn, 0);
+	ret = emulate_step(regs, auprobe->insn[0], auprobe->insn[1]);
 	if (ret > 0)
 		return true;
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 16/18] powerpc/hw_breakpoints: Initial support for prefixed instructions
  2019-11-26  5:21 [PATCH 00/18] Initial Prefixed Instruction support Jordan Niethe
                   ` (14 preceding siblings ...)
  2019-11-26  5:21 ` [PATCH 15/18] powerpc/uprobes: Add support for " Jordan Niethe
@ 2019-11-26  5:21 ` Jordan Niethe
  2019-11-26  5:21 ` [PATCH 17/18] powerpc: Add prefix support to mce_find_instr_ea_and_pfn() Jordan Niethe
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 42+ messages in thread
From: Jordan Niethe @ 2019-11-26  5:21 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Jordan Niethe

Currently when getting an instruction to emulate in
hw_breakpoint_handler() we do not load the suffix of a prefixed
instruction. Ensure we load the suffix if the instruction we need to
emulate is a prefixed instruction.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
 arch/powerpc/kernel/hw_breakpoint.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c
index f4530961998c..f7e1af8b9eae 100644
--- a/arch/powerpc/kernel/hw_breakpoint.c
+++ b/arch/powerpc/kernel/hw_breakpoint.c
@@ -240,15 +240,15 @@ dar_range_overlaps(unsigned long dar, int size, struct arch_hw_breakpoint *info)
 static bool stepping_handler(struct pt_regs *regs, struct perf_event *bp,
 			     struct arch_hw_breakpoint *info)
 {
-	unsigned int instr = 0;
+	unsigned int instr = 0, sufx = 0;
 	int ret, type, size;
 	struct instruction_op op;
 	unsigned long addr = info->address;
 
-	if (__get_user_inatomic(instr, (unsigned int *)regs->nip))
+	if (__get_user_instr_inatomic(instr, sufx, (unsigned int *)regs->nip))
 		goto fail;
 
-	ret = analyse_instr(&op, regs, instr, 0);
+	ret = analyse_instr(&op, regs, instr, sufx);
 	type = GETTYPE(op.type);
 	size = GETSIZE(op.type);
 
@@ -272,7 +272,7 @@ static bool stepping_handler(struct pt_regs *regs, struct perf_event *bp,
 		return false;
 	}
 
-	if (!emulate_step(regs, instr, 0))
+	if (!emulate_step(regs, instr, sufx))
 		goto fail;
 
 	return true;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 17/18] powerpc: Add prefix support to mce_find_instr_ea_and_pfn()
  2019-11-26  5:21 [PATCH 00/18] Initial Prefixed Instruction support Jordan Niethe
                   ` (15 preceding siblings ...)
  2019-11-26  5:21 ` [PATCH 16/18] powerpc/hw_breakpoints: Initial " Jordan Niethe
@ 2019-11-26  5:21 ` Jordan Niethe
  2019-11-26  5:21 ` [PATCH 18/18] powerpc/fault: Use analyse_instr() to check for store with updates to sp Jordan Niethe
  2019-12-03  4:31 ` [PATCH 00/18] Initial Prefixed Instruction support Andrew Donnellan
  18 siblings, 0 replies; 42+ messages in thread
From: Jordan Niethe @ 2019-11-26  5:21 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Jordan Niethe

mce_find_instr_ea_and_pfn analyses an instruction to determine the
effective address that caused the machine check. Update this to load and
pass the suffix to analyse_instr for prefixed instructions.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
 arch/powerpc/kernel/mce_power.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c
index d862bb549158..68e81fcbdf07 100644
--- a/arch/powerpc/kernel/mce_power.c
+++ b/arch/powerpc/kernel/mce_power.c
@@ -365,7 +365,7 @@ static int mce_find_instr_ea_and_phys(struct pt_regs *regs, uint64_t *addr,
 	 * in real-mode is tricky and can lead to recursive
 	 * faults
 	 */
-	int instr;
+	int instr, sufx = 0;
 	unsigned long pfn, instr_addr;
 	struct instruction_op op;
 	struct pt_regs tmp = *regs;
@@ -374,7 +374,9 @@ static int mce_find_instr_ea_and_phys(struct pt_regs *regs, uint64_t *addr,
 	if (pfn != ULONG_MAX) {
 		instr_addr = (pfn << PAGE_SHIFT) + (regs->nip & ~PAGE_MASK);
 		instr = *(unsigned int *)(instr_addr);
-		if (!analyse_instr(&op, &tmp, instr, 0)) {
+		if (IS_PREFIX(instr))
+			sufx = *(unsigned int *)(instr_addr + 4);
+		if (!analyse_instr(&op, &tmp, instr, sufx)) {
 			pfn = addr_to_pfn(regs, op.ea);
 			*addr = op.ea;
 			*phys_addr = (pfn << PAGE_SHIFT);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH 18/18] powerpc/fault: Use analyse_instr() to check for store with updates to sp
  2019-11-26  5:21 [PATCH 00/18] Initial Prefixed Instruction support Jordan Niethe
                   ` (16 preceding siblings ...)
  2019-11-26  5:21 ` [PATCH 17/18] powerpc: Add prefix support to mce_find_instr_ea_and_pfn() Jordan Niethe
@ 2019-11-26  5:21 ` Jordan Niethe
  2019-12-18 14:11   ` Daniel Axtens
  2019-12-03  4:31 ` [PATCH 00/18] Initial Prefixed Instruction support Andrew Donnellan
  18 siblings, 1 reply; 42+ messages in thread
From: Jordan Niethe @ 2019-11-26  5:21 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: alistair, Jordan Niethe

A user-mode access to an address a long way below the stack pointer is
only valid if the instruction is one that would update the stack pointer
to the address accessed. This is checked by directly looking at the
instructions op-code. As a result is does not take into account prefixed
instructions. Instead of looking at the instruction our self, use
analyse_instr() determine if this a store instruction that will update
the stack pointer.

Something to note is that there currently are not any store with update
prefixed instructions. Actually there is no plan for prefixed
update-form loads and stores. So this patch is probably not needed but
it might be preferable to use analyse_instr() rather than open coding
the test anyway.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
 arch/powerpc/mm/fault.c | 39 +++++++++++----------------------------
 1 file changed, 11 insertions(+), 28 deletions(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index b5047f9b5dec..cb78b3ca1800 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -41,37 +41,17 @@
 #include <asm/siginfo.h>
 #include <asm/debug.h>
 #include <asm/kup.h>
+#include <asm/sstep.h>
 
 /*
  * Check whether the instruction inst is a store using
  * an update addressing form which will update r1.
  */
-static bool store_updates_sp(unsigned int inst)
+static bool store_updates_sp(struct instruction_op *op)
 {
-	/* check for 1 in the rA field */
-	if (((inst >> 16) & 0x1f) != 1)
-		return false;
-	/* check major opcode */
-	switch (inst >> 26) {
-	case OP_STWU:
-	case OP_STBU:
-	case OP_STHU:
-	case OP_STFSU:
-	case OP_STFDU:
-		return true;
-	case OP_STD:	/* std or stdu */
-		return (inst & 3) == 1;
-	case OP_31:
-		/* check minor opcode */
-		switch ((inst >> 1) & 0x3ff) {
-		case OP_31_XOP_STDUX:
-		case OP_31_XOP_STWUX:
-		case OP_31_XOP_STBUX:
-		case OP_31_XOP_STHUX:
-		case OP_31_XOP_STFSUX:
-		case OP_31_XOP_STFDUX:
+	if (GETTYPE(op->type) == STORE) {
+		if ((op->type & UPDATE) && (op->update_reg == 1))
 			return true;
-		}
 	}
 	return false;
 }
@@ -278,14 +258,17 @@ static bool bad_stack_expansion(struct pt_regs *regs, unsigned long address,
 
 		if ((flags & FAULT_FLAG_WRITE) && (flags & FAULT_FLAG_USER) &&
 		    access_ok(nip, sizeof(*nip))) {
-			unsigned int inst;
+			unsigned int inst, sufx;
+			struct instruction_op op;
 			int res;
 
 			pagefault_disable();
-			res = __get_user_inatomic(inst, nip);
+			res = __get_user_instr_inatomic(inst, sufx, nip);
 			pagefault_enable();
-			if (!res)
-				return !store_updates_sp(inst);
+			if (!res) {
+				analyse_instr(&op, uregs, inst, sufx);
+				return !store_updates_sp(&op);
+			}
 			*must_retry = true;
 		}
 		return true;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH 00/18] Initial Prefixed Instruction support
  2019-11-26  5:21 [PATCH 00/18] Initial Prefixed Instruction support Jordan Niethe
                   ` (17 preceding siblings ...)
  2019-11-26  5:21 ` [PATCH 18/18] powerpc/fault: Use analyse_instr() to check for store with updates to sp Jordan Niethe
@ 2019-12-03  4:31 ` Andrew Donnellan
  18 siblings, 0 replies; 42+ messages in thread
From: Andrew Donnellan @ 2019-12-03  4:31 UTC (permalink / raw)
  To: Jordan Niethe, linuxppc-dev; +Cc: alistair

On 26/11/19 4:21 pm, Jordan Niethe wrote:
> A future revision of the ISA will introduce prefixed instructions. A
> prefixed instruction is composed of a 4-byte prefix followed by a
> 4-byte suffix.
> 
> All prefixes have the major opcode 1. A prefix will never be a valid
> word instruction. A suffix may be an existing word instruction or a new
> instruction.
> 
> The new instruction formats are:
>      * Eight-Byte Load/Store Instructions
>      * Eight-Byte Register-to-Register Instructions
>      * Modified Load/Store Instructions
>      * Modified Register-to-Register Instructions
> 
> This series enables prefixed instructions and extends the instruction
> emulation to support them. Then the places where prefixed instructions
> might need to be emulated are updated.
> 
> A future series will add prefixed instruction support to guests running
> in KVM.

Snowpatch found sparse warnings:

https://openpower.xyz/job/snowpatch/job/snowpatch-linux-sparse/14381//artifact/linux/report.txt

(And a few minor checkpatch things too)
-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 03/18] powerpc: Add PREFIXED SRR1 bit for future ISA version
  2019-11-26  5:21 ` [PATCH 03/18] powerpc: Add PREFIXED " Jordan Niethe
@ 2019-12-18  8:23   ` Daniel Axtens
  2019-12-20  5:09     ` Jordan Niethe
  0 siblings, 1 reply; 42+ messages in thread
From: Daniel Axtens @ 2019-12-18  8:23 UTC (permalink / raw)
  To: Jordan Niethe, linuxppc-dev; +Cc: alistair, Jordan Niethe

Jordan Niethe <jniethe5@gmail.com> writes:

> Add the bit definition for exceptions caused by prefixed instructions.
>
> Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> ---
>  arch/powerpc/include/asm/reg.h | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
> index 6f9fcc3d4c82..0a6d39fb4769 100644
> --- a/arch/powerpc/include/asm/reg.h
> +++ b/arch/powerpc/include/asm/reg.h
> @@ -778,6 +778,7 @@
>  
>  #define   SRR1_MCE_MCP		0x00080000 /* Machine check signal caused interrupt */
>  #define   SRR1_BOUNDARY		0x10000000 /* Prefixed instruction crosses 64-byte boundary */
> +#define   SRR1_PREFIXED		0x20000000 /* Exception caused by prefixed instruction */

You could probably squash this with the previous patch, and maybe the
next patch too.

Regards,
Daniel

>  
>  #define SPRN_HSRR0	0x13A	/* Save/Restore Register 0 */
>  #define SPRN_HSRR1	0x13B	/* Save/Restore Register 1 */
> -- 
> 2.20.1

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 05/18] powerpc sstep: Prepare to support prefixed instructions
  2019-11-26  5:21 ` [PATCH 05/18] powerpc sstep: Prepare to support prefixed instructions Jordan Niethe
@ 2019-12-18  8:35   ` Daniel Axtens
  2019-12-20  5:11     ` Jordan Niethe
  2019-12-18 14:15   ` Daniel Axtens
  2020-01-13  6:18   ` Balamuruhan S
  2 siblings, 1 reply; 42+ messages in thread
From: Daniel Axtens @ 2019-12-18  8:35 UTC (permalink / raw)
  To: Jordan Niethe, linuxppc-dev; +Cc: alistair, Jordan Niethe

Jordan Niethe <jniethe5@gmail.com> writes:

> Currently all instructions are a single word long. A future ISA version
> will include prefixed instructions which have a double word length. The
> functions used for analysing and emulating instructions need to be
> modified so that they can handle these new instruction types.
>
> A prefixed instruction is a word prefix followed by a word suffix. All
> prefixes uniquely have the primary op-code 1. Suffixes may be valid word
> instructions or instructions that only exist as suffixes.
>
> In handling prefixed instructions it will be convenient to treat the
> suffix and prefix as separate words. To facilitate this modify
> analyse_instr() and emulate_step() to take a take a suffix as a
> parameter. For word instructions it does not matter what is passed in
> here - it will be ignored.
>
> We also define a new flag, PREFIXED, to be used in instruction_op:type.
> This flag will indicate when emulating an analysed instruction if the
> NIP should be advanced by word length or double word length.
>
> The callers of analyse_instr() and emulate_step() will need their own
> changes to be able to support prefixed instructions. For now modify them
> to pass in 0 as a suffix.
>
> Note that at this point no prefixed instructions are emulated or
> analysed - this is just making it possible to do so.
>
> Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> ---
>  arch/powerpc/include/asm/ppc-opcode.h |  3 +++
>  arch/powerpc/include/asm/sstep.h      |  8 +++++--
>  arch/powerpc/include/asm/uaccess.h    | 30 +++++++++++++++++++++++++++
>  arch/powerpc/kernel/align.c           |  2 +-
>  arch/powerpc/kernel/hw_breakpoint.c   |  4 ++--
>  arch/powerpc/kernel/kprobes.c         |  2 +-
>  arch/powerpc/kernel/mce_power.c       |  2 +-
>  arch/powerpc/kernel/optprobes.c       |  2 +-
>  arch/powerpc/kernel/uprobes.c         |  2 +-
>  arch/powerpc/kvm/emulate_loadstore.c  |  2 +-
>  arch/powerpc/lib/sstep.c              | 12 ++++++-----
>  arch/powerpc/lib/test_emulate_step.c  | 30 +++++++++++++--------------
>  arch/powerpc/xmon/xmon.c              |  4 ++--
>  13 files changed, 71 insertions(+), 32 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h
> index c1df75edde44..a1dfa4bdd22f 100644
> --- a/arch/powerpc/include/asm/ppc-opcode.h
> +++ b/arch/powerpc/include/asm/ppc-opcode.h
> @@ -377,6 +377,9 @@
>  #define PPC_INST_VCMPEQUD		0x100000c7
>  #define PPC_INST_VCMPEQUB		0x10000006
>  
> +/* macro to check if a word is a prefix */
> +#define IS_PREFIX(x) (((x) >> 26) == 1)
> +
>  /* macros to insert fields into opcodes */
>  #define ___PPC_RA(a)	(((a) & 0x1f) << 16)
>  #define ___PPC_RB(b)	(((b) & 0x1f) << 11)
> diff --git a/arch/powerpc/include/asm/sstep.h b/arch/powerpc/include/asm/sstep.h
> index 769f055509c9..6d4cb602e231 100644
> --- a/arch/powerpc/include/asm/sstep.h
> +++ b/arch/powerpc/include/asm/sstep.h
> @@ -89,6 +89,9 @@ enum instruction_type {
>  #define VSX_LDLEFT	4	/* load VSX register from left */
>  #define VSX_CHECK_VEC	8	/* check MSR_VEC not MSR_VSX for reg >= 32 */
>  
> +/* Prefixed flag, ORed in with type */
> +#define PREFIXED	0x800
> +
>  /* Size field in type word */
>  #define SIZE(n)		((n) << 12)
>  #define GETSIZE(w)	((w) >> 12)
> @@ -132,7 +135,7 @@ union vsx_reg {
>   * otherwise.
>   */
>  extern int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
> -			 unsigned int instr);
> +			 unsigned int instr, unsigned int sufx);
>  
>  /*
>   * Emulate an instruction that can be executed just by updating
> @@ -149,7 +152,8 @@ void emulate_update_regs(struct pt_regs *reg, struct instruction_op *op);
>   * 0 if it could not be emulated, or -1 for an instruction that
>   * should not be emulated (rfid, mtmsrd clearing MSR_RI, etc.).
>   */
> -extern int emulate_step(struct pt_regs *regs, unsigned int instr);
> +extern int emulate_step(struct pt_regs *regs, unsigned int instr,
> +			unsigned int sufx);
>  
>  /*
>   * Emulate a load or store instruction by reading/writing the
> diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
> index 15002b51ff18..bc585399e0c7 100644
> --- a/arch/powerpc/include/asm/uaccess.h
> +++ b/arch/powerpc/include/asm/uaccess.h
> @@ -423,4 +423,34 @@ extern long __copy_from_user_flushcache(void *dst, const void __user *src,
>  extern void memcpy_page_flushcache(char *to, struct page *page, size_t offset,
>  			   size_t len);
>  
> +/*
> + * When reading an instruction iff it is a prefix, the suffix needs to be also
> + * loaded.
> + */
> +#define __get_user_instr(x, y, ptr)			\
> +({							\
> +	long __gui_ret = 0;				\
> +	y = 0;						\
> +	__gui_ret = __get_user(x, ptr);			\
> +	if (!__gui_ret) {				\
> +		if (IS_PREFIX(x))			\
> +			__gui_ret = __get_user(y, ptr + 1);	\
> +	}						\
> +							\
> +	__gui_ret;					\
> +})
> +
> +#define __get_user_instr_inatomic(x, y, ptr)		\
> +({							\
> +	long __gui_ret = 0;				\
> +	y = 0;						\
> +	__gui_ret = __get_user_inatomic(x, ptr);	\
> +	if (!__gui_ret) {				\
> +		if (IS_PREFIX(x))			\
> +			__gui_ret = __get_user_inatomic(y, ptr + 1);	\
> +	}						\
> +							\
> +	__gui_ret;					\
> +})
> +
>  #endif	/* _ARCH_POWERPC_UACCESS_H */
> diff --git a/arch/powerpc/kernel/align.c b/arch/powerpc/kernel/align.c
> index 92045ed64976..245e79792a01 100644
> --- a/arch/powerpc/kernel/align.c
> +++ b/arch/powerpc/kernel/align.c
> @@ -334,7 +334,7 @@ int fix_alignment(struct pt_regs *regs)
>  	if ((instr & 0xfc0006fe) == (PPC_INST_COPY & 0xfc0006fe))
>  		return -EIO;
>  
> -	r = analyse_instr(&op, regs, instr);
> +	r = analyse_instr(&op, regs, instr, 0);
>  	if (r < 0)
>  		return -EINVAL;
>  
> diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c
> index 58ce3d37c2a3..f4530961998c 100644
> --- a/arch/powerpc/kernel/hw_breakpoint.c
> +++ b/arch/powerpc/kernel/hw_breakpoint.c
> @@ -248,7 +248,7 @@ static bool stepping_handler(struct pt_regs *regs, struct perf_event *bp,
>  	if (__get_user_inatomic(instr, (unsigned int *)regs->nip))
>  		goto fail;
>  
> -	ret = analyse_instr(&op, regs, instr);
> +	ret = analyse_instr(&op, regs, instr, 0);
>  	type = GETTYPE(op.type);
>  	size = GETSIZE(op.type);
>  
> @@ -272,7 +272,7 @@ static bool stepping_handler(struct pt_regs *regs, struct perf_event *bp,
>  		return false;
>  	}
>  
> -	if (!emulate_step(regs, instr))
> +	if (!emulate_step(regs, instr, 0))
>  		goto fail;
>  
>  	return true;
> diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
> index 2d27ec4feee4..7303fe3856cc 100644
> --- a/arch/powerpc/kernel/kprobes.c
> +++ b/arch/powerpc/kernel/kprobes.c
> @@ -219,7 +219,7 @@ static int try_to_emulate(struct kprobe *p, struct pt_regs *regs)
>  	unsigned int insn = *p->ainsn.insn;
>  
>  	/* regs->nip is also adjusted if emulate_step returns 1 */
> -	ret = emulate_step(regs, insn);
> +	ret = emulate_step(regs, insn, 0);
>  	if (ret > 0) {
>  		/*
>  		 * Once this instruction has been boosted
> diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c
> index 1cbf7f1a4e3d..d862bb549158 100644
> --- a/arch/powerpc/kernel/mce_power.c
> +++ b/arch/powerpc/kernel/mce_power.c
> @@ -374,7 +374,7 @@ static int mce_find_instr_ea_and_phys(struct pt_regs *regs, uint64_t *addr,
>  	if (pfn != ULONG_MAX) {
>  		instr_addr = (pfn << PAGE_SHIFT) + (regs->nip & ~PAGE_MASK);
>  		instr = *(unsigned int *)(instr_addr);
> -		if (!analyse_instr(&op, &tmp, instr)) {
> +		if (!analyse_instr(&op, &tmp, instr, 0)) {
>  			pfn = addr_to_pfn(regs, op.ea);
>  			*addr = op.ea;
>  			*phys_addr = (pfn << PAGE_SHIFT);
> diff --git a/arch/powerpc/kernel/optprobes.c b/arch/powerpc/kernel/optprobes.c
> index 024f7aad1952..82dc8a589c87 100644
> --- a/arch/powerpc/kernel/optprobes.c
> +++ b/arch/powerpc/kernel/optprobes.c
> @@ -100,7 +100,7 @@ static unsigned long can_optimize(struct kprobe *p)
>  	 * and that can be emulated.
>  	 */
>  	if (!is_conditional_branch(*p->ainsn.insn) &&
> -			analyse_instr(&op, &regs, *p->ainsn.insn) == 1) {
> +			analyse_instr(&op, &regs, *p->ainsn.insn, 0) == 1) {
>  		emulate_update_regs(&regs, &op);
>  		nip = regs.nip;
>  	}
> diff --git a/arch/powerpc/kernel/uprobes.c b/arch/powerpc/kernel/uprobes.c
> index 1cfef0e5fec5..ab1077dc6148 100644
> --- a/arch/powerpc/kernel/uprobes.c
> +++ b/arch/powerpc/kernel/uprobes.c
> @@ -173,7 +173,7 @@ bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs)
>  	 * emulate_step() returns 1 if the insn was successfully emulated.
>  	 * For all other cases, we need to single-step in hardware.
>  	 */
> -	ret = emulate_step(regs, auprobe->insn);
> +	ret = emulate_step(regs, auprobe->insn, 0);
>  	if (ret > 0)
>  		return true;
>  
> diff --git a/arch/powerpc/kvm/emulate_loadstore.c b/arch/powerpc/kvm/emulate_loadstore.c
> index 2e496eb86e94..fcab1f31b48d 100644
> --- a/arch/powerpc/kvm/emulate_loadstore.c
> +++ b/arch/powerpc/kvm/emulate_loadstore.c
> @@ -100,7 +100,7 @@ int kvmppc_emulate_loadstore(struct kvm_vcpu *vcpu)
>  
>  	emulated = EMULATE_FAIL;
>  	vcpu->arch.regs.msr = vcpu->arch.shared->msr;
> -	if (analyse_instr(&op, &vcpu->arch.regs, inst) == 0) {
> +	if (analyse_instr(&op, &vcpu->arch.regs, inst, 0) == 0) {
>  		int type = op.type & INSTR_TYPE_MASK;
>  		int size = GETSIZE(op.type);
>  
> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> index c077acb983a1..ade3f5eba2e5 100644
> --- a/arch/powerpc/lib/sstep.c
> +++ b/arch/powerpc/lib/sstep.c
> @@ -1163,7 +1163,7 @@ static nokprobe_inline int trap_compare(long v1, long v2)
>   * otherwise.
>   */
>  int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
> -		  unsigned int instr)
> +		  unsigned int instr, unsigned int sufx)

I know we really like shortenings in arch/powerpc, but I think we can
afford the two extra characters to spell 'suffix' in full :)

>  {
>  	unsigned int opcode, ra, rb, rc, rd, spr, u;
>  	unsigned long int imm;
> @@ -2756,7 +2756,8 @@ void emulate_update_regs(struct pt_regs *regs, struct instruction_op *op)
>  {
>  	unsigned long next_pc;
>  
> -	next_pc = truncate_if_32bit(regs->msr, regs->nip + 4);
> +	next_pc = truncate_if_32bit(regs->msr,
> +				    regs->nip + ((op->type & PREFIXED) ? 8 : 4));

I know you only use it twice, but this is super ugly - could you add a
macro like IS_PREFIXED_TYPE? or even something like
#define OP_TYPE_LENGTH(x) (((op)->type & PREFIXED) ? 8 : 4)

>  	switch (GETTYPE(op->type)) {
>  	case COMPUTE:
>  		if (op->type & SETREG)
> @@ -3101,14 +3102,14 @@ NOKPROBE_SYMBOL(emulate_loadstore);
>   * or -1 if the instruction is one that should not be stepped,
>   * such as an rfid, or a mtmsrd that would clear MSR_RI.
>   */
> -int emulate_step(struct pt_regs *regs, unsigned int instr)
> +int emulate_step(struct pt_regs *regs, unsigned int instr, unsigned int sufx)
>  {
>  	struct instruction_op op;
>  	int r, err, type;
>  	unsigned long val;
>  	unsigned long ea;
>  
> -	r = analyse_instr(&op, regs, instr);
> +	r = analyse_instr(&op, regs, instr, sufx);
>  	if (r < 0)
>  		return r;
>  	if (r > 0) {
> @@ -3200,7 +3201,8 @@ int emulate_step(struct pt_regs *regs, unsigned int instr)
>  	return 0;
>  
>   instr_done:
> -	regs->nip = truncate_if_32bit(regs->msr, regs->nip + 4);
> +	regs->nip = truncate_if_32bit(regs->msr,
> +				      regs->nip + ((op.type & PREFIXED) ? 8 : 4));
>  	return 1;
>  }
>  NOKPROBE_SYMBOL(emulate_step);
> diff --git a/arch/powerpc/lib/test_emulate_step.c b/arch/powerpc/lib/test_emulate_step.c
> index 42347067739c..9288dc6fc715 100644
> --- a/arch/powerpc/lib/test_emulate_step.c
> +++ b/arch/powerpc/lib/test_emulate_step.c
> @@ -103,7 +103,7 @@ static void __init test_ld(void)
>  	regs.gpr[3] = (unsigned long) &a;
>  
>  	/* ld r5, 0(r3) */
> -	stepped = emulate_step(&regs, TEST_LD(5, 3, 0));
> +	stepped = emulate_step(&regs, TEST_LD(5, 3, 0), 0);

I wonder if it would be clearer if we had something like
#define NO_SUFFIX 0

>  
>  	if (stepped == 1 && regs.gpr[5] == a)
>  		show_result("ld", "PASS");
> @@ -121,7 +121,7 @@ static void __init test_lwz(void)
>  	regs.gpr[3] = (unsigned long) &a;
>  
>  	/* lwz r5, 0(r3) */
> -	stepped = emulate_step(&regs, TEST_LWZ(5, 3, 0));
> +	stepped = emulate_step(&regs, TEST_LWZ(5, 3, 0), 0);
>  
>  	if (stepped == 1 && regs.gpr[5] == a)
>  		show_result("lwz", "PASS");
> @@ -141,7 +141,7 @@ static void __init test_lwzx(void)
>  	regs.gpr[5] = 0x8765;
>  
>  	/* lwzx r5, r3, r4 */
> -	stepped = emulate_step(&regs, TEST_LWZX(5, 3, 4));
> +	stepped = emulate_step(&regs, TEST_LWZX(5, 3, 4), 0);
>  	if (stepped == 1 && regs.gpr[5] == a[2])
>  		show_result("lwzx", "PASS");
>  	else
> @@ -159,7 +159,7 @@ static void __init test_std(void)
>  	regs.gpr[5] = 0x5678;
>  
>  	/* std r5, 0(r3) */
> -	stepped = emulate_step(&regs, TEST_STD(5, 3, 0));
> +	stepped = emulate_step(&regs, TEST_STD(5, 3, 0), 0);
>  	if (stepped == 1 || regs.gpr[5] == a)
>  		show_result("std", "PASS");
>  	else
> @@ -184,7 +184,7 @@ static void __init test_ldarx_stdcx(void)
>  	regs.gpr[5] = 0x5678;
>  
>  	/* ldarx r5, r3, r4, 0 */
> -	stepped = emulate_step(&regs, TEST_LDARX(5, 3, 4, 0));
> +	stepped = emulate_step(&regs, TEST_LDARX(5, 3, 4, 0), 0);
>  
>  	/*
>  	 * Don't touch 'a' here. Touching 'a' can do Load/store
> @@ -202,7 +202,7 @@ static void __init test_ldarx_stdcx(void)
>  	regs.gpr[5] = 0x9ABC;
>  
>  	/* stdcx. r5, r3, r4 */
> -	stepped = emulate_step(&regs, TEST_STDCX(5, 3, 4));
> +	stepped = emulate_step(&regs, TEST_STDCX(5, 3, 4), 0);
>  
>  	/*
>  	 * Two possible scenarios that indicates successful emulation
> @@ -242,7 +242,7 @@ static void __init test_lfsx_stfsx(void)
>  	regs.gpr[4] = 0;
>  
>  	/* lfsx frt10, r3, r4 */
> -	stepped = emulate_step(&regs, TEST_LFSX(10, 3, 4));
> +	stepped = emulate_step(&regs, TEST_LFSX(10, 3, 4), 0);
>  
>  	if (stepped == 1)
>  		show_result("lfsx", "PASS");
> @@ -255,7 +255,7 @@ static void __init test_lfsx_stfsx(void)
>  	c.a = 678.91;
>  
>  	/* stfsx frs10, r3, r4 */
> -	stepped = emulate_step(&regs, TEST_STFSX(10, 3, 4));
> +	stepped = emulate_step(&regs, TEST_STFSX(10, 3, 4), 0);
>  
>  	if (stepped == 1 && c.b == cached_b)
>  		show_result("stfsx", "PASS");
> @@ -285,7 +285,7 @@ static void __init test_lfdx_stfdx(void)
>  	regs.gpr[4] = 0;
>  
>  	/* lfdx frt10, r3, r4 */
> -	stepped = emulate_step(&regs, TEST_LFDX(10, 3, 4));
> +	stepped = emulate_step(&regs, TEST_LFDX(10, 3, 4), 0);
>  
>  	if (stepped == 1)
>  		show_result("lfdx", "PASS");
> @@ -298,7 +298,7 @@ static void __init test_lfdx_stfdx(void)
>  	c.a = 987654.32;
>  
>  	/* stfdx frs10, r3, r4 */
> -	stepped = emulate_step(&regs, TEST_STFDX(10, 3, 4));
> +	stepped = emulate_step(&regs, TEST_STFDX(10, 3, 4), 0);
>  
>  	if (stepped == 1 && c.b == cached_b)
>  		show_result("stfdx", "PASS");
> @@ -344,7 +344,7 @@ static void __init test_lvx_stvx(void)
>  	regs.gpr[4] = 0;
>  
>  	/* lvx vrt10, r3, r4 */
> -	stepped = emulate_step(&regs, TEST_LVX(10, 3, 4));
> +	stepped = emulate_step(&regs, TEST_LVX(10, 3, 4), 0);
>  
>  	if (stepped == 1)
>  		show_result("lvx", "PASS");
> @@ -360,7 +360,7 @@ static void __init test_lvx_stvx(void)
>  	c.b[3] = 498532;
>  
>  	/* stvx vrs10, r3, r4 */
> -	stepped = emulate_step(&regs, TEST_STVX(10, 3, 4));
> +	stepped = emulate_step(&regs, TEST_STVX(10, 3, 4), 0);
>  
>  	if (stepped == 1 && cached_b[0] == c.b[0] && cached_b[1] == c.b[1] &&
>  	    cached_b[2] == c.b[2] && cached_b[3] == c.b[3])
> @@ -401,7 +401,7 @@ static void __init test_lxvd2x_stxvd2x(void)
>  	regs.gpr[4] = 0;
>  
>  	/* lxvd2x vsr39, r3, r4 */
> -	stepped = emulate_step(&regs, TEST_LXVD2X(39, 3, 4));
> +	stepped = emulate_step(&regs, TEST_LXVD2X(39, 3, 4), 0);
>  
>  	if (stepped == 1 && cpu_has_feature(CPU_FTR_VSX)) {
>  		show_result("lxvd2x", "PASS");
> @@ -421,7 +421,7 @@ static void __init test_lxvd2x_stxvd2x(void)
>  	c.b[3] = 4;
>  
>  	/* stxvd2x vsr39, r3, r4 */
> -	stepped = emulate_step(&regs, TEST_STXVD2X(39, 3, 4));
> +	stepped = emulate_step(&regs, TEST_STXVD2X(39, 3, 4), 0);
>  
>  	if (stepped == 1 && cached_b[0] == c.b[0] && cached_b[1] == c.b[1] &&
>  	    cached_b[2] == c.b[2] && cached_b[3] == c.b[3] &&
> @@ -848,7 +848,7 @@ static int __init emulate_compute_instr(struct pt_regs *regs,
>  	if (!regs || !instr)
>  		return -EINVAL;
>  
> -	if (analyse_instr(&op, regs, instr) != 1 ||
> +	if (analyse_instr(&op, regs, instr, 0) != 1 ||
>  	    GETTYPE(op.type) != COMPUTE) {
>  		pr_info("emulation failed, instruction = 0x%08x\n", instr);
>  		return -EFAULT;
> diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
> index a7056049709e..f47bd843dc52 100644
> --- a/arch/powerpc/xmon/xmon.c
> +++ b/arch/powerpc/xmon/xmon.c
> @@ -705,7 +705,7 @@ static int xmon_core(struct pt_regs *regs, int fromipi)
>  	if ((regs->msr & (MSR_IR|MSR_PR|MSR_64BIT)) == (MSR_IR|MSR_64BIT)) {
>  		bp = at_breakpoint(regs->nip);
>  		if (bp != NULL) {
> -			int stepped = emulate_step(regs, bp->instr[0]);
> +			int stepped = emulate_step(regs, bp->instr[0], 0);
>  			if (stepped == 0) {
>  				regs->nip = (unsigned long) &bp->instr[0];
>  				atomic_inc(&bp->ref_count);
> @@ -1170,7 +1170,7 @@ static int do_step(struct pt_regs *regs)
>  	/* check we are in 64-bit kernel mode, translation enabled */
>  	if ((regs->msr & (MSR_64BIT|MSR_PR|MSR_IR)) == (MSR_64BIT|MSR_IR)) {
>  		if (mread(regs->nip, &instr, 4) == 4) {
> -			stepped = emulate_step(regs, instr);
> +			stepped = emulate_step(regs, instr, 0);
>  			if (stepped < 0) {
>  				printf("Couldn't single-step %s instruction\n",
>  				       (IS_RFID(instr)? "rfid": "mtmsrd"));
> -- 
> 2.20.1

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 08/18] powerpc sstep: Add support for prefixed VSX load/stores
  2019-11-26  5:21 ` [PATCH 08/18] powerpc sstep: Add support for prefixed VSX load/stores Jordan Niethe
@ 2019-12-18 14:05   ` Daniel Axtens
  0 siblings, 0 replies; 42+ messages in thread
From: Daniel Axtens @ 2019-12-18 14:05 UTC (permalink / raw)
  To: Jordan Niethe, linuxppc-dev; +Cc: alistair, Jordan Niethe

Jordan Niethe <jniethe5@gmail.com> writes:

> This adds emulation support for the following prefixed VSX load/stores:
>   * Prefixed Load VSX Scalar Doubleword (plxsd)
>   * Prefixed Load VSX Scalar Single-Precision (plxssp)
>   * Prefixed Load VSX Vector [0|1]  (plxv, plxv0, plxv1)
>   * Prefixed Store VSX Scalar Doubleword (pstxsd)
>   * Prefixed Store VSX Scalar Single-Precision (pstxssp)
>   * Prefixed Store VSX Vector [0|1] (pstxv, pstxv0, pstxv1)
>
> Signed-off-by: Jordan Niethe <jniethe5@gmail.com>

Take this with a grain of salt, but I would proooobably squish the 3
load/store patches into one.

Part of my hesitation is that I think you also need some sstep tests for
these new instructions - if they massively bloat the patches I might
keep them as separate patches.

I'd also like to see a test for your next patch.

Regards,
Daniel

> ---
>  arch/powerpc/lib/sstep.c | 42 ++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 42 insertions(+)
>
> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> index 9113b9a21ae9..9ae8d177b67f 100644
> --- a/arch/powerpc/lib/sstep.c
> +++ b/arch/powerpc/lib/sstep.c
> @@ -2713,6 +2713,48 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
>  			case 41:	/* plwa */
>  				op->type = MKOP(LOAD, PREFIXED | SIGNEXT, 4);
>  				break;
> +			case 42:        /* plxsd */
> +				op->reg = rd + 32;
> +				op->type = MKOP(LOAD_VSX, PREFIXED, 8);
> +				op->element_size = 8;
> +				op->vsx_flags = VSX_CHECK_VEC;
> +				break;
> +			case 43:	/* plxssp */
> +				op->reg = rd + 32;
> +				op->type = MKOP(LOAD_VSX, PREFIXED, 4);
> +				op->element_size = 8;
> +				op->vsx_flags = VSX_FPCONV | VSX_CHECK_VEC;
> +				break;
> +			case 46:	/* pstxsd */
> +				op->reg = rd + 32;
> +				op->type = MKOP(STORE_VSX, PREFIXED, 8);
> +				op->element_size = 8;
> +				op->vsx_flags = VSX_CHECK_VEC;
> +				break;
> +			case 47:	/* pstxssp */
> +				op->reg = rd + 32;
> +				op->type = MKOP(STORE_VSX, PREFIXED, 4);
> +				op->element_size = 8;
> +				op->vsx_flags = VSX_FPCONV | VSX_CHECK_VEC;
> +				break;
> +			case 51:	/* plxv1 */
> +				op->reg += 32;
> +
> +				/* fallthru */
> +			case 50:	/* plxv0 */
> +				op->type = MKOP(LOAD_VSX, PREFIXED, 16);
> +				op->element_size = 16;
> +				op->vsx_flags = VSX_CHECK_VEC;
> +				break;
> +			case 55:	/* pstxv1 */
> +				op->reg = rd + 32;
> +
> +				/* fallthru */
> +			case 54:	/* pstxv0 */
> +				op->type = MKOP(STORE_VSX, PREFIXED, 16);
> +				op->element_size = 16;
> +				op->vsx_flags = VSX_CHECK_VEC;
> +				break;
>  			case 56:        /* plq */
>  				op->type = MKOP(LOAD, PREFIXED, 16);
>  				break;
> -- 
> 2.20.1

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 18/18] powerpc/fault: Use analyse_instr() to check for store with updates to sp
  2019-11-26  5:21 ` [PATCH 18/18] powerpc/fault: Use analyse_instr() to check for store with updates to sp Jordan Niethe
@ 2019-12-18 14:11   ` Daniel Axtens
  2020-02-07  8:15     ` Greg Kurz
  0 siblings, 1 reply; 42+ messages in thread
From: Daniel Axtens @ 2019-12-18 14:11 UTC (permalink / raw)
  To: Jordan Niethe, linuxppc-dev; +Cc: alistair, Jordan Niethe

Jordan Niethe <jniethe5@gmail.com> writes:

> A user-mode access to an address a long way below the stack pointer is
> only valid if the instruction is one that would update the stack pointer
> to the address accessed. This is checked by directly looking at the
> instructions op-code. As a result is does not take into account prefixed
> instructions. Instead of looking at the instruction our self, use
> analyse_instr() determine if this a store instruction that will update
> the stack pointer.
>
> Something to note is that there currently are not any store with update
> prefixed instructions. Actually there is no plan for prefixed
> update-form loads and stores. So this patch is probably not needed but
> it might be preferable to use analyse_instr() rather than open coding
> the test anyway.

Yes please. I was looking through this code recently and was
horrified. This improves things a lot and I think is justification
enough as-is.

Regards,
Daniel
>
> Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> ---
>  arch/powerpc/mm/fault.c | 39 +++++++++++----------------------------
>  1 file changed, 11 insertions(+), 28 deletions(-)
>
> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> index b5047f9b5dec..cb78b3ca1800 100644
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -41,37 +41,17 @@
>  #include <asm/siginfo.h>
>  #include <asm/debug.h>
>  #include <asm/kup.h>
> +#include <asm/sstep.h>
>  
>  /*
>   * Check whether the instruction inst is a store using
>   * an update addressing form which will update r1.
>   */
> -static bool store_updates_sp(unsigned int inst)
> +static bool store_updates_sp(struct instruction_op *op)
>  {
> -	/* check for 1 in the rA field */
> -	if (((inst >> 16) & 0x1f) != 1)
> -		return false;
> -	/* check major opcode */
> -	switch (inst >> 26) {
> -	case OP_STWU:
> -	case OP_STBU:
> -	case OP_STHU:
> -	case OP_STFSU:
> -	case OP_STFDU:
> -		return true;
> -	case OP_STD:	/* std or stdu */
> -		return (inst & 3) == 1;
> -	case OP_31:
> -		/* check minor opcode */
> -		switch ((inst >> 1) & 0x3ff) {
> -		case OP_31_XOP_STDUX:
> -		case OP_31_XOP_STWUX:
> -		case OP_31_XOP_STBUX:
> -		case OP_31_XOP_STHUX:
> -		case OP_31_XOP_STFSUX:
> -		case OP_31_XOP_STFDUX:
> +	if (GETTYPE(op->type) == STORE) {
> +		if ((op->type & UPDATE) && (op->update_reg == 1))
>  			return true;
> -		}
>  	}
>  	return false;
>  }
> @@ -278,14 +258,17 @@ static bool bad_stack_expansion(struct pt_regs *regs, unsigned long address,
>  
>  		if ((flags & FAULT_FLAG_WRITE) && (flags & FAULT_FLAG_USER) &&
>  		    access_ok(nip, sizeof(*nip))) {
> -			unsigned int inst;
> +			unsigned int inst, sufx;
> +			struct instruction_op op;
>  			int res;
>  
>  			pagefault_disable();
> -			res = __get_user_inatomic(inst, nip);
> +			res = __get_user_instr_inatomic(inst, sufx, nip);
>  			pagefault_enable();
> -			if (!res)
> -				return !store_updates_sp(inst);
> +			if (!res) {
> +				analyse_instr(&op, uregs, inst, sufx);
> +				return !store_updates_sp(&op);
> +			}
>  			*must_retry = true;
>  		}
>  		return true;
> -- 
> 2.20.1

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 05/18] powerpc sstep: Prepare to support prefixed instructions
  2019-11-26  5:21 ` [PATCH 05/18] powerpc sstep: Prepare to support prefixed instructions Jordan Niethe
  2019-12-18  8:35   ` Daniel Axtens
@ 2019-12-18 14:15   ` Daniel Axtens
  2019-12-20  5:17     ` Jordan Niethe
  2020-01-13  6:18   ` Balamuruhan S
  2 siblings, 1 reply; 42+ messages in thread
From: Daniel Axtens @ 2019-12-18 14:15 UTC (permalink / raw)
  To: Jordan Niethe, linuxppc-dev; +Cc: alistair, Jordan Niethe

Jordan Niethe <jniethe5@gmail.com> writes:

> Currently all instructions are a single word long. A future ISA version
> will include prefixed instructions which have a double word length. The
> functions used for analysing and emulating instructions need to be
> modified so that they can handle these new instruction types.
>
> A prefixed instruction is a word prefix followed by a word suffix. All
> prefixes uniquely have the primary op-code 1. Suffixes may be valid word
> instructions or instructions that only exist as suffixes.
>
> In handling prefixed instructions it will be convenient to treat the
> suffix and prefix as separate words. To facilitate this modify
> analyse_instr() and emulate_step() to take a take a suffix as a
> parameter. For word instructions it does not matter what is passed in
> here - it will be ignored.
>
> We also define a new flag, PREFIXED, to be used in instruction_op:type.
> This flag will indicate when emulating an analysed instruction if the
> NIP should be advanced by word length or double word length.
>
> The callers of analyse_instr() and emulate_step() will need their own
> changes to be able to support prefixed instructions. For now modify them
> to pass in 0 as a suffix.
>
> Note that at this point no prefixed instructions are emulated or
> analysed - this is just making it possible to do so.
>
> Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> ---
>  arch/powerpc/include/asm/ppc-opcode.h |  3 +++
>  arch/powerpc/include/asm/sstep.h      |  8 +++++--
>  arch/powerpc/include/asm/uaccess.h    | 30 +++++++++++++++++++++++++++
>  arch/powerpc/kernel/align.c           |  2 +-
>  arch/powerpc/kernel/hw_breakpoint.c   |  4 ++--
>  arch/powerpc/kernel/kprobes.c         |  2 +-
>  arch/powerpc/kernel/mce_power.c       |  2 +-
>  arch/powerpc/kernel/optprobes.c       |  2 +-
>  arch/powerpc/kernel/uprobes.c         |  2 +-
>  arch/powerpc/kvm/emulate_loadstore.c  |  2 +-
>  arch/powerpc/lib/sstep.c              | 12 ++++++-----
>  arch/powerpc/lib/test_emulate_step.c  | 30 +++++++++++++--------------
>  arch/powerpc/xmon/xmon.c              |  4 ++--
>  13 files changed, 71 insertions(+), 32 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h
> index c1df75edde44..a1dfa4bdd22f 100644
> --- a/arch/powerpc/include/asm/ppc-opcode.h
> +++ b/arch/powerpc/include/asm/ppc-opcode.h
> @@ -377,6 +377,9 @@
>  #define PPC_INST_VCMPEQUD		0x100000c7
>  #define PPC_INST_VCMPEQUB		0x10000006
>  
> +/* macro to check if a word is a prefix */
> +#define IS_PREFIX(x) (((x) >> 26) == 1)
> +
>  /* macros to insert fields into opcodes */
>  #define ___PPC_RA(a)	(((a) & 0x1f) << 16)
>  #define ___PPC_RB(b)	(((b) & 0x1f) << 11)
> diff --git a/arch/powerpc/include/asm/sstep.h b/arch/powerpc/include/asm/sstep.h
> index 769f055509c9..6d4cb602e231 100644
> --- a/arch/powerpc/include/asm/sstep.h
> +++ b/arch/powerpc/include/asm/sstep.h
> @@ -89,6 +89,9 @@ enum instruction_type {
>  #define VSX_LDLEFT	4	/* load VSX register from left */
>  #define VSX_CHECK_VEC	8	/* check MSR_VEC not MSR_VSX for reg >= 32 */
>  
> +/* Prefixed flag, ORed in with type */
> +#define PREFIXED	0x800
> +
>  /* Size field in type word */
>  #define SIZE(n)		((n) << 12)
>  #define GETSIZE(w)	((w) >> 12)
> @@ -132,7 +135,7 @@ union vsx_reg {
>   * otherwise.
>   */
>  extern int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
> -			 unsigned int instr);
> +			 unsigned int instr, unsigned int sufx);
>

I'm not saying this is necessarily better, but did you consider:

 - making instr 64 bits and using masking and shifting macros to get the
   prefix and suffix?

 - defining an instruction type/struct/union/whatever that contains both
   halves in one object?

I'm happy to be told that it ends up being way, way uglier/worse/etc,
but I just thought I'd ask.

Regards,
Daniel

>  /*
>   * Emulate an instruction that can be executed just by updating
> @@ -149,7 +152,8 @@ void emulate_update_regs(struct pt_regs *reg, struct instruction_op *op);
>   * 0 if it could not be emulated, or -1 for an instruction that
>   * should not be emulated (rfid, mtmsrd clearing MSR_RI, etc.).
>   */
> -extern int emulate_step(struct pt_regs *regs, unsigned int instr);
> +extern int emulate_step(struct pt_regs *regs, unsigned int instr,
> +			unsigned int sufx);
>  
>  /*
>   * Emulate a load or store instruction by reading/writing the
> diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
> index 15002b51ff18..bc585399e0c7 100644
> --- a/arch/powerpc/include/asm/uaccess.h
> +++ b/arch/powerpc/include/asm/uaccess.h
> @@ -423,4 +423,34 @@ extern long __copy_from_user_flushcache(void *dst, const void __user *src,
>  extern void memcpy_page_flushcache(char *to, struct page *page, size_t offset,
>  			   size_t len);
>  
> +/*
> + * When reading an instruction iff it is a prefix, the suffix needs to be also
> + * loaded.
> + */
> +#define __get_user_instr(x, y, ptr)			\
> +({							\
> +	long __gui_ret = 0;				\
> +	y = 0;						\
> +	__gui_ret = __get_user(x, ptr);			\
> +	if (!__gui_ret) {				\
> +		if (IS_PREFIX(x))			\
> +			__gui_ret = __get_user(y, ptr + 1);	\
> +	}						\
> +							\
> +	__gui_ret;					\
> +})
> +
> +#define __get_user_instr_inatomic(x, y, ptr)		\
> +({							\
> +	long __gui_ret = 0;				\
> +	y = 0;						\
> +	__gui_ret = __get_user_inatomic(x, ptr);	\
> +	if (!__gui_ret) {				\
> +		if (IS_PREFIX(x))			\
> +			__gui_ret = __get_user_inatomic(y, ptr + 1);	\
> +	}						\
> +							\
> +	__gui_ret;					\
> +})
> +
>  #endif	/* _ARCH_POWERPC_UACCESS_H */
> diff --git a/arch/powerpc/kernel/align.c b/arch/powerpc/kernel/align.c
> index 92045ed64976..245e79792a01 100644
> --- a/arch/powerpc/kernel/align.c
> +++ b/arch/powerpc/kernel/align.c
> @@ -334,7 +334,7 @@ int fix_alignment(struct pt_regs *regs)
>  	if ((instr & 0xfc0006fe) == (PPC_INST_COPY & 0xfc0006fe))
>  		return -EIO;
>  
> -	r = analyse_instr(&op, regs, instr);
> +	r = analyse_instr(&op, regs, instr, 0);
>  	if (r < 0)
>  		return -EINVAL;
>  
> diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c
> index 58ce3d37c2a3..f4530961998c 100644
> --- a/arch/powerpc/kernel/hw_breakpoint.c
> +++ b/arch/powerpc/kernel/hw_breakpoint.c
> @@ -248,7 +248,7 @@ static bool stepping_handler(struct pt_regs *regs, struct perf_event *bp,
>  	if (__get_user_inatomic(instr, (unsigned int *)regs->nip))
>  		goto fail;
>  
> -	ret = analyse_instr(&op, regs, instr);
> +	ret = analyse_instr(&op, regs, instr, 0);
>  	type = GETTYPE(op.type);
>  	size = GETSIZE(op.type);
>  
> @@ -272,7 +272,7 @@ static bool stepping_handler(struct pt_regs *regs, struct perf_event *bp,
>  		return false;
>  	}
>  
> -	if (!emulate_step(regs, instr))
> +	if (!emulate_step(regs, instr, 0))
>  		goto fail;
>  
>  	return true;
> diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
> index 2d27ec4feee4..7303fe3856cc 100644
> --- a/arch/powerpc/kernel/kprobes.c
> +++ b/arch/powerpc/kernel/kprobes.c
> @@ -219,7 +219,7 @@ static int try_to_emulate(struct kprobe *p, struct pt_regs *regs)
>  	unsigned int insn = *p->ainsn.insn;
>  
>  	/* regs->nip is also adjusted if emulate_step returns 1 */
> -	ret = emulate_step(regs, insn);
> +	ret = emulate_step(regs, insn, 0);
>  	if (ret > 0) {
>  		/*
>  		 * Once this instruction has been boosted
> diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c
> index 1cbf7f1a4e3d..d862bb549158 100644
> --- a/arch/powerpc/kernel/mce_power.c
> +++ b/arch/powerpc/kernel/mce_power.c
> @@ -374,7 +374,7 @@ static int mce_find_instr_ea_and_phys(struct pt_regs *regs, uint64_t *addr,
>  	if (pfn != ULONG_MAX) {
>  		instr_addr = (pfn << PAGE_SHIFT) + (regs->nip & ~PAGE_MASK);
>  		instr = *(unsigned int *)(instr_addr);
> -		if (!analyse_instr(&op, &tmp, instr)) {
> +		if (!analyse_instr(&op, &tmp, instr, 0)) {
>  			pfn = addr_to_pfn(regs, op.ea);
>  			*addr = op.ea;
>  			*phys_addr = (pfn << PAGE_SHIFT);
> diff --git a/arch/powerpc/kernel/optprobes.c b/arch/powerpc/kernel/optprobes.c
> index 024f7aad1952..82dc8a589c87 100644
> --- a/arch/powerpc/kernel/optprobes.c
> +++ b/arch/powerpc/kernel/optprobes.c
> @@ -100,7 +100,7 @@ static unsigned long can_optimize(struct kprobe *p)
>  	 * and that can be emulated.
>  	 */
>  	if (!is_conditional_branch(*p->ainsn.insn) &&
> -			analyse_instr(&op, &regs, *p->ainsn.insn) == 1) {
> +			analyse_instr(&op, &regs, *p->ainsn.insn, 0) == 1) {
>  		emulate_update_regs(&regs, &op);
>  		nip = regs.nip;
>  	}
> diff --git a/arch/powerpc/kernel/uprobes.c b/arch/powerpc/kernel/uprobes.c
> index 1cfef0e5fec5..ab1077dc6148 100644
> --- a/arch/powerpc/kernel/uprobes.c
> +++ b/arch/powerpc/kernel/uprobes.c
> @@ -173,7 +173,7 @@ bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs)
>  	 * emulate_step() returns 1 if the insn was successfully emulated.
>  	 * For all other cases, we need to single-step in hardware.
>  	 */
> -	ret = emulate_step(regs, auprobe->insn);
> +	ret = emulate_step(regs, auprobe->insn, 0);
>  	if (ret > 0)
>  		return true;
>  
> diff --git a/arch/powerpc/kvm/emulate_loadstore.c b/arch/powerpc/kvm/emulate_loadstore.c
> index 2e496eb86e94..fcab1f31b48d 100644
> --- a/arch/powerpc/kvm/emulate_loadstore.c
> +++ b/arch/powerpc/kvm/emulate_loadstore.c
> @@ -100,7 +100,7 @@ int kvmppc_emulate_loadstore(struct kvm_vcpu *vcpu)
>  
>  	emulated = EMULATE_FAIL;
>  	vcpu->arch.regs.msr = vcpu->arch.shared->msr;
> -	if (analyse_instr(&op, &vcpu->arch.regs, inst) == 0) {
> +	if (analyse_instr(&op, &vcpu->arch.regs, inst, 0) == 0) {
>  		int type = op.type & INSTR_TYPE_MASK;
>  		int size = GETSIZE(op.type);
>  
> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> index c077acb983a1..ade3f5eba2e5 100644
> --- a/arch/powerpc/lib/sstep.c
> +++ b/arch/powerpc/lib/sstep.c
> @@ -1163,7 +1163,7 @@ static nokprobe_inline int trap_compare(long v1, long v2)
>   * otherwise.
>   */
>  int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
> -		  unsigned int instr)
> +		  unsigned int instr, unsigned int sufx)
>  {
>  	unsigned int opcode, ra, rb, rc, rd, spr, u;
>  	unsigned long int imm;
> @@ -2756,7 +2756,8 @@ void emulate_update_regs(struct pt_regs *regs, struct instruction_op *op)
>  {
>  	unsigned long next_pc;
>  
> -	next_pc = truncate_if_32bit(regs->msr, regs->nip + 4);
> +	next_pc = truncate_if_32bit(regs->msr,
> +				    regs->nip + ((op->type & PREFIXED) ? 8 : 4));
>  	switch (GETTYPE(op->type)) {
>  	case COMPUTE:
>  		if (op->type & SETREG)
> @@ -3101,14 +3102,14 @@ NOKPROBE_SYMBOL(emulate_loadstore);
>   * or -1 if the instruction is one that should not be stepped,
>   * such as an rfid, or a mtmsrd that would clear MSR_RI.
>   */
> -int emulate_step(struct pt_regs *regs, unsigned int instr)
> +int emulate_step(struct pt_regs *regs, unsigned int instr, unsigned int sufx)
>  {
>  	struct instruction_op op;
>  	int r, err, type;
>  	unsigned long val;
>  	unsigned long ea;
>  
> -	r = analyse_instr(&op, regs, instr);
> +	r = analyse_instr(&op, regs, instr, sufx);
>  	if (r < 0)
>  		return r;
>  	if (r > 0) {
> @@ -3200,7 +3201,8 @@ int emulate_step(struct pt_regs *regs, unsigned int instr)
>  	return 0;
>  
>   instr_done:
> -	regs->nip = truncate_if_32bit(regs->msr, regs->nip + 4);
> +	regs->nip = truncate_if_32bit(regs->msr,
> +				      regs->nip + ((op.type & PREFIXED) ? 8 : 4));
>  	return 1;
>  }
>  NOKPROBE_SYMBOL(emulate_step);
> diff --git a/arch/powerpc/lib/test_emulate_step.c b/arch/powerpc/lib/test_emulate_step.c
> index 42347067739c..9288dc6fc715 100644
> --- a/arch/powerpc/lib/test_emulate_step.c
> +++ b/arch/powerpc/lib/test_emulate_step.c
> @@ -103,7 +103,7 @@ static void __init test_ld(void)
>  	regs.gpr[3] = (unsigned long) &a;
>  
>  	/* ld r5, 0(r3) */
> -	stepped = emulate_step(&regs, TEST_LD(5, 3, 0));
> +	stepped = emulate_step(&regs, TEST_LD(5, 3, 0), 0);
>  
>  	if (stepped == 1 && regs.gpr[5] == a)
>  		show_result("ld", "PASS");
> @@ -121,7 +121,7 @@ static void __init test_lwz(void)
>  	regs.gpr[3] = (unsigned long) &a;
>  
>  	/* lwz r5, 0(r3) */
> -	stepped = emulate_step(&regs, TEST_LWZ(5, 3, 0));
> +	stepped = emulate_step(&regs, TEST_LWZ(5, 3, 0), 0);
>  
>  	if (stepped == 1 && regs.gpr[5] == a)
>  		show_result("lwz", "PASS");
> @@ -141,7 +141,7 @@ static void __init test_lwzx(void)
>  	regs.gpr[5] = 0x8765;
>  
>  	/* lwzx r5, r3, r4 */
> -	stepped = emulate_step(&regs, TEST_LWZX(5, 3, 4));
> +	stepped = emulate_step(&regs, TEST_LWZX(5, 3, 4), 0);
>  	if (stepped == 1 && regs.gpr[5] == a[2])
>  		show_result("lwzx", "PASS");
>  	else
> @@ -159,7 +159,7 @@ static void __init test_std(void)
>  	regs.gpr[5] = 0x5678;
>  
>  	/* std r5, 0(r3) */
> -	stepped = emulate_step(&regs, TEST_STD(5, 3, 0));
> +	stepped = emulate_step(&regs, TEST_STD(5, 3, 0), 0);
>  	if (stepped == 1 || regs.gpr[5] == a)
>  		show_result("std", "PASS");
>  	else
> @@ -184,7 +184,7 @@ static void __init test_ldarx_stdcx(void)
>  	regs.gpr[5] = 0x5678;
>  
>  	/* ldarx r5, r3, r4, 0 */
> -	stepped = emulate_step(&regs, TEST_LDARX(5, 3, 4, 0));
> +	stepped = emulate_step(&regs, TEST_LDARX(5, 3, 4, 0), 0);
>  
>  	/*
>  	 * Don't touch 'a' here. Touching 'a' can do Load/store
> @@ -202,7 +202,7 @@ static void __init test_ldarx_stdcx(void)
>  	regs.gpr[5] = 0x9ABC;
>  
>  	/* stdcx. r5, r3, r4 */
> -	stepped = emulate_step(&regs, TEST_STDCX(5, 3, 4));
> +	stepped = emulate_step(&regs, TEST_STDCX(5, 3, 4), 0);
>  
>  	/*
>  	 * Two possible scenarios that indicates successful emulation
> @@ -242,7 +242,7 @@ static void __init test_lfsx_stfsx(void)
>  	regs.gpr[4] = 0;
>  
>  	/* lfsx frt10, r3, r4 */
> -	stepped = emulate_step(&regs, TEST_LFSX(10, 3, 4));
> +	stepped = emulate_step(&regs, TEST_LFSX(10, 3, 4), 0);
>  
>  	if (stepped == 1)
>  		show_result("lfsx", "PASS");
> @@ -255,7 +255,7 @@ static void __init test_lfsx_stfsx(void)
>  	c.a = 678.91;
>  
>  	/* stfsx frs10, r3, r4 */
> -	stepped = emulate_step(&regs, TEST_STFSX(10, 3, 4));
> +	stepped = emulate_step(&regs, TEST_STFSX(10, 3, 4), 0);
>  
>  	if (stepped == 1 && c.b == cached_b)
>  		show_result("stfsx", "PASS");
> @@ -285,7 +285,7 @@ static void __init test_lfdx_stfdx(void)
>  	regs.gpr[4] = 0;
>  
>  	/* lfdx frt10, r3, r4 */
> -	stepped = emulate_step(&regs, TEST_LFDX(10, 3, 4));
> +	stepped = emulate_step(&regs, TEST_LFDX(10, 3, 4), 0);
>  
>  	if (stepped == 1)
>  		show_result("lfdx", "PASS");
> @@ -298,7 +298,7 @@ static void __init test_lfdx_stfdx(void)
>  	c.a = 987654.32;
>  
>  	/* stfdx frs10, r3, r4 */
> -	stepped = emulate_step(&regs, TEST_STFDX(10, 3, 4));
> +	stepped = emulate_step(&regs, TEST_STFDX(10, 3, 4), 0);
>  
>  	if (stepped == 1 && c.b == cached_b)
>  		show_result("stfdx", "PASS");
> @@ -344,7 +344,7 @@ static void __init test_lvx_stvx(void)
>  	regs.gpr[4] = 0;
>  
>  	/* lvx vrt10, r3, r4 */
> -	stepped = emulate_step(&regs, TEST_LVX(10, 3, 4));
> +	stepped = emulate_step(&regs, TEST_LVX(10, 3, 4), 0);
>  
>  	if (stepped == 1)
>  		show_result("lvx", "PASS");
> @@ -360,7 +360,7 @@ static void __init test_lvx_stvx(void)
>  	c.b[3] = 498532;
>  
>  	/* stvx vrs10, r3, r4 */
> -	stepped = emulate_step(&regs, TEST_STVX(10, 3, 4));
> +	stepped = emulate_step(&regs, TEST_STVX(10, 3, 4), 0);
>  
>  	if (stepped == 1 && cached_b[0] == c.b[0] && cached_b[1] == c.b[1] &&
>  	    cached_b[2] == c.b[2] && cached_b[3] == c.b[3])
> @@ -401,7 +401,7 @@ static void __init test_lxvd2x_stxvd2x(void)
>  	regs.gpr[4] = 0;
>  
>  	/* lxvd2x vsr39, r3, r4 */
> -	stepped = emulate_step(&regs, TEST_LXVD2X(39, 3, 4));
> +	stepped = emulate_step(&regs, TEST_LXVD2X(39, 3, 4), 0);
>  
>  	if (stepped == 1 && cpu_has_feature(CPU_FTR_VSX)) {
>  		show_result("lxvd2x", "PASS");
> @@ -421,7 +421,7 @@ static void __init test_lxvd2x_stxvd2x(void)
>  	c.b[3] = 4;
>  
>  	/* stxvd2x vsr39, r3, r4 */
> -	stepped = emulate_step(&regs, TEST_STXVD2X(39, 3, 4));
> +	stepped = emulate_step(&regs, TEST_STXVD2X(39, 3, 4), 0);
>  
>  	if (stepped == 1 && cached_b[0] == c.b[0] && cached_b[1] == c.b[1] &&
>  	    cached_b[2] == c.b[2] && cached_b[3] == c.b[3] &&
> @@ -848,7 +848,7 @@ static int __init emulate_compute_instr(struct pt_regs *regs,
>  	if (!regs || !instr)
>  		return -EINVAL;
>  
> -	if (analyse_instr(&op, regs, instr) != 1 ||
> +	if (analyse_instr(&op, regs, instr, 0) != 1 ||
>  	    GETTYPE(op.type) != COMPUTE) {
>  		pr_info("emulation failed, instruction = 0x%08x\n", instr);
>  		return -EFAULT;
> diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
> index a7056049709e..f47bd843dc52 100644
> --- a/arch/powerpc/xmon/xmon.c
> +++ b/arch/powerpc/xmon/xmon.c
> @@ -705,7 +705,7 @@ static int xmon_core(struct pt_regs *regs, int fromipi)
>  	if ((regs->msr & (MSR_IR|MSR_PR|MSR_64BIT)) == (MSR_IR|MSR_64BIT)) {
>  		bp = at_breakpoint(regs->nip);
>  		if (bp != NULL) {
> -			int stepped = emulate_step(regs, bp->instr[0]);
> +			int stepped = emulate_step(regs, bp->instr[0], 0);
>  			if (stepped == 0) {
>  				regs->nip = (unsigned long) &bp->instr[0];
>  				atomic_inc(&bp->ref_count);
> @@ -1170,7 +1170,7 @@ static int do_step(struct pt_regs *regs)
>  	/* check we are in 64-bit kernel mode, translation enabled */
>  	if ((regs->msr & (MSR_64BIT|MSR_PR|MSR_IR)) == (MSR_64BIT|MSR_IR)) {
>  		if (mread(regs->nip, &instr, 4) == 4) {
> -			stepped = emulate_step(regs, instr);
> +			stepped = emulate_step(regs, instr, 0);
>  			if (stepped < 0) {
>  				printf("Couldn't single-step %s instruction\n",
>  				       (IS_RFID(instr)? "rfid": "mtmsrd"));
> -- 
> 2.20.1

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 03/18] powerpc: Add PREFIXED SRR1 bit for future ISA version
  2019-12-18  8:23   ` Daniel Axtens
@ 2019-12-20  5:09     ` Jordan Niethe
  0 siblings, 0 replies; 42+ messages in thread
From: Jordan Niethe @ 2019-12-20  5:09 UTC (permalink / raw)
  To: Daniel Axtens; +Cc: Alistair Popple, linuxppc-dev

On Wed, Dec 18, 2019 at 7:23 PM Daniel Axtens <dja@axtens.net> wrote:
>
> Jordan Niethe <jniethe5@gmail.com> writes:
>
> > Add the bit definition for exceptions caused by prefixed instructions.
> >
> > Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> > ---
> >  arch/powerpc/include/asm/reg.h | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
> > index 6f9fcc3d4c82..0a6d39fb4769 100644
> > --- a/arch/powerpc/include/asm/reg.h
> > +++ b/arch/powerpc/include/asm/reg.h
> > @@ -778,6 +778,7 @@
> >
> >  #define   SRR1_MCE_MCP               0x00080000 /* Machine check signal caused interrupt */
> >  #define   SRR1_BOUNDARY              0x10000000 /* Prefixed instruction crosses 64-byte boundary */
> > +#define   SRR1_PREFIXED              0x20000000 /* Exception caused by prefixed instruction */
>
> You could probably squash this with the previous patch, and maybe the
> next patch too.
>
> Regards,
> Daniel
>
> >
> >  #define SPRN_HSRR0   0x13A   /* Save/Restore Register 0 */
> >  #define SPRN_HSRR1   0x13B   /* Save/Restore Register 1 */
> > --
> > 2.20.1
Thanks, good idea.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 05/18] powerpc sstep: Prepare to support prefixed instructions
  2019-12-18  8:35   ` Daniel Axtens
@ 2019-12-20  5:11     ` Jordan Niethe
  2019-12-20  5:40       ` Christophe Leroy
  0 siblings, 1 reply; 42+ messages in thread
From: Jordan Niethe @ 2019-12-20  5:11 UTC (permalink / raw)
  To: Daniel Axtens; +Cc: Alistair Popple, linuxppc-dev

On Wed, Dec 18, 2019 at 7:35 PM Daniel Axtens <dja@axtens.net> wrote:
>
> Jordan Niethe <jniethe5@gmail.com> writes:
>
> > Currently all instructions are a single word long. A future ISA version
> > will include prefixed instructions which have a double word length. The
> > functions used for analysing and emulating instructions need to be
> > modified so that they can handle these new instruction types.
> >
> > A prefixed instruction is a word prefix followed by a word suffix. All
> > prefixes uniquely have the primary op-code 1. Suffixes may be valid word
> > instructions or instructions that only exist as suffixes.
> >
> > In handling prefixed instructions it will be convenient to treat the
> > suffix and prefix as separate words. To facilitate this modify
> > analyse_instr() and emulate_step() to take a take a suffix as a
> > parameter. For word instructions it does not matter what is passed in
> > here - it will be ignored.
> >
> > We also define a new flag, PREFIXED, to be used in instruction_op:type.
> > This flag will indicate when emulating an analysed instruction if the
> > NIP should be advanced by word length or double word length.
> >
> > The callers of analyse_instr() and emulate_step() will need their own
> > changes to be able to support prefixed instructions. For now modify them
> > to pass in 0 as a suffix.
> >
> > Note that at this point no prefixed instructions are emulated or
> > analysed - this is just making it possible to do so.
> >
> > Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> > ---
> >  arch/powerpc/include/asm/ppc-opcode.h |  3 +++
> >  arch/powerpc/include/asm/sstep.h      |  8 +++++--
> >  arch/powerpc/include/asm/uaccess.h    | 30 +++++++++++++++++++++++++++
> >  arch/powerpc/kernel/align.c           |  2 +-
> >  arch/powerpc/kernel/hw_breakpoint.c   |  4 ++--
> >  arch/powerpc/kernel/kprobes.c         |  2 +-
> >  arch/powerpc/kernel/mce_power.c       |  2 +-
> >  arch/powerpc/kernel/optprobes.c       |  2 +-
> >  arch/powerpc/kernel/uprobes.c         |  2 +-
> >  arch/powerpc/kvm/emulate_loadstore.c  |  2 +-
> >  arch/powerpc/lib/sstep.c              | 12 ++++++-----
> >  arch/powerpc/lib/test_emulate_step.c  | 30 +++++++++++++--------------
> >  arch/powerpc/xmon/xmon.c              |  4 ++--
> >  13 files changed, 71 insertions(+), 32 deletions(-)
> >
> > diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h
> > index c1df75edde44..a1dfa4bdd22f 100644
> > --- a/arch/powerpc/include/asm/ppc-opcode.h
> > +++ b/arch/powerpc/include/asm/ppc-opcode.h
> > @@ -377,6 +377,9 @@
> >  #define PPC_INST_VCMPEQUD            0x100000c7
> >  #define PPC_INST_VCMPEQUB            0x10000006
> >
> > +/* macro to check if a word is a prefix */
> > +#define IS_PREFIX(x) (((x) >> 26) == 1)
> > +
> >  /* macros to insert fields into opcodes */
> >  #define ___PPC_RA(a) (((a) & 0x1f) << 16)
> >  #define ___PPC_RB(b) (((b) & 0x1f) << 11)
> > diff --git a/arch/powerpc/include/asm/sstep.h b/arch/powerpc/include/asm/sstep.h
> > index 769f055509c9..6d4cb602e231 100644
> > --- a/arch/powerpc/include/asm/sstep.h
> > +++ b/arch/powerpc/include/asm/sstep.h
> > @@ -89,6 +89,9 @@ enum instruction_type {
> >  #define VSX_LDLEFT   4       /* load VSX register from left */
> >  #define VSX_CHECK_VEC        8       /* check MSR_VEC not MSR_VSX for reg >= 32 */
> >
> > +/* Prefixed flag, ORed in with type */
> > +#define PREFIXED     0x800
> > +
> >  /* Size field in type word */
> >  #define SIZE(n)              ((n) << 12)
> >  #define GETSIZE(w)   ((w) >> 12)
> > @@ -132,7 +135,7 @@ union vsx_reg {
> >   * otherwise.
> >   */
> >  extern int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
> > -                      unsigned int instr);
> > +                      unsigned int instr, unsigned int sufx);
> >
> >  /*
> >   * Emulate an instruction that can be executed just by updating
> > @@ -149,7 +152,8 @@ void emulate_update_regs(struct pt_regs *reg, struct instruction_op *op);
> >   * 0 if it could not be emulated, or -1 for an instruction that
> >   * should not be emulated (rfid, mtmsrd clearing MSR_RI, etc.).
> >   */
> > -extern int emulate_step(struct pt_regs *regs, unsigned int instr);
> > +extern int emulate_step(struct pt_regs *regs, unsigned int instr,
> > +                     unsigned int sufx);
> >
> >  /*
> >   * Emulate a load or store instruction by reading/writing the
> > diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
> > index 15002b51ff18..bc585399e0c7 100644
> > --- a/arch/powerpc/include/asm/uaccess.h
> > +++ b/arch/powerpc/include/asm/uaccess.h
> > @@ -423,4 +423,34 @@ extern long __copy_from_user_flushcache(void *dst, const void __user *src,
> >  extern void memcpy_page_flushcache(char *to, struct page *page, size_t offset,
> >                          size_t len);
> >
> > +/*
> > + * When reading an instruction iff it is a prefix, the suffix needs to be also
> > + * loaded.
> > + */
> > +#define __get_user_instr(x, y, ptr)                  \
> > +({                                                   \
> > +     long __gui_ret = 0;                             \
> > +     y = 0;                                          \
> > +     __gui_ret = __get_user(x, ptr);                 \
> > +     if (!__gui_ret) {                               \
> > +             if (IS_PREFIX(x))                       \
> > +                     __gui_ret = __get_user(y, ptr + 1);     \
> > +     }                                               \
> > +                                                     \
> > +     __gui_ret;                                      \
> > +})
> > +
> > +#define __get_user_instr_inatomic(x, y, ptr)         \
> > +({                                                   \
> > +     long __gui_ret = 0;                             \
> > +     y = 0;                                          \
> > +     __gui_ret = __get_user_inatomic(x, ptr);        \
> > +     if (!__gui_ret) {                               \
> > +             if (IS_PREFIX(x))                       \
> > +                     __gui_ret = __get_user_inatomic(y, ptr + 1);    \
> > +     }                                               \
> > +                                                     \
> > +     __gui_ret;                                      \
> > +})
> > +
> >  #endif       /* _ARCH_POWERPC_UACCESS_H */
> > diff --git a/arch/powerpc/kernel/align.c b/arch/powerpc/kernel/align.c
> > index 92045ed64976..245e79792a01 100644
> > --- a/arch/powerpc/kernel/align.c
> > +++ b/arch/powerpc/kernel/align.c
> > @@ -334,7 +334,7 @@ int fix_alignment(struct pt_regs *regs)
> >       if ((instr & 0xfc0006fe) == (PPC_INST_COPY & 0xfc0006fe))
> >               return -EIO;
> >
> > -     r = analyse_instr(&op, regs, instr);
> > +     r = analyse_instr(&op, regs, instr, 0);
> >       if (r < 0)
> >               return -EINVAL;
> >
> > diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c
> > index 58ce3d37c2a3..f4530961998c 100644
> > --- a/arch/powerpc/kernel/hw_breakpoint.c
> > +++ b/arch/powerpc/kernel/hw_breakpoint.c
> > @@ -248,7 +248,7 @@ static bool stepping_handler(struct pt_regs *regs, struct perf_event *bp,
> >       if (__get_user_inatomic(instr, (unsigned int *)regs->nip))
> >               goto fail;
> >
> > -     ret = analyse_instr(&op, regs, instr);
> > +     ret = analyse_instr(&op, regs, instr, 0);
> >       type = GETTYPE(op.type);
> >       size = GETSIZE(op.type);
> >
> > @@ -272,7 +272,7 @@ static bool stepping_handler(struct pt_regs *regs, struct perf_event *bp,
> >               return false;
> >       }
> >
> > -     if (!emulate_step(regs, instr))
> > +     if (!emulate_step(regs, instr, 0))
> >               goto fail;
> >
> >       return true;
> > diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
> > index 2d27ec4feee4..7303fe3856cc 100644
> > --- a/arch/powerpc/kernel/kprobes.c
> > +++ b/arch/powerpc/kernel/kprobes.c
> > @@ -219,7 +219,7 @@ static int try_to_emulate(struct kprobe *p, struct pt_regs *regs)
> >       unsigned int insn = *p->ainsn.insn;
> >
> >       /* regs->nip is also adjusted if emulate_step returns 1 */
> > -     ret = emulate_step(regs, insn);
> > +     ret = emulate_step(regs, insn, 0);
> >       if (ret > 0) {
> >               /*
> >                * Once this instruction has been boosted
> > diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c
> > index 1cbf7f1a4e3d..d862bb549158 100644
> > --- a/arch/powerpc/kernel/mce_power.c
> > +++ b/arch/powerpc/kernel/mce_power.c
> > @@ -374,7 +374,7 @@ static int mce_find_instr_ea_and_phys(struct pt_regs *regs, uint64_t *addr,
> >       if (pfn != ULONG_MAX) {
> >               instr_addr = (pfn << PAGE_SHIFT) + (regs->nip & ~PAGE_MASK);
> >               instr = *(unsigned int *)(instr_addr);
> > -             if (!analyse_instr(&op, &tmp, instr)) {
> > +             if (!analyse_instr(&op, &tmp, instr, 0)) {
> >                       pfn = addr_to_pfn(regs, op.ea);
> >                       *addr = op.ea;
> >                       *phys_addr = (pfn << PAGE_SHIFT);
> > diff --git a/arch/powerpc/kernel/optprobes.c b/arch/powerpc/kernel/optprobes.c
> > index 024f7aad1952..82dc8a589c87 100644
> > --- a/arch/powerpc/kernel/optprobes.c
> > +++ b/arch/powerpc/kernel/optprobes.c
> > @@ -100,7 +100,7 @@ static unsigned long can_optimize(struct kprobe *p)
> >        * and that can be emulated.
> >        */
> >       if (!is_conditional_branch(*p->ainsn.insn) &&
> > -                     analyse_instr(&op, &regs, *p->ainsn.insn) == 1) {
> > +                     analyse_instr(&op, &regs, *p->ainsn.insn, 0) == 1) {
> >               emulate_update_regs(&regs, &op);
> >               nip = regs.nip;
> >       }
> > diff --git a/arch/powerpc/kernel/uprobes.c b/arch/powerpc/kernel/uprobes.c
> > index 1cfef0e5fec5..ab1077dc6148 100644
> > --- a/arch/powerpc/kernel/uprobes.c
> > +++ b/arch/powerpc/kernel/uprobes.c
> > @@ -173,7 +173,7 @@ bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs)
> >        * emulate_step() returns 1 if the insn was successfully emulated.
> >        * For all other cases, we need to single-step in hardware.
> >        */
> > -     ret = emulate_step(regs, auprobe->insn);
> > +     ret = emulate_step(regs, auprobe->insn, 0);
> >       if (ret > 0)
> >               return true;
> >
> > diff --git a/arch/powerpc/kvm/emulate_loadstore.c b/arch/powerpc/kvm/emulate_loadstore.c
> > index 2e496eb86e94..fcab1f31b48d 100644
> > --- a/arch/powerpc/kvm/emulate_loadstore.c
> > +++ b/arch/powerpc/kvm/emulate_loadstore.c
> > @@ -100,7 +100,7 @@ int kvmppc_emulate_loadstore(struct kvm_vcpu *vcpu)
> >
> >       emulated = EMULATE_FAIL;
> >       vcpu->arch.regs.msr = vcpu->arch.shared->msr;
> > -     if (analyse_instr(&op, &vcpu->arch.regs, inst) == 0) {
> > +     if (analyse_instr(&op, &vcpu->arch.regs, inst, 0) == 0) {
> >               int type = op.type & INSTR_TYPE_MASK;
> >               int size = GETSIZE(op.type);
> >
> > diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> > index c077acb983a1..ade3f5eba2e5 100644
> > --- a/arch/powerpc/lib/sstep.c
> > +++ b/arch/powerpc/lib/sstep.c
> > @@ -1163,7 +1163,7 @@ static nokprobe_inline int trap_compare(long v1, long v2)
> >   * otherwise.
> >   */
> >  int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
> > -               unsigned int instr)
> > +               unsigned int instr, unsigned int sufx)
>
> I know we really like shortenings in arch/powerpc, but I think we can
> afford the two extra characters to spell 'suffix' in full :)
>
'suffix' was what I used initially but somewhere along the line
I found it looked unbalanced to see the abbreviation inst{r,} in the same
context as the unabbreviated 'suffix'. Happy to change it if it is clearer.

> >  {
> >       unsigned int opcode, ra, rb, rc, rd, spr, u;
> >       unsigned long int imm;
> > @@ -2756,7 +2756,8 @@ void emulate_update_regs(struct pt_regs *regs, struct instruction_op *op)
> >  {
> >       unsigned long next_pc;
> >
> > -     next_pc = truncate_if_32bit(regs->msr, regs->nip + 4);
> > +     next_pc = truncate_if_32bit(regs->msr,
> > +                                 regs->nip + ((op->type & PREFIXED) ? 8 : 4));
>
> I know you only use it twice, but this is super ugly - could you add a
> macro like IS_PREFIXED_TYPE? or even something like
> #define OP_TYPE_LENGTH(x) (((op)->type & PREFIXED) ? 8 : 4)
Yes that would be nicer.
>
> >       switch (GETTYPE(op->type)) {
> >       case COMPUTE:
> >               if (op->type & SETREG)
> > @@ -3101,14 +3102,14 @@ NOKPROBE_SYMBOL(emulate_loadstore);
> >   * or -1 if the instruction is one that should not be stepped,
> >   * such as an rfid, or a mtmsrd that would clear MSR_RI.
> >   */
> > -int emulate_step(struct pt_regs *regs, unsigned int instr)
> > +int emulate_step(struct pt_regs *regs, unsigned int instr, unsigned int sufx)
> >  {
> >       struct instruction_op op;
> >       int r, err, type;
> >       unsigned long val;
> >       unsigned long ea;
> >
> > -     r = analyse_instr(&op, regs, instr);
> > +     r = analyse_instr(&op, regs, instr, sufx);
> >       if (r < 0)
> >               return r;
> >       if (r > 0) {
> > @@ -3200,7 +3201,8 @@ int emulate_step(struct pt_regs *regs, unsigned int instr)
> >       return 0;
> >
> >   instr_done:
> > -     regs->nip = truncate_if_32bit(regs->msr, regs->nip + 4);
> > +     regs->nip = truncate_if_32bit(regs->msr,
> > +                                   regs->nip + ((op.type & PREFIXED) ? 8 : 4));
> >       return 1;
> >  }
> >  NOKPROBE_SYMBOL(emulate_step);
> > diff --git a/arch/powerpc/lib/test_emulate_step.c b/arch/powerpc/lib/test_emulate_step.c
> > index 42347067739c..9288dc6fc715 100644
> > --- a/arch/powerpc/lib/test_emulate_step.c
> > +++ b/arch/powerpc/lib/test_emulate_step.c
> > @@ -103,7 +103,7 @@ static void __init test_ld(void)
> >       regs.gpr[3] = (unsigned long) &a;
> >
> >       /* ld r5, 0(r3) */
> > -     stepped = emulate_step(&regs, TEST_LD(5, 3, 0));
> > +     stepped = emulate_step(&regs, TEST_LD(5, 3, 0), 0);
>
> I wonder if it would be clearer if we had something like
> #define NO_SUFFIX 0
It probably would be.

>
> >
> >       if (stepped == 1 && regs.gpr[5] == a)
> >               show_result("ld", "PASS");
> > @@ -121,7 +121,7 @@ static void __init test_lwz(void)
> >       regs.gpr[3] = (unsigned long) &a;
> >
> >       /* lwz r5, 0(r3) */
> > -     stepped = emulate_step(&regs, TEST_LWZ(5, 3, 0));
> > +     stepped = emulate_step(&regs, TEST_LWZ(5, 3, 0), 0);
> >
> >       if (stepped == 1 && regs.gpr[5] == a)
> >               show_result("lwz", "PASS");
> > @@ -141,7 +141,7 @@ static void __init test_lwzx(void)
> >       regs.gpr[5] = 0x8765;
> >
> >       /* lwzx r5, r3, r4 */
> > -     stepped = emulate_step(&regs, TEST_LWZX(5, 3, 4));
> > +     stepped = emulate_step(&regs, TEST_LWZX(5, 3, 4), 0);
> >       if (stepped == 1 && regs.gpr[5] == a[2])
> >               show_result("lwzx", "PASS");
> >       else
> > @@ -159,7 +159,7 @@ static void __init test_std(void)
> >       regs.gpr[5] = 0x5678;
> >
> >       /* std r5, 0(r3) */
> > -     stepped = emulate_step(&regs, TEST_STD(5, 3, 0));
> > +     stepped = emulate_step(&regs, TEST_STD(5, 3, 0), 0);
> >       if (stepped == 1 || regs.gpr[5] == a)
> >               show_result("std", "PASS");
> >       else
> > @@ -184,7 +184,7 @@ static void __init test_ldarx_stdcx(void)
> >       regs.gpr[5] = 0x5678;
> >
> >       /* ldarx r5, r3, r4, 0 */
> > -     stepped = emulate_step(&regs, TEST_LDARX(5, 3, 4, 0));
> > +     stepped = emulate_step(&regs, TEST_LDARX(5, 3, 4, 0), 0);
> >
> >       /*
> >        * Don't touch 'a' here. Touching 'a' can do Load/store
> > @@ -202,7 +202,7 @@ static void __init test_ldarx_stdcx(void)
> >       regs.gpr[5] = 0x9ABC;
> >
> >       /* stdcx. r5, r3, r4 */
> > -     stepped = emulate_step(&regs, TEST_STDCX(5, 3, 4));
> > +     stepped = emulate_step(&regs, TEST_STDCX(5, 3, 4), 0);
> >
> >       /*
> >        * Two possible scenarios that indicates successful emulation
> > @@ -242,7 +242,7 @@ static void __init test_lfsx_stfsx(void)
> >       regs.gpr[4] = 0;
> >
> >       /* lfsx frt10, r3, r4 */
> > -     stepped = emulate_step(&regs, TEST_LFSX(10, 3, 4));
> > +     stepped = emulate_step(&regs, TEST_LFSX(10, 3, 4), 0);
> >
> >       if (stepped == 1)
> >               show_result("lfsx", "PASS");
> > @@ -255,7 +255,7 @@ static void __init test_lfsx_stfsx(void)
> >       c.a = 678.91;
> >
> >       /* stfsx frs10, r3, r4 */
> > -     stepped = emulate_step(&regs, TEST_STFSX(10, 3, 4));
> > +     stepped = emulate_step(&regs, TEST_STFSX(10, 3, 4), 0);
> >
> >       if (stepped == 1 && c.b == cached_b)
> >               show_result("stfsx", "PASS");
> > @@ -285,7 +285,7 @@ static void __init test_lfdx_stfdx(void)
> >       regs.gpr[4] = 0;
> >
> >       /* lfdx frt10, r3, r4 */
> > -     stepped = emulate_step(&regs, TEST_LFDX(10, 3, 4));
> > +     stepped = emulate_step(&regs, TEST_LFDX(10, 3, 4), 0);
> >
> >       if (stepped == 1)
> >               show_result("lfdx", "PASS");
> > @@ -298,7 +298,7 @@ static void __init test_lfdx_stfdx(void)
> >       c.a = 987654.32;
> >
> >       /* stfdx frs10, r3, r4 */
> > -     stepped = emulate_step(&regs, TEST_STFDX(10, 3, 4));
> > +     stepped = emulate_step(&regs, TEST_STFDX(10, 3, 4), 0);
> >
> >       if (stepped == 1 && c.b == cached_b)
> >               show_result("stfdx", "PASS");
> > @@ -344,7 +344,7 @@ static void __init test_lvx_stvx(void)
> >       regs.gpr[4] = 0;
> >
> >       /* lvx vrt10, r3, r4 */
> > -     stepped = emulate_step(&regs, TEST_LVX(10, 3, 4));
> > +     stepped = emulate_step(&regs, TEST_LVX(10, 3, 4), 0);
> >
> >       if (stepped == 1)
> >               show_result("lvx", "PASS");
> > @@ -360,7 +360,7 @@ static void __init test_lvx_stvx(void)
> >       c.b[3] = 498532;
> >
> >       /* stvx vrs10, r3, r4 */
> > -     stepped = emulate_step(&regs, TEST_STVX(10, 3, 4));
> > +     stepped = emulate_step(&regs, TEST_STVX(10, 3, 4), 0);
> >
> >       if (stepped == 1 && cached_b[0] == c.b[0] && cached_b[1] == c.b[1] &&
> >           cached_b[2] == c.b[2] && cached_b[3] == c.b[3])
> > @@ -401,7 +401,7 @@ static void __init test_lxvd2x_stxvd2x(void)
> >       regs.gpr[4] = 0;
> >
> >       /* lxvd2x vsr39, r3, r4 */
> > -     stepped = emulate_step(&regs, TEST_LXVD2X(39, 3, 4));
> > +     stepped = emulate_step(&regs, TEST_LXVD2X(39, 3, 4), 0);
> >
> >       if (stepped == 1 && cpu_has_feature(CPU_FTR_VSX)) {
> >               show_result("lxvd2x", "PASS");
> > @@ -421,7 +421,7 @@ static void __init test_lxvd2x_stxvd2x(void)
> >       c.b[3] = 4;
> >
> >       /* stxvd2x vsr39, r3, r4 */
> > -     stepped = emulate_step(&regs, TEST_STXVD2X(39, 3, 4));
> > +     stepped = emulate_step(&regs, TEST_STXVD2X(39, 3, 4), 0);
> >
> >       if (stepped == 1 && cached_b[0] == c.b[0] && cached_b[1] == c.b[1] &&
> >           cached_b[2] == c.b[2] && cached_b[3] == c.b[3] &&
> > @@ -848,7 +848,7 @@ static int __init emulate_compute_instr(struct pt_regs *regs,
> >       if (!regs || !instr)
> >               return -EINVAL;
> >
> > -     if (analyse_instr(&op, regs, instr) != 1 ||
> > +     if (analyse_instr(&op, regs, instr, 0) != 1 ||
> >           GETTYPE(op.type) != COMPUTE) {
> >               pr_info("emulation failed, instruction = 0x%08x\n", instr);
> >               return -EFAULT;
> > diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
> > index a7056049709e..f47bd843dc52 100644
> > --- a/arch/powerpc/xmon/xmon.c
> > +++ b/arch/powerpc/xmon/xmon.c
> > @@ -705,7 +705,7 @@ static int xmon_core(struct pt_regs *regs, int fromipi)
> >       if ((regs->msr & (MSR_IR|MSR_PR|MSR_64BIT)) == (MSR_IR|MSR_64BIT)) {
> >               bp = at_breakpoint(regs->nip);
> >               if (bp != NULL) {
> > -                     int stepped = emulate_step(regs, bp->instr[0]);
> > +                     int stepped = emulate_step(regs, bp->instr[0], 0);
> >                       if (stepped == 0) {
> >                               regs->nip = (unsigned long) &bp->instr[0];
> >                               atomic_inc(&bp->ref_count);
> > @@ -1170,7 +1170,7 @@ static int do_step(struct pt_regs *regs)
> >       /* check we are in 64-bit kernel mode, translation enabled */
> >       if ((regs->msr & (MSR_64BIT|MSR_PR|MSR_IR)) == (MSR_64BIT|MSR_IR)) {
> >               if (mread(regs->nip, &instr, 4) == 4) {
> > -                     stepped = emulate_step(regs, instr);
> > +                     stepped = emulate_step(regs, instr, 0);
> >                       if (stepped < 0) {
> >                               printf("Couldn't single-step %s instruction\n",
> >                                      (IS_RFID(instr)? "rfid": "mtmsrd"));
> > --
> > 2.20.1

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 05/18] powerpc sstep: Prepare to support prefixed instructions
  2019-12-18 14:15   ` Daniel Axtens
@ 2019-12-20  5:17     ` Jordan Niethe
  2020-01-07  3:01       ` Jordan Niethe
  0 siblings, 1 reply; 42+ messages in thread
From: Jordan Niethe @ 2019-12-20  5:17 UTC (permalink / raw)
  To: Daniel Axtens; +Cc: Alistair Popple, linuxppc-dev

On Thu, Dec 19, 2019 at 1:15 AM Daniel Axtens <dja@axtens.net> wrote:
>
> Jordan Niethe <jniethe5@gmail.com> writes:
>
> > Currently all instructions are a single word long. A future ISA version
> > will include prefixed instructions which have a double word length. The
> > functions used for analysing and emulating instructions need to be
> > modified so that they can handle these new instruction types.
> >
> > A prefixed instruction is a word prefix followed by a word suffix. All
> > prefixes uniquely have the primary op-code 1. Suffixes may be valid word
> > instructions or instructions that only exist as suffixes.
> >
> > In handling prefixed instructions it will be convenient to treat the
> > suffix and prefix as separate words. To facilitate this modify
> > analyse_instr() and emulate_step() to take a take a suffix as a
> > parameter. For word instructions it does not matter what is passed in
> > here - it will be ignored.
> >
> > We also define a new flag, PREFIXED, to be used in instruction_op:type.
> > This flag will indicate when emulating an analysed instruction if the
> > NIP should be advanced by word length or double word length.
> >
> > The callers of analyse_instr() and emulate_step() will need their own
> > changes to be able to support prefixed instructions. For now modify them
> > to pass in 0 as a suffix.
> >
> > Note that at this point no prefixed instructions are emulated or
> > analysed - this is just making it possible to do so.
> >
> > Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> > ---
> >  arch/powerpc/include/asm/ppc-opcode.h |  3 +++
> >  arch/powerpc/include/asm/sstep.h      |  8 +++++--
> >  arch/powerpc/include/asm/uaccess.h    | 30 +++++++++++++++++++++++++++
> >  arch/powerpc/kernel/align.c           |  2 +-
> >  arch/powerpc/kernel/hw_breakpoint.c   |  4 ++--
> >  arch/powerpc/kernel/kprobes.c         |  2 +-
> >  arch/powerpc/kernel/mce_power.c       |  2 +-
> >  arch/powerpc/kernel/optprobes.c       |  2 +-
> >  arch/powerpc/kernel/uprobes.c         |  2 +-
> >  arch/powerpc/kvm/emulate_loadstore.c  |  2 +-
> >  arch/powerpc/lib/sstep.c              | 12 ++++++-----
> >  arch/powerpc/lib/test_emulate_step.c  | 30 +++++++++++++--------------
> >  arch/powerpc/xmon/xmon.c              |  4 ++--
> >  13 files changed, 71 insertions(+), 32 deletions(-)
> >
> > diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h
> > index c1df75edde44..a1dfa4bdd22f 100644
> > --- a/arch/powerpc/include/asm/ppc-opcode.h
> > +++ b/arch/powerpc/include/asm/ppc-opcode.h
> > @@ -377,6 +377,9 @@
> >  #define PPC_INST_VCMPEQUD            0x100000c7
> >  #define PPC_INST_VCMPEQUB            0x10000006
> >
> > +/* macro to check if a word is a prefix */
> > +#define IS_PREFIX(x) (((x) >> 26) == 1)
> > +
> >  /* macros to insert fields into opcodes */
> >  #define ___PPC_RA(a) (((a) & 0x1f) << 16)
> >  #define ___PPC_RB(b) (((b) & 0x1f) << 11)
> > diff --git a/arch/powerpc/include/asm/sstep.h b/arch/powerpc/include/asm/sstep.h
> > index 769f055509c9..6d4cb602e231 100644
> > --- a/arch/powerpc/include/asm/sstep.h
> > +++ b/arch/powerpc/include/asm/sstep.h
> > @@ -89,6 +89,9 @@ enum instruction_type {
> >  #define VSX_LDLEFT   4       /* load VSX register from left */
> >  #define VSX_CHECK_VEC        8       /* check MSR_VEC not MSR_VSX for reg >= 32 */
> >
> > +/* Prefixed flag, ORed in with type */
> > +#define PREFIXED     0x800
> > +
> >  /* Size field in type word */
> >  #define SIZE(n)              ((n) << 12)
> >  #define GETSIZE(w)   ((w) >> 12)
> > @@ -132,7 +135,7 @@ union vsx_reg {
> >   * otherwise.
> >   */
> >  extern int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
> > -                      unsigned int instr);
> > +                      unsigned int instr, unsigned int sufx);
> >
>
> I'm not saying this is necessarily better, but did you consider:
>
>  - making instr 64 bits and using masking and shifting macros to get the
>    prefix and suffix?
>
>  - defining an instruction type/struct/union/whatever that contains both
>    halves in one object?
>
> I'm happy to be told that it ends up being way, way uglier/worse/etc,
> but I just thought I'd ask.
>
> Regards,
> Daniel

It is a good question and something I thought and am not completely confident
that this approach is the best. Basically what I ended up thinking was that
the prefixed instructions were a bit of a special case, and by doing
it like this
the normal word instructions would just carry on the same as before.

I can see this is a pretty flimsy reason, so I am happy for suggestions as
to what would end up being clearer.


>
> >  /*
> >   * Emulate an instruction that can be executed just by updating
> > @@ -149,7 +152,8 @@ void emulate_update_regs(struct pt_regs *reg, struct instruction_op *op);
> >   * 0 if it could not be emulated, or -1 for an instruction that
> >   * should not be emulated (rfid, mtmsrd clearing MSR_RI, etc.).
> >   */
> > -extern int emulate_step(struct pt_regs *regs, unsigned int instr);
> > +extern int emulate_step(struct pt_regs *regs, unsigned int instr,
> > +                     unsigned int sufx);
> >
> >  /*
> >   * Emulate a load or store instruction by reading/writing the
> > diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
> > index 15002b51ff18..bc585399e0c7 100644
> > --- a/arch/powerpc/include/asm/uaccess.h
> > +++ b/arch/powerpc/include/asm/uaccess.h
> > @@ -423,4 +423,34 @@ extern long __copy_from_user_flushcache(void *dst, const void __user *src,
> >  extern void memcpy_page_flushcache(char *to, struct page *page, size_t offset,
> >                          size_t len);
> >
> > +/*
> > + * When reading an instruction iff it is a prefix, the suffix needs to be also
> > + * loaded.
> > + */
> > +#define __get_user_instr(x, y, ptr)                  \
> > +({                                                   \
> > +     long __gui_ret = 0;                             \
> > +     y = 0;                                          \
> > +     __gui_ret = __get_user(x, ptr);                 \
> > +     if (!__gui_ret) {                               \
> > +             if (IS_PREFIX(x))                       \
> > +                     __gui_ret = __get_user(y, ptr + 1);     \
> > +     }                                               \
> > +                                                     \
> > +     __gui_ret;                                      \
> > +})
> > +
> > +#define __get_user_instr_inatomic(x, y, ptr)         \
> > +({                                                   \
> > +     long __gui_ret = 0;                             \
> > +     y = 0;                                          \
> > +     __gui_ret = __get_user_inatomic(x, ptr);        \
> > +     if (!__gui_ret) {                               \
> > +             if (IS_PREFIX(x))                       \
> > +                     __gui_ret = __get_user_inatomic(y, ptr + 1);    \
> > +     }                                               \
> > +                                                     \
> > +     __gui_ret;                                      \
> > +})
> > +
> >  #endif       /* _ARCH_POWERPC_UACCESS_H */
> > diff --git a/arch/powerpc/kernel/align.c b/arch/powerpc/kernel/align.c
> > index 92045ed64976..245e79792a01 100644
> > --- a/arch/powerpc/kernel/align.c
> > +++ b/arch/powerpc/kernel/align.c
> > @@ -334,7 +334,7 @@ int fix_alignment(struct pt_regs *regs)
> >       if ((instr & 0xfc0006fe) == (PPC_INST_COPY & 0xfc0006fe))
> >               return -EIO;
> >
> > -     r = analyse_instr(&op, regs, instr);
> > +     r = analyse_instr(&op, regs, instr, 0);
> >       if (r < 0)
> >               return -EINVAL;
> >
> > diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c
> > index 58ce3d37c2a3..f4530961998c 100644
> > --- a/arch/powerpc/kernel/hw_breakpoint.c
> > +++ b/arch/powerpc/kernel/hw_breakpoint.c
> > @@ -248,7 +248,7 @@ static bool stepping_handler(struct pt_regs *regs, struct perf_event *bp,
> >       if (__get_user_inatomic(instr, (unsigned int *)regs->nip))
> >               goto fail;
> >
> > -     ret = analyse_instr(&op, regs, instr);
> > +     ret = analyse_instr(&op, regs, instr, 0);
> >       type = GETTYPE(op.type);
> >       size = GETSIZE(op.type);
> >
> > @@ -272,7 +272,7 @@ static bool stepping_handler(struct pt_regs *regs, struct perf_event *bp,
> >               return false;
> >       }
> >
> > -     if (!emulate_step(regs, instr))
> > +     if (!emulate_step(regs, instr, 0))
> >               goto fail;
> >
> >       return true;
> > diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
> > index 2d27ec4feee4..7303fe3856cc 100644
> > --- a/arch/powerpc/kernel/kprobes.c
> > +++ b/arch/powerpc/kernel/kprobes.c
> > @@ -219,7 +219,7 @@ static int try_to_emulate(struct kprobe *p, struct pt_regs *regs)
> >       unsigned int insn = *p->ainsn.insn;
> >
> >       /* regs->nip is also adjusted if emulate_step returns 1 */
> > -     ret = emulate_step(regs, insn);
> > +     ret = emulate_step(regs, insn, 0);
> >       if (ret > 0) {
> >               /*
> >                * Once this instruction has been boosted
> > diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c
> > index 1cbf7f1a4e3d..d862bb549158 100644
> > --- a/arch/powerpc/kernel/mce_power.c
> > +++ b/arch/powerpc/kernel/mce_power.c
> > @@ -374,7 +374,7 @@ static int mce_find_instr_ea_and_phys(struct pt_regs *regs, uint64_t *addr,
> >       if (pfn != ULONG_MAX) {
> >               instr_addr = (pfn << PAGE_SHIFT) + (regs->nip & ~PAGE_MASK);
> >               instr = *(unsigned int *)(instr_addr);
> > -             if (!analyse_instr(&op, &tmp, instr)) {
> > +             if (!analyse_instr(&op, &tmp, instr, 0)) {
> >                       pfn = addr_to_pfn(regs, op.ea);
> >                       *addr = op.ea;
> >                       *phys_addr = (pfn << PAGE_SHIFT);
> > diff --git a/arch/powerpc/kernel/optprobes.c b/arch/powerpc/kernel/optprobes.c
> > index 024f7aad1952..82dc8a589c87 100644
> > --- a/arch/powerpc/kernel/optprobes.c
> > +++ b/arch/powerpc/kernel/optprobes.c
> > @@ -100,7 +100,7 @@ static unsigned long can_optimize(struct kprobe *p)
> >        * and that can be emulated.
> >        */
> >       if (!is_conditional_branch(*p->ainsn.insn) &&
> > -                     analyse_instr(&op, &regs, *p->ainsn.insn) == 1) {
> > +                     analyse_instr(&op, &regs, *p->ainsn.insn, 0) == 1) {
> >               emulate_update_regs(&regs, &op);
> >               nip = regs.nip;
> >       }
> > diff --git a/arch/powerpc/kernel/uprobes.c b/arch/powerpc/kernel/uprobes.c
> > index 1cfef0e5fec5..ab1077dc6148 100644
> > --- a/arch/powerpc/kernel/uprobes.c
> > +++ b/arch/powerpc/kernel/uprobes.c
> > @@ -173,7 +173,7 @@ bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs)
> >        * emulate_step() returns 1 if the insn was successfully emulated.
> >        * For all other cases, we need to single-step in hardware.
> >        */
> > -     ret = emulate_step(regs, auprobe->insn);
> > +     ret = emulate_step(regs, auprobe->insn, 0);
> >       if (ret > 0)
> >               return true;
> >
> > diff --git a/arch/powerpc/kvm/emulate_loadstore.c b/arch/powerpc/kvm/emulate_loadstore.c
> > index 2e496eb86e94..fcab1f31b48d 100644
> > --- a/arch/powerpc/kvm/emulate_loadstore.c
> > +++ b/arch/powerpc/kvm/emulate_loadstore.c
> > @@ -100,7 +100,7 @@ int kvmppc_emulate_loadstore(struct kvm_vcpu *vcpu)
> >
> >       emulated = EMULATE_FAIL;
> >       vcpu->arch.regs.msr = vcpu->arch.shared->msr;
> > -     if (analyse_instr(&op, &vcpu->arch.regs, inst) == 0) {
> > +     if (analyse_instr(&op, &vcpu->arch.regs, inst, 0) == 0) {
> >               int type = op.type & INSTR_TYPE_MASK;
> >               int size = GETSIZE(op.type);
> >
> > diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> > index c077acb983a1..ade3f5eba2e5 100644
> > --- a/arch/powerpc/lib/sstep.c
> > +++ b/arch/powerpc/lib/sstep.c
> > @@ -1163,7 +1163,7 @@ static nokprobe_inline int trap_compare(long v1, long v2)
> >   * otherwise.
> >   */
> >  int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
> > -               unsigned int instr)
> > +               unsigned int instr, unsigned int sufx)
> >  {
> >       unsigned int opcode, ra, rb, rc, rd, spr, u;
> >       unsigned long int imm;
> > @@ -2756,7 +2756,8 @@ void emulate_update_regs(struct pt_regs *regs, struct instruction_op *op)
> >  {
> >       unsigned long next_pc;
> >
> > -     next_pc = truncate_if_32bit(regs->msr, regs->nip + 4);
> > +     next_pc = truncate_if_32bit(regs->msr,
> > +                                 regs->nip + ((op->type & PREFIXED) ? 8 : 4));
> >       switch (GETTYPE(op->type)) {
> >       case COMPUTE:
> >               if (op->type & SETREG)
> > @@ -3101,14 +3102,14 @@ NOKPROBE_SYMBOL(emulate_loadstore);
> >   * or -1 if the instruction is one that should not be stepped,
> >   * such as an rfid, or a mtmsrd that would clear MSR_RI.
> >   */
> > -int emulate_step(struct pt_regs *regs, unsigned int instr)
> > +int emulate_step(struct pt_regs *regs, unsigned int instr, unsigned int sufx)
> >  {
> >       struct instruction_op op;
> >       int r, err, type;
> >       unsigned long val;
> >       unsigned long ea;
> >
> > -     r = analyse_instr(&op, regs, instr);
> > +     r = analyse_instr(&op, regs, instr, sufx);
> >       if (r < 0)
> >               return r;
> >       if (r > 0) {
> > @@ -3200,7 +3201,8 @@ int emulate_step(struct pt_regs *regs, unsigned int instr)
> >       return 0;
> >
> >   instr_done:
> > -     regs->nip = truncate_if_32bit(regs->msr, regs->nip + 4);
> > +     regs->nip = truncate_if_32bit(regs->msr,
> > +                                   regs->nip + ((op.type & PREFIXED) ? 8 : 4));
> >       return 1;
> >  }
> >  NOKPROBE_SYMBOL(emulate_step);
> > diff --git a/arch/powerpc/lib/test_emulate_step.c b/arch/powerpc/lib/test_emulate_step.c
> > index 42347067739c..9288dc6fc715 100644
> > --- a/arch/powerpc/lib/test_emulate_step.c
> > +++ b/arch/powerpc/lib/test_emulate_step.c
> > @@ -103,7 +103,7 @@ static void __init test_ld(void)
> >       regs.gpr[3] = (unsigned long) &a;
> >
> >       /* ld r5, 0(r3) */
> > -     stepped = emulate_step(&regs, TEST_LD(5, 3, 0));
> > +     stepped = emulate_step(&regs, TEST_LD(5, 3, 0), 0);
> >
> >       if (stepped == 1 && regs.gpr[5] == a)
> >               show_result("ld", "PASS");
> > @@ -121,7 +121,7 @@ static void __init test_lwz(void)
> >       regs.gpr[3] = (unsigned long) &a;
> >
> >       /* lwz r5, 0(r3) */
> > -     stepped = emulate_step(&regs, TEST_LWZ(5, 3, 0));
> > +     stepped = emulate_step(&regs, TEST_LWZ(5, 3, 0), 0);
> >
> >       if (stepped == 1 && regs.gpr[5] == a)
> >               show_result("lwz", "PASS");
> > @@ -141,7 +141,7 @@ static void __init test_lwzx(void)
> >       regs.gpr[5] = 0x8765;
> >
> >       /* lwzx r5, r3, r4 */
> > -     stepped = emulate_step(&regs, TEST_LWZX(5, 3, 4));
> > +     stepped = emulate_step(&regs, TEST_LWZX(5, 3, 4), 0);
> >       if (stepped == 1 && regs.gpr[5] == a[2])
> >               show_result("lwzx", "PASS");
> >       else
> > @@ -159,7 +159,7 @@ static void __init test_std(void)
> >       regs.gpr[5] = 0x5678;
> >
> >       /* std r5, 0(r3) */
> > -     stepped = emulate_step(&regs, TEST_STD(5, 3, 0));
> > +     stepped = emulate_step(&regs, TEST_STD(5, 3, 0), 0);
> >       if (stepped == 1 || regs.gpr[5] == a)
> >               show_result("std", "PASS");
> >       else
> > @@ -184,7 +184,7 @@ static void __init test_ldarx_stdcx(void)
> >       regs.gpr[5] = 0x5678;
> >
> >       /* ldarx r5, r3, r4, 0 */
> > -     stepped = emulate_step(&regs, TEST_LDARX(5, 3, 4, 0));
> > +     stepped = emulate_step(&regs, TEST_LDARX(5, 3, 4, 0), 0);
> >
> >       /*
> >        * Don't touch 'a' here. Touching 'a' can do Load/store
> > @@ -202,7 +202,7 @@ static void __init test_ldarx_stdcx(void)
> >       regs.gpr[5] = 0x9ABC;
> >
> >       /* stdcx. r5, r3, r4 */
> > -     stepped = emulate_step(&regs, TEST_STDCX(5, 3, 4));
> > +     stepped = emulate_step(&regs, TEST_STDCX(5, 3, 4), 0);
> >
> >       /*
> >        * Two possible scenarios that indicates successful emulation
> > @@ -242,7 +242,7 @@ static void __init test_lfsx_stfsx(void)
> >       regs.gpr[4] = 0;
> >
> >       /* lfsx frt10, r3, r4 */
> > -     stepped = emulate_step(&regs, TEST_LFSX(10, 3, 4));
> > +     stepped = emulate_step(&regs, TEST_LFSX(10, 3, 4), 0);
> >
> >       if (stepped == 1)
> >               show_result("lfsx", "PASS");
> > @@ -255,7 +255,7 @@ static void __init test_lfsx_stfsx(void)
> >       c.a = 678.91;
> >
> >       /* stfsx frs10, r3, r4 */
> > -     stepped = emulate_step(&regs, TEST_STFSX(10, 3, 4));
> > +     stepped = emulate_step(&regs, TEST_STFSX(10, 3, 4), 0);
> >
> >       if (stepped == 1 && c.b == cached_b)
> >               show_result("stfsx", "PASS");
> > @@ -285,7 +285,7 @@ static void __init test_lfdx_stfdx(void)
> >       regs.gpr[4] = 0;
> >
> >       /* lfdx frt10, r3, r4 */
> > -     stepped = emulate_step(&regs, TEST_LFDX(10, 3, 4));
> > +     stepped = emulate_step(&regs, TEST_LFDX(10, 3, 4), 0);
> >
> >       if (stepped == 1)
> >               show_result("lfdx", "PASS");
> > @@ -298,7 +298,7 @@ static void __init test_lfdx_stfdx(void)
> >       c.a = 987654.32;
> >
> >       /* stfdx frs10, r3, r4 */
> > -     stepped = emulate_step(&regs, TEST_STFDX(10, 3, 4));
> > +     stepped = emulate_step(&regs, TEST_STFDX(10, 3, 4), 0);
> >
> >       if (stepped == 1 && c.b == cached_b)
> >               show_result("stfdx", "PASS");
> > @@ -344,7 +344,7 @@ static void __init test_lvx_stvx(void)
> >       regs.gpr[4] = 0;
> >
> >       /* lvx vrt10, r3, r4 */
> > -     stepped = emulate_step(&regs, TEST_LVX(10, 3, 4));
> > +     stepped = emulate_step(&regs, TEST_LVX(10, 3, 4), 0);
> >
> >       if (stepped == 1)
> >               show_result("lvx", "PASS");
> > @@ -360,7 +360,7 @@ static void __init test_lvx_stvx(void)
> >       c.b[3] = 498532;
> >
> >       /* stvx vrs10, r3, r4 */
> > -     stepped = emulate_step(&regs, TEST_STVX(10, 3, 4));
> > +     stepped = emulate_step(&regs, TEST_STVX(10, 3, 4), 0);
> >
> >       if (stepped == 1 && cached_b[0] == c.b[0] && cached_b[1] == c.b[1] &&
> >           cached_b[2] == c.b[2] && cached_b[3] == c.b[3])
> > @@ -401,7 +401,7 @@ static void __init test_lxvd2x_stxvd2x(void)
> >       regs.gpr[4] = 0;
> >
> >       /* lxvd2x vsr39, r3, r4 */
> > -     stepped = emulate_step(&regs, TEST_LXVD2X(39, 3, 4));
> > +     stepped = emulate_step(&regs, TEST_LXVD2X(39, 3, 4), 0);
> >
> >       if (stepped == 1 && cpu_has_feature(CPU_FTR_VSX)) {
> >               show_result("lxvd2x", "PASS");
> > @@ -421,7 +421,7 @@ static void __init test_lxvd2x_stxvd2x(void)
> >       c.b[3] = 4;
> >
> >       /* stxvd2x vsr39, r3, r4 */
> > -     stepped = emulate_step(&regs, TEST_STXVD2X(39, 3, 4));
> > +     stepped = emulate_step(&regs, TEST_STXVD2X(39, 3, 4), 0);
> >
> >       if (stepped == 1 && cached_b[0] == c.b[0] && cached_b[1] == c.b[1] &&
> >           cached_b[2] == c.b[2] && cached_b[3] == c.b[3] &&
> > @@ -848,7 +848,7 @@ static int __init emulate_compute_instr(struct pt_regs *regs,
> >       if (!regs || !instr)
> >               return -EINVAL;
> >
> > -     if (analyse_instr(&op, regs, instr) != 1 ||
> > +     if (analyse_instr(&op, regs, instr, 0) != 1 ||
> >           GETTYPE(op.type) != COMPUTE) {
> >               pr_info("emulation failed, instruction = 0x%08x\n", instr);
> >               return -EFAULT;
> > diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
> > index a7056049709e..f47bd843dc52 100644
> > --- a/arch/powerpc/xmon/xmon.c
> > +++ b/arch/powerpc/xmon/xmon.c
> > @@ -705,7 +705,7 @@ static int xmon_core(struct pt_regs *regs, int fromipi)
> >       if ((regs->msr & (MSR_IR|MSR_PR|MSR_64BIT)) == (MSR_IR|MSR_64BIT)) {
> >               bp = at_breakpoint(regs->nip);
> >               if (bp != NULL) {
> > -                     int stepped = emulate_step(regs, bp->instr[0]);
> > +                     int stepped = emulate_step(regs, bp->instr[0], 0);
> >                       if (stepped == 0) {
> >                               regs->nip = (unsigned long) &bp->instr[0];
> >                               atomic_inc(&bp->ref_count);
> > @@ -1170,7 +1170,7 @@ static int do_step(struct pt_regs *regs)
> >       /* check we are in 64-bit kernel mode, translation enabled */
> >       if ((regs->msr & (MSR_64BIT|MSR_PR|MSR_IR)) == (MSR_64BIT|MSR_IR)) {
> >               if (mread(regs->nip, &instr, 4) == 4) {
> > -                     stepped = emulate_step(regs, instr);
> > +                     stepped = emulate_step(regs, instr, 0);
> >                       if (stepped < 0) {
> >                               printf("Couldn't single-step %s instruction\n",
> >                                      (IS_RFID(instr)? "rfid": "mtmsrd"));
> > --
> > 2.20.1

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 05/18] powerpc sstep: Prepare to support prefixed instructions
  2019-12-20  5:11     ` Jordan Niethe
@ 2019-12-20  5:40       ` Christophe Leroy
  0 siblings, 0 replies; 42+ messages in thread
From: Christophe Leroy @ 2019-12-20  5:40 UTC (permalink / raw)
  To: Jordan Niethe, Daniel Axtens; +Cc: Alistair Popple, linuxppc-dev



Le 20/12/2019 à 06:11, Jordan Niethe a écrit :
> On Wed, Dec 18, 2019 at 7:35 PM Daniel Axtens <dja@axtens.net> wrote:
>>
>> Jordan Niethe <jniethe5@gmail.com> writes:
>>
>>> Currently all instructions are a single word long. A future ISA version
>>> will include prefixed instructions which have a double word length. The
>>> functions used for analysing and emulating instructions need to be
>>> modified so that they can handle these new instruction types.
>>>
>>> A prefixed instruction is a word prefix followed by a word suffix. All
>>> prefixes uniquely have the primary op-code 1. Suffixes may be valid word
>>> instructions or instructions that only exist as suffixes.
>>>
>>> In handling prefixed instructions it will be convenient to treat the
>>> suffix and prefix as separate words. To facilitate this modify
>>> analyse_instr() and emulate_step() to take a take a suffix as a
>>> parameter. For word instructions it does not matter what is passed in
>>> here - it will be ignored.
>>>
>>> We also define a new flag, PREFIXED, to be used in instruction_op:type.
>>> This flag will indicate when emulating an analysed instruction if the
>>> NIP should be advanced by word length or double word length.
>>>
>>> The callers of analyse_instr() and emulate_step() will need their own
>>> changes to be able to support prefixed instructions. For now modify them
>>> to pass in 0 as a suffix.
>>>
>>> Note that at this point no prefixed instructions are emulated or
>>> analysed - this is just making it possible to do so.
>>>
>>> Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
>>> ---
>>>   arch/powerpc/include/asm/ppc-opcode.h |  3 +++
>>>   arch/powerpc/include/asm/sstep.h      |  8 +++++--
>>>   arch/powerpc/include/asm/uaccess.h    | 30 +++++++++++++++++++++++++++
>>>   arch/powerpc/kernel/align.c           |  2 +-
>>>   arch/powerpc/kernel/hw_breakpoint.c   |  4 ++--
>>>   arch/powerpc/kernel/kprobes.c         |  2 +-
>>>   arch/powerpc/kernel/mce_power.c       |  2 +-
>>>   arch/powerpc/kernel/optprobes.c       |  2 +-
>>>   arch/powerpc/kernel/uprobes.c         |  2 +-
>>>   arch/powerpc/kvm/emulate_loadstore.c  |  2 +-
>>>   arch/powerpc/lib/sstep.c              | 12 ++++++-----
>>>   arch/powerpc/lib/test_emulate_step.c  | 30 +++++++++++++--------------
>>>   arch/powerpc/xmon/xmon.c              |  4 ++--
>>>   13 files changed, 71 insertions(+), 32 deletions(-)
>>>
>>> diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h
>>> index c1df75edde44..a1dfa4bdd22f 100644
>>> --- a/arch/powerpc/include/asm/ppc-opcode.h
>>> +++ b/arch/powerpc/include/asm/ppc-opcode.h
>>> @@ -377,6 +377,9 @@
>>>   #define PPC_INST_VCMPEQUD            0x100000c7
>>>   #define PPC_INST_VCMPEQUB            0x10000006
>>>
>>> +/* macro to check if a word is a prefix */
>>> +#define IS_PREFIX(x) (((x) >> 26) == 1)
>>> +
>>>   /* macros to insert fields into opcodes */
>>>   #define ___PPC_RA(a) (((a) & 0x1f) << 16)
>>>   #define ___PPC_RB(b) (((b) & 0x1f) << 11)
>>> diff --git a/arch/powerpc/include/asm/sstep.h b/arch/powerpc/include/asm/sstep.h
>>> index 769f055509c9..6d4cb602e231 100644
>>> --- a/arch/powerpc/include/asm/sstep.h
>>> +++ b/arch/powerpc/include/asm/sstep.h
>>> @@ -89,6 +89,9 @@ enum instruction_type {
>>>   #define VSX_LDLEFT   4       /* load VSX register from left */
>>>   #define VSX_CHECK_VEC        8       /* check MSR_VEC not MSR_VSX for reg >= 32 */
>>>
>>> +/* Prefixed flag, ORed in with type */
>>> +#define PREFIXED     0x800
>>> +
>>>   /* Size field in type word */
>>>   #define SIZE(n)              ((n) << 12)
>>>   #define GETSIZE(w)   ((w) >> 12)
>>> @@ -132,7 +135,7 @@ union vsx_reg {
>>>    * otherwise.
>>>    */
>>>   extern int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
>>> -                      unsigned int instr);
>>> +                      unsigned int instr, unsigned int sufx);
>>>
>>>   /*
>>>    * Emulate an instruction that can be executed just by updating
>>> @@ -149,7 +152,8 @@ void emulate_update_regs(struct pt_regs *reg, struct instruction_op *op);
>>>    * 0 if it could not be emulated, or -1 for an instruction that
>>>    * should not be emulated (rfid, mtmsrd clearing MSR_RI, etc.).
>>>    */
>>> -extern int emulate_step(struct pt_regs *regs, unsigned int instr);
>>> +extern int emulate_step(struct pt_regs *regs, unsigned int instr,
>>> +                     unsigned int sufx);
>>>
>>>   /*
>>>    * Emulate a load or store instruction by reading/writing the
>>> diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
>>> index 15002b51ff18..bc585399e0c7 100644
>>> --- a/arch/powerpc/include/asm/uaccess.h
>>> +++ b/arch/powerpc/include/asm/uaccess.h
>>> @@ -423,4 +423,34 @@ extern long __copy_from_user_flushcache(void *dst, const void __user *src,
>>>   extern void memcpy_page_flushcache(char *to, struct page *page, size_t offset,
>>>                           size_t len);
>>>
>>> +/*
>>> + * When reading an instruction iff it is a prefix, the suffix needs to be also
>>> + * loaded.
>>> + */
>>> +#define __get_user_instr(x, y, ptr)                  \
>>> +({                                                   \
>>> +     long __gui_ret = 0;                             \
>>> +     y = 0;                                          \
>>> +     __gui_ret = __get_user(x, ptr);                 \
>>> +     if (!__gui_ret) {                               \
>>> +             if (IS_PREFIX(x))                       \
>>> +                     __gui_ret = __get_user(y, ptr + 1);     \
>>> +     }                                               \
>>> +                                                     \
>>> +     __gui_ret;                                      \
>>> +})
>>> +
>>> +#define __get_user_instr_inatomic(x, y, ptr)         \
>>> +({                                                   \
>>> +     long __gui_ret = 0;                             \
>>> +     y = 0;                                          \
>>> +     __gui_ret = __get_user_inatomic(x, ptr);        \
>>> +     if (!__gui_ret) {                               \
>>> +             if (IS_PREFIX(x))                       \
>>> +                     __gui_ret = __get_user_inatomic(y, ptr + 1);    \
>>> +     }                                               \
>>> +                                                     \
>>> +     __gui_ret;                                      \
>>> +})
>>> +
>>>   #endif       /* _ARCH_POWERPC_UACCESS_H */
>>> diff --git a/arch/powerpc/kernel/align.c b/arch/powerpc/kernel/align.c
>>> index 92045ed64976..245e79792a01 100644
>>> --- a/arch/powerpc/kernel/align.c
>>> +++ b/arch/powerpc/kernel/align.c
>>> @@ -334,7 +334,7 @@ int fix_alignment(struct pt_regs *regs)
>>>        if ((instr & 0xfc0006fe) == (PPC_INST_COPY & 0xfc0006fe))
>>>                return -EIO;
>>>
>>> -     r = analyse_instr(&op, regs, instr);
>>> +     r = analyse_instr(&op, regs, instr, 0);
>>>        if (r < 0)
>>>                return -EINVAL;
>>>
>>> diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c
>>> index 58ce3d37c2a3..f4530961998c 100644
>>> --- a/arch/powerpc/kernel/hw_breakpoint.c
>>> +++ b/arch/powerpc/kernel/hw_breakpoint.c
>>> @@ -248,7 +248,7 @@ static bool stepping_handler(struct pt_regs *regs, struct perf_event *bp,
>>>        if (__get_user_inatomic(instr, (unsigned int *)regs->nip))
>>>                goto fail;
>>>
>>> -     ret = analyse_instr(&op, regs, instr);
>>> +     ret = analyse_instr(&op, regs, instr, 0);
>>>        type = GETTYPE(op.type);
>>>        size = GETSIZE(op.type);
>>>
>>> @@ -272,7 +272,7 @@ static bool stepping_handler(struct pt_regs *regs, struct perf_event *bp,
>>>                return false;
>>>        }
>>>
>>> -     if (!emulate_step(regs, instr))
>>> +     if (!emulate_step(regs, instr, 0))
>>>                goto fail;
>>>
>>>        return true;
>>> diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
>>> index 2d27ec4feee4..7303fe3856cc 100644
>>> --- a/arch/powerpc/kernel/kprobes.c
>>> +++ b/arch/powerpc/kernel/kprobes.c
>>> @@ -219,7 +219,7 @@ static int try_to_emulate(struct kprobe *p, struct pt_regs *regs)
>>>        unsigned int insn = *p->ainsn.insn;
>>>
>>>        /* regs->nip is also adjusted if emulate_step returns 1 */
>>> -     ret = emulate_step(regs, insn);
>>> +     ret = emulate_step(regs, insn, 0);
>>>        if (ret > 0) {
>>>                /*
>>>                 * Once this instruction has been boosted
>>> diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c
>>> index 1cbf7f1a4e3d..d862bb549158 100644
>>> --- a/arch/powerpc/kernel/mce_power.c
>>> +++ b/arch/powerpc/kernel/mce_power.c
>>> @@ -374,7 +374,7 @@ static int mce_find_instr_ea_and_phys(struct pt_regs *regs, uint64_t *addr,
>>>        if (pfn != ULONG_MAX) {
>>>                instr_addr = (pfn << PAGE_SHIFT) + (regs->nip & ~PAGE_MASK);
>>>                instr = *(unsigned int *)(instr_addr);
>>> -             if (!analyse_instr(&op, &tmp, instr)) {
>>> +             if (!analyse_instr(&op, &tmp, instr, 0)) {
>>>                        pfn = addr_to_pfn(regs, op.ea);
>>>                        *addr = op.ea;
>>>                        *phys_addr = (pfn << PAGE_SHIFT);
>>> diff --git a/arch/powerpc/kernel/optprobes.c b/arch/powerpc/kernel/optprobes.c
>>> index 024f7aad1952..82dc8a589c87 100644
>>> --- a/arch/powerpc/kernel/optprobes.c
>>> +++ b/arch/powerpc/kernel/optprobes.c
>>> @@ -100,7 +100,7 @@ static unsigned long can_optimize(struct kprobe *p)
>>>         * and that can be emulated.
>>>         */
>>>        if (!is_conditional_branch(*p->ainsn.insn) &&
>>> -                     analyse_instr(&op, &regs, *p->ainsn.insn) == 1) {
>>> +                     analyse_instr(&op, &regs, *p->ainsn.insn, 0) == 1) {
>>>                emulate_update_regs(&regs, &op);
>>>                nip = regs.nip;
>>>        }
>>> diff --git a/arch/powerpc/kernel/uprobes.c b/arch/powerpc/kernel/uprobes.c
>>> index 1cfef0e5fec5..ab1077dc6148 100644
>>> --- a/arch/powerpc/kernel/uprobes.c
>>> +++ b/arch/powerpc/kernel/uprobes.c
>>> @@ -173,7 +173,7 @@ bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs)
>>>         * emulate_step() returns 1 if the insn was successfully emulated.
>>>         * For all other cases, we need to single-step in hardware.
>>>         */
>>> -     ret = emulate_step(regs, auprobe->insn);
>>> +     ret = emulate_step(regs, auprobe->insn, 0);
>>>        if (ret > 0)
>>>                return true;
>>>
>>> diff --git a/arch/powerpc/kvm/emulate_loadstore.c b/arch/powerpc/kvm/emulate_loadstore.c
>>> index 2e496eb86e94..fcab1f31b48d 100644
>>> --- a/arch/powerpc/kvm/emulate_loadstore.c
>>> +++ b/arch/powerpc/kvm/emulate_loadstore.c
>>> @@ -100,7 +100,7 @@ int kvmppc_emulate_loadstore(struct kvm_vcpu *vcpu)
>>>
>>>        emulated = EMULATE_FAIL;
>>>        vcpu->arch.regs.msr = vcpu->arch.shared->msr;
>>> -     if (analyse_instr(&op, &vcpu->arch.regs, inst) == 0) {
>>> +     if (analyse_instr(&op, &vcpu->arch.regs, inst, 0) == 0) {
>>>                int type = op.type & INSTR_TYPE_MASK;
>>>                int size = GETSIZE(op.type);
>>>
>>> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
>>> index c077acb983a1..ade3f5eba2e5 100644
>>> --- a/arch/powerpc/lib/sstep.c
>>> +++ b/arch/powerpc/lib/sstep.c
>>> @@ -1163,7 +1163,7 @@ static nokprobe_inline int trap_compare(long v1, long v2)
>>>    * otherwise.
>>>    */
>>>   int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
>>> -               unsigned int instr)
>>> +               unsigned int instr, unsigned int sufx)
>>
>> I know we really like shortenings in arch/powerpc, but I think we can
>> afford the two extra characters to spell 'suffix' in full :)
>>
> 'suffix' was what I used initially but somewhere along the line
> I found it looked unbalanced to see the abbreviation inst{r,} in the same
> context as the unabbreviated 'suffix'. Happy to change it if it is clearer.
> 

I guess 'instruction' is pretty long, while 'suffix' is half the length. 
In addition, 'instr' is rather common while 'suffix' is quite new.
So I agree it would be clearer to keep 'suffix'.

Christophe

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 05/18] powerpc sstep: Prepare to support prefixed instructions
  2019-12-20  5:17     ` Jordan Niethe
@ 2020-01-07  3:01       ` Jordan Niethe
  0 siblings, 0 replies; 42+ messages in thread
From: Jordan Niethe @ 2020-01-07  3:01 UTC (permalink / raw)
  To: Daniel Axtens; +Cc: Alistair Popple, linuxppc-dev

On Fri, Dec 20, 2019 at 4:17 PM Jordan Niethe <jniethe5@gmail.com> wrote:
>
> On Thu, Dec 19, 2019 at 1:15 AM Daniel Axtens <dja@axtens.net> wrote:
> >
> > Jordan Niethe <jniethe5@gmail.com> writes:
> >
> > > Currently all instructions are a single word long. A future ISA version
> > > will include prefixed instructions which have a double word length. The
> > > functions used for analysing and emulating instructions need to be
> > > modified so that they can handle these new instruction types.
> > >
> > > A prefixed instruction is a word prefix followed by a word suffix. All
> > > prefixes uniquely have the primary op-code 1. Suffixes may be valid word
> > > instructions or instructions that only exist as suffixes.
> > >
> > > In handling prefixed instructions it will be convenient to treat the
> > > suffix and prefix as separate words. To facilitate this modify
> > > analyse_instr() and emulate_step() to take a take a suffix as a
> > > parameter. For word instructions it does not matter what is passed in
> > > here - it will be ignored.
> > >
> > > We also define a new flag, PREFIXED, to be used in instruction_op:type.
> > > This flag will indicate when emulating an analysed instruction if the
> > > NIP should be advanced by word length or double word length.
> > >
> > > The callers of analyse_instr() and emulate_step() will need their own
> > > changes to be able to support prefixed instructions. For now modify them
> > > to pass in 0 as a suffix.
> > >
> > > Note that at this point no prefixed instructions are emulated or
> > > analysed - this is just making it possible to do so.
> > >
> > > Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> > > ---
> > >  arch/powerpc/include/asm/ppc-opcode.h |  3 +++
> > >  arch/powerpc/include/asm/sstep.h      |  8 +++++--
> > >  arch/powerpc/include/asm/uaccess.h    | 30 +++++++++++++++++++++++++++
> > >  arch/powerpc/kernel/align.c           |  2 +-
> > >  arch/powerpc/kernel/hw_breakpoint.c   |  4 ++--
> > >  arch/powerpc/kernel/kprobes.c         |  2 +-
> > >  arch/powerpc/kernel/mce_power.c       |  2 +-
> > >  arch/powerpc/kernel/optprobes.c       |  2 +-
> > >  arch/powerpc/kernel/uprobes.c         |  2 +-
> > >  arch/powerpc/kvm/emulate_loadstore.c  |  2 +-
> > >  arch/powerpc/lib/sstep.c              | 12 ++++++-----
> > >  arch/powerpc/lib/test_emulate_step.c  | 30 +++++++++++++--------------
> > >  arch/powerpc/xmon/xmon.c              |  4 ++--
> > >  13 files changed, 71 insertions(+), 32 deletions(-)
> > >
> > > diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h
> > > index c1df75edde44..a1dfa4bdd22f 100644
> > > --- a/arch/powerpc/include/asm/ppc-opcode.h
> > > +++ b/arch/powerpc/include/asm/ppc-opcode.h
> > > @@ -377,6 +377,9 @@
> > >  #define PPC_INST_VCMPEQUD            0x100000c7
> > >  #define PPC_INST_VCMPEQUB            0x10000006
> > >
> > > +/* macro to check if a word is a prefix */
> > > +#define IS_PREFIX(x) (((x) >> 26) == 1)
> > > +
> > >  /* macros to insert fields into opcodes */
> > >  #define ___PPC_RA(a) (((a) & 0x1f) << 16)
> > >  #define ___PPC_RB(b) (((b) & 0x1f) << 11)
> > > diff --git a/arch/powerpc/include/asm/sstep.h b/arch/powerpc/include/asm/sstep.h
> > > index 769f055509c9..6d4cb602e231 100644
> > > --- a/arch/powerpc/include/asm/sstep.h
> > > +++ b/arch/powerpc/include/asm/sstep.h
> > > @@ -89,6 +89,9 @@ enum instruction_type {
> > >  #define VSX_LDLEFT   4       /* load VSX register from left */
> > >  #define VSX_CHECK_VEC        8       /* check MSR_VEC not MSR_VSX for reg >= 32 */
> > >
> > > +/* Prefixed flag, ORed in with type */
> > > +#define PREFIXED     0x800
> > > +
> > >  /* Size field in type word */
> > >  #define SIZE(n)              ((n) << 12)
> > >  #define GETSIZE(w)   ((w) >> 12)
> > > @@ -132,7 +135,7 @@ union vsx_reg {
> > >   * otherwise.
> > >   */
> > >  extern int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
> > > -                      unsigned int instr);
> > > +                      unsigned int instr, unsigned int sufx);
> > >
> >
> > I'm not saying this is necessarily better, but did you consider:
> >
> >  - making instr 64 bits and using masking and shifting macros to get the
> >    prefix and suffix?
> >
> >  - defining an instruction type/struct/union/whatever that contains both
> >    halves in one object?
> >
> > I'm happy to be told that it ends up being way, way uglier/worse/etc,
> > but I just thought I'd ask.
> >
> > Regards,
> > Daniel
>
> It is a good question and something I thought and am not completely confident
> that this approach is the best. Basically what I ended up thinking was that
> the prefixed instructions were a bit of a special case, and by doing
> it like this
> the normal word instructions would just carry on the same as before.
>
> I can see this is a pretty flimsy reason, so I am happy for suggestions as
> to what would end up being clearer.
>
>

Sorry I was pretty vague here. Some more thoughts:
The current representation of an instruction is an unsigned integer.
This direct representation works out pretty nicely, we can fill an
array of unsigned integers
with instructions and execute them (as in xmon).
We can de-reference pointers to unsigned integers to read instructions
(like in the code patching code).

With prefixed instructions now a powerpc instruction is either a word
instruction
or a word suffix prefix pair. Something like:

union ppc_inst {
    uint word;
    struct {
        uint prefix;
        uint suffix;
    };
}

Of course, we can not just directly use this in place of our existing
unsigned integer
instructions (if we tried to deference a pointer to this but it was a
pointer to a word
instruction we would read too much. If we wanted to execute an array of word
instructions half of them would be garbage - could pad with noops).

So if we were to use this kind of type, we would need to introduce
some form of marshalling to
and from this sum type form which would touch a lot of places. Given
the somewhat
limited extent of prefixed instructions it seemed to me like it was
going to be cleaner to deal
with prefixed instructions by treating them more or less as two word
instructions
in the places they could occur.



> >
> > >  /*
> > >   * Emulate an instruction that can be executed just by updating
> > > @@ -149,7 +152,8 @@ void emulate_update_regs(struct pt_regs *reg, struct instruction_op *op);
> > >   * 0 if it could not be emulated, or -1 for an instruction that
> > >   * should not be emulated (rfid, mtmsrd clearing MSR_RI, etc.).
> > >   */
> > > -extern int emulate_step(struct pt_regs *regs, unsigned int instr);
> > > +extern int emulate_step(struct pt_regs *regs, unsigned int instr,
> > > +                     unsigned int sufx);
> > >
> > >  /*
> > >   * Emulate a load or store instruction by reading/writing the
> > > diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
> > > index 15002b51ff18..bc585399e0c7 100644
> > > --- a/arch/powerpc/include/asm/uaccess.h
> > > +++ b/arch/powerpc/include/asm/uaccess.h
> > > @@ -423,4 +423,34 @@ extern long __copy_from_user_flushcache(void *dst, const void __user *src,
> > >  extern void memcpy_page_flushcache(char *to, struct page *page, size_t offset,
> > >                          size_t len);
> > >
> > > +/*
> > > + * When reading an instruction iff it is a prefix, the suffix needs to be also
> > > + * loaded.
> > > + */
> > > +#define __get_user_instr(x, y, ptr)                  \
> > > +({                                                   \
> > > +     long __gui_ret = 0;                             \
> > > +     y = 0;                                          \
> > > +     __gui_ret = __get_user(x, ptr);                 \
> > > +     if (!__gui_ret) {                               \
> > > +             if (IS_PREFIX(x))                       \
> > > +                     __gui_ret = __get_user(y, ptr + 1);     \
> > > +     }                                               \
> > > +                                                     \
> > > +     __gui_ret;                                      \
> > > +})
> > > +
> > > +#define __get_user_instr_inatomic(x, y, ptr)         \
> > > +({                                                   \
> > > +     long __gui_ret = 0;                             \
> > > +     y = 0;                                          \
> > > +     __gui_ret = __get_user_inatomic(x, ptr);        \
> > > +     if (!__gui_ret) {                               \
> > > +             if (IS_PREFIX(x))                       \
> > > +                     __gui_ret = __get_user_inatomic(y, ptr + 1);    \
> > > +     }                                               \
> > > +                                                     \
> > > +     __gui_ret;                                      \
> > > +})
> > > +
> > >  #endif       /* _ARCH_POWERPC_UACCESS_H */
> > > diff --git a/arch/powerpc/kernel/align.c b/arch/powerpc/kernel/align.c
> > > index 92045ed64976..245e79792a01 100644
> > > --- a/arch/powerpc/kernel/align.c
> > > +++ b/arch/powerpc/kernel/align.c
> > > @@ -334,7 +334,7 @@ int fix_alignment(struct pt_regs *regs)
> > >       if ((instr & 0xfc0006fe) == (PPC_INST_COPY & 0xfc0006fe))
> > >               return -EIO;
> > >
> > > -     r = analyse_instr(&op, regs, instr);
> > > +     r = analyse_instr(&op, regs, instr, 0);
> > >       if (r < 0)
> > >               return -EINVAL;
> > >
> > > diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c
> > > index 58ce3d37c2a3..f4530961998c 100644
> > > --- a/arch/powerpc/kernel/hw_breakpoint.c
> > > +++ b/arch/powerpc/kernel/hw_breakpoint.c
> > > @@ -248,7 +248,7 @@ static bool stepping_handler(struct pt_regs *regs, struct perf_event *bp,
> > >       if (__get_user_inatomic(instr, (unsigned int *)regs->nip))
> > >               goto fail;
> > >
> > > -     ret = analyse_instr(&op, regs, instr);
> > > +     ret = analyse_instr(&op, regs, instr, 0);
> > >       type = GETTYPE(op.type);
> > >       size = GETSIZE(op.type);
> > >
> > > @@ -272,7 +272,7 @@ static bool stepping_handler(struct pt_regs *regs, struct perf_event *bp,
> > >               return false;
> > >       }
> > >
> > > -     if (!emulate_step(regs, instr))
> > > +     if (!emulate_step(regs, instr, 0))
> > >               goto fail;
> > >
> > >       return true;
> > > diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
> > > index 2d27ec4feee4..7303fe3856cc 100644
> > > --- a/arch/powerpc/kernel/kprobes.c
> > > +++ b/arch/powerpc/kernel/kprobes.c
> > > @@ -219,7 +219,7 @@ static int try_to_emulate(struct kprobe *p, struct pt_regs *regs)
> > >       unsigned int insn = *p->ainsn.insn;
> > >
> > >       /* regs->nip is also adjusted if emulate_step returns 1 */
> > > -     ret = emulate_step(regs, insn);
> > > +     ret = emulate_step(regs, insn, 0);
> > >       if (ret > 0) {
> > >               /*
> > >                * Once this instruction has been boosted
> > > diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c
> > > index 1cbf7f1a4e3d..d862bb549158 100644
> > > --- a/arch/powerpc/kernel/mce_power.c
> > > +++ b/arch/powerpc/kernel/mce_power.c
> > > @@ -374,7 +374,7 @@ static int mce_find_instr_ea_and_phys(struct pt_regs *regs, uint64_t *addr,
> > >       if (pfn != ULONG_MAX) {
> > >               instr_addr = (pfn << PAGE_SHIFT) + (regs->nip & ~PAGE_MASK);
> > >               instr = *(unsigned int *)(instr_addr);
> > > -             if (!analyse_instr(&op, &tmp, instr)) {
> > > +             if (!analyse_instr(&op, &tmp, instr, 0)) {
> > >                       pfn = addr_to_pfn(regs, op.ea);
> > >                       *addr = op.ea;
> > >                       *phys_addr = (pfn << PAGE_SHIFT);
> > > diff --git a/arch/powerpc/kernel/optprobes.c b/arch/powerpc/kernel/optprobes.c
> > > index 024f7aad1952..82dc8a589c87 100644
> > > --- a/arch/powerpc/kernel/optprobes.c
> > > +++ b/arch/powerpc/kernel/optprobes.c
> > > @@ -100,7 +100,7 @@ static unsigned long can_optimize(struct kprobe *p)
> > >        * and that can be emulated.
> > >        */
> > >       if (!is_conditional_branch(*p->ainsn.insn) &&
> > > -                     analyse_instr(&op, &regs, *p->ainsn.insn) == 1) {
> > > +                     analyse_instr(&op, &regs, *p->ainsn.insn, 0) == 1) {
> > >               emulate_update_regs(&regs, &op);
> > >               nip = regs.nip;
> > >       }
> > > diff --git a/arch/powerpc/kernel/uprobes.c b/arch/powerpc/kernel/uprobes.c
> > > index 1cfef0e5fec5..ab1077dc6148 100644
> > > --- a/arch/powerpc/kernel/uprobes.c
> > > +++ b/arch/powerpc/kernel/uprobes.c
> > > @@ -173,7 +173,7 @@ bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs)
> > >        * emulate_step() returns 1 if the insn was successfully emulated.
> > >        * For all other cases, we need to single-step in hardware.
> > >        */
> > > -     ret = emulate_step(regs, auprobe->insn);
> > > +     ret = emulate_step(regs, auprobe->insn, 0);
> > >       if (ret > 0)
> > >               return true;
> > >
> > > diff --git a/arch/powerpc/kvm/emulate_loadstore.c b/arch/powerpc/kvm/emulate_loadstore.c
> > > index 2e496eb86e94..fcab1f31b48d 100644
> > > --- a/arch/powerpc/kvm/emulate_loadstore.c
> > > +++ b/arch/powerpc/kvm/emulate_loadstore.c
> > > @@ -100,7 +100,7 @@ int kvmppc_emulate_loadstore(struct kvm_vcpu *vcpu)
> > >
> > >       emulated = EMULATE_FAIL;
> > >       vcpu->arch.regs.msr = vcpu->arch.shared->msr;
> > > -     if (analyse_instr(&op, &vcpu->arch.regs, inst) == 0) {
> > > +     if (analyse_instr(&op, &vcpu->arch.regs, inst, 0) == 0) {
> > >               int type = op.type & INSTR_TYPE_MASK;
> > >               int size = GETSIZE(op.type);
> > >
> > > diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> > > index c077acb983a1..ade3f5eba2e5 100644
> > > --- a/arch/powerpc/lib/sstep.c
> > > +++ b/arch/powerpc/lib/sstep.c
> > > @@ -1163,7 +1163,7 @@ static nokprobe_inline int trap_compare(long v1, long v2)
> > >   * otherwise.
> > >   */
> > >  int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
> > > -               unsigned int instr)
> > > +               unsigned int instr, unsigned int sufx)
> > >  {
> > >       unsigned int opcode, ra, rb, rc, rd, spr, u;
> > >       unsigned long int imm;
> > > @@ -2756,7 +2756,8 @@ void emulate_update_regs(struct pt_regs *regs, struct instruction_op *op)
> > >  {
> > >       unsigned long next_pc;
> > >
> > > -     next_pc = truncate_if_32bit(regs->msr, regs->nip + 4);
> > > +     next_pc = truncate_if_32bit(regs->msr,
> > > +                                 regs->nip + ((op->type & PREFIXED) ? 8 : 4));
> > >       switch (GETTYPE(op->type)) {
> > >       case COMPUTE:
> > >               if (op->type & SETREG)
> > > @@ -3101,14 +3102,14 @@ NOKPROBE_SYMBOL(emulate_loadstore);
> > >   * or -1 if the instruction is one that should not be stepped,
> > >   * such as an rfid, or a mtmsrd that would clear MSR_RI.
> > >   */
> > > -int emulate_step(struct pt_regs *regs, unsigned int instr)
> > > +int emulate_step(struct pt_regs *regs, unsigned int instr, unsigned int sufx)
> > >  {
> > >       struct instruction_op op;
> > >       int r, err, type;
> > >       unsigned long val;
> > >       unsigned long ea;
> > >
> > > -     r = analyse_instr(&op, regs, instr);
> > > +     r = analyse_instr(&op, regs, instr, sufx);
> > >       if (r < 0)
> > >               return r;
> > >       if (r > 0) {
> > > @@ -3200,7 +3201,8 @@ int emulate_step(struct pt_regs *regs, unsigned int instr)
> > >       return 0;
> > >
> > >   instr_done:
> > > -     regs->nip = truncate_if_32bit(regs->msr, regs->nip + 4);
> > > +     regs->nip = truncate_if_32bit(regs->msr,
> > > +                                   regs->nip + ((op.type & PREFIXED) ? 8 : 4));
> > >       return 1;
> > >  }
> > >  NOKPROBE_SYMBOL(emulate_step);
> > > diff --git a/arch/powerpc/lib/test_emulate_step.c b/arch/powerpc/lib/test_emulate_step.c
> > > index 42347067739c..9288dc6fc715 100644
> > > --- a/arch/powerpc/lib/test_emulate_step.c
> > > +++ b/arch/powerpc/lib/test_emulate_step.c
> > > @@ -103,7 +103,7 @@ static void __init test_ld(void)
> > >       regs.gpr[3] = (unsigned long) &a;
> > >
> > >       /* ld r5, 0(r3) */
> > > -     stepped = emulate_step(&regs, TEST_LD(5, 3, 0));
> > > +     stepped = emulate_step(&regs, TEST_LD(5, 3, 0), 0);
> > >
> > >       if (stepped == 1 && regs.gpr[5] == a)
> > >               show_result("ld", "PASS");
> > > @@ -121,7 +121,7 @@ static void __init test_lwz(void)
> > >       regs.gpr[3] = (unsigned long) &a;
> > >
> > >       /* lwz r5, 0(r3) */
> > > -     stepped = emulate_step(&regs, TEST_LWZ(5, 3, 0));
> > > +     stepped = emulate_step(&regs, TEST_LWZ(5, 3, 0), 0);
> > >
> > >       if (stepped == 1 && regs.gpr[5] == a)
> > >               show_result("lwz", "PASS");
> > > @@ -141,7 +141,7 @@ static void __init test_lwzx(void)
> > >       regs.gpr[5] = 0x8765;
> > >
> > >       /* lwzx r5, r3, r4 */
> > > -     stepped = emulate_step(&regs, TEST_LWZX(5, 3, 4));
> > > +     stepped = emulate_step(&regs, TEST_LWZX(5, 3, 4), 0);
> > >       if (stepped == 1 && regs.gpr[5] == a[2])
> > >               show_result("lwzx", "PASS");
> > >       else
> > > @@ -159,7 +159,7 @@ static void __init test_std(void)
> > >       regs.gpr[5] = 0x5678;
> > >
> > >       /* std r5, 0(r3) */
> > > -     stepped = emulate_step(&regs, TEST_STD(5, 3, 0));
> > > +     stepped = emulate_step(&regs, TEST_STD(5, 3, 0), 0);
> > >       if (stepped == 1 || regs.gpr[5] == a)
> > >               show_result("std", "PASS");
> > >       else
> > > @@ -184,7 +184,7 @@ static void __init test_ldarx_stdcx(void)
> > >       regs.gpr[5] = 0x5678;
> > >
> > >       /* ldarx r5, r3, r4, 0 */
> > > -     stepped = emulate_step(&regs, TEST_LDARX(5, 3, 4, 0));
> > > +     stepped = emulate_step(&regs, TEST_LDARX(5, 3, 4, 0), 0);
> > >
> > >       /*
> > >        * Don't touch 'a' here. Touching 'a' can do Load/store
> > > @@ -202,7 +202,7 @@ static void __init test_ldarx_stdcx(void)
> > >       regs.gpr[5] = 0x9ABC;
> > >
> > >       /* stdcx. r5, r3, r4 */
> > > -     stepped = emulate_step(&regs, TEST_STDCX(5, 3, 4));
> > > +     stepped = emulate_step(&regs, TEST_STDCX(5, 3, 4), 0);
> > >
> > >       /*
> > >        * Two possible scenarios that indicates successful emulation
> > > @@ -242,7 +242,7 @@ static void __init test_lfsx_stfsx(void)
> > >       regs.gpr[4] = 0;
> > >
> > >       /* lfsx frt10, r3, r4 */
> > > -     stepped = emulate_step(&regs, TEST_LFSX(10, 3, 4));
> > > +     stepped = emulate_step(&regs, TEST_LFSX(10, 3, 4), 0);
> > >
> > >       if (stepped == 1)
> > >               show_result("lfsx", "PASS");
> > > @@ -255,7 +255,7 @@ static void __init test_lfsx_stfsx(void)
> > >       c.a = 678.91;
> > >
> > >       /* stfsx frs10, r3, r4 */
> > > -     stepped = emulate_step(&regs, TEST_STFSX(10, 3, 4));
> > > +     stepped = emulate_step(&regs, TEST_STFSX(10, 3, 4), 0);
> > >
> > >       if (stepped == 1 && c.b == cached_b)
> > >               show_result("stfsx", "PASS");
> > > @@ -285,7 +285,7 @@ static void __init test_lfdx_stfdx(void)
> > >       regs.gpr[4] = 0;
> > >
> > >       /* lfdx frt10, r3, r4 */
> > > -     stepped = emulate_step(&regs, TEST_LFDX(10, 3, 4));
> > > +     stepped = emulate_step(&regs, TEST_LFDX(10, 3, 4), 0);
> > >
> > >       if (stepped == 1)
> > >               show_result("lfdx", "PASS");
> > > @@ -298,7 +298,7 @@ static void __init test_lfdx_stfdx(void)
> > >       c.a = 987654.32;
> > >
> > >       /* stfdx frs10, r3, r4 */
> > > -     stepped = emulate_step(&regs, TEST_STFDX(10, 3, 4));
> > > +     stepped = emulate_step(&regs, TEST_STFDX(10, 3, 4), 0);
> > >
> > >       if (stepped == 1 && c.b == cached_b)
> > >               show_result("stfdx", "PASS");
> > > @@ -344,7 +344,7 @@ static void __init test_lvx_stvx(void)
> > >       regs.gpr[4] = 0;
> > >
> > >       /* lvx vrt10, r3, r4 */
> > > -     stepped = emulate_step(&regs, TEST_LVX(10, 3, 4));
> > > +     stepped = emulate_step(&regs, TEST_LVX(10, 3, 4), 0);
> > >
> > >       if (stepped == 1)
> > >               show_result("lvx", "PASS");
> > > @@ -360,7 +360,7 @@ static void __init test_lvx_stvx(void)
> > >       c.b[3] = 498532;
> > >
> > >       /* stvx vrs10, r3, r4 */
> > > -     stepped = emulate_step(&regs, TEST_STVX(10, 3, 4));
> > > +     stepped = emulate_step(&regs, TEST_STVX(10, 3, 4), 0);
> > >
> > >       if (stepped == 1 && cached_b[0] == c.b[0] && cached_b[1] == c.b[1] &&
> > >           cached_b[2] == c.b[2] && cached_b[3] == c.b[3])
> > > @@ -401,7 +401,7 @@ static void __init test_lxvd2x_stxvd2x(void)
> > >       regs.gpr[4] = 0;
> > >
> > >       /* lxvd2x vsr39, r3, r4 */
> > > -     stepped = emulate_step(&regs, TEST_LXVD2X(39, 3, 4));
> > > +     stepped = emulate_step(&regs, TEST_LXVD2X(39, 3, 4), 0);
> > >
> > >       if (stepped == 1 && cpu_has_feature(CPU_FTR_VSX)) {
> > >               show_result("lxvd2x", "PASS");
> > > @@ -421,7 +421,7 @@ static void __init test_lxvd2x_stxvd2x(void)
> > >       c.b[3] = 4;
> > >
> > >       /* stxvd2x vsr39, r3, r4 */
> > > -     stepped = emulate_step(&regs, TEST_STXVD2X(39, 3, 4));
> > > +     stepped = emulate_step(&regs, TEST_STXVD2X(39, 3, 4), 0);
> > >
> > >       if (stepped == 1 && cached_b[0] == c.b[0] && cached_b[1] == c.b[1] &&
> > >           cached_b[2] == c.b[2] && cached_b[3] == c.b[3] &&
> > > @@ -848,7 +848,7 @@ static int __init emulate_compute_instr(struct pt_regs *regs,
> > >       if (!regs || !instr)
> > >               return -EINVAL;
> > >
> > > -     if (analyse_instr(&op, regs, instr) != 1 ||
> > > +     if (analyse_instr(&op, regs, instr, 0) != 1 ||
> > >           GETTYPE(op.type) != COMPUTE) {
> > >               pr_info("emulation failed, instruction = 0x%08x\n", instr);
> > >               return -EFAULT;
> > > diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
> > > index a7056049709e..f47bd843dc52 100644
> > > --- a/arch/powerpc/xmon/xmon.c
> > > +++ b/arch/powerpc/xmon/xmon.c
> > > @@ -705,7 +705,7 @@ static int xmon_core(struct pt_regs *regs, int fromipi)
> > >       if ((regs->msr & (MSR_IR|MSR_PR|MSR_64BIT)) == (MSR_IR|MSR_64BIT)) {
> > >               bp = at_breakpoint(regs->nip);
> > >               if (bp != NULL) {
> > > -                     int stepped = emulate_step(regs, bp->instr[0]);
> > > +                     int stepped = emulate_step(regs, bp->instr[0], 0);
> > >                       if (stepped == 0) {
> > >                               regs->nip = (unsigned long) &bp->instr[0];
> > >                               atomic_inc(&bp->ref_count);
> > > @@ -1170,7 +1170,7 @@ static int do_step(struct pt_regs *regs)
> > >       /* check we are in 64-bit kernel mode, translation enabled */
> > >       if ((regs->msr & (MSR_64BIT|MSR_PR|MSR_IR)) == (MSR_64BIT|MSR_IR)) {
> > >               if (mread(regs->nip, &instr, 4) == 4) {
> > > -                     stepped = emulate_step(regs, instr);
> > > +                     stepped = emulate_step(regs, instr, 0);
> > >                       if (stepped < 0) {
> > >                               printf("Couldn't single-step %s instruction\n",
> > >                                      (IS_RFID(instr)? "rfid": "mtmsrd"));
> > > --
> > > 2.20.1

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 06/18] powerpc sstep: Add support for prefixed integer load/stores
  2019-11-26  5:21 ` [PATCH 06/18] powerpc sstep: Add support for prefixed integer load/stores Jordan Niethe
@ 2020-01-10 10:38   ` Balamuruhan S
  2020-02-07  0:18     ` Jordan Niethe
  2020-01-10 15:13   ` Balamuruhan S
  1 sibling, 1 reply; 42+ messages in thread
From: Balamuruhan S @ 2020-01-10 10:38 UTC (permalink / raw)
  To: Jordan Niethe; +Cc: alistair, linuxppc-dev

On Tue, Nov 26, 2019 at 04:21:29PM +1100, Jordan Niethe wrote:
> This adds emulation support for the following prefixed integer
> load/stores:
>   * Prefixed Load Byte and Zero (plbz)
>   * Prefixed Load Halfword and Zero (plhz)
>   * Prefixed Load Halfword Algebraic (plha)
>   * Prefixed Load Word and Zero (plwz)
>   * Prefixed Load Word Algebraic (plwa)
>   * Prefixed Load Doubleword (pld)
>   * Prefixed Store Byte (pstb)
>   * Prefixed Store Halfword (psth)
>   * Prefixed Store Word (pstw)
>   * Prefixed Store Doubleword (pstd)
>   * Prefixed Load Quadword (plq)
>   * Prefixed Store Quadword (pstq)
> 
> Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> ---
>  arch/powerpc/lib/sstep.c | 110 +++++++++++++++++++++++++++++++++++++++
>  1 file changed, 110 insertions(+)
> 
> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> index ade3f5eba2e5..4f5ad1f602d8 100644
> --- a/arch/powerpc/lib/sstep.c
> +++ b/arch/powerpc/lib/sstep.c
> @@ -187,6 +187,43 @@ static nokprobe_inline unsigned long xform_ea(unsigned int instr,
>  	return ea;
>  }
>  
> +/*
> + * Calculate effective address for a MLS:D-form / 8LS:D-form prefixed instruction
> + */
> +static nokprobe_inline unsigned long mlsd_8lsd_ea(unsigned int instr,
> +						  unsigned int sufx,
> +						  const struct pt_regs *regs)
> +{
> +	int ra, prefix_r;
> +	unsigned int  dd;
> +	unsigned long ea, d0, d1, d;
> +
> +	prefix_r = instr & (1ul << 20);
> +	ra = (sufx >> 16) & 0x1f;
> +
> +	d0 = instr & 0x3ffff;
> +	d1 = sufx & 0xffff;
> +	d = (d0 << 16) | d1;
> +
> +	/*
> +	 * sign extend a 34 bit number
> +	 */
> +	dd = (unsigned int) (d >> 2);
> +	ea = (signed int) dd;
> +	ea = (ea << 2) | (d & 0x3);
> +
> +	if (!prefix_r && ra)
> +		ea += regs->gpr[ra];
> +	else if (!prefix_r && !ra)
> +		; /* Leave ea as is */
> +	else if (prefix_r && !ra)
> +		ea += regs->nip;
> +	else if (prefix_r && ra)
> +		; /* Invalid form. Should already be checked for by caller! */
> +
> +	return ea;
> +}
> +
>  /*
>   * Return the largest power of 2, not greater than sizeof(unsigned long),
>   * such that x is a multiple of it.
> @@ -1166,6 +1203,7 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
>  		  unsigned int instr, unsigned int sufx)
>  {
>  	unsigned int opcode, ra, rb, rc, rd, spr, u;
> +	unsigned int sufxopcode, prefixtype, prefix_r;
>  	unsigned long int imm;
>  	unsigned long int val, val2;
>  	unsigned int mb, me, sh;
> @@ -2652,6 +2690,78 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
>  
>  	}
>  
> +/*
> + * Prefixed instructions
> + */
> +	switch (opcode) {
> +	case 1:
> +		prefix_r = instr & (1ul << 20);
> +		ra = (sufx >> 16) & 0x1f;
> +		op->update_reg = ra;
> +		rd = (sufx >> 21) & 0x1f;
> +		op->reg = rd;
> +		op->val = regs->gpr[rd];
> +
> +		sufxopcode = sufx >> 26;
> +		prefixtype = (instr >> 24) & 0x3;
> +		switch (prefixtype) {
> +		case 0: /* Type 00  Eight-Byte Load/Store */
> +			if (prefix_r && ra)
> +				break;
> +			op->ea = mlsd_8lsd_ea(instr, sufx, regs);
> +			switch (sufxopcode) {
> +			case 41:	/* plwa */
> +				op->type = MKOP(LOAD, PREFIXED | SIGNEXT, 4);
> +				break;
> +			case 56:        /* plq */
> +				op->type = MKOP(LOAD, PREFIXED, 16);
> +				break;
> +			case 57:	/* pld */
> +				op->type = MKOP(LOAD, PREFIXED | SIGNEXT, 8);
> +				break;
> +			case 60:        /* stq */
> +				op->type = MKOP(STORE, PREFIXED, 16);
> +				break;
> +			case 61:	/* pstd */
> +				op->type = MKOP(STORE, PREFIXED | SIGNEXT, 8);

For 8 byte and and 1 byte (mentioned below for Type 10 instructions), we
do not have their respective definitions in `do_signext()`, I am not
sure whether it is typo/miss.

> +				break;
> +			}
> +			break;
> +		case 1: /* Type 01 Modified Register-to-Register */

Type 01 would be Eight-Byte Register-to-Register.

-- Bala
> +			break;
> +		case 2: /* Type 10 Modified Load/Store */
> +			if (prefix_r && ra)
> +				break;
> +			op->ea = mlsd_8lsd_ea(instr, sufx, regs);
> +			switch (sufxopcode) {
> +			case 32:	/* plwz */
> +				op->type = MKOP(LOAD, PREFIXED, 4);
> +				break;
> +			case 34:	/* plbz */
> +				op->type = MKOP(LOAD, PREFIXED, 1);
> +				break;
> +			case 36:	/* pstw */
> +				op->type = MKOP(STORE, PREFIXED, 4);
> +				break;
> +			case 38:	/* pstb */
> +				op->type = MKOP(STORE, PREFIXED, 1);
> +				break;
> +			case 40:	/* plhz */
> +				op->type = MKOP(LOAD, PREFIXED, 2);
> +				break;
> +			case 42:	/* plha */
> +				op->type = MKOP(LOAD, PREFIXED | SIGNEXT, 2);
> +				break;
> +			case 44:	/* psth */
> +				op->type = MKOP(STORE, PREFIXED, 2);
> +				break;
> +			}
> +			break;
> +		case 3: /* Type 11 Modified Register-to-Register */
> +			break;
> +		}
> +	}
> +
>  #ifdef CONFIG_VSX
>  	if ((GETTYPE(op->type) == LOAD_VSX ||
>  	     GETTYPE(op->type) == STORE_VSX) &&
> -- 
> 2.20.1
> 


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 06/18] powerpc sstep: Add support for prefixed integer load/stores
  2019-11-26  5:21 ` [PATCH 06/18] powerpc sstep: Add support for prefixed integer load/stores Jordan Niethe
  2020-01-10 10:38   ` Balamuruhan S
@ 2020-01-10 15:13   ` Balamuruhan S
  2020-02-07  0:20     ` Jordan Niethe
  1 sibling, 1 reply; 42+ messages in thread
From: Balamuruhan S @ 2020-01-10 15:13 UTC (permalink / raw)
  To: Jordan Niethe; +Cc: alistair, linuxppc-dev

On Tue, Nov 26, 2019 at 04:21:29PM +1100, Jordan Niethe wrote:
> This adds emulation support for the following prefixed integer
> load/stores:
>   * Prefixed Load Byte and Zero (plbz)
>   * Prefixed Load Halfword and Zero (plhz)
>   * Prefixed Load Halfword Algebraic (plha)
>   * Prefixed Load Word and Zero (plwz)
>   * Prefixed Load Word Algebraic (plwa)
>   * Prefixed Load Doubleword (pld)
>   * Prefixed Store Byte (pstb)
>   * Prefixed Store Halfword (psth)
>   * Prefixed Store Word (pstw)
>   * Prefixed Store Doubleword (pstd)
>   * Prefixed Load Quadword (plq)
>   * Prefixed Store Quadword (pstq)
> 
> Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> ---
>  arch/powerpc/lib/sstep.c | 110 +++++++++++++++++++++++++++++++++++++++
>  1 file changed, 110 insertions(+)
> 
> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> index ade3f5eba2e5..4f5ad1f602d8 100644
> --- a/arch/powerpc/lib/sstep.c
> +++ b/arch/powerpc/lib/sstep.c
> @@ -187,6 +187,43 @@ static nokprobe_inline unsigned long xform_ea(unsigned int instr,
>  	return ea;
>  }
>  
> +/*
> + * Calculate effective address for a MLS:D-form / 8LS:D-form prefixed instruction
> + */
> +static nokprobe_inline unsigned long mlsd_8lsd_ea(unsigned int instr,
> +						  unsigned int sufx,
> +						  const struct pt_regs *regs)
> +{
> +	int ra, prefix_r;
> +	unsigned int  dd;
> +	unsigned long ea, d0, d1, d;
> +
> +	prefix_r = instr & (1ul << 20);
> +	ra = (sufx >> 16) & 0x1f;
> +
> +	d0 = instr & 0x3ffff;
> +	d1 = sufx & 0xffff;
> +	d = (d0 << 16) | d1;
> +
> +	/*
> +	 * sign extend a 34 bit number
> +	 */
> +	dd = (unsigned int) (d >> 2);
> +	ea = (signed int) dd;
> +	ea = (ea << 2) | (d & 0x3);
> +
> +	if (!prefix_r && ra)
> +		ea += regs->gpr[ra];
> +	else if (!prefix_r && !ra)
> +		; /* Leave ea as is */
> +	else if (prefix_r && !ra)
> +		ea += regs->nip;
> +	else if (prefix_r && ra)
> +		; /* Invalid form. Should already be checked for by caller! */
> +
> +	return ea;
> +}
> +
>  /*
>   * Return the largest power of 2, not greater than sizeof(unsigned long),
>   * such that x is a multiple of it.
> @@ -1166,6 +1203,7 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
>  		  unsigned int instr, unsigned int sufx)
>  {
>  	unsigned int opcode, ra, rb, rc, rd, spr, u;
> +	unsigned int sufxopcode, prefixtype, prefix_r;
>  	unsigned long int imm;
>  	unsigned long int val, val2;
>  	unsigned int mb, me, sh;
> @@ -2652,6 +2690,78 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
>  
>  	}
>  
> +/*
> + * Prefixed instructions
> + */
> +	switch (opcode) {
> +	case 1:
> +		prefix_r = instr & (1ul << 20);
> +		ra = (sufx >> 16) & 0x1f;
> +		op->update_reg = ra;
> +		rd = (sufx >> 21) & 0x1f;
> +		op->reg = rd;
> +		op->val = regs->gpr[rd];
> +
> +		sufxopcode = sufx >> 26;
> +		prefixtype = (instr >> 24) & 0x3;
> +		switch (prefixtype) {
> +		case 0: /* Type 00  Eight-Byte Load/Store */
> +			if (prefix_r && ra)
> +				break;
> +			op->ea = mlsd_8lsd_ea(instr, sufx, regs);
> +			switch (sufxopcode) {
> +			case 41:	/* plwa */
> +				op->type = MKOP(LOAD, PREFIXED | SIGNEXT, 4);
> +				break;
> +			case 56:        /* plq */
> +				op->type = MKOP(LOAD, PREFIXED, 16);
> +				break;
> +			case 57:	/* pld */
> +				op->type = MKOP(LOAD, PREFIXED | SIGNEXT, 8);
> +				break;
> +			case 60:        /* stq */
> +				op->type = MKOP(STORE, PREFIXED, 16);
> +				break;
> +			case 61:	/* pstd */
> +				op->type = MKOP(STORE, PREFIXED | SIGNEXT, 8);

sorry, we don't do SIGNEXT for 1 byte below in Type 10, so does 8 byte is used
conscious without definition in `do_signext()` as we don't really need to do
anything ?

-- Bala

> +				break;
> +			}
> +			break;
> +		case 1: /* Type 01 Modified Register-to-Register */
> +			break;
> +		case 2: /* Type 10 Modified Load/Store */
> +			if (prefix_r && ra)
> +				break;
> +			op->ea = mlsd_8lsd_ea(instr, sufx, regs);
> +			switch (sufxopcode) {
> +			case 32:	/* plwz */
> +				op->type = MKOP(LOAD, PREFIXED, 4);
> +				break;
> +			case 34:	/* plbz */
> +				op->type = MKOP(LOAD, PREFIXED, 1);
> +				break;
> +			case 36:	/* pstw */
> +				op->type = MKOP(STORE, PREFIXED, 4);
> +				break;
> +			case 38:	/* pstb */
> +				op->type = MKOP(STORE, PREFIXED, 1);
> +				break;
> +			case 40:	/* plhz */
> +				op->type = MKOP(LOAD, PREFIXED, 2);
> +				break;
> +			case 42:	/* plha */
> +				op->type = MKOP(LOAD, PREFIXED | SIGNEXT, 2);
> +				break;
> +			case 44:	/* psth */
> +				op->type = MKOP(STORE, PREFIXED, 2);
> +				break;
> +			}
> +			break;
> +		case 3: /* Type 11 Modified Register-to-Register */
> +			break;
> +		}
> +	}
> +
>  #ifdef CONFIG_VSX
>  	if ((GETTYPE(op->type) == LOAD_VSX ||
>  	     GETTYPE(op->type) == STORE_VSX) &&
> -- 
> 2.20.1
> 


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 05/18] powerpc sstep: Prepare to support prefixed instructions
  2019-11-26  5:21 ` [PATCH 05/18] powerpc sstep: Prepare to support prefixed instructions Jordan Niethe
  2019-12-18  8:35   ` Daniel Axtens
  2019-12-18 14:15   ` Daniel Axtens
@ 2020-01-13  6:18   ` Balamuruhan S
  2020-02-06 23:12     ` Jordan Niethe
  2 siblings, 1 reply; 42+ messages in thread
From: Balamuruhan S @ 2020-01-13  6:18 UTC (permalink / raw)
  To: Jordan Niethe; +Cc: alistair, linuxppc-dev

On Tue, Nov 26, 2019 at 04:21:28PM +1100, Jordan Niethe wrote:
> Currently all instructions are a single word long. A future ISA version
> will include prefixed instructions which have a double word length. The
> functions used for analysing and emulating instructions need to be
> modified so that they can handle these new instruction types.
> 
> A prefixed instruction is a word prefix followed by a word suffix. All
> prefixes uniquely have the primary op-code 1. Suffixes may be valid word
> instructions or instructions that only exist as suffixes.
> 
> In handling prefixed instructions it will be convenient to treat the
> suffix and prefix as separate words. To facilitate this modify
> analyse_instr() and emulate_step() to take a take a suffix as a

typo - s/take a take a/take a

> parameter. For word instructions it does not matter what is passed in
> here - it will be ignored.
> 
> We also define a new flag, PREFIXED, to be used in instruction_op:type.
> This flag will indicate when emulating an analysed instruction if the
> NIP should be advanced by word length or double word length.
> 
> The callers of analyse_instr() and emulate_step() will need their own
> changes to be able to support prefixed instructions. For now modify them
> to pass in 0 as a suffix.
> 
> Note that at this point no prefixed instructions are emulated or
> analysed - this is just making it possible to do so.
> 
> Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> ---
>  arch/powerpc/include/asm/ppc-opcode.h |  3 +++
>  arch/powerpc/include/asm/sstep.h      |  8 +++++--
>  arch/powerpc/include/asm/uaccess.h    | 30 +++++++++++++++++++++++++++
>  arch/powerpc/kernel/align.c           |  2 +-
>  arch/powerpc/kernel/hw_breakpoint.c   |  4 ++--
>  arch/powerpc/kernel/kprobes.c         |  2 +-
>  arch/powerpc/kernel/mce_power.c       |  2 +-
>  arch/powerpc/kernel/optprobes.c       |  2 +-
>  arch/powerpc/kernel/uprobes.c         |  2 +-
>  arch/powerpc/kvm/emulate_loadstore.c  |  2 +-
>  arch/powerpc/lib/sstep.c              | 12 ++++++-----
>  arch/powerpc/lib/test_emulate_step.c  | 30 +++++++++++++--------------
>  arch/powerpc/xmon/xmon.c              |  4 ++--
>  13 files changed, 71 insertions(+), 32 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h
> index c1df75edde44..a1dfa4bdd22f 100644
> --- a/arch/powerpc/include/asm/ppc-opcode.h
> +++ b/arch/powerpc/include/asm/ppc-opcode.h
> @@ -377,6 +377,9 @@
>  #define PPC_INST_VCMPEQUD		0x100000c7
>  #define PPC_INST_VCMPEQUB		0x10000006
>  
> +/* macro to check if a word is a prefix */
> +#define IS_PREFIX(x) (((x) >> 26) == 1)
> +
>  /* macros to insert fields into opcodes */
>  #define ___PPC_RA(a)	(((a) & 0x1f) << 16)
>  #define ___PPC_RB(b)	(((b) & 0x1f) << 11)
> diff --git a/arch/powerpc/include/asm/sstep.h b/arch/powerpc/include/asm/sstep.h
> index 769f055509c9..6d4cb602e231 100644
> --- a/arch/powerpc/include/asm/sstep.h
> +++ b/arch/powerpc/include/asm/sstep.h
> @@ -89,6 +89,9 @@ enum instruction_type {
>  #define VSX_LDLEFT	4	/* load VSX register from left */
>  #define VSX_CHECK_VEC	8	/* check MSR_VEC not MSR_VSX for reg >= 32 */
>  
> +/* Prefixed flag, ORed in with type */
> +#define PREFIXED	0x800
> +
>  /* Size field in type word */
>  #define SIZE(n)		((n) << 12)
>  #define GETSIZE(w)	((w) >> 12)
> @@ -132,7 +135,7 @@ union vsx_reg {
>   * otherwise.
>   */
>  extern int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
> -			 unsigned int instr);
> +			 unsigned int instr, unsigned int sufx);
>  
>  /*
>   * Emulate an instruction that can be executed just by updating
> @@ -149,7 +152,8 @@ void emulate_update_regs(struct pt_regs *reg, struct instruction_op *op);
>   * 0 if it could not be emulated, or -1 for an instruction that
>   * should not be emulated (rfid, mtmsrd clearing MSR_RI, etc.).
>   */
> -extern int emulate_step(struct pt_regs *regs, unsigned int instr);
> +extern int emulate_step(struct pt_regs *regs, unsigned int instr,
> +			unsigned int sufx);
>  
>  /*
>   * Emulate a load or store instruction by reading/writing the
> diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
> index 15002b51ff18..bc585399e0c7 100644
> --- a/arch/powerpc/include/asm/uaccess.h
> +++ b/arch/powerpc/include/asm/uaccess.h
> @@ -423,4 +423,34 @@ extern long __copy_from_user_flushcache(void *dst, const void __user *src,
>  extern void memcpy_page_flushcache(char *to, struct page *page, size_t offset,
>  			   size_t len);
>  
> +/*
> + * When reading an instruction iff it is a prefix, the suffix needs to be also
> + * loaded.
> + */
> +#define __get_user_instr(x, y, ptr)			\

It will be better to have `__get_user_instr()` and
`__get_user_instr_inatomic()` in separate commit or squashed with patch
[PATCH 10/18] powerpc: Support prefixed instructions in alignment handler.

-- Bala

> +({							\
> +	long __gui_ret = 0;				\
> +	y = 0;						\
> +	__gui_ret = __get_user(x, ptr);			\
> +	if (!__gui_ret) {				\
> +		if (IS_PREFIX(x))			\
> +			__gui_ret = __get_user(y, ptr + 1);	\
> +	}						\
> +							\
> +	__gui_ret;					\
> +})
> +
> +#define __get_user_instr_inatomic(x, y, ptr)		\
> +({							\
> +	long __gui_ret = 0;				\
> +	y = 0;						\
> +	__gui_ret = __get_user_inatomic(x, ptr);	\
> +	if (!__gui_ret) {				\
> +		if (IS_PREFIX(x))			\
> +			__gui_ret = __get_user_inatomic(y, ptr + 1);	\
> +	}						\
> +							\
> +	__gui_ret;					\
> +})
> +
>  #endif	/* _ARCH_POWERPC_UACCESS_H */
> diff --git a/arch/powerpc/kernel/align.c b/arch/powerpc/kernel/align.c
> index 92045ed64976..245e79792a01 100644
> --- a/arch/powerpc/kernel/align.c
> +++ b/arch/powerpc/kernel/align.c
> @@ -334,7 +334,7 @@ int fix_alignment(struct pt_regs *regs)
>  	if ((instr & 0xfc0006fe) == (PPC_INST_COPY & 0xfc0006fe))
>  		return -EIO;
>  
> -	r = analyse_instr(&op, regs, instr);
> +	r = analyse_instr(&op, regs, instr, 0);
>  	if (r < 0)
>  		return -EINVAL;
>  
> diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c
> index 58ce3d37c2a3..f4530961998c 100644
> --- a/arch/powerpc/kernel/hw_breakpoint.c
> +++ b/arch/powerpc/kernel/hw_breakpoint.c
> @@ -248,7 +248,7 @@ static bool stepping_handler(struct pt_regs *regs, struct perf_event *bp,
>  	if (__get_user_inatomic(instr, (unsigned int *)regs->nip))
>  		goto fail;
>  
> -	ret = analyse_instr(&op, regs, instr);
> +	ret = analyse_instr(&op, regs, instr, 0);
>  	type = GETTYPE(op.type);
>  	size = GETSIZE(op.type);
>  
> @@ -272,7 +272,7 @@ static bool stepping_handler(struct pt_regs *regs, struct perf_event *bp,
>  		return false;
>  	}
>  
> -	if (!emulate_step(regs, instr))
> +	if (!emulate_step(regs, instr, 0))
>  		goto fail;
>  
>  	return true;
> diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
> index 2d27ec4feee4..7303fe3856cc 100644
> --- a/arch/powerpc/kernel/kprobes.c
> +++ b/arch/powerpc/kernel/kprobes.c
> @@ -219,7 +219,7 @@ static int try_to_emulate(struct kprobe *p, struct pt_regs *regs)
>  	unsigned int insn = *p->ainsn.insn;
>  
>  	/* regs->nip is also adjusted if emulate_step returns 1 */
> -	ret = emulate_step(regs, insn);
> +	ret = emulate_step(regs, insn, 0);
>  	if (ret > 0) {
>  		/*
>  		 * Once this instruction has been boosted
> diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c
> index 1cbf7f1a4e3d..d862bb549158 100644
> --- a/arch/powerpc/kernel/mce_power.c
> +++ b/arch/powerpc/kernel/mce_power.c
> @@ -374,7 +374,7 @@ static int mce_find_instr_ea_and_phys(struct pt_regs *regs, uint64_t *addr,
>  	if (pfn != ULONG_MAX) {
>  		instr_addr = (pfn << PAGE_SHIFT) + (regs->nip & ~PAGE_MASK);
>  		instr = *(unsigned int *)(instr_addr);
> -		if (!analyse_instr(&op, &tmp, instr)) {
> +		if (!analyse_instr(&op, &tmp, instr, 0)) {
>  			pfn = addr_to_pfn(regs, op.ea);
>  			*addr = op.ea;
>  			*phys_addr = (pfn << PAGE_SHIFT);
> diff --git a/arch/powerpc/kernel/optprobes.c b/arch/powerpc/kernel/optprobes.c
> index 024f7aad1952..82dc8a589c87 100644
> --- a/arch/powerpc/kernel/optprobes.c
> +++ b/arch/powerpc/kernel/optprobes.c
> @@ -100,7 +100,7 @@ static unsigned long can_optimize(struct kprobe *p)
>  	 * and that can be emulated.
>  	 */
>  	if (!is_conditional_branch(*p->ainsn.insn) &&
> -			analyse_instr(&op, &regs, *p->ainsn.insn) == 1) {
> +			analyse_instr(&op, &regs, *p->ainsn.insn, 0) == 1) {
>  		emulate_update_regs(&regs, &op);
>  		nip = regs.nip;
>  	}
> diff --git a/arch/powerpc/kernel/uprobes.c b/arch/powerpc/kernel/uprobes.c
> index 1cfef0e5fec5..ab1077dc6148 100644
> --- a/arch/powerpc/kernel/uprobes.c
> +++ b/arch/powerpc/kernel/uprobes.c
> @@ -173,7 +173,7 @@ bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs)
>  	 * emulate_step() returns 1 if the insn was successfully emulated.
>  	 * For all other cases, we need to single-step in hardware.
>  	 */
> -	ret = emulate_step(regs, auprobe->insn);
> +	ret = emulate_step(regs, auprobe->insn, 0);
>  	if (ret > 0)
>  		return true;
>  
> diff --git a/arch/powerpc/kvm/emulate_loadstore.c b/arch/powerpc/kvm/emulate_loadstore.c
> index 2e496eb86e94..fcab1f31b48d 100644
> --- a/arch/powerpc/kvm/emulate_loadstore.c
> +++ b/arch/powerpc/kvm/emulate_loadstore.c
> @@ -100,7 +100,7 @@ int kvmppc_emulate_loadstore(struct kvm_vcpu *vcpu)
>  
>  	emulated = EMULATE_FAIL;
>  	vcpu->arch.regs.msr = vcpu->arch.shared->msr;
> -	if (analyse_instr(&op, &vcpu->arch.regs, inst) == 0) {
> +	if (analyse_instr(&op, &vcpu->arch.regs, inst, 0) == 0) {
>  		int type = op.type & INSTR_TYPE_MASK;
>  		int size = GETSIZE(op.type);
>  
> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> index c077acb983a1..ade3f5eba2e5 100644
> --- a/arch/powerpc/lib/sstep.c
> +++ b/arch/powerpc/lib/sstep.c
> @@ -1163,7 +1163,7 @@ static nokprobe_inline int trap_compare(long v1, long v2)
>   * otherwise.
>   */
>  int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
> -		  unsigned int instr)
> +		  unsigned int instr, unsigned int sufx)
>  {
>  	unsigned int opcode, ra, rb, rc, rd, spr, u;
>  	unsigned long int imm;
> @@ -2756,7 +2756,8 @@ void emulate_update_regs(struct pt_regs *regs, struct instruction_op *op)
>  {
>  	unsigned long next_pc;
>  
> -	next_pc = truncate_if_32bit(regs->msr, regs->nip + 4);
> +	next_pc = truncate_if_32bit(regs->msr,
> +				    regs->nip + ((op->type & PREFIXED) ? 8 : 4));
>  	switch (GETTYPE(op->type)) {
>  	case COMPUTE:
>  		if (op->type & SETREG)
> @@ -3101,14 +3102,14 @@ NOKPROBE_SYMBOL(emulate_loadstore);
>   * or -1 if the instruction is one that should not be stepped,
>   * such as an rfid, or a mtmsrd that would clear MSR_RI.
>   */
> -int emulate_step(struct pt_regs *regs, unsigned int instr)
> +int emulate_step(struct pt_regs *regs, unsigned int instr, unsigned int sufx)
>  {
>  	struct instruction_op op;
>  	int r, err, type;
>  	unsigned long val;
>  	unsigned long ea;
>  
> -	r = analyse_instr(&op, regs, instr);
> +	r = analyse_instr(&op, regs, instr, sufx);
>  	if (r < 0)
>  		return r;
>  	if (r > 0) {
> @@ -3200,7 +3201,8 @@ int emulate_step(struct pt_regs *regs, unsigned int instr)
>  	return 0;
>  
>   instr_done:
> -	regs->nip = truncate_if_32bit(regs->msr, regs->nip + 4);
> +	regs->nip = truncate_if_32bit(regs->msr,
> +				      regs->nip + ((op.type & PREFIXED) ? 8 : 4));
>  	return 1;
>  }
>  NOKPROBE_SYMBOL(emulate_step);
> diff --git a/arch/powerpc/lib/test_emulate_step.c b/arch/powerpc/lib/test_emulate_step.c
> index 42347067739c..9288dc6fc715 100644
> --- a/arch/powerpc/lib/test_emulate_step.c
> +++ b/arch/powerpc/lib/test_emulate_step.c
> @@ -103,7 +103,7 @@ static void __init test_ld(void)
>  	regs.gpr[3] = (unsigned long) &a;
>  
>  	/* ld r5, 0(r3) */
> -	stepped = emulate_step(&regs, TEST_LD(5, 3, 0));
> +	stepped = emulate_step(&regs, TEST_LD(5, 3, 0), 0);
>  
>  	if (stepped == 1 && regs.gpr[5] == a)
>  		show_result("ld", "PASS");
> @@ -121,7 +121,7 @@ static void __init test_lwz(void)
>  	regs.gpr[3] = (unsigned long) &a;
>  
>  	/* lwz r5, 0(r3) */
> -	stepped = emulate_step(&regs, TEST_LWZ(5, 3, 0));
> +	stepped = emulate_step(&regs, TEST_LWZ(5, 3, 0), 0);
>  
>  	if (stepped == 1 && regs.gpr[5] == a)
>  		show_result("lwz", "PASS");
> @@ -141,7 +141,7 @@ static void __init test_lwzx(void)
>  	regs.gpr[5] = 0x8765;
>  
>  	/* lwzx r5, r3, r4 */
> -	stepped = emulate_step(&regs, TEST_LWZX(5, 3, 4));
> +	stepped = emulate_step(&regs, TEST_LWZX(5, 3, 4), 0);
>  	if (stepped == 1 && regs.gpr[5] == a[2])
>  		show_result("lwzx", "PASS");
>  	else
> @@ -159,7 +159,7 @@ static void __init test_std(void)
>  	regs.gpr[5] = 0x5678;
>  
>  	/* std r5, 0(r3) */
> -	stepped = emulate_step(&regs, TEST_STD(5, 3, 0));
> +	stepped = emulate_step(&regs, TEST_STD(5, 3, 0), 0);
>  	if (stepped == 1 || regs.gpr[5] == a)
>  		show_result("std", "PASS");
>  	else
> @@ -184,7 +184,7 @@ static void __init test_ldarx_stdcx(void)
>  	regs.gpr[5] = 0x5678;
>  
>  	/* ldarx r5, r3, r4, 0 */
> -	stepped = emulate_step(&regs, TEST_LDARX(5, 3, 4, 0));
> +	stepped = emulate_step(&regs, TEST_LDARX(5, 3, 4, 0), 0);
>  
>  	/*
>  	 * Don't touch 'a' here. Touching 'a' can do Load/store
> @@ -202,7 +202,7 @@ static void __init test_ldarx_stdcx(void)
>  	regs.gpr[5] = 0x9ABC;
>  
>  	/* stdcx. r5, r3, r4 */
> -	stepped = emulate_step(&regs, TEST_STDCX(5, 3, 4));
> +	stepped = emulate_step(&regs, TEST_STDCX(5, 3, 4), 0);
>  
>  	/*
>  	 * Two possible scenarios that indicates successful emulation
> @@ -242,7 +242,7 @@ static void __init test_lfsx_stfsx(void)
>  	regs.gpr[4] = 0;
>  
>  	/* lfsx frt10, r3, r4 */
> -	stepped = emulate_step(&regs, TEST_LFSX(10, 3, 4));
> +	stepped = emulate_step(&regs, TEST_LFSX(10, 3, 4), 0);
>  
>  	if (stepped == 1)
>  		show_result("lfsx", "PASS");
> @@ -255,7 +255,7 @@ static void __init test_lfsx_stfsx(void)
>  	c.a = 678.91;
>  
>  	/* stfsx frs10, r3, r4 */
> -	stepped = emulate_step(&regs, TEST_STFSX(10, 3, 4));
> +	stepped = emulate_step(&regs, TEST_STFSX(10, 3, 4), 0);
>  
>  	if (stepped == 1 && c.b == cached_b)
>  		show_result("stfsx", "PASS");
> @@ -285,7 +285,7 @@ static void __init test_lfdx_stfdx(void)
>  	regs.gpr[4] = 0;
>  
>  	/* lfdx frt10, r3, r4 */
> -	stepped = emulate_step(&regs, TEST_LFDX(10, 3, 4));
> +	stepped = emulate_step(&regs, TEST_LFDX(10, 3, 4), 0);
>  
>  	if (stepped == 1)
>  		show_result("lfdx", "PASS");
> @@ -298,7 +298,7 @@ static void __init test_lfdx_stfdx(void)
>  	c.a = 987654.32;
>  
>  	/* stfdx frs10, r3, r4 */
> -	stepped = emulate_step(&regs, TEST_STFDX(10, 3, 4));
> +	stepped = emulate_step(&regs, TEST_STFDX(10, 3, 4), 0);
>  
>  	if (stepped == 1 && c.b == cached_b)
>  		show_result("stfdx", "PASS");
> @@ -344,7 +344,7 @@ static void __init test_lvx_stvx(void)
>  	regs.gpr[4] = 0;
>  
>  	/* lvx vrt10, r3, r4 */
> -	stepped = emulate_step(&regs, TEST_LVX(10, 3, 4));
> +	stepped = emulate_step(&regs, TEST_LVX(10, 3, 4), 0);
>  
>  	if (stepped == 1)
>  		show_result("lvx", "PASS");
> @@ -360,7 +360,7 @@ static void __init test_lvx_stvx(void)
>  	c.b[3] = 498532;
>  
>  	/* stvx vrs10, r3, r4 */
> -	stepped = emulate_step(&regs, TEST_STVX(10, 3, 4));
> +	stepped = emulate_step(&regs, TEST_STVX(10, 3, 4), 0);
>  
>  	if (stepped == 1 && cached_b[0] == c.b[0] && cached_b[1] == c.b[1] &&
>  	    cached_b[2] == c.b[2] && cached_b[3] == c.b[3])
> @@ -401,7 +401,7 @@ static void __init test_lxvd2x_stxvd2x(void)
>  	regs.gpr[4] = 0;
>  
>  	/* lxvd2x vsr39, r3, r4 */
> -	stepped = emulate_step(&regs, TEST_LXVD2X(39, 3, 4));
> +	stepped = emulate_step(&regs, TEST_LXVD2X(39, 3, 4), 0);
>  
>  	if (stepped == 1 && cpu_has_feature(CPU_FTR_VSX)) {
>  		show_result("lxvd2x", "PASS");
> @@ -421,7 +421,7 @@ static void __init test_lxvd2x_stxvd2x(void)
>  	c.b[3] = 4;
>  
>  	/* stxvd2x vsr39, r3, r4 */
> -	stepped = emulate_step(&regs, TEST_STXVD2X(39, 3, 4));
> +	stepped = emulate_step(&regs, TEST_STXVD2X(39, 3, 4), 0);
>  
>  	if (stepped == 1 && cached_b[0] == c.b[0] && cached_b[1] == c.b[1] &&
>  	    cached_b[2] == c.b[2] && cached_b[3] == c.b[3] &&
> @@ -848,7 +848,7 @@ static int __init emulate_compute_instr(struct pt_regs *regs,
>  	if (!regs || !instr)
>  		return -EINVAL;
>  
> -	if (analyse_instr(&op, regs, instr) != 1 ||
> +	if (analyse_instr(&op, regs, instr, 0) != 1 ||
>  	    GETTYPE(op.type) != COMPUTE) {
>  		pr_info("emulation failed, instruction = 0x%08x\n", instr);
>  		return -EFAULT;
> diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
> index a7056049709e..f47bd843dc52 100644
> --- a/arch/powerpc/xmon/xmon.c
> +++ b/arch/powerpc/xmon/xmon.c
> @@ -705,7 +705,7 @@ static int xmon_core(struct pt_regs *regs, int fromipi)
>  	if ((regs->msr & (MSR_IR|MSR_PR|MSR_64BIT)) == (MSR_IR|MSR_64BIT)) {
>  		bp = at_breakpoint(regs->nip);
>  		if (bp != NULL) {
> -			int stepped = emulate_step(regs, bp->instr[0]);
> +			int stepped = emulate_step(regs, bp->instr[0], 0);
>  			if (stepped == 0) {
>  				regs->nip = (unsigned long) &bp->instr[0];
>  				atomic_inc(&bp->ref_count);
> @@ -1170,7 +1170,7 @@ static int do_step(struct pt_regs *regs)
>  	/* check we are in 64-bit kernel mode, translation enabled */
>  	if ((regs->msr & (MSR_64BIT|MSR_PR|MSR_IR)) == (MSR_64BIT|MSR_IR)) {
>  		if (mread(regs->nip, &instr, 4) == 4) {
> -			stepped = emulate_step(regs, instr);
> +			stepped = emulate_step(regs, instr, 0);
>  			if (stepped < 0) {
>  				printf("Couldn't single-step %s instruction\n",
>  				       (IS_RFID(instr)? "rfid": "mtmsrd"));
> -- 
> 2.20.1
> 


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 15/18] powerpc/uprobes: Add support for prefixed instructions
  2019-11-26  5:21 ` [PATCH 15/18] powerpc/uprobes: Add support for " Jordan Niethe
@ 2020-01-13 11:30   ` Balamuruhan S
  2020-02-06 23:09     ` Jordan Niethe
  0 siblings, 1 reply; 42+ messages in thread
From: Balamuruhan S @ 2020-01-13 11:30 UTC (permalink / raw)
  To: Jordan Niethe; +Cc: alistair, linuxppc-dev

On Tue, Nov 26, 2019 at 04:21:38PM +1100, Jordan Niethe wrote:
> Uprobes can execute instructions out of line. Increase the size of the
> buffer used  for this so that this works for prefixed instructions. Take
> into account the length of prefixed instructions when fixing up the nip.
> 
> Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> ---
>  arch/powerpc/include/asm/uprobes.h | 18 ++++++++++++++----
>  arch/powerpc/kernel/uprobes.c      |  4 ++--
>  2 files changed, 16 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/uprobes.h b/arch/powerpc/include/asm/uprobes.h
> index 2bbdf27d09b5..5b5e8a3d2f55 100644
> --- a/arch/powerpc/include/asm/uprobes.h
> +++ b/arch/powerpc/include/asm/uprobes.h
> @@ -14,18 +14,28 @@
>  
>  typedef ppc_opcode_t uprobe_opcode_t;
>  
> +/*
> + * We have to ensure we have enought space for prefixed instructions, which

minor typo of `enought` and we can have something like below,

s/We have to ensure we have enought/Ensure we have enough

-- Bala

> + * are double the size of a word instruction, i.e. 8 bytes. However,
> + * sometimes it is simpler to treat a prefixed instruction like 2 word
> + * instructions.
> + */
>  #define MAX_UINSN_BYTES		4
> -#define UPROBE_XOL_SLOT_BYTES	(MAX_UINSN_BYTES)
> +#define UPROBE_XOL_SLOT_BYTES	(2 * MAX_UINSN_BYTES)
>  
>  /* The following alias is needed for reference from arch-agnostic code */
>  #define UPROBE_SWBP_INSN	BREAKPOINT_INSTRUCTION
>  #define UPROBE_SWBP_INSN_SIZE	4 /* swbp insn size in bytes */
>  
>  struct arch_uprobe {
> +	 /*
> +	  * Ensure there is enough space for prefixed instructions. Prefixed
> +	  * instructions must not cross 64-byte boundaries.
> +	  */
>  	union {
> -		u32	insn;
> -		u32	ixol;
> -	};
> +		uprobe_opcode_t	insn[2];
> +		uprobe_opcode_t	ixol[2];
> +	} __aligned(64);
>  };
>  
>  struct arch_uprobe_task {
> diff --git a/arch/powerpc/kernel/uprobes.c b/arch/powerpc/kernel/uprobes.c
> index ab1077dc6148..cfcea6946f8b 100644
> --- a/arch/powerpc/kernel/uprobes.c
> +++ b/arch/powerpc/kernel/uprobes.c
> @@ -111,7 +111,7 @@ int arch_uprobe_post_xol(struct arch_uprobe *auprobe, struct pt_regs *regs)
>  	 * support doesn't exist and have to fix-up the next instruction
>  	 * to be executed.
>  	 */
> -	regs->nip = utask->vaddr + MAX_UINSN_BYTES;
> +	regs->nip = utask->vaddr + ((IS_PREFIX(auprobe->insn[0])) ? 8 : 4);
>  
>  	user_disable_single_step(current);
>  	return 0;
> @@ -173,7 +173,7 @@ bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs)
>  	 * emulate_step() returns 1 if the insn was successfully emulated.
>  	 * For all other cases, we need to single-step in hardware.
>  	 */
> -	ret = emulate_step(regs, auprobe->insn, 0);
> +	ret = emulate_step(regs, auprobe->insn[0], auprobe->insn[1]);
>  	if (ret > 0)
>  		return true;
>  
> -- 
> 2.20.1
> 


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 14/18] powerpc/kprobes: Support kprobes on prefixed instructions
  2019-11-26  5:21 ` [PATCH 14/18] powerpc/kprobes: Support kprobes on " Jordan Niethe
@ 2020-01-14  7:19   ` Balamuruhan S
  2020-01-16  6:09     ` Michael Ellerman
  0 siblings, 1 reply; 42+ messages in thread
From: Balamuruhan S @ 2020-01-14  7:19 UTC (permalink / raw)
  To: Jordan Niethe; +Cc: alistair, linuxppc-dev

On Tue, Nov 26, 2019 at 04:21:37PM +1100, Jordan Niethe wrote:
> A prefixed instruction is composed of a word prefix followed by a word
> suffix. It does not make sense to be able to have a kprobe on the suffix
> of a prefixed instruction, so make this impossible.
> 
> Kprobes work by replacing an instruction with a trap and saving that
> instruction to be single stepped out of place later. Currently there is
> not enough space allocated to keep a prefixed instruction for single
> stepping. Increase the amount of space allocated for holding the
> instruction copy.
> 
> kprobe_post_handler() expects all instructions to be 4 bytes long which
> means that it does not function correctly for prefixed instructions.
> Add checks for prefixed instructions which will use a length of 8 bytes
> instead.
> 
> For optprobes we normally patch in loading the instruction we put a
> probe on into r4 before calling emulate_step(). We now make space and
> patch in loading the suffix into r5 as well.
> 
> Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> ---
>  arch/powerpc/include/asm/kprobes.h   |  5 +--
>  arch/powerpc/kernel/kprobes.c        | 46 +++++++++++++++++++++-------
>  arch/powerpc/kernel/optprobes.c      | 31 +++++++++++--------
>  arch/powerpc/kernel/optprobes_head.S |  6 ++++
>  4 files changed, 62 insertions(+), 26 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/kprobes.h b/arch/powerpc/include/asm/kprobes.h
> index 66b3f2983b22..1f03a1cacb1e 100644
> --- a/arch/powerpc/include/asm/kprobes.h
> +++ b/arch/powerpc/include/asm/kprobes.h
> @@ -38,12 +38,13 @@ extern kprobe_opcode_t optprobe_template_entry[];
>  extern kprobe_opcode_t optprobe_template_op_address[];
>  extern kprobe_opcode_t optprobe_template_call_handler[];
>  extern kprobe_opcode_t optprobe_template_insn[];
> +extern kprobe_opcode_t optprobe_template_sufx[];
>  extern kprobe_opcode_t optprobe_template_call_emulate[];
>  extern kprobe_opcode_t optprobe_template_ret[];
>  extern kprobe_opcode_t optprobe_template_end[];
>  
> -/* Fixed instruction size for powerpc */
> -#define MAX_INSN_SIZE		1
> +/* Prefixed instructions are two words */
> +#define MAX_INSN_SIZE		2
>  #define MAX_OPTIMIZED_LENGTH	sizeof(kprobe_opcode_t)	/* 4 bytes */
>  #define MAX_OPTINSN_SIZE	(optprobe_template_end - optprobe_template_entry)
>  #define RELATIVEJUMP_SIZE	sizeof(kprobe_opcode_t)	/* 4 bytes */
> diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
> index 7303fe3856cc..aa15b3480385 100644
> --- a/arch/powerpc/kernel/kprobes.c
> +++ b/arch/powerpc/kernel/kprobes.c
> @@ -104,17 +104,30 @@ kprobe_opcode_t *kprobe_lookup_name(const char *name, unsigned int offset)
>  
>  int arch_prepare_kprobe(struct kprobe *p)
>  {
> +	int len;
>  	int ret = 0;
> +	struct kprobe *prev;
>  	kprobe_opcode_t insn = *p->addr;
> +	kprobe_opcode_t prfx = *(p->addr - 1);
>  
> +	preempt_disable();
>  	if ((unsigned long)p->addr & 0x03) {
>  		printk("Attempt to register kprobe at an unaligned address\n");
>  		ret = -EINVAL;
>  	} else if (IS_MTMSRD(insn) || IS_RFID(insn) || IS_RFI(insn)) {
>  		printk("Cannot register a kprobe on rfi/rfid or mtmsr[d]\n");
>  		ret = -EINVAL;
> +	} else if (IS_PREFIX(prfx)) {
> +		printk("Cannot register a kprobe on the second word of prefixed instruction\n");

Let's have line with in 80 columns length.

> +		ret = -EINVAL;
> +	}
> +	prev = get_kprobe(p->addr - 1);
> +	if (prev && IS_PREFIX(*prev->ainsn.insn)) {
> +		printk("Cannot register a kprobe on the second word of prefixed instruction\n");

same here.

-- Bala
> +		ret = -EINVAL;
>  	}
>  
> +
>  	/* insn must be on a special executable page on ppc64.  This is
>  	 * not explicitly required on ppc32 (right now), but it doesn't hurt */
>  	if (!ret) {
> @@ -124,14 +137,18 @@ int arch_prepare_kprobe(struct kprobe *p)
>  	}
>  
>  	if (!ret) {
> -		memcpy(p->ainsn.insn, p->addr,
> -				MAX_INSN_SIZE * sizeof(kprobe_opcode_t));
> +		if (IS_PREFIX(insn))
> +			len = MAX_INSN_SIZE * sizeof(kprobe_opcode_t);
> +		else
> +			len = sizeof(kprobe_opcode_t);
> +		memcpy(p->ainsn.insn, p->addr, len);
>  		p->opcode = *p->addr;
>  		flush_icache_range((unsigned long)p->ainsn.insn,
>  			(unsigned long)p->ainsn.insn + sizeof(kprobe_opcode_t));
>  	}
>  
>  	p->ainsn.boostable = 0;
> +	preempt_enable_no_resched();
>  	return ret;
>  }
>  NOKPROBE_SYMBOL(arch_prepare_kprobe);
> @@ -216,10 +233,11 @@ NOKPROBE_SYMBOL(arch_prepare_kretprobe);
>  static int try_to_emulate(struct kprobe *p, struct pt_regs *regs)
>  {
>  	int ret;
> -	unsigned int insn = *p->ainsn.insn;
> +	unsigned int insn = p->ainsn.insn[0];
> +	unsigned int sufx = p->ainsn.insn[1];
>  
>  	/* regs->nip is also adjusted if emulate_step returns 1 */
> -	ret = emulate_step(regs, insn, 0);
> +	ret = emulate_step(regs, insn, sufx);
>  	if (ret > 0) {
>  		/*
>  		 * Once this instruction has been boosted
> @@ -233,7 +251,10 @@ static int try_to_emulate(struct kprobe *p, struct pt_regs *regs)
>  		 * So, we should never get here... but, its still
>  		 * good to catch them, just in case...
>  		 */
> -		printk("Can't step on instruction %x\n", insn);
> +		if (!IS_PREFIX(insn))
> +			printk("Can't step on instruction %x\n", insn);
> +		else
> +			printk("Can't step on instruction %x %x\n", insn, sufx);
>  		BUG();
>  	} else {
>  		/*
> @@ -275,7 +296,7 @@ int kprobe_handler(struct pt_regs *regs)
>  	if (kprobe_running()) {
>  		p = get_kprobe(addr);
>  		if (p) {
> -			kprobe_opcode_t insn = *p->ainsn.insn;
> +			kprobe_opcode_t insn = p->ainsn.insn[0];
>  			if (kcb->kprobe_status == KPROBE_HIT_SS &&
>  					is_trap(insn)) {
>  				/* Turn off 'trace' bits */
> @@ -448,9 +469,10 @@ static int trampoline_probe_handler(struct kprobe *p, struct pt_regs *regs)
>  	 * the link register properly so that the subsequent 'blr' in
>  	 * kretprobe_trampoline jumps back to the right instruction.
>  	 *
> -	 * For nip, we should set the address to the previous instruction since
> -	 * we end up emulating it in kprobe_handler(), which increments the nip
> -	 * again.
> +	 * To keep the nip at the correct address we need to counter the
> +	 * increment that happens when we emulate the kretprobe_trampoline noop
> +	 * in kprobe_handler(). We do this by decrementing the address by the
> +	 * length of the noop which is always 4 bytes.
>  	 */
>  	regs->nip = orig_ret_address - 4;
>  	regs->link = orig_ret_address;
> @@ -478,12 +500,14 @@ int kprobe_post_handler(struct pt_regs *regs)
>  {
>  	struct kprobe *cur = kprobe_running();
>  	struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
> +	kprobe_opcode_t insn;
>  
>  	if (!cur || user_mode(regs))
>  		return 0;
>  
> +	insn = *cur->ainsn.insn;
>  	/* make sure we got here for instruction we have a kprobe on */
> -	if (((unsigned long)cur->ainsn.insn + 4) != regs->nip)
> +	if (((unsigned long)cur->ainsn.insn + (IS_PREFIX(insn) ? 8 : 4)) != regs->nip)
>  		return 0;
>  
>  	if ((kcb->kprobe_status != KPROBE_REENTER) && cur->post_handler) {
> @@ -492,7 +516,7 @@ int kprobe_post_handler(struct pt_regs *regs)
>  	}
>  
>  	/* Adjust nip to after the single-stepped instruction */
> -	regs->nip = (unsigned long)cur->addr + 4;
> +	regs->nip = (unsigned long)cur->addr + (IS_PREFIX(insn) ? 8 : 4);
>  	regs->msr |= kcb->kprobe_saved_msr;
>  
>  	/*Restore back the original saved kprobes variables and continue. */
> diff --git a/arch/powerpc/kernel/optprobes.c b/arch/powerpc/kernel/optprobes.c
> index 82dc8a589c87..b2aef27bac27 100644
> --- a/arch/powerpc/kernel/optprobes.c
> +++ b/arch/powerpc/kernel/optprobes.c
> @@ -27,6 +27,8 @@
>  	(optprobe_template_op_address - optprobe_template_entry)
>  #define TMPL_INSN_IDX		\
>  	(optprobe_template_insn - optprobe_template_entry)
> +#define TMPL_SUFX_IDX		\
> +	(optprobe_template_sufx - optprobe_template_entry)
>  #define TMPL_END_IDX		\
>  	(optprobe_template_end - optprobe_template_entry)
>  
> @@ -100,7 +102,8 @@ static unsigned long can_optimize(struct kprobe *p)
>  	 * and that can be emulated.
>  	 */
>  	if (!is_conditional_branch(*p->ainsn.insn) &&
> -			analyse_instr(&op, &regs, *p->ainsn.insn, 0) == 1) {
> +			analyse_instr(&op, &regs, p->ainsn.insn[0],
> +				      p->ainsn.insn[1]) == 1) {
>  		emulate_update_regs(&regs, &op);
>  		nip = regs.nip;
>  	}
> @@ -140,27 +143,27 @@ void arch_remove_optimized_kprobe(struct optimized_kprobe *op)
>  }
>  
>  /*
> - * emulate_step() requires insn to be emulated as
> - * second parameter. Load register 'r4' with the
> - * instruction.
> + * emulate_step() requires insn to be emulated as second parameter, and the
> + * suffix as the third parameter. Load these into registers.
>   */
> -void patch_imm32_load_insns(unsigned int val, kprobe_opcode_t *addr)
> +static void patch_imm32_load_insns(int reg, unsigned int val,
> +				   kprobe_opcode_t *addr)
>  {
> -	/* addis r4,0,(insn)@h */
> -	patch_instruction(addr, PPC_INST_ADDIS | ___PPC_RT(4) |
> +	/* addis reg,0,(insn)@h */
> +	patch_instruction(addr, PPC_INST_ADDIS | ___PPC_RT(reg) |
>  			  ((val >> 16) & 0xffff));
>  	addr++;
>  
> -	/* ori r4,r4,(insn)@l */
> -	patch_instruction(addr, PPC_INST_ORI | ___PPC_RA(4) |
> -			  ___PPC_RS(4) | (val & 0xffff));
> +	/* ori reg,reg,(insn)@l */
> +	patch_instruction(addr, PPC_INST_ORI | ___PPC_RA(reg) |
> +			  ___PPC_RS(reg) | (val & 0xffff));
>  }
>  
>  /*
>   * Generate instructions to load provided immediate 64-bit value
>   * to register 'r3' and patch these instructions at 'addr'.
>   */
> -void patch_imm64_load_insns(unsigned long val, kprobe_opcode_t *addr)
> +static void patch_imm64_load_insns(unsigned long val, kprobe_opcode_t *addr)
>  {
>  	/* lis r3,(op)@highest */
>  	patch_instruction(addr, PPC_INST_ADDIS | ___PPC_RT(3) |
> @@ -266,9 +269,11 @@ int arch_prepare_optimized_kprobe(struct optimized_kprobe *op, struct kprobe *p)
>  	patch_instruction(buff + TMPL_EMULATE_IDX, branch_emulate_step);
>  
>  	/*
> -	 * 3. load instruction to be emulated into relevant register, and
> +	 * 3. load instruction and suffix to be emulated into the relevant
> +	 * registers, and
>  	 */
> -	patch_imm32_load_insns(*p->ainsn.insn, buff + TMPL_INSN_IDX);
> +	patch_imm32_load_insns(4, p->ainsn.insn[0], buff + TMPL_INSN_IDX);
> +	patch_imm32_load_insns(5, p->ainsn.insn[1], buff + TMPL_SUFX_IDX);
>  
>  	/*
>  	 * 4. branch back from trampoline
> diff --git a/arch/powerpc/kernel/optprobes_head.S b/arch/powerpc/kernel/optprobes_head.S
> index cf383520843f..998359ae44ec 100644
> --- a/arch/powerpc/kernel/optprobes_head.S
> +++ b/arch/powerpc/kernel/optprobes_head.S
> @@ -95,6 +95,12 @@ optprobe_template_insn:
>  	nop
>  	nop
>  
> +	.global optprobe_template_sufx
> +optprobe_template_sufx:
> +	/* Pass suffix to be emulated in r5 */
> +	nop
> +	nop
> +
>  	.global optprobe_template_call_emulate
>  optprobe_template_call_emulate:
>  	/* Branch to emulate_step()  */
> -- 
> 2.20.1
> 


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 14/18] powerpc/kprobes: Support kprobes on prefixed instructions
  2020-01-14  7:19   ` Balamuruhan S
@ 2020-01-16  6:09     ` Michael Ellerman
  0 siblings, 0 replies; 42+ messages in thread
From: Michael Ellerman @ 2020-01-16  6:09 UTC (permalink / raw)
  To: Balamuruhan S, Jordan Niethe; +Cc: alistair, linuxppc-dev

Balamuruhan S <bala24@linux.ibm.com> writes:
> On Tue, Nov 26, 2019 at 04:21:37PM +1100, Jordan Niethe wrote:
...
>> diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
>> index 7303fe3856cc..aa15b3480385 100644
>> --- a/arch/powerpc/kernel/kprobes.c
>> +++ b/arch/powerpc/kernel/kprobes.c
>> @@ -104,17 +104,30 @@ kprobe_opcode_t *kprobe_lookup_name(const char *name, unsigned int offset)
>>  
>>  int arch_prepare_kprobe(struct kprobe *p)
>>  {
>> +	int len;
>>  	int ret = 0;
>> +	struct kprobe *prev;
>>  	kprobe_opcode_t insn = *p->addr;
>> +	kprobe_opcode_t prfx = *(p->addr - 1);
>>  
>> +	preempt_disable();
>>  	if ((unsigned long)p->addr & 0x03) {
>>  		printk("Attempt to register kprobe at an unaligned address\n");
>>  		ret = -EINVAL;
>>  	} else if (IS_MTMSRD(insn) || IS_RFID(insn) || IS_RFI(insn)) {
>>  		printk("Cannot register a kprobe on rfi/rfid or mtmsr[d]\n");
>>  		ret = -EINVAL;
>> +	} else if (IS_PREFIX(prfx)) {
>> +		printk("Cannot register a kprobe on the second word of prefixed instruction\n");
>
> Let's have line with in 80 columns length.

We don't split printk strings across lines. So this is fine.

cheers

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 15/18] powerpc/uprobes: Add support for prefixed instructions
  2020-01-13 11:30   ` Balamuruhan S
@ 2020-02-06 23:09     ` Jordan Niethe
  0 siblings, 0 replies; 42+ messages in thread
From: Jordan Niethe @ 2020-02-06 23:09 UTC (permalink / raw)
  To: Balamuruhan S; +Cc: Alistair Popple, linuxppc-dev

On Mon, Jan 13, 2020 at 10:30 PM Balamuruhan S <bala24@linux.ibm.com> wrote:
>
> On Tue, Nov 26, 2019 at 04:21:38PM +1100, Jordan Niethe wrote:
> > Uprobes can execute instructions out of line. Increase the size of the
> > buffer used  for this so that this works for prefixed instructions. Take
> > into account the length of prefixed instructions when fixing up the nip.
> >
> > Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> > ---
> >  arch/powerpc/include/asm/uprobes.h | 18 ++++++++++++++----
> >  arch/powerpc/kernel/uprobes.c      |  4 ++--
> >  2 files changed, 16 insertions(+), 6 deletions(-)
> >
> > diff --git a/arch/powerpc/include/asm/uprobes.h b/arch/powerpc/include/asm/uprobes.h
> > index 2bbdf27d09b5..5b5e8a3d2f55 100644
> > --- a/arch/powerpc/include/asm/uprobes.h
> > +++ b/arch/powerpc/include/asm/uprobes.h
> > @@ -14,18 +14,28 @@
> >
> >  typedef ppc_opcode_t uprobe_opcode_t;
> >
> > +/*
> > + * We have to ensure we have enought space for prefixed instructions, which
>
> minor typo of `enought` and we can have something like below,
Thanks for catching that.
>
> s/We have to ensure we have enought/Ensure we have enough
Will do.
>
> -- Bala
>
> > + * are double the size of a word instruction, i.e. 8 bytes. However,
> > + * sometimes it is simpler to treat a prefixed instruction like 2 word
> > + * instructions.
> > + */
> >  #define MAX_UINSN_BYTES              4
> > -#define UPROBE_XOL_SLOT_BYTES        (MAX_UINSN_BYTES)
> > +#define UPROBE_XOL_SLOT_BYTES        (2 * MAX_UINSN_BYTES)
> >
> >  /* The following alias is needed for reference from arch-agnostic code */
> >  #define UPROBE_SWBP_INSN     BREAKPOINT_INSTRUCTION
> >  #define UPROBE_SWBP_INSN_SIZE        4 /* swbp insn size in bytes */
> >
> >  struct arch_uprobe {
> > +      /*
> > +       * Ensure there is enough space for prefixed instructions. Prefixed
> > +       * instructions must not cross 64-byte boundaries.
> > +       */
> >       union {
> > -             u32     insn;
> > -             u32     ixol;
> > -     };
> > +             uprobe_opcode_t insn[2];
> > +             uprobe_opcode_t ixol[2];
> > +     } __aligned(64);
> >  };
> >
> >  struct arch_uprobe_task {
> > diff --git a/arch/powerpc/kernel/uprobes.c b/arch/powerpc/kernel/uprobes.c
> > index ab1077dc6148..cfcea6946f8b 100644
> > --- a/arch/powerpc/kernel/uprobes.c
> > +++ b/arch/powerpc/kernel/uprobes.c
> > @@ -111,7 +111,7 @@ int arch_uprobe_post_xol(struct arch_uprobe *auprobe, struct pt_regs *regs)
> >        * support doesn't exist and have to fix-up the next instruction
> >        * to be executed.
> >        */
> > -     regs->nip = utask->vaddr + MAX_UINSN_BYTES;
> > +     regs->nip = utask->vaddr + ((IS_PREFIX(auprobe->insn[0])) ? 8 : 4);
> >
> >       user_disable_single_step(current);
> >       return 0;
> > @@ -173,7 +173,7 @@ bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs)
> >        * emulate_step() returns 1 if the insn was successfully emulated.
> >        * For all other cases, we need to single-step in hardware.
> >        */
> > -     ret = emulate_step(regs, auprobe->insn, 0);
> > +     ret = emulate_step(regs, auprobe->insn[0], auprobe->insn[1]);
> >       if (ret > 0)
> >               return true;
> >
> > --
> > 2.20.1
> >
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 05/18] powerpc sstep: Prepare to support prefixed instructions
  2020-01-13  6:18   ` Balamuruhan S
@ 2020-02-06 23:12     ` Jordan Niethe
  0 siblings, 0 replies; 42+ messages in thread
From: Jordan Niethe @ 2020-02-06 23:12 UTC (permalink / raw)
  To: Balamuruhan S; +Cc: Alistair Popple, linuxppc-dev

On Mon, Jan 13, 2020 at 5:18 PM Balamuruhan S <bala24@linux.ibm.com> wrote:
>
> On Tue, Nov 26, 2019 at 04:21:28PM +1100, Jordan Niethe wrote:
> > Currently all instructions are a single word long. A future ISA version
> > will include prefixed instructions which have a double word length. The
> > functions used for analysing and emulating instructions need to be
> > modified so that they can handle these new instruction types.
> >
> > A prefixed instruction is a word prefix followed by a word suffix. All
> > prefixes uniquely have the primary op-code 1. Suffixes may be valid word
> > instructions or instructions that only exist as suffixes.
> >
> > In handling prefixed instructions it will be convenient to treat the
> > suffix and prefix as separate words. To facilitate this modify
> > analyse_instr() and emulate_step() to take a take a suffix as a
>
> typo - s/take a take a/take a
Thanks for catching this.
>
> > parameter. For word instructions it does not matter what is passed in
> > here - it will be ignored.
> >
> > We also define a new flag, PREFIXED, to be used in instruction_op:type.
> > This flag will indicate when emulating an analysed instruction if the
> > NIP should be advanced by word length or double word length.
> >
> > The callers of analyse_instr() and emulate_step() will need their own
> > changes to be able to support prefixed instructions. For now modify them
> > to pass in 0 as a suffix.
> >
> > Note that at this point no prefixed instructions are emulated or
> > analysed - this is just making it possible to do so.
> >
> > Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> > ---
> >  arch/powerpc/include/asm/ppc-opcode.h |  3 +++
> >  arch/powerpc/include/asm/sstep.h      |  8 +++++--
> >  arch/powerpc/include/asm/uaccess.h    | 30 +++++++++++++++++++++++++++
> >  arch/powerpc/kernel/align.c           |  2 +-
> >  arch/powerpc/kernel/hw_breakpoint.c   |  4 ++--
> >  arch/powerpc/kernel/kprobes.c         |  2 +-
> >  arch/powerpc/kernel/mce_power.c       |  2 +-
> >  arch/powerpc/kernel/optprobes.c       |  2 +-
> >  arch/powerpc/kernel/uprobes.c         |  2 +-
> >  arch/powerpc/kvm/emulate_loadstore.c  |  2 +-
> >  arch/powerpc/lib/sstep.c              | 12 ++++++-----
> >  arch/powerpc/lib/test_emulate_step.c  | 30 +++++++++++++--------------
> >  arch/powerpc/xmon/xmon.c              |  4 ++--
> >  13 files changed, 71 insertions(+), 32 deletions(-)
> >
> > diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h
> > index c1df75edde44..a1dfa4bdd22f 100644
> > --- a/arch/powerpc/include/asm/ppc-opcode.h
> > +++ b/arch/powerpc/include/asm/ppc-opcode.h
> > @@ -377,6 +377,9 @@
> >  #define PPC_INST_VCMPEQUD            0x100000c7
> >  #define PPC_INST_VCMPEQUB            0x10000006
> >
> > +/* macro to check if a word is a prefix */
> > +#define IS_PREFIX(x) (((x) >> 26) == 1)
> > +
> >  /* macros to insert fields into opcodes */
> >  #define ___PPC_RA(a) (((a) & 0x1f) << 16)
> >  #define ___PPC_RB(b) (((b) & 0x1f) << 11)
> > diff --git a/arch/powerpc/include/asm/sstep.h b/arch/powerpc/include/asm/sstep.h
> > index 769f055509c9..6d4cb602e231 100644
> > --- a/arch/powerpc/include/asm/sstep.h
> > +++ b/arch/powerpc/include/asm/sstep.h
> > @@ -89,6 +89,9 @@ enum instruction_type {
> >  #define VSX_LDLEFT   4       /* load VSX register from left */
> >  #define VSX_CHECK_VEC        8       /* check MSR_VEC not MSR_VSX for reg >= 32 */
> >
> > +/* Prefixed flag, ORed in with type */
> > +#define PREFIXED     0x800
> > +
> >  /* Size field in type word */
> >  #define SIZE(n)              ((n) << 12)
> >  #define GETSIZE(w)   ((w) >> 12)
> > @@ -132,7 +135,7 @@ union vsx_reg {
> >   * otherwise.
> >   */
> >  extern int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
> > -                      unsigned int instr);
> > +                      unsigned int instr, unsigned int sufx);
> >
> >  /*
> >   * Emulate an instruction that can be executed just by updating
> > @@ -149,7 +152,8 @@ void emulate_update_regs(struct pt_regs *reg, struct instruction_op *op);
> >   * 0 if it could not be emulated, or -1 for an instruction that
> >   * should not be emulated (rfid, mtmsrd clearing MSR_RI, etc.).
> >   */
> > -extern int emulate_step(struct pt_regs *regs, unsigned int instr);
> > +extern int emulate_step(struct pt_regs *regs, unsigned int instr,
> > +                     unsigned int sufx);
> >
> >  /*
> >   * Emulate a load or store instruction by reading/writing the
> > diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
> > index 15002b51ff18..bc585399e0c7 100644
> > --- a/arch/powerpc/include/asm/uaccess.h
> > +++ b/arch/powerpc/include/asm/uaccess.h
> > @@ -423,4 +423,34 @@ extern long __copy_from_user_flushcache(void *dst, const void __user *src,
> >  extern void memcpy_page_flushcache(char *to, struct page *page, size_t offset,
> >                          size_t len);
> >
> > +/*
> > + * When reading an instruction iff it is a prefix, the suffix needs to be also
> > + * loaded.
> > + */
> > +#define __get_user_instr(x, y, ptr)                  \
>
> It will be better to have `__get_user_instr()` and
> `__get_user_instr_inatomic()` in separate commit or squashed with patch
> [PATCH 10/18] powerpc: Support prefixed instructions in alignment handler.
>
> -- Bala
Will do.
>
> > +({                                                   \
> > +     long __gui_ret = 0;                             \
> > +     y = 0;                                          \
> > +     __gui_ret = __get_user(x, ptr);                 \
> > +     if (!__gui_ret) {                               \
> > +             if (IS_PREFIX(x))                       \
> > +                     __gui_ret = __get_user(y, ptr + 1);     \
> > +     }                                               \
> > +                                                     \
> > +     __gui_ret;                                      \
> > +})
> > +
> > +#define __get_user_instr_inatomic(x, y, ptr)         \
> > +({                                                   \
> > +     long __gui_ret = 0;                             \
> > +     y = 0;                                          \
> > +     __gui_ret = __get_user_inatomic(x, ptr);        \
> > +     if (!__gui_ret) {                               \
> > +             if (IS_PREFIX(x))                       \
> > +                     __gui_ret = __get_user_inatomic(y, ptr + 1);    \
> > +     }                                               \
> > +                                                     \
> > +     __gui_ret;                                      \
> > +})
> > +
> >  #endif       /* _ARCH_POWERPC_UACCESS_H */
> > diff --git a/arch/powerpc/kernel/align.c b/arch/powerpc/kernel/align.c
> > index 92045ed64976..245e79792a01 100644
> > --- a/arch/powerpc/kernel/align.c
> > +++ b/arch/powerpc/kernel/align.c
> > @@ -334,7 +334,7 @@ int fix_alignment(struct pt_regs *regs)
> >       if ((instr & 0xfc0006fe) == (PPC_INST_COPY & 0xfc0006fe))
> >               return -EIO;
> >
> > -     r = analyse_instr(&op, regs, instr);
> > +     r = analyse_instr(&op, regs, instr, 0);
> >       if (r < 0)
> >               return -EINVAL;
> >
> > diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c
> > index 58ce3d37c2a3..f4530961998c 100644
> > --- a/arch/powerpc/kernel/hw_breakpoint.c
> > +++ b/arch/powerpc/kernel/hw_breakpoint.c
> > @@ -248,7 +248,7 @@ static bool stepping_handler(struct pt_regs *regs, struct perf_event *bp,
> >       if (__get_user_inatomic(instr, (unsigned int *)regs->nip))
> >               goto fail;
> >
> > -     ret = analyse_instr(&op, regs, instr);
> > +     ret = analyse_instr(&op, regs, instr, 0);
> >       type = GETTYPE(op.type);
> >       size = GETSIZE(op.type);
> >
> > @@ -272,7 +272,7 @@ static bool stepping_handler(struct pt_regs *regs, struct perf_event *bp,
> >               return false;
> >       }
> >
> > -     if (!emulate_step(regs, instr))
> > +     if (!emulate_step(regs, instr, 0))
> >               goto fail;
> >
> >       return true;
> > diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
> > index 2d27ec4feee4..7303fe3856cc 100644
> > --- a/arch/powerpc/kernel/kprobes.c
> > +++ b/arch/powerpc/kernel/kprobes.c
> > @@ -219,7 +219,7 @@ static int try_to_emulate(struct kprobe *p, struct pt_regs *regs)
> >       unsigned int insn = *p->ainsn.insn;
> >
> >       /* regs->nip is also adjusted if emulate_step returns 1 */
> > -     ret = emulate_step(regs, insn);
> > +     ret = emulate_step(regs, insn, 0);
> >       if (ret > 0) {
> >               /*
> >                * Once this instruction has been boosted
> > diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c
> > index 1cbf7f1a4e3d..d862bb549158 100644
> > --- a/arch/powerpc/kernel/mce_power.c
> > +++ b/arch/powerpc/kernel/mce_power.c
> > @@ -374,7 +374,7 @@ static int mce_find_instr_ea_and_phys(struct pt_regs *regs, uint64_t *addr,
> >       if (pfn != ULONG_MAX) {
> >               instr_addr = (pfn << PAGE_SHIFT) + (regs->nip & ~PAGE_MASK);
> >               instr = *(unsigned int *)(instr_addr);
> > -             if (!analyse_instr(&op, &tmp, instr)) {
> > +             if (!analyse_instr(&op, &tmp, instr, 0)) {
> >                       pfn = addr_to_pfn(regs, op.ea);
> >                       *addr = op.ea;
> >                       *phys_addr = (pfn << PAGE_SHIFT);
> > diff --git a/arch/powerpc/kernel/optprobes.c b/arch/powerpc/kernel/optprobes.c
> > index 024f7aad1952..82dc8a589c87 100644
> > --- a/arch/powerpc/kernel/optprobes.c
> > +++ b/arch/powerpc/kernel/optprobes.c
> > @@ -100,7 +100,7 @@ static unsigned long can_optimize(struct kprobe *p)
> >        * and that can be emulated.
> >        */
> >       if (!is_conditional_branch(*p->ainsn.insn) &&
> > -                     analyse_instr(&op, &regs, *p->ainsn.insn) == 1) {
> > +                     analyse_instr(&op, &regs, *p->ainsn.insn, 0) == 1) {
> >               emulate_update_regs(&regs, &op);
> >               nip = regs.nip;
> >       }
> > diff --git a/arch/powerpc/kernel/uprobes.c b/arch/powerpc/kernel/uprobes.c
> > index 1cfef0e5fec5..ab1077dc6148 100644
> > --- a/arch/powerpc/kernel/uprobes.c
> > +++ b/arch/powerpc/kernel/uprobes.c
> > @@ -173,7 +173,7 @@ bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs)
> >        * emulate_step() returns 1 if the insn was successfully emulated.
> >        * For all other cases, we need to single-step in hardware.
> >        */
> > -     ret = emulate_step(regs, auprobe->insn);
> > +     ret = emulate_step(regs, auprobe->insn, 0);
> >       if (ret > 0)
> >               return true;
> >
> > diff --git a/arch/powerpc/kvm/emulate_loadstore.c b/arch/powerpc/kvm/emulate_loadstore.c
> > index 2e496eb86e94..fcab1f31b48d 100644
> > --- a/arch/powerpc/kvm/emulate_loadstore.c
> > +++ b/arch/powerpc/kvm/emulate_loadstore.c
> > @@ -100,7 +100,7 @@ int kvmppc_emulate_loadstore(struct kvm_vcpu *vcpu)
> >
> >       emulated = EMULATE_FAIL;
> >       vcpu->arch.regs.msr = vcpu->arch.shared->msr;
> > -     if (analyse_instr(&op, &vcpu->arch.regs, inst) == 0) {
> > +     if (analyse_instr(&op, &vcpu->arch.regs, inst, 0) == 0) {
> >               int type = op.type & INSTR_TYPE_MASK;
> >               int size = GETSIZE(op.type);
> >
> > diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> > index c077acb983a1..ade3f5eba2e5 100644
> > --- a/arch/powerpc/lib/sstep.c
> > +++ b/arch/powerpc/lib/sstep.c
> > @@ -1163,7 +1163,7 @@ static nokprobe_inline int trap_compare(long v1, long v2)
> >   * otherwise.
> >   */
> >  int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
> > -               unsigned int instr)
> > +               unsigned int instr, unsigned int sufx)
> >  {
> >       unsigned int opcode, ra, rb, rc, rd, spr, u;
> >       unsigned long int imm;
> > @@ -2756,7 +2756,8 @@ void emulate_update_regs(struct pt_regs *regs, struct instruction_op *op)
> >  {
> >       unsigned long next_pc;
> >
> > -     next_pc = truncate_if_32bit(regs->msr, regs->nip + 4);
> > +     next_pc = truncate_if_32bit(regs->msr,
> > +                                 regs->nip + ((op->type & PREFIXED) ? 8 : 4));
> >       switch (GETTYPE(op->type)) {
> >       case COMPUTE:
> >               if (op->type & SETREG)
> > @@ -3101,14 +3102,14 @@ NOKPROBE_SYMBOL(emulate_loadstore);
> >   * or -1 if the instruction is one that should not be stepped,
> >   * such as an rfid, or a mtmsrd that would clear MSR_RI.
> >   */
> > -int emulate_step(struct pt_regs *regs, unsigned int instr)
> > +int emulate_step(struct pt_regs *regs, unsigned int instr, unsigned int sufx)
> >  {
> >       struct instruction_op op;
> >       int r, err, type;
> >       unsigned long val;
> >       unsigned long ea;
> >
> > -     r = analyse_instr(&op, regs, instr);
> > +     r = analyse_instr(&op, regs, instr, sufx);
> >       if (r < 0)
> >               return r;
> >       if (r > 0) {
> > @@ -3200,7 +3201,8 @@ int emulate_step(struct pt_regs *regs, unsigned int instr)
> >       return 0;
> >
> >   instr_done:
> > -     regs->nip = truncate_if_32bit(regs->msr, regs->nip + 4);
> > +     regs->nip = truncate_if_32bit(regs->msr,
> > +                                   regs->nip + ((op.type & PREFIXED) ? 8 : 4));
> >       return 1;
> >  }
> >  NOKPROBE_SYMBOL(emulate_step);
> > diff --git a/arch/powerpc/lib/test_emulate_step.c b/arch/powerpc/lib/test_emulate_step.c
> > index 42347067739c..9288dc6fc715 100644
> > --- a/arch/powerpc/lib/test_emulate_step.c
> > +++ b/arch/powerpc/lib/test_emulate_step.c
> > @@ -103,7 +103,7 @@ static void __init test_ld(void)
> >       regs.gpr[3] = (unsigned long) &a;
> >
> >       /* ld r5, 0(r3) */
> > -     stepped = emulate_step(&regs, TEST_LD(5, 3, 0));
> > +     stepped = emulate_step(&regs, TEST_LD(5, 3, 0), 0);
> >
> >       if (stepped == 1 && regs.gpr[5] == a)
> >               show_result("ld", "PASS");
> > @@ -121,7 +121,7 @@ static void __init test_lwz(void)
> >       regs.gpr[3] = (unsigned long) &a;
> >
> >       /* lwz r5, 0(r3) */
> > -     stepped = emulate_step(&regs, TEST_LWZ(5, 3, 0));
> > +     stepped = emulate_step(&regs, TEST_LWZ(5, 3, 0), 0);
> >
> >       if (stepped == 1 && regs.gpr[5] == a)
> >               show_result("lwz", "PASS");
> > @@ -141,7 +141,7 @@ static void __init test_lwzx(void)
> >       regs.gpr[5] = 0x8765;
> >
> >       /* lwzx r5, r3, r4 */
> > -     stepped = emulate_step(&regs, TEST_LWZX(5, 3, 4));
> > +     stepped = emulate_step(&regs, TEST_LWZX(5, 3, 4), 0);
> >       if (stepped == 1 && regs.gpr[5] == a[2])
> >               show_result("lwzx", "PASS");
> >       else
> > @@ -159,7 +159,7 @@ static void __init test_std(void)
> >       regs.gpr[5] = 0x5678;
> >
> >       /* std r5, 0(r3) */
> > -     stepped = emulate_step(&regs, TEST_STD(5, 3, 0));
> > +     stepped = emulate_step(&regs, TEST_STD(5, 3, 0), 0);
> >       if (stepped == 1 || regs.gpr[5] == a)
> >               show_result("std", "PASS");
> >       else
> > @@ -184,7 +184,7 @@ static void __init test_ldarx_stdcx(void)
> >       regs.gpr[5] = 0x5678;
> >
> >       /* ldarx r5, r3, r4, 0 */
> > -     stepped = emulate_step(&regs, TEST_LDARX(5, 3, 4, 0));
> > +     stepped = emulate_step(&regs, TEST_LDARX(5, 3, 4, 0), 0);
> >
> >       /*
> >        * Don't touch 'a' here. Touching 'a' can do Load/store
> > @@ -202,7 +202,7 @@ static void __init test_ldarx_stdcx(void)
> >       regs.gpr[5] = 0x9ABC;
> >
> >       /* stdcx. r5, r3, r4 */
> > -     stepped = emulate_step(&regs, TEST_STDCX(5, 3, 4));
> > +     stepped = emulate_step(&regs, TEST_STDCX(5, 3, 4), 0);
> >
> >       /*
> >        * Two possible scenarios that indicates successful emulation
> > @@ -242,7 +242,7 @@ static void __init test_lfsx_stfsx(void)
> >       regs.gpr[4] = 0;
> >
> >       /* lfsx frt10, r3, r4 */
> > -     stepped = emulate_step(&regs, TEST_LFSX(10, 3, 4));
> > +     stepped = emulate_step(&regs, TEST_LFSX(10, 3, 4), 0);
> >
> >       if (stepped == 1)
> >               show_result("lfsx", "PASS");
> > @@ -255,7 +255,7 @@ static void __init test_lfsx_stfsx(void)
> >       c.a = 678.91;
> >
> >       /* stfsx frs10, r3, r4 */
> > -     stepped = emulate_step(&regs, TEST_STFSX(10, 3, 4));
> > +     stepped = emulate_step(&regs, TEST_STFSX(10, 3, 4), 0);
> >
> >       if (stepped == 1 && c.b == cached_b)
> >               show_result("stfsx", "PASS");
> > @@ -285,7 +285,7 @@ static void __init test_lfdx_stfdx(void)
> >       regs.gpr[4] = 0;
> >
> >       /* lfdx frt10, r3, r4 */
> > -     stepped = emulate_step(&regs, TEST_LFDX(10, 3, 4));
> > +     stepped = emulate_step(&regs, TEST_LFDX(10, 3, 4), 0);
> >
> >       if (stepped == 1)
> >               show_result("lfdx", "PASS");
> > @@ -298,7 +298,7 @@ static void __init test_lfdx_stfdx(void)
> >       c.a = 987654.32;
> >
> >       /* stfdx frs10, r3, r4 */
> > -     stepped = emulate_step(&regs, TEST_STFDX(10, 3, 4));
> > +     stepped = emulate_step(&regs, TEST_STFDX(10, 3, 4), 0);
> >
> >       if (stepped == 1 && c.b == cached_b)
> >               show_result("stfdx", "PASS");
> > @@ -344,7 +344,7 @@ static void __init test_lvx_stvx(void)
> >       regs.gpr[4] = 0;
> >
> >       /* lvx vrt10, r3, r4 */
> > -     stepped = emulate_step(&regs, TEST_LVX(10, 3, 4));
> > +     stepped = emulate_step(&regs, TEST_LVX(10, 3, 4), 0);
> >
> >       if (stepped == 1)
> >               show_result("lvx", "PASS");
> > @@ -360,7 +360,7 @@ static void __init test_lvx_stvx(void)
> >       c.b[3] = 498532;
> >
> >       /* stvx vrs10, r3, r4 */
> > -     stepped = emulate_step(&regs, TEST_STVX(10, 3, 4));
> > +     stepped = emulate_step(&regs, TEST_STVX(10, 3, 4), 0);
> >
> >       if (stepped == 1 && cached_b[0] == c.b[0] && cached_b[1] == c.b[1] &&
> >           cached_b[2] == c.b[2] && cached_b[3] == c.b[3])
> > @@ -401,7 +401,7 @@ static void __init test_lxvd2x_stxvd2x(void)
> >       regs.gpr[4] = 0;
> >
> >       /* lxvd2x vsr39, r3, r4 */
> > -     stepped = emulate_step(&regs, TEST_LXVD2X(39, 3, 4));
> > +     stepped = emulate_step(&regs, TEST_LXVD2X(39, 3, 4), 0);
> >
> >       if (stepped == 1 && cpu_has_feature(CPU_FTR_VSX)) {
> >               show_result("lxvd2x", "PASS");
> > @@ -421,7 +421,7 @@ static void __init test_lxvd2x_stxvd2x(void)
> >       c.b[3] = 4;
> >
> >       /* stxvd2x vsr39, r3, r4 */
> > -     stepped = emulate_step(&regs, TEST_STXVD2X(39, 3, 4));
> > +     stepped = emulate_step(&regs, TEST_STXVD2X(39, 3, 4), 0);
> >
> >       if (stepped == 1 && cached_b[0] == c.b[0] && cached_b[1] == c.b[1] &&
> >           cached_b[2] == c.b[2] && cached_b[3] == c.b[3] &&
> > @@ -848,7 +848,7 @@ static int __init emulate_compute_instr(struct pt_regs *regs,
> >       if (!regs || !instr)
> >               return -EINVAL;
> >
> > -     if (analyse_instr(&op, regs, instr) != 1 ||
> > +     if (analyse_instr(&op, regs, instr, 0) != 1 ||
> >           GETTYPE(op.type) != COMPUTE) {
> >               pr_info("emulation failed, instruction = 0x%08x\n", instr);
> >               return -EFAULT;
> > diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
> > index a7056049709e..f47bd843dc52 100644
> > --- a/arch/powerpc/xmon/xmon.c
> > +++ b/arch/powerpc/xmon/xmon.c
> > @@ -705,7 +705,7 @@ static int xmon_core(struct pt_regs *regs, int fromipi)
> >       if ((regs->msr & (MSR_IR|MSR_PR|MSR_64BIT)) == (MSR_IR|MSR_64BIT)) {
> >               bp = at_breakpoint(regs->nip);
> >               if (bp != NULL) {
> > -                     int stepped = emulate_step(regs, bp->instr[0]);
> > +                     int stepped = emulate_step(regs, bp->instr[0], 0);
> >                       if (stepped == 0) {
> >                               regs->nip = (unsigned long) &bp->instr[0];
> >                               atomic_inc(&bp->ref_count);
> > @@ -1170,7 +1170,7 @@ static int do_step(struct pt_regs *regs)
> >       /* check we are in 64-bit kernel mode, translation enabled */
> >       if ((regs->msr & (MSR_64BIT|MSR_PR|MSR_IR)) == (MSR_64BIT|MSR_IR)) {
> >               if (mread(regs->nip, &instr, 4) == 4) {
> > -                     stepped = emulate_step(regs, instr);
> > +                     stepped = emulate_step(regs, instr, 0);
> >                       if (stepped < 0) {
> >                               printf("Couldn't single-step %s instruction\n",
> >                                      (IS_RFID(instr)? "rfid": "mtmsrd"));
> > --
> > 2.20.1
> >
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 06/18] powerpc sstep: Add support for prefixed integer load/stores
  2020-01-10 10:38   ` Balamuruhan S
@ 2020-02-07  0:18     ` Jordan Niethe
  0 siblings, 0 replies; 42+ messages in thread
From: Jordan Niethe @ 2020-02-07  0:18 UTC (permalink / raw)
  To: Balamuruhan S; +Cc: Alistair Popple, linuxppc-dev

On Fri, Jan 10, 2020 at 9:38 PM Balamuruhan S <bala24@linux.ibm.com> wrote:
>
> On Tue, Nov 26, 2019 at 04:21:29PM +1100, Jordan Niethe wrote:
> > This adds emulation support for the following prefixed integer
> > load/stores:
> >   * Prefixed Load Byte and Zero (plbz)
> >   * Prefixed Load Halfword and Zero (plhz)
> >   * Prefixed Load Halfword Algebraic (plha)
> >   * Prefixed Load Word and Zero (plwz)
> >   * Prefixed Load Word Algebraic (plwa)
> >   * Prefixed Load Doubleword (pld)
> >   * Prefixed Store Byte (pstb)
> >   * Prefixed Store Halfword (psth)
> >   * Prefixed Store Word (pstw)
> >   * Prefixed Store Doubleword (pstd)
> >   * Prefixed Load Quadword (plq)
> >   * Prefixed Store Quadword (pstq)
> >
> > Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> > ---
> >  arch/powerpc/lib/sstep.c | 110 +++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 110 insertions(+)
> >
> > diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> > index ade3f5eba2e5..4f5ad1f602d8 100644
> > --- a/arch/powerpc/lib/sstep.c
> > +++ b/arch/powerpc/lib/sstep.c
> > @@ -187,6 +187,43 @@ static nokprobe_inline unsigned long xform_ea(unsigned int instr,
> >       return ea;
> >  }
> >
> > +/*
> > + * Calculate effective address for a MLS:D-form / 8LS:D-form prefixed instruction
> > + */
> > +static nokprobe_inline unsigned long mlsd_8lsd_ea(unsigned int instr,
> > +                                               unsigned int sufx,
> > +                                               const struct pt_regs *regs)
> > +{
> > +     int ra, prefix_r;
> > +     unsigned int  dd;
> > +     unsigned long ea, d0, d1, d;
> > +
> > +     prefix_r = instr & (1ul << 20);
> > +     ra = (sufx >> 16) & 0x1f;
> > +
> > +     d0 = instr & 0x3ffff;
> > +     d1 = sufx & 0xffff;
> > +     d = (d0 << 16) | d1;
> > +
> > +     /*
> > +      * sign extend a 34 bit number
> > +      */
> > +     dd = (unsigned int) (d >> 2);
> > +     ea = (signed int) dd;
> > +     ea = (ea << 2) | (d & 0x3);
> > +
> > +     if (!prefix_r && ra)
> > +             ea += regs->gpr[ra];
> > +     else if (!prefix_r && !ra)
> > +             ; /* Leave ea as is */
> > +     else if (prefix_r && !ra)
> > +             ea += regs->nip;
> > +     else if (prefix_r && ra)
> > +             ; /* Invalid form. Should already be checked for by caller! */
> > +
> > +     return ea;
> > +}
> > +
> >  /*
> >   * Return the largest power of 2, not greater than sizeof(unsigned long),
> >   * such that x is a multiple of it.
> > @@ -1166,6 +1203,7 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
> >                 unsigned int instr, unsigned int sufx)
> >  {
> >       unsigned int opcode, ra, rb, rc, rd, spr, u;
> > +     unsigned int sufxopcode, prefixtype, prefix_r;
> >       unsigned long int imm;
> >       unsigned long int val, val2;
> >       unsigned int mb, me, sh;
> > @@ -2652,6 +2690,78 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
> >
> >       }
> >
> > +/*
> > + * Prefixed instructions
> > + */
> > +     switch (opcode) {
> > +     case 1:
> > +             prefix_r = instr & (1ul << 20);
> > +             ra = (sufx >> 16) & 0x1f;
> > +             op->update_reg = ra;
> > +             rd = (sufx >> 21) & 0x1f;
> > +             op->reg = rd;
> > +             op->val = regs->gpr[rd];
> > +
> > +             sufxopcode = sufx >> 26;
> > +             prefixtype = (instr >> 24) & 0x3;
> > +             switch (prefixtype) {
> > +             case 0: /* Type 00  Eight-Byte Load/Store */
> > +                     if (prefix_r && ra)
> > +                             break;
> > +                     op->ea = mlsd_8lsd_ea(instr, sufx, regs);
> > +                     switch (sufxopcode) {
> > +                     case 41:        /* plwa */
> > +                             op->type = MKOP(LOAD, PREFIXED | SIGNEXT, 4);
> > +                             break;
> > +                     case 56:        /* plq */
> > +                             op->type = MKOP(LOAD, PREFIXED, 16);
> > +                             break;
> > +                     case 57:        /* pld */
> > +                             op->type = MKOP(LOAD, PREFIXED | SIGNEXT, 8);
> > +                             break;
> > +                     case 60:        /* stq */
> > +                             op->type = MKOP(STORE, PREFIXED, 16);
> > +                             break;
> > +                     case 61:        /* pstd */
> > +                             op->type = MKOP(STORE, PREFIXED | SIGNEXT, 8);
>
> For 8 byte and and 1 byte (mentioned below for Type 10 instructions), we
> do not have their respective definitions in `do_signext()`, I am not
> sure whether it is typo/miss.

This was a mistake. pstd/pld should not have been flagged with SIGNEXT.
There still are only algebraic loads for word and halfword lengths so
do_signext()
is fine.

>
> > +                             break;
> > +                     }
> > +                     break;
> > +             case 1: /* Type 01 Modified Register-to-Register */
>
> Type 01 would be Eight-Byte Register-to-Register.
Thanks, you are right.
>
> -- Bala
> > +                     break;
> > +             case 2: /* Type 10 Modified Load/Store */
> > +                     if (prefix_r && ra)
> > +                             break;
> > +                     op->ea = mlsd_8lsd_ea(instr, sufx, regs);
> > +                     switch (sufxopcode) {
> > +                     case 32:        /* plwz */
> > +                             op->type = MKOP(LOAD, PREFIXED, 4);
> > +                             break;
> > +                     case 34:        /* plbz */
> > +                             op->type = MKOP(LOAD, PREFIXED, 1);
> > +                             break;
> > +                     case 36:        /* pstw */
> > +                             op->type = MKOP(STORE, PREFIXED, 4);
> > +                             break;
> > +                     case 38:        /* pstb */
> > +                             op->type = MKOP(STORE, PREFIXED, 1);
> > +                             break;
> > +                     case 40:        /* plhz */
> > +                             op->type = MKOP(LOAD, PREFIXED, 2);
> > +                             break;
> > +                     case 42:        /* plha */
> > +                             op->type = MKOP(LOAD, PREFIXED | SIGNEXT, 2);
> > +                             break;
> > +                     case 44:        /* psth */
> > +                             op->type = MKOP(STORE, PREFIXED, 2);
> > +                             break;
> > +                     }
> > +                     break;
> > +             case 3: /* Type 11 Modified Register-to-Register */
> > +                     break;
> > +             }
> > +     }
> > +
> >  #ifdef CONFIG_VSX
> >       if ((GETTYPE(op->type) == LOAD_VSX ||
> >            GETTYPE(op->type) == STORE_VSX) &&
> > --
> > 2.20.1
> >
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 06/18] powerpc sstep: Add support for prefixed integer load/stores
  2020-01-10 15:13   ` Balamuruhan S
@ 2020-02-07  0:20     ` Jordan Niethe
  0 siblings, 0 replies; 42+ messages in thread
From: Jordan Niethe @ 2020-02-07  0:20 UTC (permalink / raw)
  To: Balamuruhan S; +Cc: Alistair Popple, linuxppc-dev

On Sat, Jan 11, 2020 at 2:13 AM Balamuruhan S <bala24@linux.ibm.com> wrote:
>
> On Tue, Nov 26, 2019 at 04:21:29PM +1100, Jordan Niethe wrote:
> > This adds emulation support for the following prefixed integer
> > load/stores:
> >   * Prefixed Load Byte and Zero (plbz)
> >   * Prefixed Load Halfword and Zero (plhz)
> >   * Prefixed Load Halfword Algebraic (plha)
> >   * Prefixed Load Word and Zero (plwz)
> >   * Prefixed Load Word Algebraic (plwa)
> >   * Prefixed Load Doubleword (pld)
> >   * Prefixed Store Byte (pstb)
> >   * Prefixed Store Halfword (psth)
> >   * Prefixed Store Word (pstw)
> >   * Prefixed Store Doubleword (pstd)
> >   * Prefixed Load Quadword (plq)
> >   * Prefixed Store Quadword (pstq)
> >
> > Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> > ---
> >  arch/powerpc/lib/sstep.c | 110 +++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 110 insertions(+)
> >
> > diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> > index ade3f5eba2e5..4f5ad1f602d8 100644
> > --- a/arch/powerpc/lib/sstep.c
> > +++ b/arch/powerpc/lib/sstep.c
> > @@ -187,6 +187,43 @@ static nokprobe_inline unsigned long xform_ea(unsigned int instr,
> >       return ea;
> >  }
> >
> > +/*
> > + * Calculate effective address for a MLS:D-form / 8LS:D-form prefixed instruction
> > + */
> > +static nokprobe_inline unsigned long mlsd_8lsd_ea(unsigned int instr,
> > +                                               unsigned int sufx,
> > +                                               const struct pt_regs *regs)
> > +{
> > +     int ra, prefix_r;
> > +     unsigned int  dd;
> > +     unsigned long ea, d0, d1, d;
> > +
> > +     prefix_r = instr & (1ul << 20);
> > +     ra = (sufx >> 16) & 0x1f;
> > +
> > +     d0 = instr & 0x3ffff;
> > +     d1 = sufx & 0xffff;
> > +     d = (d0 << 16) | d1;
> > +
> > +     /*
> > +      * sign extend a 34 bit number
> > +      */
> > +     dd = (unsigned int) (d >> 2);
> > +     ea = (signed int) dd;
> > +     ea = (ea << 2) | (d & 0x3);
> > +
> > +     if (!prefix_r && ra)
> > +             ea += regs->gpr[ra];
> > +     else if (!prefix_r && !ra)
> > +             ; /* Leave ea as is */
> > +     else if (prefix_r && !ra)
> > +             ea += regs->nip;
> > +     else if (prefix_r && ra)
> > +             ; /* Invalid form. Should already be checked for by caller! */
> > +
> > +     return ea;
> > +}
> > +
> >  /*
> >   * Return the largest power of 2, not greater than sizeof(unsigned long),
> >   * such that x is a multiple of it.
> > @@ -1166,6 +1203,7 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
> >                 unsigned int instr, unsigned int sufx)
> >  {
> >       unsigned int opcode, ra, rb, rc, rd, spr, u;
> > +     unsigned int sufxopcode, prefixtype, prefix_r;
> >       unsigned long int imm;
> >       unsigned long int val, val2;
> >       unsigned int mb, me, sh;
> > @@ -2652,6 +2690,78 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
> >
> >       }
> >
> > +/*
> > + * Prefixed instructions
> > + */
> > +     switch (opcode) {
> > +     case 1:
> > +             prefix_r = instr & (1ul << 20);
> > +             ra = (sufx >> 16) & 0x1f;
> > +             op->update_reg = ra;
> > +             rd = (sufx >> 21) & 0x1f;
> > +             op->reg = rd;
> > +             op->val = regs->gpr[rd];
> > +
> > +             sufxopcode = sufx >> 26;
> > +             prefixtype = (instr >> 24) & 0x3;
> > +             switch (prefixtype) {
> > +             case 0: /* Type 00  Eight-Byte Load/Store */
> > +                     if (prefix_r && ra)
> > +                             break;
> > +                     op->ea = mlsd_8lsd_ea(instr, sufx, regs);
> > +                     switch (sufxopcode) {
> > +                     case 41:        /* plwa */
> > +                             op->type = MKOP(LOAD, PREFIXED | SIGNEXT, 4);
> > +                             break;
> > +                     case 56:        /* plq */
> > +                             op->type = MKOP(LOAD, PREFIXED, 16);
> > +                             break;
> > +                     case 57:        /* pld */
> > +                             op->type = MKOP(LOAD, PREFIXED | SIGNEXT, 8);
> > +                             break;
> > +                     case 60:        /* stq */
> > +                             op->type = MKOP(STORE, PREFIXED, 16);
> > +                             break;
> > +                     case 61:        /* pstd */
> > +                             op->type = MKOP(STORE, PREFIXED | SIGNEXT, 8);
>
> sorry, we don't do SIGNEXT for 1 byte below in Type 10, so does 8 byte is used
> conscious without definition in `do_signext()` as we don't really need to do
> anything ?
No, it was mistake. Those instructions should not have been marked as SIGNEXT.
>
> -- Bala
>
> > +                             break;
> > +                     }
> > +                     break;
> > +             case 1: /* Type 01 Modified Register-to-Register */
> > +                     break;
> > +             case 2: /* Type 10 Modified Load/Store */
> > +                     if (prefix_r && ra)
> > +                             break;
> > +                     op->ea = mlsd_8lsd_ea(instr, sufx, regs);
> > +                     switch (sufxopcode) {
> > +                     case 32:        /* plwz */
> > +                             op->type = MKOP(LOAD, PREFIXED, 4);
> > +                             break;
> > +                     case 34:        /* plbz */
> > +                             op->type = MKOP(LOAD, PREFIXED, 1);
> > +                             break;
> > +                     case 36:        /* pstw */
> > +                             op->type = MKOP(STORE, PREFIXED, 4);
> > +                             break;
> > +                     case 38:        /* pstb */
> > +                             op->type = MKOP(STORE, PREFIXED, 1);
> > +                             break;
> > +                     case 40:        /* plhz */
> > +                             op->type = MKOP(LOAD, PREFIXED, 2);
> > +                             break;
> > +                     case 42:        /* plha */
> > +                             op->type = MKOP(LOAD, PREFIXED | SIGNEXT, 2);
> > +                             break;
> > +                     case 44:        /* psth */
> > +                             op->type = MKOP(STORE, PREFIXED, 2);
> > +                             break;
> > +                     }
> > +                     break;
> > +             case 3: /* Type 11 Modified Register-to-Register */
> > +                     break;
> > +             }
> > +     }
> > +
> >  #ifdef CONFIG_VSX
> >       if ((GETTYPE(op->type) == LOAD_VSX ||
> >            GETTYPE(op->type) == STORE_VSX) &&
> > --
> > 2.20.1
> >
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 18/18] powerpc/fault: Use analyse_instr() to check for store with updates to sp
  2019-12-18 14:11   ` Daniel Axtens
@ 2020-02-07  8:15     ` Greg Kurz
  2020-02-08  0:28       ` Jordan Niethe
  0 siblings, 1 reply; 42+ messages in thread
From: Greg Kurz @ 2020-02-07  8:15 UTC (permalink / raw)
  To: Daniel Axtens; +Cc: Jordan Niethe, linuxppc-dev, alistair

On Thu, 19 Dec 2019 01:11:33 +1100
Daniel Axtens <dja@axtens.net> wrote:

> Jordan Niethe <jniethe5@gmail.com> writes:
> 
> > A user-mode access to an address a long way below the stack pointer is
> > only valid if the instruction is one that would update the stack pointer
> > to the address accessed. This is checked by directly looking at the
> > instructions op-code. As a result is does not take into account prefixed
> > instructions. Instead of looking at the instruction our self, use
> > analyse_instr() determine if this a store instruction that will update
> > the stack pointer.
> >
> > Something to note is that there currently are not any store with update
> > prefixed instructions. Actually there is no plan for prefixed
> > update-form loads and stores. So this patch is probably not needed but
> > it might be preferable to use analyse_instr() rather than open coding
> > the test anyway.
> 
> Yes please. I was looking through this code recently and was
> horrified. This improves things a lot and I think is justification
> enough as-is.
> 

Except it doesn't work... I'm now experiencing a systematic crash of
systemd at boot in my fedora31 guest:

[    3.322912] systemd[1]: segfault (11) at 7ffff3eaf550 nip 7ce4d42f8d78 lr 9d82c098fc0 code 1 in libsystemd-shared-243.so[7ce4d4150000+2e0000]
[    3.323112] systemd[1]: code: 00000480 60420000 3c4c001e 3842edb0 7c0802a6 3d81fff0 fb81ffe0 fba1ffe8 
[    3.323244] systemd[1]: code: fbc1fff0 fbe1fff8 f8010010 7c200b78 <f801f001> 7c216000 4082fff8 f801ff71 

f801f001 is

0x1a8d78 <serialize_item_format+40>: stdu    r0,-4096(r1)

which analyse_instr() is supposed to decode as a STORE that
updates r1 so we should be good... Unfortunately analyse_instr()
forbids partial register sets, since it might return op->val
based on some register content depending on the instruction:

	/* Following cases refer to regs->gpr[], so we need all regs */
	if (!FULL_REGS(regs))
		return -1;

analyse_instr() was introduced with instruction emulation in mind, which
goes far beyond the need we have in store_updates_sp(). Especially the
fault path doesn't care for the register content at all...

Not sure how to cope with that correctly (refactor analyse_instr() ? ) but
until someone comes up with a solution, please don't merge this patch.

Cheers,

--
Greg

> Regards,
> Daniel
> >
> > Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> > ---
> >  arch/powerpc/mm/fault.c | 39 +++++++++++----------------------------
> >  1 file changed, 11 insertions(+), 28 deletions(-)
> >
> > diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> > index b5047f9b5dec..cb78b3ca1800 100644
> > --- a/arch/powerpc/mm/fault.c
> > +++ b/arch/powerpc/mm/fault.c
> > @@ -41,37 +41,17 @@
> >  #include <asm/siginfo.h>
> >  #include <asm/debug.h>
> >  #include <asm/kup.h>
> > +#include <asm/sstep.h>
> >  
> >  /*
> >   * Check whether the instruction inst is a store using
> >   * an update addressing form which will update r1.
> >   */
> > -static bool store_updates_sp(unsigned int inst)
> > +static bool store_updates_sp(struct instruction_op *op)
> >  {
> > -	/* check for 1 in the rA field */
> > -	if (((inst >> 16) & 0x1f) != 1)
> > -		return false;
> > -	/* check major opcode */
> > -	switch (inst >> 26) {
> > -	case OP_STWU:
> > -	case OP_STBU:
> > -	case OP_STHU:
> > -	case OP_STFSU:
> > -	case OP_STFDU:
> > -		return true;
> > -	case OP_STD:	/* std or stdu */
> > -		return (inst & 3) == 1;
> > -	case OP_31:
> > -		/* check minor opcode */
> > -		switch ((inst >> 1) & 0x3ff) {
> > -		case OP_31_XOP_STDUX:
> > -		case OP_31_XOP_STWUX:
> > -		case OP_31_XOP_STBUX:
> > -		case OP_31_XOP_STHUX:
> > -		case OP_31_XOP_STFSUX:
> > -		case OP_31_XOP_STFDUX:
> > +	if (GETTYPE(op->type) == STORE) {
> > +		if ((op->type & UPDATE) && (op->update_reg == 1))
> >  			return true;
> > -		}
> >  	}
> >  	return false;
> >  }
> > @@ -278,14 +258,17 @@ static bool bad_stack_expansion(struct pt_regs *regs, unsigned long address,
> >  
> >  		if ((flags & FAULT_FLAG_WRITE) && (flags & FAULT_FLAG_USER) &&
> >  		    access_ok(nip, sizeof(*nip))) {
> > -			unsigned int inst;
> > +			unsigned int inst, sufx;
> > +			struct instruction_op op;
> >  			int res;
> >  
> >  			pagefault_disable();
> > -			res = __get_user_inatomic(inst, nip);
> > +			res = __get_user_instr_inatomic(inst, sufx, nip);
> >  			pagefault_enable();
> > -			if (!res)
> > -				return !store_updates_sp(inst);
> > +			if (!res) {
> > +				analyse_instr(&op, uregs, inst, sufx);
> > +				return !store_updates_sp(&op);
> > +			}
> >  			*must_retry = true;
> >  		}
> >  		return true;
> > -- 
> > 2.20.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH 18/18] powerpc/fault: Use analyse_instr() to check for store with updates to sp
  2020-02-07  8:15     ` Greg Kurz
@ 2020-02-08  0:28       ` Jordan Niethe
  0 siblings, 0 replies; 42+ messages in thread
From: Jordan Niethe @ 2020-02-08  0:28 UTC (permalink / raw)
  To: Greg Kurz; +Cc: Alistair Popple, linuxppc-dev, Daniel Axtens

On Fri, Feb 7, 2020 at 7:16 PM Greg Kurz <groug@kaod.org> wrote:
>
> On Thu, 19 Dec 2019 01:11:33 +1100
> Daniel Axtens <dja@axtens.net> wrote:
>
> > Jordan Niethe <jniethe5@gmail.com> writes:
> >
> > > A user-mode access to an address a long way below the stack pointer is
> > > only valid if the instruction is one that would update the stack pointer
> > > to the address accessed. This is checked by directly looking at the
> > > instructions op-code. As a result is does not take into account prefixed
> > > instructions. Instead of looking at the instruction our self, use
> > > analyse_instr() determine if this a store instruction that will update
> > > the stack pointer.
> > >
> > > Something to note is that there currently are not any store with update
> > > prefixed instructions. Actually there is no plan for prefixed
> > > update-form loads and stores. So this patch is probably not needed but
> > > it might be preferable to use analyse_instr() rather than open coding
> > > the test anyway.
> >
> > Yes please. I was looking through this code recently and was
> > horrified. This improves things a lot and I think is justification
> > enough as-is.
> >
>
> Except it doesn't work... I'm now experiencing a systematic crash of
> systemd at boot in my fedora31 guest:
>
> [    3.322912] systemd[1]: segfault (11) at 7ffff3eaf550 nip 7ce4d42f8d78 lr 9d82c098fc0 code 1 in libsystemd-shared-243.so[7ce4d4150000+2e0000]
> [    3.323112] systemd[1]: code: 00000480 60420000 3c4c001e 3842edb0 7c0802a6 3d81fff0 fb81ffe0 fba1ffe8
> [    3.323244] systemd[1]: code: fbc1fff0 fbe1fff8 f8010010 7c200b78 <f801f001> 7c216000 4082fff8 f801ff71
>
> f801f001 is
>
> 0x1a8d78 <serialize_item_format+40>: stdu    r0,-4096(r1)
>
> which analyse_instr() is supposed to decode as a STORE that
> updates r1 so we should be good... Unfortunately analyse_instr()
> forbids partial register sets, since it might return op->val
> based on some register content depending on the instruction:
>
>         /* Following cases refer to regs->gpr[], so we need all regs */
>         if (!FULL_REGS(regs))
>                 return -1;
>
> analyse_instr() was introduced with instruction emulation in mind, which
> goes far beyond the need we have in store_updates_sp(). Especially the
> fault path doesn't care for the register content at all...
>
> Not sure how to cope with that correctly (refactor analyse_instr() ? ) but
> until someone comes up with a solution, please don't merge this patch.
>
> Cheers,
>
> --
> Greg
Thank you this information. I agree analyse_instr() is overkill for
the situation
especially as there are no prefixed store-with-updates. I am going to drop this
patch from the series.
>
> > Regards,
> > Daniel
> > >
> > > Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> > > ---
> > >  arch/powerpc/mm/fault.c | 39 +++++++++++----------------------------
> > >  1 file changed, 11 insertions(+), 28 deletions(-)
> > >
> > > diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> > > index b5047f9b5dec..cb78b3ca1800 100644
> > > --- a/arch/powerpc/mm/fault.c
> > > +++ b/arch/powerpc/mm/fault.c
> > > @@ -41,37 +41,17 @@
> > >  #include <asm/siginfo.h>
> > >  #include <asm/debug.h>
> > >  #include <asm/kup.h>
> > > +#include <asm/sstep.h>
> > >
> > >  /*
> > >   * Check whether the instruction inst is a store using
> > >   * an update addressing form which will update r1.
> > >   */
> > > -static bool store_updates_sp(unsigned int inst)
> > > +static bool store_updates_sp(struct instruction_op *op)
> > >  {
> > > -   /* check for 1 in the rA field */
> > > -   if (((inst >> 16) & 0x1f) != 1)
> > > -           return false;
> > > -   /* check major opcode */
> > > -   switch (inst >> 26) {
> > > -   case OP_STWU:
> > > -   case OP_STBU:
> > > -   case OP_STHU:
> > > -   case OP_STFSU:
> > > -   case OP_STFDU:
> > > -           return true;
> > > -   case OP_STD:    /* std or stdu */
> > > -           return (inst & 3) == 1;
> > > -   case OP_31:
> > > -           /* check minor opcode */
> > > -           switch ((inst >> 1) & 0x3ff) {
> > > -           case OP_31_XOP_STDUX:
> > > -           case OP_31_XOP_STWUX:
> > > -           case OP_31_XOP_STBUX:
> > > -           case OP_31_XOP_STHUX:
> > > -           case OP_31_XOP_STFSUX:
> > > -           case OP_31_XOP_STFDUX:
> > > +   if (GETTYPE(op->type) == STORE) {
> > > +           if ((op->type & UPDATE) && (op->update_reg == 1))
> > >                     return true;
> > > -           }
> > >     }
> > >     return false;
> > >  }
> > > @@ -278,14 +258,17 @@ static bool bad_stack_expansion(struct pt_regs *regs, unsigned long address,
> > >
> > >             if ((flags & FAULT_FLAG_WRITE) && (flags & FAULT_FLAG_USER) &&
> > >                 access_ok(nip, sizeof(*nip))) {
> > > -                   unsigned int inst;
> > > +                   unsigned int inst, sufx;
> > > +                   struct instruction_op op;
> > >                     int res;
> > >
> > >                     pagefault_disable();
> > > -                   res = __get_user_inatomic(inst, nip);
> > > +                   res = __get_user_instr_inatomic(inst, sufx, nip);
> > >                     pagefault_enable();
> > > -                   if (!res)
> > > -                           return !store_updates_sp(inst);
> > > +                   if (!res) {
> > > +                           analyse_instr(&op, uregs, inst, sufx);
> > > +                           return !store_updates_sp(&op);
> > > +                   }
> > >                     *must_retry = true;
> > >             }
> > >             return true;
> > > --
> > > 2.20.1
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2020-02-08  0:30 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-26  5:21 [PATCH 00/18] Initial Prefixed Instruction support Jordan Niethe
2019-11-26  5:21 ` [PATCH 01/18] powerpc: Enable Prefixed Instructions Jordan Niethe
2019-11-26  5:21 ` [PATCH 02/18] powerpc: Add BOUNDARY SRR1 bit for future ISA version Jordan Niethe
2019-11-26  5:21 ` [PATCH 03/18] powerpc: Add PREFIXED " Jordan Niethe
2019-12-18  8:23   ` Daniel Axtens
2019-12-20  5:09     ` Jordan Niethe
2019-11-26  5:21 ` [PATCH 04/18] powerpc: Rename Bit 35 of SRR1 to indicate new purpose Jordan Niethe
2019-11-26  5:21 ` [PATCH 05/18] powerpc sstep: Prepare to support prefixed instructions Jordan Niethe
2019-12-18  8:35   ` Daniel Axtens
2019-12-20  5:11     ` Jordan Niethe
2019-12-20  5:40       ` Christophe Leroy
2019-12-18 14:15   ` Daniel Axtens
2019-12-20  5:17     ` Jordan Niethe
2020-01-07  3:01       ` Jordan Niethe
2020-01-13  6:18   ` Balamuruhan S
2020-02-06 23:12     ` Jordan Niethe
2019-11-26  5:21 ` [PATCH 06/18] powerpc sstep: Add support for prefixed integer load/stores Jordan Niethe
2020-01-10 10:38   ` Balamuruhan S
2020-02-07  0:18     ` Jordan Niethe
2020-01-10 15:13   ` Balamuruhan S
2020-02-07  0:20     ` Jordan Niethe
2019-11-26  5:21 ` [PATCH 07/18] powerpc sstep: Add support for prefixed floating-point load/stores Jordan Niethe
2019-11-26  5:21 ` [PATCH 08/18] powerpc sstep: Add support for prefixed VSX load/stores Jordan Niethe
2019-12-18 14:05   ` Daniel Axtens
2019-11-26  5:21 ` [PATCH 09/18] powerpc sstep: Add support for prefixed fixed-point arithmetic Jordan Niethe
2019-11-26  5:21 ` [PATCH 10/18] powerpc: Support prefixed instructions in alignment handler Jordan Niethe
2019-11-26  5:21 ` [PATCH 11/18] powerpc/traps: Check for prefixed instructions in facility_unavailable_exception() Jordan Niethe
2019-11-26  5:21 ` [PATCH 12/18] powerpc/xmon: Add initial support for prefixed instructions Jordan Niethe
2019-11-26  5:21 ` [PATCH 13/18] powerpc/xmon: Dump " Jordan Niethe
2019-11-26  5:21 ` [PATCH 14/18] powerpc/kprobes: Support kprobes on " Jordan Niethe
2020-01-14  7:19   ` Balamuruhan S
2020-01-16  6:09     ` Michael Ellerman
2019-11-26  5:21 ` [PATCH 15/18] powerpc/uprobes: Add support for " Jordan Niethe
2020-01-13 11:30   ` Balamuruhan S
2020-02-06 23:09     ` Jordan Niethe
2019-11-26  5:21 ` [PATCH 16/18] powerpc/hw_breakpoints: Initial " Jordan Niethe
2019-11-26  5:21 ` [PATCH 17/18] powerpc: Add prefix support to mce_find_instr_ea_and_pfn() Jordan Niethe
2019-11-26  5:21 ` [PATCH 18/18] powerpc/fault: Use analyse_instr() to check for store with updates to sp Jordan Niethe
2019-12-18 14:11   ` Daniel Axtens
2020-02-07  8:15     ` Greg Kurz
2020-02-08  0:28       ` Jordan Niethe
2019-12-03  4:31 ` [PATCH 00/18] Initial Prefixed Instruction support Andrew Donnellan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).