linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API
@ 2020-11-24 10:19 Borislav Petkov
  2020-11-24 10:19 ` [RFC PATCH v0 01/19] x86/insn: Rename insn_decode() to insn_decode_regs() Borislav Petkov
                   ` (19 more replies)
  0 siblings, 20 replies; 34+ messages in thread
From: Borislav Petkov @ 2020-11-24 10:19 UTC (permalink / raw)
  To: Andy Lutomirski, Masami Hiramatsu; +Cc: X86 ML, LKML

From: Borislav Petkov <bp@suse.de>

Hi,

here's what I had in mind, finally split into proper patches. The final
goal is for all users of the decoder to simply call insn_decode() and
look at its retval. Simple.

As to amluto's question about detecting partial insns, see the diff
below.

Running that gives:

insn buffer:
0x48 0xcf 0x48 0x83 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 
supplied buf size: 15, ret 0
supplied buf size: 2, ret 0
supplied buf size: 3, ret 0
supplied buf size: 4, ret 0

and the return value is always success.

Which means that that buf_len that gets supplied to the decoder
functions doesn't really work and I need to look into it.

That is, provided this is how we want to control what the instruction
decoder decodes - by supplying the length of the buffer it should look
at.

We could also say that probably there should be a way to say "decode
only the first insn in the buffer and ignore the rest". That is all up
to the use cases so I'm looking for suggestions here.

In any case, at least the case where I give it

0x48 0xcf 0x48 0x83

and say that buf size is 4, should return an error because the second
insn is incomplete. So I need to go look at that now.

Thx.

---

diff --git a/arch/x86/tools/insn_sanity.c b/arch/x86/tools/insn_sanity.c
index 51309df285b4..41e1ae0cd833 100644
--- a/arch/x86/tools/insn_sanity.c
+++ b/arch/x86/tools/insn_sanity.c
@@ -220,6 +220,45 @@ static void parse_args(int argc, char **argv)
 	}
 }
 
+static void run_insn_limits_test(void)
+{
+	unsigned char b[MAX_INSN_SIZE];
+	struct insn insn;
+	int ret, i, size;
+
+	memset(b, INSN_NOP, MAX_INSN_SIZE);
+
+	/* IRETQ */
+	b[0] = 0x48;
+	b[1] = 0xcf;
+
+	/* partial add $0x8, %rsp */
+	b[2] = 0x48;
+	b[3] = 0x83;
+
+	printf("insn buffer:\n");
+
+	for (i = 0; i < MAX_INSN_SIZE; i++)
+		printf("0x%hhx ", b[i]);
+	printf("\n");
+
+	size = MAX_INSN_SIZE;
+	ret = insn_decode(&insn, b, size, INSN_MODE_64);
+	printf("supplied buf size: %d, ret %d\n", size, ret);
+
+	size = 2;
+	ret = insn_decode(&insn, b, size, INSN_MODE_64);
+	printf("supplied buf size: %d, ret %d\n", size, ret);
+
+	size = 3;
+	ret = insn_decode(&insn, b, size, INSN_MODE_64);
+	printf("supplied buf size: %d, ret %d\n", size, ret);
+
+	size = 4;
+	ret = insn_decode(&insn, b, size, INSN_MODE_64);
+	printf("supplied buf size: %d, ret %d\n", size, ret);
+}
+
 int main(int argc, char **argv)
 {
 	int insns = 0, ret;
@@ -265,5 +304,7 @@ int main(int argc, char **argv)
 		errors,
 		seed);
 
+	run_insn_limits_test();
+
 	return errors ? 1 : 0;
 }

Borislav Petkov (19):
  x86/insn: Rename insn_decode() to insn_decode_regs()
  x86/insn: Add @buf_len param to insn_init() kernel-doc comment
  x86/insn: Add an insn_decode() API
  x86/insn-eval: Handle return values from the decoder
  x86/boot/compressed/sev-es: Convert to insn_decode()
  perf/x86/intel/ds: Check insn_get_length() retval
  perf/x86/intel/ds: Check return values of insn decoder functions
  x86/alternative: Use insn_decode()
  x86/mce: Convert to insn_decode()
  x86/kprobes: Convert to insn_decode()
  x86/sev-es: Convert to insn_decode()
  x86/traps: Convert to insn_decode()
  x86/uprobes: Convert to insn_decode()
  x86/tools/insn_decoder_test: Convert to insn_decode()
  tools/objtool: Convert to insn_decode()
  x86/tools/insn_sanity: Convert to insn_decode()
  tools/perf: Convert to insn_decode()
  x86/insn: Remove kernel_insn_init()
  x86/insn: Make insn_complete() static

 arch/x86/boot/compressed/sev-es.c             |  11 +-
 arch/x86/events/intel/ds.c                    |   4 +-
 arch/x86/events/intel/lbr.c                   |  10 +-
 arch/x86/include/asm/insn-eval.h              |   4 +-
 arch/x86/include/asm/insn.h                   |  42 ++--
 arch/x86/kernel/alternative.c                 |   6 +-
 arch/x86/kernel/cpu/mce/severity.c            |  12 +-
 arch/x86/kernel/kprobes/core.c                |  17 +-
 arch/x86/kernel/kprobes/opt.c                 |   9 +-
 arch/x86/kernel/sev-es.c                      |  15 +-
 arch/x86/kernel/traps.c                       |   7 +-
 arch/x86/kernel/umip.c                        |   2 +-
 arch/x86/kernel/uprobes.c                     |   8 +-
 arch/x86/lib/insn-eval.c                      |  20 +-
 arch/x86/lib/insn.c                           | 190 ++++++++++++++----
 arch/x86/tools/insn_decoder_test.c            |  10 +-
 arch/x86/tools/insn_sanity.c                  |   8 +-
 tools/arch/x86/include/asm/insn.h             |  42 ++--
 tools/arch/x86/lib/insn.c                     | 190 ++++++++++++++----
 tools/include/linux/kconfig.h                 |  73 +++++++
 tools/objtool/arch/x86/decode.c               |   9 +-
 tools/perf/arch/x86/tests/insn-x86.c          |   9 +-
 tools/perf/arch/x86/util/archinsn.c           |   9 +-
 .../intel-pt-decoder/intel-pt-insn-decoder.c  |  17 +-
 24 files changed, 507 insertions(+), 217 deletions(-)
 create mode 100644 tools/include/linux/kconfig.h

-- 
2.21.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH v0 01/19] x86/insn: Rename insn_decode() to insn_decode_regs()
  2020-11-24 10:19 [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API Borislav Petkov
@ 2020-11-24 10:19 ` Borislav Petkov
  2020-11-24 10:19 ` [RFC PATCH v0 02/19] x86/insn: Add @buf_len param to insn_init() kernel-doc comment Borislav Petkov
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 34+ messages in thread
From: Borislav Petkov @ 2020-11-24 10:19 UTC (permalink / raw)
  To: Andy Lutomirski, Masami Hiramatsu; +Cc: X86 ML, LKML

From: Borislav Petkov <bp@suse.de>

Rename insn_decode() to insn_decode_regs() to denote that it receives
regs as param and free the name for a generic version.

No functional changes.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/insn-eval.h | 4 ++--
 arch/x86/kernel/sev-es.c         | 2 +-
 arch/x86/kernel/umip.c           | 2 +-
 arch/x86/lib/insn-eval.c         | 6 +++---
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/insn-eval.h b/arch/x86/include/asm/insn-eval.h
index a0f839aa144d..3797497a9270 100644
--- a/arch/x86/include/asm/insn-eval.h
+++ b/arch/x86/include/asm/insn-eval.h
@@ -23,7 +23,7 @@ unsigned long insn_get_seg_base(struct pt_regs *regs, int seg_reg_idx);
 int insn_get_code_seg_params(struct pt_regs *regs);
 int insn_fetch_from_user(struct pt_regs *regs,
 			 unsigned char buf[MAX_INSN_SIZE]);
-bool insn_decode(struct insn *insn, struct pt_regs *regs,
-		 unsigned char buf[MAX_INSN_SIZE], int buf_size);
+bool insn_decode_regs(struct insn *insn, struct pt_regs *regs,
+		      unsigned char buf[MAX_INSN_SIZE], int buf_size);
 
 #endif /* _ASM_X86_INSN_EVAL_H */
diff --git a/arch/x86/kernel/sev-es.c b/arch/x86/kernel/sev-es.c
index 0bd1a0fc587e..37736486603e 100644
--- a/arch/x86/kernel/sev-es.c
+++ b/arch/x86/kernel/sev-es.c
@@ -256,7 +256,7 @@ static enum es_result vc_decode_insn(struct es_em_ctxt *ctxt)
 			return ES_EXCEPTION;
 		}
 
-		if (!insn_decode(&ctxt->insn, ctxt->regs, buffer, res))
+		if (!insn_decode_regs(&ctxt->insn, ctxt->regs, buffer, res))
 			return ES_DECODE_FAILED;
 	} else {
 		res = vc_fetch_insn_kernel(ctxt, buffer);
diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c
index f6225bf22c02..e3584894b074 100644
--- a/arch/x86/kernel/umip.c
+++ b/arch/x86/kernel/umip.c
@@ -356,7 +356,7 @@ bool fixup_umip_exception(struct pt_regs *regs)
 	if (!nr_copied)
 		return false;
 
-	if (!insn_decode(&insn, regs, buf, nr_copied))
+	if (!insn_decode_regs(&insn, regs, buf, nr_copied))
 		return false;
 
 	umip_inst = identify_insn(&insn);
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 58f7fb95c7f4..99fafbaf8555 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -1454,7 +1454,7 @@ int insn_fetch_from_user(struct pt_regs *regs, unsigned char buf[MAX_INSN_SIZE])
 }
 
 /**
- * insn_decode() - Decode an instruction
+ * insn_decode_regs() - Decode an instruction
  * @insn:	Structure to store decoded instruction
  * @regs:	Structure with register values as seen when entering kernel mode
  * @buf:	Buffer containing the instruction bytes
@@ -1467,8 +1467,8 @@ int insn_fetch_from_user(struct pt_regs *regs, unsigned char buf[MAX_INSN_SIZE])
  *
  * True if instruction was decoded, False otherwise.
  */
-bool insn_decode(struct insn *insn, struct pt_regs *regs,
-		 unsigned char buf[MAX_INSN_SIZE], int buf_size)
+bool insn_decode_regs(struct insn *insn, struct pt_regs *regs,
+		      unsigned char buf[MAX_INSN_SIZE], int buf_size)
 {
 	int seg_defs;
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH v0 02/19] x86/insn: Add @buf_len param to insn_init() kernel-doc comment
  2020-11-24 10:19 [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API Borislav Petkov
  2020-11-24 10:19 ` [RFC PATCH v0 01/19] x86/insn: Rename insn_decode() to insn_decode_regs() Borislav Petkov
@ 2020-11-24 10:19 ` Borislav Petkov
  2020-11-24 10:19 ` [RFC PATCH v0 03/19] x86/insn: Add an insn_decode() API Borislav Petkov
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 34+ messages in thread
From: Borislav Petkov @ 2020-11-24 10:19 UTC (permalink / raw)
  To: Andy Lutomirski, Masami Hiramatsu; +Cc: X86 ML, LKML

From: Borislav Petkov <bp@suse.de>

It wasn't documented so add it. No functional changes.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/lib/insn.c       | 1 +
 tools/arch/x86/lib/insn.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/x86/lib/insn.c b/arch/x86/lib/insn.c
index 404279563891..1ba994862b56 100644
--- a/arch/x86/lib/insn.c
+++ b/arch/x86/lib/insn.c
@@ -37,6 +37,7 @@
  * insn_init() - initialize struct insn
  * @insn:	&struct insn to be initialized
  * @kaddr:	address (in kernel memory) of instruction (or copy thereof)
+ * @buf_len:	length of the insn buffer at @kaddr
  * @x86_64:	!0 for 64-bit kernel or 64-bit app
  */
 void insn_init(struct insn *insn, const void *kaddr, int buf_len, int x86_64)
diff --git a/tools/arch/x86/lib/insn.c b/tools/arch/x86/lib/insn.c
index 0151dfc6da61..f3277d6e4ef2 100644
--- a/tools/arch/x86/lib/insn.c
+++ b/tools/arch/x86/lib/insn.c
@@ -37,6 +37,7 @@
  * insn_init() - initialize struct insn
  * @insn:	&struct insn to be initialized
  * @kaddr:	address (in kernel memory) of instruction (or copy thereof)
+ * @buf_len:	length of the insn buffer at @kaddr
  * @x86_64:	!0 for 64-bit kernel or 64-bit app
  */
 void insn_init(struct insn *insn, const void *kaddr, int buf_len, int x86_64)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH v0 03/19] x86/insn: Add an insn_decode() API
  2020-11-24 10:19 [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API Borislav Petkov
  2020-11-24 10:19 ` [RFC PATCH v0 01/19] x86/insn: Rename insn_decode() to insn_decode_regs() Borislav Petkov
  2020-11-24 10:19 ` [RFC PATCH v0 02/19] x86/insn: Add @buf_len param to insn_init() kernel-doc comment Borislav Petkov
@ 2020-11-24 10:19 ` Borislav Petkov
  2020-11-25 16:53   ` Masami Hiramatsu
  2020-11-24 10:19 ` [RFC PATCH v0 04/19] x86/insn-eval: Handle return values from the decoder Borislav Petkov
                   ` (16 subsequent siblings)
  19 siblings, 1 reply; 34+ messages in thread
From: Borislav Petkov @ 2020-11-24 10:19 UTC (permalink / raw)
  To: Andy Lutomirski, Masami Hiramatsu; +Cc: X86 ML, LKML

From: Borislav Petkov <bp@suse.de>

Users of the instruction decoder should use this to decode instruction
bytes. For that, have insn*() helpers return an int value to denote
success/failure.

While at it, make insn_get_opcode() more stricter as to whether what has
seen so far is a valid insn and if not.

Copy linux/kconfig.h for the tools-version of the decoder so that it can
use IS_ENABLED().

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/insn.h       |  24 ++--
 arch/x86/lib/insn.c               | 182 +++++++++++++++++++++++-------
 tools/arch/x86/include/asm/insn.h |  24 ++--
 tools/arch/x86/lib/insn.c         | 182 +++++++++++++++++++++++-------
 tools/include/linux/kconfig.h     |  73 ++++++++++++
 5 files changed, 391 insertions(+), 94 deletions(-)
 create mode 100644 tools/include/linux/kconfig.h

diff --git a/arch/x86/include/asm/insn.h b/arch/x86/include/asm/insn.h
index 5c1ae3eff9d4..e80ddfe86255 100644
--- a/arch/x86/include/asm/insn.h
+++ b/arch/x86/include/asm/insn.h
@@ -87,13 +87,23 @@ struct insn {
 #define X86_VEX_M_MAX	0x1f			/* VEX3.M Maximum value */
 
 extern void insn_init(struct insn *insn, const void *kaddr, int buf_len, int x86_64);
-extern void insn_get_prefixes(struct insn *insn);
-extern void insn_get_opcode(struct insn *insn);
-extern void insn_get_modrm(struct insn *insn);
-extern void insn_get_sib(struct insn *insn);
-extern void insn_get_displacement(struct insn *insn);
-extern void insn_get_immediate(struct insn *insn);
-extern void insn_get_length(struct insn *insn);
+extern int insn_get_prefixes(struct insn *insn);
+extern int insn_get_opcode(struct insn *insn);
+extern int insn_get_modrm(struct insn *insn);
+extern int insn_get_sib(struct insn *insn);
+extern int insn_get_displacement(struct insn *insn);
+extern int insn_get_immediate(struct insn *insn);
+extern int insn_get_length(struct insn *insn);
+
+enum insn_mode {
+	INSN_MODE_32,
+	INSN_MODE_64,
+	/* Mode is determined by the current kernel build. */
+	INSN_MODE_KERN,
+	INSN_NUM_MODES,
+};
+
+extern int insn_decode(struct insn *insn, const void *kaddr, int buf_len, enum insn_mode m);
 
 /* Attribute will be determined after getting ModRM (for opcode groups) */
 static inline void insn_get_attribute(struct insn *insn)
diff --git a/arch/x86/lib/insn.c b/arch/x86/lib/insn.c
index 1ba994862b56..1f749e3b4a25 100644
--- a/arch/x86/lib/insn.c
+++ b/arch/x86/lib/insn.c
@@ -13,6 +13,9 @@
 #include <asm/inat.h>
 #include <asm/insn.h>
 
+#include <linux/errno.h>
+#include <linux/kconfig.h>
+
 #include <asm/emulate_prefix.h>
 
 /* Verify next sizeof(t) bytes can be on the same instruction */
@@ -98,8 +101,12 @@ static void insn_get_emulate_prefix(struct insn *insn)
  * Populates the @insn->prefixes bitmap, and updates @insn->next_byte
  * to point to the (first) opcode.  No effect if @insn->prefixes.got
  * is already set.
+ *
+ * * Returns:
+ * 0:  on success
+ * !0: on error
  */
-void insn_get_prefixes(struct insn *insn)
+int insn_get_prefixes(struct insn *insn)
 {
 	struct insn_field *prefixes = &insn->prefixes;
 	insn_attr_t attr;
@@ -107,7 +114,7 @@ void insn_get_prefixes(struct insn *insn)
 	int i, nb;
 
 	if (prefixes->got)
-		return;
+		return 0;
 
 	insn_get_emulate_prefix(insn);
 
@@ -218,8 +225,10 @@ void insn_get_prefixes(struct insn *insn)
 
 	prefixes->got = 1;
 
+	return 0;
+
 err_out:
-	return;
+	return 1;
 }
 
 /**
@@ -231,14 +240,20 @@ void insn_get_prefixes(struct insn *insn)
  * If necessary, first collects any preceding (prefix) bytes.
  * Sets @insn->opcode.value = opcode1.  No effect if @insn->opcode.got
  * is already 1.
+ *
+ * Returns:
+ * 0:  on success
+ * !0: on error
  */
-void insn_get_opcode(struct insn *insn)
+int insn_get_opcode(struct insn *insn)
 {
 	struct insn_field *opcode = &insn->opcode;
 	insn_byte_t op;
 	int pfx_id;
+
 	if (opcode->got)
-		return;
+		return 0;
+
 	if (!insn->prefixes.got)
 		insn_get_prefixes(insn);
 
@@ -255,9 +270,13 @@ void insn_get_opcode(struct insn *insn)
 		insn->attr = inat_get_avx_attribute(op, m, p);
 		if ((inat_must_evex(insn->attr) && !insn_is_evex(insn)) ||
 		    (!inat_accept_vex(insn->attr) &&
-		     !inat_is_group(insn->attr)))
-			insn->attr = 0;	/* This instruction is bad */
-		goto end;	/* VEX has only 1 byte for opcode */
+		     !inat_is_group(insn->attr))) {
+			/* This instruction is bad */
+			insn->attr = 0;
+			return 1;
+		}
+		/* VEX has only 1 byte for opcode */
+		goto end;
 	}
 
 	insn->attr = inat_get_opcode_attribute(op);
@@ -268,13 +287,18 @@ void insn_get_opcode(struct insn *insn)
 		pfx_id = insn_last_prefix_id(insn);
 		insn->attr = inat_get_escape_attribute(op, pfx_id, insn->attr);
 	}
-	if (inat_must_vex(insn->attr))
-		insn->attr = 0;	/* This instruction is bad */
+
+	if (inat_must_vex(insn->attr)) {
+		/* This instruction is bad */
+		insn->attr = 0;
+		return 1;
+	}
 end:
 	opcode->got = 1;
+	return 0;
 
 err_out:
-	return;
+	return 1;
 }
 
 /**
@@ -284,15 +308,22 @@ void insn_get_opcode(struct insn *insn)
  * Populates @insn->modrm and updates @insn->next_byte to point past the
  * ModRM byte, if any.  If necessary, first collects the preceding bytes
  * (prefixes and opcode(s)).  No effect if @insn->modrm.got is already 1.
+ *
+ * Returns:
+ * 0:  on success
+ * !0: on error
  */
-void insn_get_modrm(struct insn *insn)
+int insn_get_modrm(struct insn *insn)
 {
 	struct insn_field *modrm = &insn->modrm;
 	insn_byte_t pfx_id, mod;
+
 	if (modrm->got)
-		return;
+		return 0;
+
 	if (!insn->opcode.got)
-		insn_get_opcode(insn);
+		if (insn_get_opcode(insn))
+			return 1;
 
 	if (inat_has_modrm(insn->attr)) {
 		mod = get_next(insn_byte_t, insn);
@@ -302,17 +333,22 @@ void insn_get_modrm(struct insn *insn)
 			pfx_id = insn_last_prefix_id(insn);
 			insn->attr = inat_get_group_attribute(mod, pfx_id,
 							      insn->attr);
-			if (insn_is_avx(insn) && !inat_accept_vex(insn->attr))
-				insn->attr = 0;	/* This is bad */
+			if (insn_is_avx(insn) && !inat_accept_vex(insn->attr)) {
+				/* Bad insn */
+				insn->attr = 0;
+				return 1;
+			}
 		}
 	}
 
 	if (insn->x86_64 && inat_is_force64(insn->attr))
 		insn->opnd_bytes = 8;
+
 	modrm->got = 1;
+	return 0;
 
 err_out:
-	return;
+	return 1;
 }
 
 
@@ -329,8 +365,11 @@ int insn_rip_relative(struct insn *insn)
 
 	if (!insn->x86_64)
 		return 0;
-	if (!modrm->got)
-		insn_get_modrm(insn);
+
+	if (!modrm->got) {
+		if (insn_get_modrm(insn))
+			return 0;
+	}
 	/*
 	 * For rip-relative instructions, the mod field (top 2 bits)
 	 * is zero and the r/m field (bottom 3 bits) is 0x5.
@@ -344,15 +383,23 @@ int insn_rip_relative(struct insn *insn)
  *
  * If necessary, first collects the instruction up to and including the
  * ModRM byte.
+ *
+ * Returns:
+ * 0: if decoding succeeded
+ * !0: otherwise.
  */
-void insn_get_sib(struct insn *insn)
+int insn_get_sib(struct insn *insn)
 {
 	insn_byte_t modrm;
 
 	if (insn->sib.got)
-		return;
-	if (!insn->modrm.got)
-		insn_get_modrm(insn);
+		return 0;
+
+	if (!insn->modrm.got) {
+		if (insn_get_modrm(insn))
+			return 1;
+	}
+
 	if (insn->modrm.nbytes) {
 		modrm = (insn_byte_t)insn->modrm.value;
 		if (insn->addr_bytes != 2 &&
@@ -363,8 +410,10 @@ void insn_get_sib(struct insn *insn)
 	}
 	insn->sib.got = 1;
 
+	return 0;
+
 err_out:
-	return;
+	return 1;
 }
 
 
@@ -375,15 +424,23 @@ void insn_get_sib(struct insn *insn)
  * If necessary, first collects the instruction up to and including the
  * SIB byte.
  * Displacement value is sign-expanded.
+ *
+ * * Returns:
+ * 0: if decoding succeeded
+ * !0: otherwise.
  */
-void insn_get_displacement(struct insn *insn)
+int insn_get_displacement(struct insn *insn)
 {
 	insn_byte_t mod, rm, base;
 
 	if (insn->displacement.got)
-		return;
-	if (!insn->sib.got)
-		insn_get_sib(insn);
+		return 0;
+
+	if (!insn->sib.got) {
+		if (insn_get_sib(insn))
+			return 1;
+	}
+
 	if (insn->modrm.nbytes) {
 		/*
 		 * Interpreting the modrm byte:
@@ -426,9 +483,10 @@ void insn_get_displacement(struct insn *insn)
 	}
 out:
 	insn->displacement.got = 1;
+	return 0;
 
 err_out:
-	return;
+	return 1;
 }
 
 /* Decode moffset16/32/64. Return 0 if failed */
@@ -539,20 +597,27 @@ static int __get_immptr(struct insn *insn)
 }
 
 /**
- * insn_get_immediate() - Get the immediates of instruction
+ * insn_get_immediate() - Get the immediate in an instruction
  * @insn:	&struct insn containing instruction
  *
  * If necessary, first collects the instruction up to and including the
  * displacement bytes.
  * Basically, most of immediates are sign-expanded. Unsigned-value can be
- * get by bit masking with ((1 << (nbytes * 8)) - 1)
+ * computed by bit masking with ((1 << (nbytes * 8)) - 1)
+ *
+ * Returns:
+ * 0:  on success
+ * !0: on error
  */
-void insn_get_immediate(struct insn *insn)
+int insn_get_immediate(struct insn *insn)
 {
 	if (insn->immediate.got)
-		return;
-	if (!insn->displacement.got)
-		insn_get_displacement(insn);
+		return 0;
+
+	if (!insn->displacement.got) {
+		if (insn_get_displacement(insn))
+			return 1;
+	}
 
 	if (inat_has_moffset(insn->attr)) {
 		if (!__get_moffset(insn))
@@ -605,9 +670,10 @@ void insn_get_immediate(struct insn *insn)
 	}
 done:
 	insn->immediate.got = 1;
+	return 0;
 
 err_out:
-	return;
+	return 1;
 }
 
 /**
@@ -616,13 +682,49 @@ void insn_get_immediate(struct insn *insn)
  *
  * If necessary, first collects the instruction up to and including the
  * immediates bytes.
- */
-void insn_get_length(struct insn *insn)
+ *
+ * Returns:
+ *  - 0 on success
+ *  - !0 on error
+*/
+int insn_get_length(struct insn *insn)
 {
 	if (insn->length)
-		return;
+		return 0;
+
 	if (!insn->immediate.got)
-		insn_get_immediate(insn);
+		if (insn_get_immediate(insn))
+			return 1;
+
 	insn->length = (unsigned char)((unsigned long)insn->next_byte
 				     - (unsigned long)insn->kaddr);
+
+	return 0;
+}
+
+/**
+ * insn_decode() - Decode an x86 instruction
+ * @insn:	&struct insn to be initialized
+ * @kaddr:	address (in kernel memory) of instruction (or copy thereof)
+ * @buf_len:	length of the insn buffer at @kaddr
+ * @m:		insn mode, see enum insn_mode
+ *
+ * Returns:
+ * 0: if decoding succeeded
+ * < 0: otherwise.
+ */
+int insn_decode(struct insn *insn, const void *kaddr, int buf_len, enum insn_mode m)
+{
+	if (m == INSN_MODE_KERN)
+		insn_init(insn, kaddr, buf_len, IS_ENABLED(CONFIG_X86_64));
+	else
+		insn_init(insn, kaddr, buf_len, m == INSN_MODE_64);
+
+	if (insn_get_length(insn))
+		return -EINVAL;
+
+	if (insn_complete(insn))
+		return 0;
+
+	return -EINVAL;
 }
diff --git a/tools/arch/x86/include/asm/insn.h b/tools/arch/x86/include/asm/insn.h
index 568854b14d0a..f223bd7ec27e 100644
--- a/tools/arch/x86/include/asm/insn.h
+++ b/tools/arch/x86/include/asm/insn.h
@@ -87,13 +87,23 @@ struct insn {
 #define X86_VEX_M_MAX	0x1f			/* VEX3.M Maximum value */
 
 extern void insn_init(struct insn *insn, const void *kaddr, int buf_len, int x86_64);
-extern void insn_get_prefixes(struct insn *insn);
-extern void insn_get_opcode(struct insn *insn);
-extern void insn_get_modrm(struct insn *insn);
-extern void insn_get_sib(struct insn *insn);
-extern void insn_get_displacement(struct insn *insn);
-extern void insn_get_immediate(struct insn *insn);
-extern void insn_get_length(struct insn *insn);
+extern int insn_get_prefixes(struct insn *insn);
+extern int insn_get_opcode(struct insn *insn);
+extern int insn_get_modrm(struct insn *insn);
+extern int insn_get_sib(struct insn *insn);
+extern int insn_get_displacement(struct insn *insn);
+extern int insn_get_immediate(struct insn *insn);
+extern int insn_get_length(struct insn *insn);
+
+enum insn_mode {
+	INSN_MODE_32,
+	INSN_MODE_64,
+	/* Mode is determined by the current kernel build. */
+	INSN_MODE_KERN,
+	INSN_NUM_MODES,
+};
+
+extern int insn_decode(struct insn *insn, const void *kaddr, int buf_len, enum insn_mode m);
 
 /* Attribute will be determined after getting ModRM (for opcode groups) */
 static inline void insn_get_attribute(struct insn *insn)
diff --git a/tools/arch/x86/lib/insn.c b/tools/arch/x86/lib/insn.c
index f3277d6e4ef2..41d2418302d9 100644
--- a/tools/arch/x86/lib/insn.c
+++ b/tools/arch/x86/lib/insn.c
@@ -13,6 +13,9 @@
 #include "../include/asm/inat.h"
 #include "../include/asm/insn.h"
 
+#include <linux/errno.h>
+#include <linux/kconfig.h>
+
 #include "../include/asm/emulate_prefix.h"
 
 /* Verify next sizeof(t) bytes can be on the same instruction */
@@ -98,8 +101,12 @@ static void insn_get_emulate_prefix(struct insn *insn)
  * Populates the @insn->prefixes bitmap, and updates @insn->next_byte
  * to point to the (first) opcode.  No effect if @insn->prefixes.got
  * is already set.
+ *
+ * * Returns:
+ * 0:  on success
+ * !0: on error
  */
-void insn_get_prefixes(struct insn *insn)
+int insn_get_prefixes(struct insn *insn)
 {
 	struct insn_field *prefixes = &insn->prefixes;
 	insn_attr_t attr;
@@ -107,7 +114,7 @@ void insn_get_prefixes(struct insn *insn)
 	int i, nb;
 
 	if (prefixes->got)
-		return;
+		return 0;
 
 	insn_get_emulate_prefix(insn);
 
@@ -218,8 +225,10 @@ void insn_get_prefixes(struct insn *insn)
 
 	prefixes->got = 1;
 
+	return 0;
+
 err_out:
-	return;
+	return 1;
 }
 
 /**
@@ -231,14 +240,20 @@ void insn_get_prefixes(struct insn *insn)
  * If necessary, first collects any preceding (prefix) bytes.
  * Sets @insn->opcode.value = opcode1.  No effect if @insn->opcode.got
  * is already 1.
+ *
+ * Returns:
+ * 0:  on success
+ * !0: on error
  */
-void insn_get_opcode(struct insn *insn)
+int insn_get_opcode(struct insn *insn)
 {
 	struct insn_field *opcode = &insn->opcode;
 	insn_byte_t op;
 	int pfx_id;
+
 	if (opcode->got)
-		return;
+		return 0;
+
 	if (!insn->prefixes.got)
 		insn_get_prefixes(insn);
 
@@ -255,9 +270,13 @@ void insn_get_opcode(struct insn *insn)
 		insn->attr = inat_get_avx_attribute(op, m, p);
 		if ((inat_must_evex(insn->attr) && !insn_is_evex(insn)) ||
 		    (!inat_accept_vex(insn->attr) &&
-		     !inat_is_group(insn->attr)))
-			insn->attr = 0;	/* This instruction is bad */
-		goto end;	/* VEX has only 1 byte for opcode */
+		     !inat_is_group(insn->attr))) {
+			/* This instruction is bad */
+			insn->attr = 0;
+			return 1;
+		}
+		/* VEX has only 1 byte for opcode */
+		goto end;
 	}
 
 	insn->attr = inat_get_opcode_attribute(op);
@@ -268,13 +287,18 @@ void insn_get_opcode(struct insn *insn)
 		pfx_id = insn_last_prefix_id(insn);
 		insn->attr = inat_get_escape_attribute(op, pfx_id, insn->attr);
 	}
-	if (inat_must_vex(insn->attr))
-		insn->attr = 0;	/* This instruction is bad */
+
+	if (inat_must_vex(insn->attr)) {
+		/* This instruction is bad */
+		insn->attr = 0;
+		return 1;
+	}
 end:
 	opcode->got = 1;
+	return 0;
 
 err_out:
-	return;
+	return 1;
 }
 
 /**
@@ -284,15 +308,22 @@ void insn_get_opcode(struct insn *insn)
  * Populates @insn->modrm and updates @insn->next_byte to point past the
  * ModRM byte, if any.  If necessary, first collects the preceding bytes
  * (prefixes and opcode(s)).  No effect if @insn->modrm.got is already 1.
+ *
+ * Returns:
+ * 0:  on success
+ * !0: on error
  */
-void insn_get_modrm(struct insn *insn)
+int insn_get_modrm(struct insn *insn)
 {
 	struct insn_field *modrm = &insn->modrm;
 	insn_byte_t pfx_id, mod;
+
 	if (modrm->got)
-		return;
+		return 0;
+
 	if (!insn->opcode.got)
-		insn_get_opcode(insn);
+		if (insn_get_opcode(insn))
+			return 1;
 
 	if (inat_has_modrm(insn->attr)) {
 		mod = get_next(insn_byte_t, insn);
@@ -302,17 +333,22 @@ void insn_get_modrm(struct insn *insn)
 			pfx_id = insn_last_prefix_id(insn);
 			insn->attr = inat_get_group_attribute(mod, pfx_id,
 							      insn->attr);
-			if (insn_is_avx(insn) && !inat_accept_vex(insn->attr))
-				insn->attr = 0;	/* This is bad */
+			if (insn_is_avx(insn) && !inat_accept_vex(insn->attr)) {
+				/* Bad insn */
+				insn->attr = 0;
+				return 1;
+			}
 		}
 	}
 
 	if (insn->x86_64 && inat_is_force64(insn->attr))
 		insn->opnd_bytes = 8;
+
 	modrm->got = 1;
+	return 0;
 
 err_out:
-	return;
+	return 1;
 }
 
 
@@ -329,8 +365,11 @@ int insn_rip_relative(struct insn *insn)
 
 	if (!insn->x86_64)
 		return 0;
-	if (!modrm->got)
-		insn_get_modrm(insn);
+
+	if (!modrm->got) {
+		if (insn_get_modrm(insn))
+			return 0;
+	}
 	/*
 	 * For rip-relative instructions, the mod field (top 2 bits)
 	 * is zero and the r/m field (bottom 3 bits) is 0x5.
@@ -344,15 +383,23 @@ int insn_rip_relative(struct insn *insn)
  *
  * If necessary, first collects the instruction up to and including the
  * ModRM byte.
+ *
+ * Returns:
+ * 0: if decoding succeeded
+ * !0: otherwise.
  */
-void insn_get_sib(struct insn *insn)
+int insn_get_sib(struct insn *insn)
 {
 	insn_byte_t modrm;
 
 	if (insn->sib.got)
-		return;
-	if (!insn->modrm.got)
-		insn_get_modrm(insn);
+		return 0;
+
+	if (!insn->modrm.got) {
+		if (insn_get_modrm(insn))
+			return 1;
+	}
+
 	if (insn->modrm.nbytes) {
 		modrm = (insn_byte_t)insn->modrm.value;
 		if (insn->addr_bytes != 2 &&
@@ -363,8 +410,10 @@ void insn_get_sib(struct insn *insn)
 	}
 	insn->sib.got = 1;
 
+	return 0;
+
 err_out:
-	return;
+	return 1;
 }
 
 
@@ -375,15 +424,23 @@ void insn_get_sib(struct insn *insn)
  * If necessary, first collects the instruction up to and including the
  * SIB byte.
  * Displacement value is sign-expanded.
+ *
+ * * Returns:
+ * 0: if decoding succeeded
+ * !0: otherwise.
  */
-void insn_get_displacement(struct insn *insn)
+int insn_get_displacement(struct insn *insn)
 {
 	insn_byte_t mod, rm, base;
 
 	if (insn->displacement.got)
-		return;
-	if (!insn->sib.got)
-		insn_get_sib(insn);
+		return 0;
+
+	if (!insn->sib.got) {
+		if (insn_get_sib(insn))
+			return 1;
+	}
+
 	if (insn->modrm.nbytes) {
 		/*
 		 * Interpreting the modrm byte:
@@ -426,9 +483,10 @@ void insn_get_displacement(struct insn *insn)
 	}
 out:
 	insn->displacement.got = 1;
+	return 0;
 
 err_out:
-	return;
+	return 1;
 }
 
 /* Decode moffset16/32/64. Return 0 if failed */
@@ -539,20 +597,27 @@ static int __get_immptr(struct insn *insn)
 }
 
 /**
- * insn_get_immediate() - Get the immediates of instruction
+ * insn_get_immediate() - Get the immediate in an instruction
  * @insn:	&struct insn containing instruction
  *
  * If necessary, first collects the instruction up to and including the
  * displacement bytes.
  * Basically, most of immediates are sign-expanded. Unsigned-value can be
- * get by bit masking with ((1 << (nbytes * 8)) - 1)
+ * computed by bit masking with ((1 << (nbytes * 8)) - 1)
+ *
+ * Returns:
+ * 0:  on success
+ * !0: on error
  */
-void insn_get_immediate(struct insn *insn)
+int insn_get_immediate(struct insn *insn)
 {
 	if (insn->immediate.got)
-		return;
-	if (!insn->displacement.got)
-		insn_get_displacement(insn);
+		return 0;
+
+	if (!insn->displacement.got) {
+		if (insn_get_displacement(insn))
+			return 1;
+	}
 
 	if (inat_has_moffset(insn->attr)) {
 		if (!__get_moffset(insn))
@@ -605,9 +670,10 @@ void insn_get_immediate(struct insn *insn)
 	}
 done:
 	insn->immediate.got = 1;
+	return 0;
 
 err_out:
-	return;
+	return 1;
 }
 
 /**
@@ -616,13 +682,49 @@ void insn_get_immediate(struct insn *insn)
  *
  * If necessary, first collects the instruction up to and including the
  * immediates bytes.
- */
-void insn_get_length(struct insn *insn)
+ *
+ * Returns:
+ *  - 0 on success
+ *  - !0 on error
+*/
+int insn_get_length(struct insn *insn)
 {
 	if (insn->length)
-		return;
+		return 0;
+
 	if (!insn->immediate.got)
-		insn_get_immediate(insn);
+		if (insn_get_immediate(insn))
+			return 1;
+
 	insn->length = (unsigned char)((unsigned long)insn->next_byte
 				     - (unsigned long)insn->kaddr);
+
+	return 0;
+}
+
+/**
+ * insn_decode() - Decode an x86 instruction
+ * @insn:	&struct insn to be initialized
+ * @kaddr:	address (in kernel memory) of instruction (or copy thereof)
+ * @buf_len:	length of the insn buffer at @kaddr
+ * @m:		insn mode, see enum insn_mode
+ *
+ * Returns:
+ * 0: if decoding succeeded
+ * < 0: otherwise.
+ */
+int insn_decode(struct insn *insn, const void *kaddr, int buf_len, enum insn_mode m)
+{
+	if (m == INSN_MODE_KERN)
+		insn_init(insn, kaddr, buf_len, IS_ENABLED(CONFIG_X86_64));
+	else
+		insn_init(insn, kaddr, buf_len, m == INSN_MODE_64);
+
+	if (insn_get_length(insn))
+		return -EINVAL;
+
+	if (insn_complete(insn))
+		return 0;
+
+	return -EINVAL;
 }
diff --git a/tools/include/linux/kconfig.h b/tools/include/linux/kconfig.h
new file mode 100644
index 000000000000..1555a0c4f345
--- /dev/null
+++ b/tools/include/linux/kconfig.h
@@ -0,0 +1,73 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _TOOLS_LINUX_KCONFIG_H
+#define _TOOLS_LINUX_KCONFIG_H
+
+/* CONFIG_CC_VERSION_TEXT (Do not delete this comment. See help in Kconfig) */
+
+#ifdef CONFIG_CPU_BIG_ENDIAN
+#define __BIG_ENDIAN 4321
+#else
+#define __LITTLE_ENDIAN 1234
+#endif
+
+#define __ARG_PLACEHOLDER_1 0,
+#define __take_second_arg(__ignored, val, ...) val
+
+/*
+ * The use of "&&" / "||" is limited in certain expressions.
+ * The following enable to calculate "and" / "or" with macro expansion only.
+ */
+#define __and(x, y)			___and(x, y)
+#define ___and(x, y)			____and(__ARG_PLACEHOLDER_##x, y)
+#define ____and(arg1_or_junk, y)	__take_second_arg(arg1_or_junk y, 0)
+
+#define __or(x, y)			___or(x, y)
+#define ___or(x, y)			____or(__ARG_PLACEHOLDER_##x, y)
+#define ____or(arg1_or_junk, y)		__take_second_arg(arg1_or_junk 1, y)
+
+/*
+ * Helper macros to use CONFIG_ options in C/CPP expressions. Note that
+ * these only work with boolean and tristate options.
+ */
+
+/*
+ * Getting something that works in C and CPP for an arg that may or may
+ * not be defined is tricky.  Here, if we have "#define CONFIG_BOOGER 1"
+ * we match on the placeholder define, insert the "0," for arg1 and generate
+ * the triplet (0, 1, 0).  Then the last step cherry picks the 2nd arg (a one).
+ * When CONFIG_BOOGER is not defined, we generate a (... 1, 0) pair, and when
+ * the last step cherry picks the 2nd arg, we get a zero.
+ */
+#define __is_defined(x)			___is_defined(x)
+#define ___is_defined(val)		____is_defined(__ARG_PLACEHOLDER_##val)
+#define ____is_defined(arg1_or_junk)	__take_second_arg(arg1_or_junk 1, 0)
+
+/*
+ * IS_BUILTIN(CONFIG_FOO) evaluates to 1 if CONFIG_FOO is set to 'y', 0
+ * otherwise. For boolean options, this is equivalent to
+ * IS_ENABLED(CONFIG_FOO).
+ */
+#define IS_BUILTIN(option) __is_defined(option)
+
+/*
+ * IS_MODULE(CONFIG_FOO) evaluates to 1 if CONFIG_FOO is set to 'm', 0
+ * otherwise.
+ */
+#define IS_MODULE(option) __is_defined(option##_MODULE)
+
+/*
+ * IS_REACHABLE(CONFIG_FOO) evaluates to 1 if the currently compiled
+ * code can call a function defined in code compiled based on CONFIG_FOO.
+ * This is similar to IS_ENABLED(), but returns false when invoked from
+ * built-in code when CONFIG_FOO is set to 'm'.
+ */
+#define IS_REACHABLE(option) __or(IS_BUILTIN(option), \
+				__and(IS_MODULE(option), __is_defined(MODULE)))
+
+/*
+ * IS_ENABLED(CONFIG_FOO) evaluates to 1 if CONFIG_FOO is set to 'y' or 'm',
+ * 0 otherwise.
+ */
+#define IS_ENABLED(option) __or(IS_BUILTIN(option), IS_MODULE(option))
+
+#endif /* _TOOLS_LINUX_KCONFIG_H */
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH v0 04/19] x86/insn-eval: Handle return values from the decoder
  2020-11-24 10:19 [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API Borislav Petkov
                   ` (2 preceding siblings ...)
  2020-11-24 10:19 ` [RFC PATCH v0 03/19] x86/insn: Add an insn_decode() API Borislav Petkov
@ 2020-11-24 10:19 ` Borislav Petkov
  2020-11-24 10:19 ` [RFC PATCH v0 05/19] x86/boot/compressed/sev-es: Convert to insn_decode() Borislav Petkov
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 34+ messages in thread
From: Borislav Petkov @ 2020-11-24 10:19 UTC (permalink / raw)
  To: Andy Lutomirski, Masami Hiramatsu; +Cc: X86 ML, LKML

From: Borislav Petkov <bp@suse.de>

Now that the different instruction inspecting functions return a value,
test that and return early from callers if error has been encountered.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/lib/insn-eval.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 99fafbaf8555..5825b4cf4386 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -1117,7 +1117,8 @@ static int get_eff_addr_sib(struct insn *insn, struct pt_regs *regs,
 	if (insn->addr_bytes != 8 && insn->addr_bytes != 4)
 		return -EINVAL;
 
-	insn_get_modrm(insn);
+	if (insn_get_modrm(insn))
+		return -EINVAL;
 
 	if (!insn->modrm.nbytes)
 		return -EINVAL;
@@ -1125,7 +1126,8 @@ static int get_eff_addr_sib(struct insn *insn, struct pt_regs *regs,
 	if (X86_MODRM_MOD(insn->modrm.value) > 2)
 		return -EINVAL;
 
-	insn_get_sib(insn);
+	if (insn_get_sib(insn))
+		return -EINVAL;
 
 	if (!insn->sib.nbytes)
 		return -EINVAL;
@@ -1194,8 +1196,8 @@ static void __user *get_addr_ref_16(struct insn *insn, struct pt_regs *regs)
 	short eff_addr;
 	long tmp;
 
-	insn_get_modrm(insn);
-	insn_get_displacement(insn);
+	if (insn_get_modrm(insn) || insn_get_displacement(insn))
+		goto out;
 
 	if (insn->addr_bytes != 2)
 		goto out;
@@ -1491,7 +1493,9 @@ bool insn_decode_regs(struct insn *insn, struct pt_regs *regs,
 	insn->addr_bytes = INSN_CODE_SEG_ADDR_SZ(seg_defs);
 	insn->opnd_bytes = INSN_CODE_SEG_OPND_SZ(seg_defs);
 
-	insn_get_length(insn);
+	if (insn_get_length(insn))
+		return false;
+
 	if (buf_size < insn->length)
 		return false;
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH v0 05/19] x86/boot/compressed/sev-es: Convert to insn_decode()
  2020-11-24 10:19 [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API Borislav Petkov
                   ` (3 preceding siblings ...)
  2020-11-24 10:19 ` [RFC PATCH v0 04/19] x86/insn-eval: Handle return values from the decoder Borislav Petkov
@ 2020-11-24 10:19 ` Borislav Petkov
  2020-11-24 10:19 ` [RFC PATCH v0 06/19] perf/x86/intel/ds: Check insn_get_length() retval Borislav Petkov
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 34+ messages in thread
From: Borislav Petkov @ 2020-11-24 10:19 UTC (permalink / raw)
  To: Andy Lutomirski, Masami Hiramatsu; +Cc: X86 ML, LKML

From: Borislav Petkov <bp@suse.de>

Other than simplifying the code there should be no functional changes
resulting from this.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/boot/compressed/sev-es.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/arch/x86/boot/compressed/sev-es.c b/arch/x86/boot/compressed/sev-es.c
index 954cb2702e23..6bb28170a770 100644
--- a/arch/x86/boot/compressed/sev-es.c
+++ b/arch/x86/boot/compressed/sev-es.c
@@ -79,16 +79,15 @@ static inline void sev_es_wr_ghcb_msr(u64 val)
 static enum es_result vc_decode_insn(struct es_em_ctxt *ctxt)
 {
 	char buffer[MAX_INSN_SIZE];
-	enum es_result ret;
+	int ret;
 
 	memcpy(buffer, (unsigned char *)ctxt->regs->ip, MAX_INSN_SIZE);
 
-	insn_init(&ctxt->insn, buffer, MAX_INSN_SIZE, 1);
-	insn_get_length(&ctxt->insn);
+	ret = insn_decode(&ctxt->insn, buffer, MAX_INSN_SIZE, INSN_MODE_64);
+	if (ret < 0)
+		return ES_DECODE_FAILED;
 
-	ret = ctxt->insn.immediate.got ? ES_OK : ES_DECODE_FAILED;
-
-	return ret;
+	return ES_OK;
 }
 
 static enum es_result vc_write_mem(struct es_em_ctxt *ctxt,
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH v0 06/19] perf/x86/intel/ds: Check insn_get_length() retval
  2020-11-24 10:19 [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API Borislav Petkov
                   ` (4 preceding siblings ...)
  2020-11-24 10:19 ` [RFC PATCH v0 05/19] x86/boot/compressed/sev-es: Convert to insn_decode() Borislav Petkov
@ 2020-11-24 10:19 ` Borislav Petkov
  2020-11-24 10:19 ` [RFC PATCH v0 07/19] perf/x86/intel/ds: Check return values of insn decoder functions Borislav Petkov
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 34+ messages in thread
From: Borislav Petkov @ 2020-11-24 10:19 UTC (permalink / raw)
  To: Andy Lutomirski, Masami Hiramatsu; +Cc: X86 ML, LKML

From: Borislav Petkov <bp@suse.de>

intel_pmu_pebs_fixup_ip() needs only the insn length so use the
appropriate helper instead of a full decode. A full decode differs only
in running insn_complete() on the decoded insn but that is not needed
here.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/events/intel/ds.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index fb327d11a04d..56cbcfee0ab1 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1263,14 +1263,14 @@ static int intel_pmu_pebs_fixup_ip(struct pt_regs *regs)
 		is_64bit = kernel_ip(to) || any_64bit_mode(regs);
 #endif
 		insn_init(&insn, kaddr, size, is_64bit);
-		insn_get_length(&insn);
+
 		/*
 		 * Make sure there was not a problem decoding the
 		 * instruction and getting the length.  This is
 		 * doubly important because we have an infinite
 		 * loop if insn.length=0.
 		 */
-		if (!insn.length)
+		if (insn_get_length(&insn) || !insn.length)
 			break;
 
 		to += insn.length;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH v0 07/19] perf/x86/intel/ds: Check return values of insn decoder functions
  2020-11-24 10:19 [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API Borislav Petkov
                   ` (5 preceding siblings ...)
  2020-11-24 10:19 ` [RFC PATCH v0 06/19] perf/x86/intel/ds: Check insn_get_length() retval Borislav Petkov
@ 2020-11-24 10:19 ` Borislav Petkov
  2020-11-24 10:19 ` [RFC PATCH v0 08/19] x86/alternative: Use insn_decode() Borislav Petkov
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 34+ messages in thread
From: Borislav Petkov @ 2020-11-24 10:19 UTC (permalink / raw)
  To: Andy Lutomirski, Masami Hiramatsu; +Cc: X86 ML, LKML

From: Borislav Petkov <bp@suse.de>

branch_type() doesn't need to call the full insn_decode() because it
doesn't need it in all cases thus leave the calls separate.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/events/intel/lbr.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
index 1aadb253d296..0b7d2cc3f001 100644
--- a/arch/x86/events/intel/lbr.c
+++ b/arch/x86/events/intel/lbr.c
@@ -1224,8 +1224,7 @@ static int branch_type(unsigned long from, unsigned long to, int abort)
 	is64 = kernel_ip((unsigned long)addr) || any_64bit_mode(current_pt_regs());
 #endif
 	insn_init(&insn, addr, bytes_read, is64);
-	insn_get_opcode(&insn);
-	if (!insn.opcode.got)
+	if (insn_get_opcode(&insn))
 		return X86_BR_ABORT;
 
 	switch (insn.opcode.bytes[0]) {
@@ -1262,8 +1261,7 @@ static int branch_type(unsigned long from, unsigned long to, int abort)
 		ret = X86_BR_INT;
 		break;
 	case 0xe8: /* call near rel */
-		insn_get_immediate(&insn);
-		if (insn.immediate1.value == 0) {
+		if (insn_get_immediate(&insn) || insn.immediate1.value == 0) {
 			/* zero length call */
 			ret = X86_BR_ZERO_CALL;
 			break;
@@ -1279,7 +1277,9 @@ static int branch_type(unsigned long from, unsigned long to, int abort)
 		ret = X86_BR_JMP;
 		break;
 	case 0xff: /* call near absolute, call far absolute ind */
-		insn_get_modrm(&insn);
+		if (insn_get_modrm(&insn))
+			return X86_BR_ABORT;
+
 		ext = (insn.modrm.bytes[0] >> 3) & 0x7;
 		switch (ext) {
 		case 2: /* near ind call */
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH v0 08/19] x86/alternative: Use insn_decode()
  2020-11-24 10:19 [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API Borislav Petkov
                   ` (6 preceding siblings ...)
  2020-11-24 10:19 ` [RFC PATCH v0 07/19] perf/x86/intel/ds: Check return values of insn decoder functions Borislav Petkov
@ 2020-11-24 10:19 ` Borislav Petkov
  2020-11-24 10:19 ` [RFC PATCH v0 09/19] x86/mce: Convert to insn_decode() Borislav Petkov
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 34+ messages in thread
From: Borislav Petkov @ 2020-11-24 10:19 UTC (permalink / raw)
  To: Andy Lutomirski, Masami Hiramatsu; +Cc: X86 ML, LKML

From: Borislav Petkov <bp@suse.de>

No functional changes, just simplification.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/alternative.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 2400ad62f330..02845ea7354a 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -1274,15 +1274,15 @@ static void text_poke_loc_init(struct text_poke_loc *tp, void *addr,
 			       const void *opcode, size_t len, const void *emulate)
 {
 	struct insn insn;
+	int ret;
 
 	memcpy((void *)tp->text, opcode, len);
 	if (!emulate)
 		emulate = opcode;
 
-	kernel_insn_init(&insn, emulate, MAX_INSN_SIZE);
-	insn_get_length(&insn);
+	ret = insn_decode(&insn, emulate, MAX_INSN_SIZE, INSN_MODE_KERN);
 
-	BUG_ON(!insn_complete(&insn));
+	BUG_ON(ret < 0);
 	BUG_ON(len != insn.length);
 
 	tp->rel_addr = addr - (void *)_stext;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH v0 09/19] x86/mce: Convert to insn_decode()
  2020-11-24 10:19 [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API Borislav Petkov
                   ` (7 preceding siblings ...)
  2020-11-24 10:19 ` [RFC PATCH v0 08/19] x86/alternative: Use insn_decode() Borislav Petkov
@ 2020-11-24 10:19 ` Borislav Petkov
  2020-11-24 10:19 ` [RFC PATCH v0 10/19] x86/kprobes: " Borislav Petkov
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 34+ messages in thread
From: Borislav Petkov @ 2020-11-24 10:19 UTC (permalink / raw)
  To: Andy Lutomirski, Masami Hiramatsu; +Cc: X86 ML, LKML

From: Borislav Petkov <bp@suse.de>

Simplify code, no functional changes.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/cpu/mce/severity.c | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/severity.c b/arch/x86/kernel/cpu/mce/severity.c
index 83df991314c5..a2136ced9d73 100644
--- a/arch/x86/kernel/cpu/mce/severity.c
+++ b/arch/x86/kernel/cpu/mce/severity.c
@@ -218,15 +218,15 @@ static struct severity {
 static bool is_copy_from_user(struct pt_regs *regs)
 {
 	u8 insn_buf[MAX_INSN_SIZE];
-	struct insn insn;
 	unsigned long addr;
+	struct insn insn;
+	int ret;
 
 	if (copy_from_kernel_nofault(insn_buf, (void *)regs->ip, MAX_INSN_SIZE))
 		return false;
 
-	kernel_insn_init(&insn, insn_buf, MAX_INSN_SIZE);
-	insn_get_opcode(&insn);
-	if (!insn.opcode.got)
+	ret = insn_decode(&insn, insn_buf, MAX_INSN_SIZE, INSN_MODE_KERN);
+	if (ret < 0)
 		return false;
 
 	switch (insn.opcode.value) {
@@ -234,10 +234,6 @@ static bool is_copy_from_user(struct pt_regs *regs)
 	case 0x8A: case 0x8B:
 	/* MOVZ mem,reg */
 	case 0xB60F: case 0xB70F:
-		insn_get_modrm(&insn);
-		insn_get_sib(&insn);
-		if (!insn.modrm.got || !insn.sib.got)
-			return false;
 		addr = (unsigned long)insn_get_addr_ref(&insn, regs);
 		break;
 	/* REP MOVS */
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH v0 10/19] x86/kprobes: Convert to insn_decode()
  2020-11-24 10:19 [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API Borislav Petkov
                   ` (8 preceding siblings ...)
  2020-11-24 10:19 ` [RFC PATCH v0 09/19] x86/mce: Convert to insn_decode() Borislav Petkov
@ 2020-11-24 10:19 ` Borislav Petkov
  2020-11-25 17:19   ` Masami Hiramatsu
  2020-11-24 10:19 ` [RFC PATCH v0 11/19] x86/sev-es: " Borislav Petkov
                   ` (9 subsequent siblings)
  19 siblings, 1 reply; 34+ messages in thread
From: Borislav Petkov @ 2020-11-24 10:19 UTC (permalink / raw)
  To: Andy Lutomirski, Masami Hiramatsu; +Cc: X86 ML, LKML

From: Borislav Petkov <bp@suse.de>

Simplify code, no functional changes.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/kprobes/core.c | 17 +++++++++++------
 arch/x86/kernel/kprobes/opt.c  |  9 +++++++--
 2 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 547c7abb39f5..43d4a3056d21 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -285,6 +285,8 @@ static int can_probe(unsigned long paddr)
 	/* Decode instructions */
 	addr = paddr - offset;
 	while (addr < paddr) {
+		int ret;
+
 		/*
 		 * Check if the instruction has been modified by another
 		 * kprobe, in which case we replace the breakpoint by the
@@ -296,8 +298,10 @@ static int can_probe(unsigned long paddr)
 		__addr = recover_probed_instruction(buf, addr);
 		if (!__addr)
 			return 0;
-		kernel_insn_init(&insn, (void *)__addr, MAX_INSN_SIZE);
-		insn_get_length(&insn);
+
+		ret = insn_decode(&insn, (void *)__addr, MAX_INSN_SIZE, INSN_MODE_KERN);
+		if (ret < 0)
+			return 0;
 
 		/*
 		 * Another debugging subsystem might insert this breakpoint.
@@ -340,8 +344,8 @@ static int is_IF_modifier(kprobe_opcode_t *insn)
 int __copy_instruction(u8 *dest, u8 *src, u8 *real, struct insn *insn)
 {
 	kprobe_opcode_t buf[MAX_INSN_SIZE];
-	unsigned long recovered_insn =
-		recover_probed_instruction(buf, (unsigned long)src);
+	unsigned long recovered_insn = recover_probed_instruction(buf, (unsigned long)src);
+	int ret;
 
 	if (!recovered_insn || !insn)
 		return 0;
@@ -351,8 +355,9 @@ int __copy_instruction(u8 *dest, u8 *src, u8 *real, struct insn *insn)
 			MAX_INSN_SIZE))
 		return 0;
 
-	kernel_insn_init(insn, dest, MAX_INSN_SIZE);
-	insn_get_length(insn);
+	ret = insn_decode(insn, dest, MAX_INSN_SIZE, INSN_MODE_KERN);
+	if (ret < 0)
+		return 0;
 
 	/* We can not probe force emulate prefixed instruction */
 	if (insn_has_emulate_prefix(insn))
diff --git a/arch/x86/kernel/kprobes/opt.c b/arch/x86/kernel/kprobes/opt.c
index 041f0b50bc27..4d67571249e1 100644
--- a/arch/x86/kernel/kprobes/opt.c
+++ b/arch/x86/kernel/kprobes/opt.c
@@ -299,6 +299,8 @@ static int can_optimize(unsigned long paddr)
 	addr = paddr - offset;
 	while (addr < paddr - offset + size) { /* Decode until function end */
 		unsigned long recovered_insn;
+		int ret;
+
 		if (search_exception_tables(addr))
 			/*
 			 * Since some fixup code will jumps into this function,
@@ -308,8 +310,11 @@ static int can_optimize(unsigned long paddr)
 		recovered_insn = recover_probed_instruction(buf, addr);
 		if (!recovered_insn)
 			return 0;
-		kernel_insn_init(&insn, (void *)recovered_insn, MAX_INSN_SIZE);
-		insn_get_length(&insn);
+
+		ret = insn_decode(&insn, (void *)recovered_insn, MAX_INSN_SIZE, INSN_MODE_KERN);
+		if (ret < 0)
+			return 0;
+
 		/* Another subsystem puts a breakpoint */
 		if (insn.opcode.bytes[0] == INT3_INSN_OPCODE)
 			return 0;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH v0 11/19] x86/sev-es: Convert to insn_decode()
  2020-11-24 10:19 [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API Borislav Petkov
                   ` (9 preceding siblings ...)
  2020-11-24 10:19 ` [RFC PATCH v0 10/19] x86/kprobes: " Borislav Petkov
@ 2020-11-24 10:19 ` Borislav Petkov
  2020-11-24 10:19 ` [RFC PATCH v0 12/19] x86/traps: " Borislav Petkov
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 34+ messages in thread
From: Borislav Petkov @ 2020-11-24 10:19 UTC (permalink / raw)
  To: Andy Lutomirski, Masami Hiramatsu; +Cc: X86 ML, LKML

From: Borislav Petkov <bp@suse.de>

Simplify code, no functional changes.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/sev-es.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/sev-es.c b/arch/x86/kernel/sev-es.c
index 37736486603e..564cc9fc693d 100644
--- a/arch/x86/kernel/sev-es.c
+++ b/arch/x86/kernel/sev-es.c
@@ -244,8 +244,7 @@ static int vc_fetch_insn_kernel(struct es_em_ctxt *ctxt,
 static enum es_result vc_decode_insn(struct es_em_ctxt *ctxt)
 {
 	char buffer[MAX_INSN_SIZE];
-	enum es_result ret;
-	int res;
+	int res, ret;
 
 	if (user_mode(ctxt->regs)) {
 		res = insn_fetch_from_user(ctxt->regs, buffer);
@@ -267,13 +266,13 @@ static enum es_result vc_decode_insn(struct es_em_ctxt *ctxt)
 			return ES_EXCEPTION;
 		}
 
-		insn_init(&ctxt->insn, buffer, MAX_INSN_SIZE - res, 1);
-		insn_get_length(&ctxt->insn);
+		ret = insn_decode(&ctxt->insn, buffer, MAX_INSN_SIZE - res, INSN_MODE_64);
 	}
 
-	ret = ctxt->insn.immediate.got ? ES_OK : ES_DECODE_FAILED;
-
-	return ret;
+	if (ret < 0)
+		return ES_DECODE_FAILED;
+	else
+		return ES_OK;
 }
 
 static enum es_result vc_write_mem(struct es_em_ctxt *ctxt,
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH v0 12/19] x86/traps: Convert to insn_decode()
  2020-11-24 10:19 [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API Borislav Petkov
                   ` (10 preceding siblings ...)
  2020-11-24 10:19 ` [RFC PATCH v0 11/19] x86/sev-es: " Borislav Petkov
@ 2020-11-24 10:19 ` Borislav Petkov
  2020-11-24 10:19 ` [RFC PATCH v0 13/19] x86/uprobes: " Borislav Petkov
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 34+ messages in thread
From: Borislav Petkov @ 2020-11-24 10:19 UTC (permalink / raw)
  To: Andy Lutomirski, Masami Hiramatsu; +Cc: X86 ML, LKML

From: Borislav Petkov <bp@suse.de>

Simplify code, no functional changes.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/traps.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index e1b78829d909..4a06f79aaeeb 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -493,14 +493,15 @@ static enum kernel_gp_hint get_kernel_gp_address(struct pt_regs *regs,
 {
 	u8 insn_buf[MAX_INSN_SIZE];
 	struct insn insn;
+	int ret;
 
 	if (copy_from_kernel_nofault(insn_buf, (void *)regs->ip,
 			MAX_INSN_SIZE))
 		return GP_NO_HINT;
 
-	kernel_insn_init(&insn, insn_buf, MAX_INSN_SIZE);
-	insn_get_modrm(&insn);
-	insn_get_sib(&insn);
+	ret = insn_decode(&insn, insn_buf, MAX_INSN_SIZE, INSN_MODE_KERN);
+	if (ret < 0)
+		return GP_NO_HINT;
 
 	*addr = (unsigned long)insn_get_addr_ref(&insn, regs);
 	if (*addr == -1UL)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH v0 13/19] x86/uprobes: Convert to insn_decode()
  2020-11-24 10:19 [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API Borislav Petkov
                   ` (11 preceding siblings ...)
  2020-11-24 10:19 ` [RFC PATCH v0 12/19] x86/traps: " Borislav Petkov
@ 2020-11-24 10:19 ` Borislav Petkov
  2020-11-24 10:19 ` [RFC PATCH v0 14/19] x86/tools/insn_decoder_test: " Borislav Petkov
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 34+ messages in thread
From: Borislav Petkov @ 2020-11-24 10:19 UTC (permalink / raw)
  To: Andy Lutomirski, Masami Hiramatsu; +Cc: X86 ML, LKML

From: Borislav Petkov <bp@suse.de>

Simplify code, no functional changes.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/uprobes.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c
index 3fdaa042823d..e5759310f499 100644
--- a/arch/x86/kernel/uprobes.c
+++ b/arch/x86/kernel/uprobes.c
@@ -275,12 +275,12 @@ static bool is_prefix_bad(struct insn *insn)
 
 static int uprobe_init_insn(struct arch_uprobe *auprobe, struct insn *insn, bool x86_64)
 {
+	enum insn_mode m = x86_64 ? INSN_MODE_64 : INSN_MODE_32;
 	u32 volatile *good_insns;
+	int ret;
 
-	insn_init(insn, auprobe->insn, sizeof(auprobe->insn), x86_64);
-	/* has the side-effect of processing the entire instruction */
-	insn_get_length(insn);
-	if (!insn_complete(insn))
+	ret = insn_decode(insn, auprobe->insn, sizeof(auprobe->insn), m);
+	if (ret < 0)
 		return -ENOEXEC;
 
 	if (is_prefix_bad(insn))
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH v0 14/19] x86/tools/insn_decoder_test: Convert to insn_decode()
  2020-11-24 10:19 [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API Borislav Petkov
                   ` (12 preceding siblings ...)
  2020-11-24 10:19 ` [RFC PATCH v0 13/19] x86/uprobes: " Borislav Petkov
@ 2020-11-24 10:19 ` Borislav Petkov
  2020-11-24 10:19 ` [RFC PATCH v0 15/19] tools/objtool: " Borislav Petkov
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 34+ messages in thread
From: Borislav Petkov @ 2020-11-24 10:19 UTC (permalink / raw)
  To: Andy Lutomirski, Masami Hiramatsu; +Cc: X86 ML, LKML

From: Borislav Petkov <bp@suse.de>

Simplify code, no functional changes.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/tools/insn_decoder_test.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/x86/tools/insn_decoder_test.c b/arch/x86/tools/insn_decoder_test.c
index 34eda63c124b..472540aeabc2 100644
--- a/arch/x86/tools/insn_decoder_test.c
+++ b/arch/x86/tools/insn_decoder_test.c
@@ -120,7 +120,7 @@ int main(int argc, char **argv)
 
 	while (fgets(line, BUFSIZE, stdin)) {
 		char copy[BUFSIZE], *s, *tab1, *tab2;
-		int nb = 0;
+		int nb = 0, ret;
 		unsigned int b;
 
 		if (line[0] == '<') {
@@ -148,10 +148,12 @@ int main(int argc, char **argv)
 			} else
 				break;
 		}
+
 		/* Decode an instruction */
-		insn_init(&insn, insn_buff, sizeof(insn_buff), x86_64);
-		insn_get_length(&insn);
-		if (insn.length != nb) {
+		ret = insn_decode(&insn, insn_buff, sizeof(insn_buff),
+				  x86_64 ? INSN_MODE_64 : INSN_MODE_32);
+
+		if (ret < 0 || insn.length != nb) {
 			warnings++;
 			pr_warn("Found an x86 instruction decoder bug, "
 				"please report this.\n", sym);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH v0 15/19] tools/objtool: Convert to insn_decode()
  2020-11-24 10:19 [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API Borislav Petkov
                   ` (13 preceding siblings ...)
  2020-11-24 10:19 ` [RFC PATCH v0 14/19] x86/tools/insn_decoder_test: " Borislav Petkov
@ 2020-11-24 10:19 ` Borislav Petkov
  2020-11-24 10:19 ` [RFC PATCH v0 16/19] x86/tools/insn_sanity: " Borislav Petkov
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 34+ messages in thread
From: Borislav Petkov @ 2020-11-24 10:19 UTC (permalink / raw)
  To: Andy Lutomirski, Masami Hiramatsu; +Cc: X86 ML, LKML

From: Borislav Petkov <bp@suse.de>

Simplify code, no functional changes.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 tools/objtool/arch/x86/decode.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
index cde9c36e40ae..67ee8d2a9e5c 100644
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -90,7 +90,7 @@ int arch_decode_instruction(const struct elf *elf, const struct section *sec,
 			    struct list_head *ops_list)
 {
 	struct insn insn;
-	int x86_64, sign;
+	int x86_64, sign, ret;
 	unsigned char op1, op2, rex = 0, rex_b = 0, rex_r = 0, rex_w = 0,
 		      rex_x = 0, modrm = 0, modrm_mod = 0, modrm_rm = 0,
 		      modrm_reg = 0, sib = 0;
@@ -101,10 +101,9 @@ int arch_decode_instruction(const struct elf *elf, const struct section *sec,
 	if (x86_64 == -1)
 		return -1;
 
-	insn_init(&insn, sec->data->d_buf + offset, maxlen, x86_64);
-	insn_get_length(&insn);
-
-	if (!insn_complete(&insn)) {
+	ret = insn_decode(&insn, sec->data->d_buf + offset, maxlen,
+			  x86_64 ? INSN_MODE_64 : INSN_MODE_32);
+	if (ret < 0) {
 		WARN("can't decode instruction at %s:0x%lx", sec->name, offset);
 		return -1;
 	}
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH v0 16/19] x86/tools/insn_sanity: Convert to insn_decode()
  2020-11-24 10:19 [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API Borislav Petkov
                   ` (14 preceding siblings ...)
  2020-11-24 10:19 ` [RFC PATCH v0 15/19] tools/objtool: " Borislav Petkov
@ 2020-11-24 10:19 ` Borislav Petkov
  2020-11-24 10:19 ` [RFC PATCH v0 17/19] tools/perf: " Borislav Petkov
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 34+ messages in thread
From: Borislav Petkov @ 2020-11-24 10:19 UTC (permalink / raw)
  To: Andy Lutomirski, Masami Hiramatsu; +Cc: X86 ML, LKML

From: Borislav Petkov <bp@suse.de>

Simplify code, no functional changes.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/tools/insn_sanity.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/tools/insn_sanity.c b/arch/x86/tools/insn_sanity.c
index 185ceba9d289..51309df285b4 100644
--- a/arch/x86/tools/insn_sanity.c
+++ b/arch/x86/tools/insn_sanity.c
@@ -222,8 +222,8 @@ static void parse_args(int argc, char **argv)
 
 int main(int argc, char **argv)
 {
+	int insns = 0, ret;
 	struct insn insn;
-	int insns = 0;
 	int errors = 0;
 	unsigned long i;
 	unsigned char insn_buff[MAX_INSN_SIZE * 2];
@@ -241,15 +241,15 @@ int main(int argc, char **argv)
 			continue;
 
 		/* Decode an instruction */
-		insn_init(&insn, insn_buff, sizeof(insn_buff), x86_64);
-		insn_get_length(&insn);
+		ret = insn_decode(&insn, insn_buff, sizeof(insn_buff),
+				  x86_64 ? INSN_MODE_64 : INSN_MODE_32);
 
 		if (insn.next_byte <= insn.kaddr ||
 		    insn.kaddr + MAX_INSN_SIZE < insn.next_byte) {
 			/* Access out-of-range memory */
 			dump_stream(stderr, "Error: Found an access violation", i, insn_buff, &insn);
 			errors++;
-		} else if (verbose && !insn_complete(&insn))
+		} else if (verbose && ret < 0)
 			dump_stream(stdout, "Info: Found an undecodable input", i, insn_buff, &insn);
 		else if (verbose >= 2)
 			dump_insn(stdout, &insn);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH v0 17/19] tools/perf: Convert to insn_decode()
  2020-11-24 10:19 [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API Borislav Petkov
                   ` (15 preceding siblings ...)
  2020-11-24 10:19 ` [RFC PATCH v0 16/19] x86/tools/insn_sanity: " Borislav Petkov
@ 2020-11-24 10:19 ` Borislav Petkov
  2020-11-24 10:19 ` [RFC PATCH v0 18/19] x86/insn: Remove kernel_insn_init() Borislav Petkov
                   ` (2 subsequent siblings)
  19 siblings, 0 replies; 34+ messages in thread
From: Borislav Petkov @ 2020-11-24 10:19 UTC (permalink / raw)
  To: Andy Lutomirski, Masami Hiramatsu; +Cc: X86 ML, LKML, Arnaldo Carvalho de Melo

From: Borislav Petkov <bp@suse.de>

Simplify code, no functional changes.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/arch/x86/tests/insn-x86.c            |  9 ++++-----
 tools/perf/arch/x86/util/archinsn.c             |  9 +++++----
 .../intel-pt-decoder/intel-pt-insn-decoder.c    | 17 ++++++++++-------
 3 files changed, 19 insertions(+), 16 deletions(-)

diff --git a/tools/perf/arch/x86/tests/insn-x86.c b/tools/perf/arch/x86/tests/insn-x86.c
index 745f29adb14b..6c7970b3acb6 100644
--- a/tools/perf/arch/x86/tests/insn-x86.c
+++ b/tools/perf/arch/x86/tests/insn-x86.c
@@ -95,13 +95,12 @@ static int get_branch(const char *branch_str)
 static int test_data_item(struct test_data *dat, int x86_64)
 {
 	struct intel_pt_insn intel_pt_insn;
+	int op, branch, ret;
 	struct insn insn;
-	int op, branch;
 
-	insn_init(&insn, dat->data, MAX_INSN_SIZE, x86_64);
-	insn_get_length(&insn);
-
-	if (!insn_complete(&insn)) {
+	ret = insn_decode(&insn, dat->data, MAX_INSN_SIZE,
+			  x86_64 ? INSN_MODE_64 : INSN_MODE_32);
+	if (ret < 0) {
 		pr_debug("Failed to decode: %s\n", dat->asm_rep);
 		return -1;
 	}
diff --git a/tools/perf/arch/x86/util/archinsn.c b/tools/perf/arch/x86/util/archinsn.c
index 3e6791531ca5..9fb12e8e67eb 100644
--- a/tools/perf/arch/x86/util/archinsn.c
+++ b/tools/perf/arch/x86/util/archinsn.c
@@ -11,7 +11,7 @@ void arch_fetch_insn(struct perf_sample *sample,
 		     struct machine *machine)
 {
 	struct insn insn;
-	int len;
+	int len, ret;
 	bool is64bit = false;
 
 	if (!sample->ip)
@@ -19,8 +19,9 @@ void arch_fetch_insn(struct perf_sample *sample,
 	len = thread__memcpy(thread, machine, sample->insn, sample->ip, sizeof(sample->insn), &is64bit);
 	if (len <= 0)
 		return;
-	insn_init(&insn, sample->insn, len, is64bit);
-	insn_get_length(&insn);
-	if (insn_complete(&insn) && insn.length <= len)
+
+	ret = insn_decode(&insn, sample->insn, len,
+			  is64bit ? INSN_MODE_64 : INSN_MODE_32);
+	if (ret >= 0 && insn.length <= len)
 		sample->insn_len = insn.length;
 }
diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c
index fb8a3558d3d5..56b42545946e 100644
--- a/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c
@@ -158,11 +158,13 @@ int intel_pt_get_insn(const unsigned char *buf, size_t len, int x86_64,
 		      struct intel_pt_insn *intel_pt_insn)
 {
 	struct insn insn;
+	int ret;
 
-	insn_init(&insn, buf, len, x86_64);
-	insn_get_length(&insn);
-	if (!insn_complete(&insn) || insn.length > len)
+	ret = insn_decode(&insn, buf, len,
+			  x86_64 ? INSN_MODE_64 : INSN_MODE_32);
+	if (ret < 0 || insn.length > len)
 		return -1;
+
 	intel_pt_insn_decoder(&insn, intel_pt_insn);
 	if (insn.length < INTEL_PT_INSN_BUF_SZ)
 		memcpy(intel_pt_insn->buf, buf, insn.length);
@@ -183,12 +185,13 @@ const char *dump_insn(struct perf_insn *x, uint64_t ip __maybe_unused,
 		      u8 *inbuf, int inlen, int *lenp)
 {
 	struct insn insn;
-	int n, i;
+	int n, i, ret;
 	int left;
 
-	insn_init(&insn, inbuf, inlen, x->is64bit);
-	insn_get_length(&insn);
-	if (!insn_complete(&insn) || insn.length > inlen)
+	ret = insn_decode(&insn, inbuf, inlen,
+			  x->is64bit ? INSN_MODE_64 : INSN_MODE_32);
+
+	if (ret < 0 || insn.length > inlen)
 		return "<bad>";
 	if (lenp)
 		*lenp = insn.length;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH v0 18/19] x86/insn: Remove kernel_insn_init()
  2020-11-24 10:19 [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API Borislav Petkov
                   ` (16 preceding siblings ...)
  2020-11-24 10:19 ` [RFC PATCH v0 17/19] tools/perf: " Borislav Petkov
@ 2020-11-24 10:19 ` Borislav Petkov
  2020-11-24 10:19 ` [RFC PATCH v0 19/19] x86/insn: Make insn_complete() static Borislav Petkov
  2020-11-24 17:46 ` [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API Borislav Petkov
  19 siblings, 0 replies; 34+ messages in thread
From: Borislav Petkov @ 2020-11-24 10:19 UTC (permalink / raw)
  To: Andy Lutomirski, Masami Hiramatsu; +Cc: X86 ML, LKML

From: Borislav Petkov <bp@suse.de>

Now that it is not needed anymore, drop it.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/insn.h       | 11 -----------
 tools/arch/x86/include/asm/insn.h | 11 -----------
 2 files changed, 22 deletions(-)

diff --git a/arch/x86/include/asm/insn.h b/arch/x86/include/asm/insn.h
index e80ddfe86255..ccf472ae4378 100644
--- a/arch/x86/include/asm/insn.h
+++ b/arch/x86/include/asm/insn.h
@@ -114,17 +114,6 @@ static inline void insn_get_attribute(struct insn *insn)
 /* Instruction uses RIP-relative addressing */
 extern int insn_rip_relative(struct insn *insn);
 
-/* Init insn for kernel text */
-static inline void kernel_insn_init(struct insn *insn,
-				    const void *kaddr, int buf_len)
-{
-#ifdef CONFIG_X86_64
-	insn_init(insn, kaddr, buf_len, 1);
-#else /* CONFIG_X86_32 */
-	insn_init(insn, kaddr, buf_len, 0);
-#endif
-}
-
 static inline int insn_is_avx(struct insn *insn)
 {
 	if (!insn->prefixes.got)
diff --git a/tools/arch/x86/include/asm/insn.h b/tools/arch/x86/include/asm/insn.h
index f223bd7ec27e..6c8d6b167bea 100644
--- a/tools/arch/x86/include/asm/insn.h
+++ b/tools/arch/x86/include/asm/insn.h
@@ -114,17 +114,6 @@ static inline void insn_get_attribute(struct insn *insn)
 /* Instruction uses RIP-relative addressing */
 extern int insn_rip_relative(struct insn *insn);
 
-/* Init insn for kernel text */
-static inline void kernel_insn_init(struct insn *insn,
-				    const void *kaddr, int buf_len)
-{
-#ifdef CONFIG_X86_64
-	insn_init(insn, kaddr, buf_len, 1);
-#else /* CONFIG_X86_32 */
-	insn_init(insn, kaddr, buf_len, 0);
-#endif
-}
-
 static inline int insn_is_avx(struct insn *insn)
 {
 	if (!insn->prefixes.got)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC PATCH v0 19/19] x86/insn: Make insn_complete() static
  2020-11-24 10:19 [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API Borislav Petkov
                   ` (17 preceding siblings ...)
  2020-11-24 10:19 ` [RFC PATCH v0 18/19] x86/insn: Remove kernel_insn_init() Borislav Petkov
@ 2020-11-24 10:19 ` Borislav Petkov
  2020-11-24 17:46 ` [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API Borislav Petkov
  19 siblings, 0 replies; 34+ messages in thread
From: Borislav Petkov @ 2020-11-24 10:19 UTC (permalink / raw)
  To: Andy Lutomirski, Masami Hiramatsu; +Cc: X86 ML, LKML

From: Borislav Petkov <bp@suse.de>

... and move it above the only place it is used.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/insn.h       | 7 -------
 arch/x86/lib/insn.c               | 7 +++++++
 tools/arch/x86/include/asm/insn.h | 7 -------
 tools/arch/x86/lib/insn.c         | 7 +++++++
 4 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/insn.h b/arch/x86/include/asm/insn.h
index ccf472ae4378..331379889f9f 100644
--- a/arch/x86/include/asm/insn.h
+++ b/arch/x86/include/asm/insn.h
@@ -133,13 +133,6 @@ static inline int insn_has_emulate_prefix(struct insn *insn)
 	return !!insn->emulate_prefix_size;
 }
 
-/* Ensure this instruction is decoded completely */
-static inline int insn_complete(struct insn *insn)
-{
-	return insn->opcode.got && insn->modrm.got && insn->sib.got &&
-		insn->displacement.got && insn->immediate.got;
-}
-
 static inline insn_byte_t insn_vex_m_bits(struct insn *insn)
 {
 	if (insn->vex_prefix.nbytes == 2)	/* 2 bytes VEX */
diff --git a/arch/x86/lib/insn.c b/arch/x86/lib/insn.c
index 1f749e3b4a25..4dd05534fffb 100644
--- a/arch/x86/lib/insn.c
+++ b/arch/x86/lib/insn.c
@@ -702,6 +702,13 @@ int insn_get_length(struct insn *insn)
 	return 0;
 }
 
+/* Ensure this instruction is decoded completely */
+static inline int insn_complete(struct insn *insn)
+{
+	return insn->opcode.got && insn->modrm.got && insn->sib.got &&
+		insn->displacement.got && insn->immediate.got;
+}
+
 /**
  * insn_decode() - Decode an x86 instruction
  * @insn:	&struct insn to be initialized
diff --git a/tools/arch/x86/include/asm/insn.h b/tools/arch/x86/include/asm/insn.h
index 6c8d6b167bea..5e57a4dcac42 100644
--- a/tools/arch/x86/include/asm/insn.h
+++ b/tools/arch/x86/include/asm/insn.h
@@ -133,13 +133,6 @@ static inline int insn_has_emulate_prefix(struct insn *insn)
 	return !!insn->emulate_prefix_size;
 }
 
-/* Ensure this instruction is decoded completely */
-static inline int insn_complete(struct insn *insn)
-{
-	return insn->opcode.got && insn->modrm.got && insn->sib.got &&
-		insn->displacement.got && insn->immediate.got;
-}
-
 static inline insn_byte_t insn_vex_m_bits(struct insn *insn)
 {
 	if (insn->vex_prefix.nbytes == 2)	/* 2 bytes VEX */
diff --git a/tools/arch/x86/lib/insn.c b/tools/arch/x86/lib/insn.c
index 41d2418302d9..bb83822b9e7b 100644
--- a/tools/arch/x86/lib/insn.c
+++ b/tools/arch/x86/lib/insn.c
@@ -702,6 +702,13 @@ int insn_get_length(struct insn *insn)
 	return 0;
 }
 
+/* Ensure this instruction is decoded completely */
+static inline int insn_complete(struct insn *insn)
+{
+	return insn->opcode.got && insn->modrm.got && insn->sib.got &&
+		insn->displacement.got && insn->immediate.got;
+}
+
 /**
  * insn_decode() - Decode an x86 instruction
  * @insn:	&struct insn to be initialized
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API
  2020-11-24 10:19 [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API Borislav Petkov
                   ` (18 preceding siblings ...)
  2020-11-24 10:19 ` [RFC PATCH v0 19/19] x86/insn: Make insn_complete() static Borislav Petkov
@ 2020-11-24 17:46 ` Borislav Petkov
  2020-11-25  8:03   ` Masami Hiramatsu
  2020-11-27 17:45   ` Andy Lutomirski
  19 siblings, 2 replies; 34+ messages in thread
From: Borislav Petkov @ 2020-11-24 17:46 UTC (permalink / raw)
  To: Andy Lutomirski, Masami Hiramatsu; +Cc: X86 ML, LKML

On Tue, Nov 24, 2020 at 11:19:33AM +0100, Borislav Petkov wrote:
> In any case, at least the case where I give it
> 
> 0x48 0xcf 0x48 0x83
> 
> and say that buf size is 4, should return an error because the second
> insn is incomplete. So I need to go look at that now.

Ok, got it:

./arch/x86/tools/insn_sanity: Success: decoded and checked 10000 random instructions with 0 errors (seed:0x826fdf9c)
insn buffer:
0x48 0xcf 0x48 0x83 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 
supplied buf size: 15, ret 0
supplied buf size: 2, ret 0
supplied buf size: 3, ret 0
supplied buf size: 4, ret 0
supplied buf size: 1, ret -22

the current decoder simply decodes the *first* insn in the buffer it
encounters and that's it.

When you give it a buffer of size smaller than the first instruction:

supplied buf size: 1, ret -22

while the first insn is 2 bytes long:

0x48 0xcf (IRETQ)

then it signals an error.

Andy, does that work for your use cases?

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API
  2020-11-24 17:46 ` [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API Borislav Petkov
@ 2020-11-25  8:03   ` Masami Hiramatsu
  2020-11-27 17:45   ` Andy Lutomirski
  1 sibling, 0 replies; 34+ messages in thread
From: Masami Hiramatsu @ 2020-11-25  8:03 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: Andy Lutomirski, X86 ML, LKML

On Tue, 24 Nov 2020 18:46:47 +0100
Borislav Petkov <bp@alien8.de> wrote:

> On Tue, Nov 24, 2020 at 11:19:33AM +0100, Borislav Petkov wrote:
> > In any case, at least the case where I give it
> > 
> > 0x48 0xcf 0x48 0x83
> > 
> > and say that buf size is 4, should return an error because the second
> > insn is incomplete. So I need to go look at that now.
> 
> Ok, got it:
> 
> ./arch/x86/tools/insn_sanity: Success: decoded and checked 10000 random instructions with 0 errors (seed:0x826fdf9c)
> insn buffer:
> 0x48 0xcf 0x48 0x83 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 
> supplied buf size: 15, ret 0
> supplied buf size: 2, ret 0
> supplied buf size: 3, ret 0
> supplied buf size: 4, ret 0
> supplied buf size: 1, ret -22
> 
> the current decoder simply decodes the *first* insn in the buffer it
> encounters and that's it.

Yes, currently the buf_size is only for checking the maximum length of
the buffer, because we expect the user doesn't know the actual length of
the instruction before calling insn_get_length().
But yes, for the insn_sanity.c, the return length should be compared.

Thank you,

> 
> When you give it a buffer of size smaller than the first instruction:
> 
> supplied buf size: 1, ret -22
> 
> while the first insn is 2 bytes long:
> 
> 0x48 0xcf (IRETQ)
> 
> then it signals an error.
> 
> Andy, does that work for your use cases?
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> https://people.kernel.org/tglx/notes-about-netiquette


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH v0 03/19] x86/insn: Add an insn_decode() API
  2020-11-24 10:19 ` [RFC PATCH v0 03/19] x86/insn: Add an insn_decode() API Borislav Petkov
@ 2020-11-25 16:53   ` Masami Hiramatsu
  2020-11-25 19:25     ` Borislav Petkov
  0 siblings, 1 reply; 34+ messages in thread
From: Masami Hiramatsu @ 2020-11-25 16:53 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: Andy Lutomirski, X86 ML, LKML

Hi Borislav,

On Tue, 24 Nov 2020 11:19:36 +0100
Borislav Petkov <bp@alien8.de> wrote:

> From: Borislav Petkov <bp@suse.de>
> 
> Users of the instruction decoder should use this to decode instruction
> bytes. For that, have insn*() helpers return an int value to denote
> success/failure.
> 
> While at it, make insn_get_opcode() more stricter as to whether what has
> seen so far is a valid insn and if not.

(only from the viewpoint of VEX coding, a bit stricter, but not perfect.)


> Copy linux/kconfig.h for the tools-version of the decoder so that it can
> use IS_ENABLED().
> 
> Signed-off-by: Borislav Petkov <bp@suse.de>
> ---
>  arch/x86/include/asm/insn.h       |  24 ++--
>  arch/x86/lib/insn.c               | 182 +++++++++++++++++++++++-------
>  tools/arch/x86/include/asm/insn.h |  24 ++--
>  tools/arch/x86/lib/insn.c         | 182 +++++++++++++++++++++++-------
>  tools/include/linux/kconfig.h     |  73 ++++++++++++
>  5 files changed, 391 insertions(+), 94 deletions(-)
>  create mode 100644 tools/include/linux/kconfig.h
> 
> diff --git a/arch/x86/include/asm/insn.h b/arch/x86/include/asm/insn.h
> index 5c1ae3eff9d4..e80ddfe86255 100644
> --- a/arch/x86/include/asm/insn.h
> +++ b/arch/x86/include/asm/insn.h
> @@ -87,13 +87,23 @@ struct insn {
>  #define X86_VEX_M_MAX	0x1f			/* VEX3.M Maximum value */
>  
>  extern void insn_init(struct insn *insn, const void *kaddr, int buf_len, int x86_64);
> -extern void insn_get_prefixes(struct insn *insn);
> -extern void insn_get_opcode(struct insn *insn);
> -extern void insn_get_modrm(struct insn *insn);
> -extern void insn_get_sib(struct insn *insn);
> -extern void insn_get_displacement(struct insn *insn);
> -extern void insn_get_immediate(struct insn *insn);
> -extern void insn_get_length(struct insn *insn);
> +extern int insn_get_prefixes(struct insn *insn);
> +extern int insn_get_opcode(struct insn *insn);
> +extern int insn_get_modrm(struct insn *insn);
> +extern int insn_get_sib(struct insn *insn);
> +extern int insn_get_displacement(struct insn *insn);
> +extern int insn_get_immediate(struct insn *insn);
> +extern int insn_get_length(struct insn *insn);
> +
> +enum insn_mode {
> +	INSN_MODE_32,
> +	INSN_MODE_64,
> +	/* Mode is determined by the current kernel build. */
> +	INSN_MODE_KERN,
> +	INSN_NUM_MODES,
> +};
> +
> +extern int insn_decode(struct insn *insn, const void *kaddr, int buf_len, enum insn_mode m);
>  
>  /* Attribute will be determined after getting ModRM (for opcode groups) */
>  static inline void insn_get_attribute(struct insn *insn)
> diff --git a/arch/x86/lib/insn.c b/arch/x86/lib/insn.c
> index 1ba994862b56..1f749e3b4a25 100644
> --- a/arch/x86/lib/insn.c
> +++ b/arch/x86/lib/insn.c
> @@ -13,6 +13,9 @@
>  #include <asm/inat.h>
>  #include <asm/insn.h>
>  
> +#include <linux/errno.h>
> +#include <linux/kconfig.h>
> +
>  #include <asm/emulate_prefix.h>
>  
>  /* Verify next sizeof(t) bytes can be on the same instruction */
> @@ -98,8 +101,12 @@ static void insn_get_emulate_prefix(struct insn *insn)
>   * Populates the @insn->prefixes bitmap, and updates @insn->next_byte
>   * to point to the (first) opcode.  No effect if @insn->prefixes.got
>   * is already set.
> + *
> + * * Returns:
> + * 0:  on success
> + * !0: on error
>   */

So this is different from...

[..]
> +
> +/**
> + * insn_decode() - Decode an x86 instruction
> + * @insn:	&struct insn to be initialized
> + * @kaddr:	address (in kernel memory) of instruction (or copy thereof)
> + * @buf_len:	length of the insn buffer at @kaddr
> + * @m:		insn mode, see enum insn_mode
> + *
> + * Returns:
> + * 0: if decoding succeeded
> + * < 0: otherwise.

this return value.

Even for the insn_get_*(), I would like to see them returning -EINVAL
as same as insn_decode(). Same API group has different return value is
confusing.

Thank you,

-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH v0 10/19] x86/kprobes: Convert to insn_decode()
  2020-11-24 10:19 ` [RFC PATCH v0 10/19] x86/kprobes: " Borislav Petkov
@ 2020-11-25 17:19   ` Masami Hiramatsu
  0 siblings, 0 replies; 34+ messages in thread
From: Masami Hiramatsu @ 2020-11-25 17:19 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: Andy Lutomirski, X86 ML, LKML

On Tue, 24 Nov 2020 11:19:43 +0100
Borislav Petkov <bp@alien8.de> wrote:

> From: Borislav Petkov <bp@suse.de>
> 
> Simplify code, no functional changes.

You've made a functional change. Improve decoding error check :)
Anyway, this looks good to me.

Acked-by: Masami Hiramatsu <mhiramat@kernel.org>

Thank you!

> 
> Signed-off-by: Borislav Petkov <bp@suse.de>
> ---
>  arch/x86/kernel/kprobes/core.c | 17 +++++++++++------
>  arch/x86/kernel/kprobes/opt.c  |  9 +++++++--
>  2 files changed, 18 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
> index 547c7abb39f5..43d4a3056d21 100644
> --- a/arch/x86/kernel/kprobes/core.c
> +++ b/arch/x86/kernel/kprobes/core.c
> @@ -285,6 +285,8 @@ static int can_probe(unsigned long paddr)
>  	/* Decode instructions */
>  	addr = paddr - offset;
>  	while (addr < paddr) {
> +		int ret;
> +
>  		/*
>  		 * Check if the instruction has been modified by another
>  		 * kprobe, in which case we replace the breakpoint by the
> @@ -296,8 +298,10 @@ static int can_probe(unsigned long paddr)
>  		__addr = recover_probed_instruction(buf, addr);
>  		if (!__addr)
>  			return 0;
> -		kernel_insn_init(&insn, (void *)__addr, MAX_INSN_SIZE);
> -		insn_get_length(&insn);
> +
> +		ret = insn_decode(&insn, (void *)__addr, MAX_INSN_SIZE, INSN_MODE_KERN);
> +		if (ret < 0)
> +			return 0;
>  
>  		/*
>  		 * Another debugging subsystem might insert this breakpoint.
> @@ -340,8 +344,8 @@ static int is_IF_modifier(kprobe_opcode_t *insn)
>  int __copy_instruction(u8 *dest, u8 *src, u8 *real, struct insn *insn)
>  {
>  	kprobe_opcode_t buf[MAX_INSN_SIZE];
> -	unsigned long recovered_insn =
> -		recover_probed_instruction(buf, (unsigned long)src);
> +	unsigned long recovered_insn = recover_probed_instruction(buf, (unsigned long)src);
> +	int ret;
>  
>  	if (!recovered_insn || !insn)
>  		return 0;
> @@ -351,8 +355,9 @@ int __copy_instruction(u8 *dest, u8 *src, u8 *real, struct insn *insn)
>  			MAX_INSN_SIZE))
>  		return 0;
>  
> -	kernel_insn_init(insn, dest, MAX_INSN_SIZE);
> -	insn_get_length(insn);
> +	ret = insn_decode(insn, dest, MAX_INSN_SIZE, INSN_MODE_KERN);
> +	if (ret < 0)
> +		return 0;
>  
>  	/* We can not probe force emulate prefixed instruction */
>  	if (insn_has_emulate_prefix(insn))
> diff --git a/arch/x86/kernel/kprobes/opt.c b/arch/x86/kernel/kprobes/opt.c
> index 041f0b50bc27..4d67571249e1 100644
> --- a/arch/x86/kernel/kprobes/opt.c
> +++ b/arch/x86/kernel/kprobes/opt.c
> @@ -299,6 +299,8 @@ static int can_optimize(unsigned long paddr)
>  	addr = paddr - offset;
>  	while (addr < paddr - offset + size) { /* Decode until function end */
>  		unsigned long recovered_insn;
> +		int ret;
> +
>  		if (search_exception_tables(addr))
>  			/*
>  			 * Since some fixup code will jumps into this function,
> @@ -308,8 +310,11 @@ static int can_optimize(unsigned long paddr)
>  		recovered_insn = recover_probed_instruction(buf, addr);
>  		if (!recovered_insn)
>  			return 0;
> -		kernel_insn_init(&insn, (void *)recovered_insn, MAX_INSN_SIZE);
> -		insn_get_length(&insn);
> +
> +		ret = insn_decode(&insn, (void *)recovered_insn, MAX_INSN_SIZE, INSN_MODE_KERN);
> +		if (ret < 0)
> +			return 0;
> +
>  		/* Another subsystem puts a breakpoint */
>  		if (insn.opcode.bytes[0] == INT3_INSN_OPCODE)
>  			return 0;
> -- 
> 2.21.0
> 


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH v0 03/19] x86/insn: Add an insn_decode() API
  2020-11-25 16:53   ` Masami Hiramatsu
@ 2020-11-25 19:25     ` Borislav Petkov
  2020-11-26  1:37       ` Masami Hiramatsu
  0 siblings, 1 reply; 34+ messages in thread
From: Borislav Petkov @ 2020-11-25 19:25 UTC (permalink / raw)
  To: Masami Hiramatsu; +Cc: Andy Lutomirski, X86 ML, LKML

On Thu, Nov 26, 2020 at 01:53:33AM +0900, Masami Hiramatsu wrote:
> (only from the viewpoint of VEX coding, a bit stricter, but not perfect.)

Yeah, I wanted to document the fact that it has changed behavior in the
commit message - we'll make it perfect later. :-)

> > @@ -98,8 +101,12 @@ static void insn_get_emulate_prefix(struct insn *insn)
> >   * Populates the @insn->prefixes bitmap, and updates @insn->next_byte
> >   * to point to the (first) opcode.  No effect if @insn->prefixes.got
> >   * is already set.
> > + *
> > + * * Returns:
> > + * 0:  on success
> > + * !0: on error
> >   */
> 
> So this is different from...
> 
> [..]
> > +
> > +/**
> > + * insn_decode() - Decode an x86 instruction
> > + * @insn:	&struct insn to be initialized
> > + * @kaddr:	address (in kernel memory) of instruction (or copy thereof)
> > + * @buf_len:	length of the insn buffer at @kaddr
> > + * @m:		insn mode, see enum insn_mode
> > + *
> > + * Returns:
> > + * 0: if decoding succeeded
> > + * < 0: otherwise.
> 
> this return value.
> 
> Even for the insn_get_*(), I would like to see them returning -EINVAL
> as same as insn_decode(). Same API group has different return value is
> confusing.

Right, my goal in the end here is to make *all* users of the decoder
call insn_decode() and nothing else. And there you can have different
return values so checking negative/positive is the proper way to go.

Those other helpers, though, should then become internal and for those I
think it is easier to use them when they return a boolean yes/no value,
meaning, they do one thing and one thing only:

For example, it is more readable to do:

	if (insn_...)

vs

	int ret;

	...

	ret = insn_,...()
	if (ret)
		...

which is 4 more lines of error handling and return variable, leading to
more code.

But if you want to be able to use those other helpers outside of the
decoder - for whatever reason - then sure, the function signatures
should be the same.

Thoughts?

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH v0 03/19] x86/insn: Add an insn_decode() API
  2020-11-25 19:25     ` Borislav Petkov
@ 2020-11-26  1:37       ` Masami Hiramatsu
  2020-11-26 17:50         ` Borislav Petkov
  0 siblings, 1 reply; 34+ messages in thread
From: Masami Hiramatsu @ 2020-11-26  1:37 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: Andy Lutomirski, X86 ML, LKML

On Wed, 25 Nov 2020 20:25:53 +0100
Borislav Petkov <bp@alien8.de> wrote:

> On Thu, Nov 26, 2020 at 01:53:33AM +0900, Masami Hiramatsu wrote:
> > (only from the viewpoint of VEX coding, a bit stricter, but not perfect.)
> 
> Yeah, I wanted to document the fact that it has changed behavior in the
> commit message - we'll make it perfect later. :-)

It is just a note :) and no need to update the document.

BTW, the instruction validation depends on who needs it, because to
check the all invalid ops, we need more information in the x86-opcode-map.txt
and it will bloat up the table size and consumes more time to analysis.
(Moreover, it depends on the processor generation -- older processor will
not support VEX prefix, those are invalid)
If we can expect that the target instruction is kernel text which compiler
outputs, we may not need such instruction validation.
> 
> > > @@ -98,8 +101,12 @@ static void insn_get_emulate_prefix(struct insn *insn)
> > >   * Populates the @insn->prefixes bitmap, and updates @insn->next_byte
> > >   * to point to the (first) opcode.  No effect if @insn->prefixes.got
> > >   * is already set.
> > > + *
> > > + * * Returns:
> > > + * 0:  on success
> > > + * !0: on error
> > >   */
> > 
> > So this is different from...
> > 
> > [..]
> > > +
> > > +/**
> > > + * insn_decode() - Decode an x86 instruction
> > > + * @insn:	&struct insn to be initialized
> > > + * @kaddr:	address (in kernel memory) of instruction (or copy thereof)
> > > + * @buf_len:	length of the insn buffer at @kaddr
> > > + * @m:		insn mode, see enum insn_mode
> > > + *
> > > + * Returns:
> > > + * 0: if decoding succeeded
> > > + * < 0: otherwise.
> > 
> > this return value.
> > 
> > Even for the insn_get_*(), I would like to see them returning -EINVAL
> > as same as insn_decode(). Same API group has different return value is
> > confusing.
> 
> Right, my goal in the end here is to make *all* users of the decoder
> call insn_decode() and nothing else. And there you can have different
> return values so checking negative/positive is the proper way to go.
> 
> Those other helpers, though, should then become internal and for those I
> think it is easier to use them when they return a boolean yes/no value,
> meaning, they do one thing and one thing only:
> 
> For example, it is more readable to do:
> 
> 	if (insn_...)
> 
> vs
> 
> 	int ret;
> 
> 	...
> 
> 	ret = insn_,...()
> 	if (ret)
> 		...
> 
> which is 4 more lines of error handling and return variable, leading to
> more code.

OK, then could you use -1 instead of 1? It may allow us to expand it
to return error code in the future.

> 
> But if you want to be able to use those other helpers outside of the
> decoder - for whatever reason - then sure, the function signatures
> should be the same.

I think insn_get_prefixes() can be used independently, because x86
perfix bytes is very complex. (I found kprobes is using its insn_attribute
directly, but that is not good. I'll fix it)

Thank you,

-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH v0 03/19] x86/insn: Add an insn_decode() API
  2020-11-26  1:37       ` Masami Hiramatsu
@ 2020-11-26 17:50         ` Borislav Petkov
  2020-11-27  5:54           ` Masami Hiramatsu
  0 siblings, 1 reply; 34+ messages in thread
From: Borislav Petkov @ 2020-11-26 17:50 UTC (permalink / raw)
  To: Masami Hiramatsu; +Cc: Andy Lutomirski, X86 ML, LKML

On Thu, Nov 26, 2020 at 10:37:09AM +0900, Masami Hiramatsu wrote:
> BTW, the instruction validation depends on who needs it, because to
> check the all invalid ops, we need more information in the x86-opcode-map.txt
> and it will bloat up the table size and consumes more time to analysis.

Yes, the decoder is supposed to serve the kernel's needs, not be a
general purpose one.

> (Moreover, it depends on the processor generation -- older processor will
> not support VEX prefix, those are invalid)

Why does the processor VEX support matter? Isn't the decoder supposed to
decode any instruction it knows about, regardless of the CPU it runs on?

> OK, then could you use -1 instead of 1? It may allow us to expand it
> to return error code in the future.

Ok, sure.

> I think insn_get_prefixes() can be used independently, because x86
> perfix bytes is very complex.

Yah, it all depends on what API interfaces we want to give to users and
make those other helpers internal. Time and usecases will tell.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH v0 03/19] x86/insn: Add an insn_decode() API
  2020-11-26 17:50         ` Borislav Petkov
@ 2020-11-27  5:54           ` Masami Hiramatsu
  0 siblings, 0 replies; 34+ messages in thread
From: Masami Hiramatsu @ 2020-11-27  5:54 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: Andy Lutomirski, X86 ML, LKML

On Thu, 26 Nov 2020 18:50:11 +0100
Borislav Petkov <bp@alien8.de> wrote:

> On Thu, Nov 26, 2020 at 10:37:09AM +0900, Masami Hiramatsu wrote:
> > BTW, the instruction validation depends on who needs it, because to
> > check the all invalid ops, we need more information in the x86-opcode-map.txt
> > and it will bloat up the table size and consumes more time to analysis.
> 
> Yes, the decoder is supposed to serve the kernel's needs, not be a
> general purpose one.
> 
> > (Moreover, it depends on the processor generation -- older processor will
> > not support VEX prefix, those are invalid)
> 
> Why does the processor VEX support matter? Isn't the decoder supposed to
> decode any instruction it knows about, regardless of the CPU it runs on?

Hm, you meant the "invalid" means "that can not be decoded" ?
Then it is OK. I Thought "invalid" means "the processor can not execute
(some exception will occur)".

> 
> > OK, then could you use -1 instead of 1? It may allow us to expand it
> > to return error code in the future.
> 
> Ok, sure.

Thanks!

> 
> > I think insn_get_prefixes() can be used independently, because x86
> > perfix bytes is very complex.
> 
> Yah, it all depends on what API interfaces we want to give to users and
> make those other helpers internal. Time and usecases will tell.
> 
> Thx.
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> https://people.kernel.org/tglx/notes-about-netiquette


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API
  2020-11-24 17:46 ` [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API Borislav Petkov
  2020-11-25  8:03   ` Masami Hiramatsu
@ 2020-11-27 17:45   ` Andy Lutomirski
  2020-11-27 22:35     ` Borislav Petkov
  2020-11-29  8:50     ` Masami Hiramatsu
  1 sibling, 2 replies; 34+ messages in thread
From: Andy Lutomirski @ 2020-11-27 17:45 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: Masami Hiramatsu, X86 ML, LKML

On Tue, Nov 24, 2020 at 9:46 AM Borislav Petkov <bp@alien8.de> wrote:
>
> On Tue, Nov 24, 2020 at 11:19:33AM +0100, Borislav Petkov wrote:
> > In any case, at least the case where I give it
> >
> > 0x48 0xcf 0x48 0x83
> >
> > and say that buf size is 4, should return an error because the second
> > insn is incomplete. So I need to go look at that now.
>
> Ok, got it:
>
> ./arch/x86/tools/insn_sanity: Success: decoded and checked 10000 random instructions with 0 errors (seed:0x826fdf9c)
> insn buffer:
> 0x48 0xcf 0x48 0x83 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90
> supplied buf size: 15, ret 0
> supplied buf size: 2, ret 0
> supplied buf size: 3, ret 0
> supplied buf size: 4, ret 0
> supplied buf size: 1, ret -22
>
> the current decoder simply decodes the *first* insn in the buffer it
> encounters and that's it.
>
> When you give it a buffer of size smaller than the first instruction:
>
> supplied buf size: 1, ret -22
>
> while the first insn is 2 bytes long:
>
> 0x48 0xcf (IRETQ)
>
> then it signals an error.
>
> Andy, does that work for your use cases?

Is -22 (-EINVAL) the same error it returns if you pass in garbage?
How hard would it be to teach it to return a different error code when
the buffer is too small?

>
> --
> Regards/Gruss,
>     Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API
  2020-11-27 17:45   ` Andy Lutomirski
@ 2020-11-27 22:35     ` Borislav Petkov
  2020-11-29  8:50     ` Masami Hiramatsu
  1 sibling, 0 replies; 34+ messages in thread
From: Borislav Petkov @ 2020-11-27 22:35 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: Masami Hiramatsu, X86 ML, LKML

On Fri, Nov 27, 2020 at 09:45:39AM -0800, Andy Lutomirski wrote:
> Is -22 (-EINVAL) the same error it returns if you pass in garbage?

Define garbage.

Yes, if you have a sequence of bytes which you can unambiguously
determine to be

 - an invalid instruction in some of the tables

 - REX prefix with the wrong bits set

 - a byte says that some insn part like ModRM or SIB is following but
 the buffer falls short

 - ... other error condition

then maybe you can say, yes, I'm looking at garbage and can error out
right then and there. But you need to have enough bytes of information
to determine that.

For example (those are random bytes):

00000000000011ff <.asm_start>:
    11ff:       95                      xchg   %eax,%ebp
    1200:       14 60                   adc    $0x60,%al
    1202:       77 74                   ja     1278 <__libc_csu_init+0x28>
    1204:       82                      (bad)  
    1205:       67 dc 55 35             fcoml  0x35(%ebp)

The 0x82 is usually in opcode group 1 but that opcode is invalid in
64-bit mode. So if this is a 64-bit executable, you know that that is an
invalid insn.

Another example:

              18:       a0                      .byte 0xa0
              19:       17                       (bad)
              1a:       27                       (bad)
              1b:       ea                       (bad)
              1c:       90                      nop
              1d:       90                      nop
              1e:       90                      nop
              1f:       90                      nop

0xa0 is the opcode for

	MOV AL, moffset8

where moffset8 is an address-sized memory offset, which in 64-bit mode
is 64-bit. But we have only 7 bytes after the 0xa0 thus we know that the
buffer is truncated.

If it had one byte more, it would be a valid insn:

               18:       a0 17 27 ea 90 90 90    movabs 0x9090909090ea2717,%al
               20:       90 90 


I'm sure you get the idea: if you have enough unambiguous bits which
tell you that this cannot be a valid insn, then you can return early
from the decoder and signal that fact.

I'm not sure that you can do that for all possible byte combinations
and also I'm not sure that it won't ever happen that it per chance
misinterprets garbage data as valid instructions.

> How hard would it be to teach it to return a different error code when
> the buffer is too small?

Yap, see above. Unambiguous cases are clear but I don't know it would
work in all cases. For example, let's say you give it a zeroed out
buffer of 8 bytes which doesn't contain anything - you've just zeroed it
out and haven't put any insns in there yet.

But those are perfectly valid insns:

               0:       00 00                   add %al,(%rax)
               2:       00 00                   add %al,(%rax)
               4:       00 00                   add %al,(%rax)
               6:       00 00                   add %al,(%rax)

So now you go about your merry way working with those although they're
not real instructions which some tool generated.

See what I mean?

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API
  2020-11-27 17:45   ` Andy Lutomirski
  2020-11-27 22:35     ` Borislav Petkov
@ 2020-11-29  8:50     ` Masami Hiramatsu
  2020-11-30 13:44       ` Borislav Petkov
  1 sibling, 1 reply; 34+ messages in thread
From: Masami Hiramatsu @ 2020-11-29  8:50 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: Borislav Petkov, Masami Hiramatsu, X86 ML, LKML

On Fri, 27 Nov 2020 09:45:39 -0800
Andy Lutomirski <luto@amacapital.net> wrote:

> On Tue, Nov 24, 2020 at 9:46 AM Borislav Petkov <bp@alien8.de> wrote:
> >
> > On Tue, Nov 24, 2020 at 11:19:33AM +0100, Borislav Petkov wrote:
> > > In any case, at least the case where I give it
> > >
> > > 0x48 0xcf 0x48 0x83
> > >
> > > and say that buf size is 4, should return an error because the second
> > > insn is incomplete. So I need to go look at that now.
> >
> > Ok, got it:
> >
> > ./arch/x86/tools/insn_sanity: Success: decoded and checked 10000 random instructions with 0 errors (seed:0x826fdf9c)
> > insn buffer:
> > 0x48 0xcf 0x48 0x83 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90
> > supplied buf size: 15, ret 0
> > supplied buf size: 2, ret 0
> > supplied buf size: 3, ret 0
> > supplied buf size: 4, ret 0
> > supplied buf size: 1, ret -22
> >
> > the current decoder simply decodes the *first* insn in the buffer it
> > encounters and that's it.
> >
> > When you give it a buffer of size smaller than the first instruction:
> >
> > supplied buf size: 1, ret -22
> >
> > while the first insn is 2 bytes long:
> >
> > 0x48 0xcf (IRETQ)
> >
> > then it signals an error.
> >
> > Andy, does that work for your use cases?
> 
> Is -22 (-EINVAL) the same error it returns if you pass in garbage?
> How hard would it be to teach it to return a different error code when
> the buffer is too small?
> 

Good point. I think we can return, e.g. -EFAULT if we failed in get_next(). Then, we can read out next page, for example.

Thank you,

-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API
  2020-11-29  8:50     ` Masami Hiramatsu
@ 2020-11-30 13:44       ` Borislav Petkov
  2020-11-30 17:21         ` Masami Hiramatsu
  0 siblings, 1 reply; 34+ messages in thread
From: Borislav Petkov @ 2020-11-30 13:44 UTC (permalink / raw)
  To: Masami Hiramatsu; +Cc: Andy Lutomirski, X86 ML, LKML

On Sun, Nov 29, 2020 at 05:50:05PM +0900, Masami Hiramatsu wrote:
> Good point. I think we can return, e.g. -EFAULT if we failed in
> get_next(). Then, we can read out next page, for example.

Why -EFAULT?

Running this

        size = 1;
        ret = insn_decode(&insn, b, size, INSN_MODE_64)

i.e., buffer size is 1:

./arch/x86/tools/insn_sanity: Success: decoded and checked 10000 random instructions with 0 errors (seed:0x9994a137)
insn buffer:
0x48 0xcf 0x48 0x83 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 
supplied buf size: 15, ret 0
supplied buf size: 2, ret 0
supplied buf size: 3, ret 0
supplied buf size: 4, ret 0
insn_decode: entry
insn_decode: will get_length
insn_get_immediate: getting immediate
insn_get_displacement: getting displacement
insn_get_sib: getting sib
insn_get_modrm: entry
insn_get_opcode: entry
insn_get_prefixes: entry, prefixes->got: 0
insn_get_prefixes: 1
insn_get_prefixes: REX
insn_get_prefixes: VEX
insn_get_prefixes: validate_next: 0
insn_get_prefixes: insn->next_byte: 0x7ffec297c3e1, insn->end_kaddr: 0x7ffec297c3e1
insn_get_prefixes: errored out
supplied buf size: 1, ret -22

is caught in validate_next() where ->next_byte == ->end_kaddr.

I'm thinking we should return EOF here, to denote that we're reached the
end and then propagate that error up the callchain.

We don't have "define EOF" in the kernel but we can designate one for
the insn decoder, perhaps

#define EOF      -1

as stdio.h does:

/* The value returned by fgetc and similar functions to indicate the
   end of the file.  */
#define EOF (-1)

Hmm, but then the callers would need to know EOF too so maybe EIO or
something.

In any case, it should be a value which callers should be able to use to
get told that input buffer is truncated...

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API
  2020-11-30 13:44       ` Borislav Petkov
@ 2020-11-30 17:21         ` Masami Hiramatsu
  2020-12-02 18:04           ` Borislav Petkov
  0 siblings, 1 reply; 34+ messages in thread
From: Masami Hiramatsu @ 2020-11-30 17:21 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: Andy Lutomirski, X86 ML, LKML

On Mon, 30 Nov 2020 14:44:42 +0100
Borislav Petkov <bp@alien8.de> wrote:

> On Sun, Nov 29, 2020 at 05:50:05PM +0900, Masami Hiramatsu wrote:
> > Good point. I think we can return, e.g. -EFAULT if we failed in
> > get_next(). Then, we can read out next page, for example.
> 
> Why -EFAULT?

Because it overruns the buffer. Maybe -E2BIG/ENODATA or any other
error (except for -EINVAL) is OK :)
 
> Running this
> 
>         size = 1;
>         ret = insn_decode(&insn, b, size, INSN_MODE_64)
> 
> i.e., buffer size is 1:
> 
> ./arch/x86/tools/insn_sanity: Success: decoded and checked 10000 random instructions with 0 errors (seed:0x9994a137)
> insn buffer:
> 0x48 0xcf 0x48 0x83 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 
> supplied buf size: 15, ret 0
> supplied buf size: 2, ret 0
> supplied buf size: 3, ret 0
> supplied buf size: 4, ret 0
> insn_decode: entry
> insn_decode: will get_length
> insn_get_immediate: getting immediate
> insn_get_displacement: getting displacement
> insn_get_sib: getting sib
> insn_get_modrm: entry
> insn_get_opcode: entry
> insn_get_prefixes: entry, prefixes->got: 0
> insn_get_prefixes: 1
> insn_get_prefixes: REX
> insn_get_prefixes: VEX
> insn_get_prefixes: validate_next: 0
> insn_get_prefixes: insn->next_byte: 0x7ffec297c3e1, insn->end_kaddr: 0x7ffec297c3e1
> insn_get_prefixes: errored out
> supplied buf size: 1, ret -22
> 
> is caught in validate_next() where ->next_byte == ->end_kaddr.
> 
> I'm thinking we should return EOF here, to denote that we're reached the
> end and then propagate that error up the callchain.

EOF means the end of file and it is not an error. Here, since the
decoder fails to decode the data, so it should return an error code.

> 
> We don't have "define EOF" in the kernel but we can designate one for
> the insn decoder, perhaps
> 
> #define EOF      -1
> 
> as stdio.h does:
> 
> /* The value returned by fgetc and similar functions to indicate the
>    end of the file.  */
> #define EOF (-1)

It is because libc fgetc() returns -1 in any error case...

> 
> Hmm, but then the callers would need to know EOF too so maybe EIO or
> something.
> 
> In any case, it should be a value which callers should be able to use to
> get told that input buffer is truncated...

Yes!

Thank you,

> 
> Thx.
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> https://people.kernel.org/tglx/notes-about-netiquette


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API
  2020-11-30 17:21         ` Masami Hiramatsu
@ 2020-12-02 18:04           ` Borislav Petkov
  0 siblings, 0 replies; 34+ messages in thread
From: Borislav Petkov @ 2020-12-02 18:04 UTC (permalink / raw)
  To: Masami Hiramatsu; +Cc: Andy Lutomirski, X86 ML, LKML

On Tue, Dec 01, 2020 at 02:21:45AM +0900, Masami Hiramatsu wrote:
> Because it overruns the buffer. Maybe -E2BIG/ENODATA or any other
> error (except for -EINVAL) is OK :)

ENODATA it is. :)

And propagating that error value is easy because the err_out: labels are
all coming from the validate_next() error path so we basically know that
the buffer has ended.

./insn_sanity: Success: decoded and checked 10000 random instructions with 0 errors (seed:0x7bdfa56e)
insn buffer:
0x48 0xcf 0x48 0x83 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 
supplied buf size: 15, ret 0
supplied buf size: 2, ret 0
supplied buf size: 3, ret 0
supplied buf size: 4, ret 0
insn_decode: entry
insn_decode: will get_length
insn_get_immediate: getting immediate
insn_get_displacement: getting displacement
insn_get_sib: getting sib
insn_get_modrm: entry
insn_get_opcode: entry
insn_get_prefixes: entry, prefixes->got: 0
insn_get_prefixes: 1
insn_get_prefixes: REX
insn_get_prefixes: VEX
insn_get_prefixes: validate_next: 0
insn_get_prefixes: insn->next_byte: 0x7ffc211eb661, insn->end_kaddr: 0x7ffc211eb661
insn_get_prefixes: errored out
supplied buf size: 1, ret -61

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2020-12-02 18:05 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-24 10:19 [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API Borislav Petkov
2020-11-24 10:19 ` [RFC PATCH v0 01/19] x86/insn: Rename insn_decode() to insn_decode_regs() Borislav Petkov
2020-11-24 10:19 ` [RFC PATCH v0 02/19] x86/insn: Add @buf_len param to insn_init() kernel-doc comment Borislav Petkov
2020-11-24 10:19 ` [RFC PATCH v0 03/19] x86/insn: Add an insn_decode() API Borislav Petkov
2020-11-25 16:53   ` Masami Hiramatsu
2020-11-25 19:25     ` Borislav Petkov
2020-11-26  1:37       ` Masami Hiramatsu
2020-11-26 17:50         ` Borislav Petkov
2020-11-27  5:54           ` Masami Hiramatsu
2020-11-24 10:19 ` [RFC PATCH v0 04/19] x86/insn-eval: Handle return values from the decoder Borislav Petkov
2020-11-24 10:19 ` [RFC PATCH v0 05/19] x86/boot/compressed/sev-es: Convert to insn_decode() Borislav Petkov
2020-11-24 10:19 ` [RFC PATCH v0 06/19] perf/x86/intel/ds: Check insn_get_length() retval Borislav Petkov
2020-11-24 10:19 ` [RFC PATCH v0 07/19] perf/x86/intel/ds: Check return values of insn decoder functions Borislav Petkov
2020-11-24 10:19 ` [RFC PATCH v0 08/19] x86/alternative: Use insn_decode() Borislav Petkov
2020-11-24 10:19 ` [RFC PATCH v0 09/19] x86/mce: Convert to insn_decode() Borislav Petkov
2020-11-24 10:19 ` [RFC PATCH v0 10/19] x86/kprobes: " Borislav Petkov
2020-11-25 17:19   ` Masami Hiramatsu
2020-11-24 10:19 ` [RFC PATCH v0 11/19] x86/sev-es: " Borislav Petkov
2020-11-24 10:19 ` [RFC PATCH v0 12/19] x86/traps: " Borislav Petkov
2020-11-24 10:19 ` [RFC PATCH v0 13/19] x86/uprobes: " Borislav Petkov
2020-11-24 10:19 ` [RFC PATCH v0 14/19] x86/tools/insn_decoder_test: " Borislav Petkov
2020-11-24 10:19 ` [RFC PATCH v0 15/19] tools/objtool: " Borislav Petkov
2020-11-24 10:19 ` [RFC PATCH v0 16/19] x86/tools/insn_sanity: " Borislav Petkov
2020-11-24 10:19 ` [RFC PATCH v0 17/19] tools/perf: " Borislav Petkov
2020-11-24 10:19 ` [RFC PATCH v0 18/19] x86/insn: Remove kernel_insn_init() Borislav Petkov
2020-11-24 10:19 ` [RFC PATCH v0 19/19] x86/insn: Make insn_complete() static Borislav Petkov
2020-11-24 17:46 ` [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API Borislav Petkov
2020-11-25  8:03   ` Masami Hiramatsu
2020-11-27 17:45   ` Andy Lutomirski
2020-11-27 22:35     ` Borislav Petkov
2020-11-29  8:50     ` Masami Hiramatsu
2020-11-30 13:44       ` Borislav Petkov
2020-11-30 17:21         ` Masami Hiramatsu
2020-12-02 18:04           ` Borislav Petkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).