All of lore.kernel.org
 help / color / mirror / Atom feed
From: tip-bot for Jiri Kosina <tipbot@zytor.com>
To: linux-tip-commits@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, hpa@zytor.com, mingo@kernel.org,
	jkosina@suse.cz, masami.hiramatsu.pt@hitachi.com,
	rostedt@goodmis.org, tglx@linutronix.de, hpa@linux.intel.com
Subject: [tip:x86/jumplabel] x86: Introduce int3 (breakpoint) -based instruction patching
Date: Tue, 16 Jul 2013 18:18:29 -0700	[thread overview]
Message-ID: <tip-fd4363fff3d96795d3feb1b3fb48ce590f186bdd@git.kernel.org> (raw)
In-Reply-To: <alpine.LNX.2.00.1307121102440.29788@pobox.suse.cz>

Commit-ID:  fd4363fff3d96795d3feb1b3fb48ce590f186bdd
Gitweb:     http://git.kernel.org/tip/fd4363fff3d96795d3feb1b3fb48ce590f186bdd
Author:     Jiri Kosina <jkosina@suse.cz>
AuthorDate: Fri, 12 Jul 2013 11:21:48 +0200
Committer:  H. Peter Anvin <hpa@linux.intel.com>
CommitDate: Tue, 16 Jul 2013 17:55:29 -0700

x86: Introduce int3 (breakpoint)-based instruction patching

Introduce a method for run-time instruction patching on a live SMP kernel
based on int3 breakpoint, completely avoiding the need for stop_machine().

The way this is achieved:

	- add a int3 trap to the address that will be patched
	- sync cores
	- update all but the first byte of the patched range
	- sync cores
	- replace the first byte (int3) by the first byte of
	  replacing opcode
	- sync cores

According to

	http://lkml.indiana.edu/hypermail/linux/kernel/1001.1/01530.html

synchronization after replacing "all but first" instructions should not
be necessary (on Intel hardware), as the syncing after the subsequent
patching of the first byte provides enough safety.
But there's not only Intel HW out there, and we'd rather be on a safe
side.

If any CPU instruction execution would collide with the patching,
it'd be trapped by the int3 breakpoint and redirected to the provided
"handler" (which would typically mean just skipping over the patched
region, acting as "nop" has been there, in case we are doing nop -> jump
and jump -> nop transitions).

Ftrace has been using this very technique since 08d636b ("ftrace/x86:
Have arch x86_64 use breakpoints instead of stop machine") for ages
already, and jump labels are another obvious potential user of this.

Based on activities of Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
a few years ago.

Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Link: http://lkml.kernel.org/r/alpine.LNX.2.00.1307121102440.29788@pobox.suse.cz
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
---
 arch/x86/include/asm/alternative.h |   1 +
 arch/x86/kernel/alternative.c      | 106 +++++++++++++++++++++++++++++++++++++
 kernel/kprobes.c                   |   2 +-
 3 files changed, 108 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
index 58ed6d9..3abf8dd 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -233,6 +233,7 @@ struct text_poke_param {
 };
 
 extern void *text_poke(void *addr, const void *opcode, size_t len);
+extern void *text_poke_bp(void *addr, const void *opcode, size_t len, void *handler);
 extern void *text_poke_smp(void *addr, const void *opcode, size_t len);
 extern void text_poke_smp_batch(struct text_poke_param *params, int n);
 
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index c15cf9a..0ab4936 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -11,6 +11,7 @@
 #include <linux/memory.h>
 #include <linux/stop_machine.h>
 #include <linux/slab.h>
+#include <linux/kdebug.h>
 #include <asm/alternative.h>
 #include <asm/sections.h>
 #include <asm/pgtable.h>
@@ -596,6 +597,111 @@ void *__kprobes text_poke(void *addr, const void *opcode, size_t len)
 	return addr;
 }
 
+static void do_sync_core(void *info)
+{
+	sync_core();
+}
+
+static bool bp_patching_in_progress;
+static void *bp_int3_handler, *bp_int3_addr;
+
+static int int3_notify(struct notifier_block *self, unsigned long val, void *data)
+{
+	struct die_args *args = data;
+
+	/* bp_patching_in_progress */
+	smp_rmb();
+
+	if (likely(!bp_patching_in_progress))
+		return NOTIFY_DONE;
+
+	/* we are not interested in non-int3 faults and ring > 0 faults */
+	if (val != DIE_INT3 || !args->regs || user_mode_vm(args->regs)
+			    || args->regs->ip != (unsigned long)bp_int3_addr)
+		return NOTIFY_DONE;
+
+	/* set up the specified breakpoint handler */
+	args->regs->ip = (unsigned long) bp_int3_handler;
+
+	return NOTIFY_STOP;
+}
+/**
+ * text_poke_bp() -- update instructions on live kernel on SMP
+ * @addr:	address to patch
+ * @opcode:	opcode of new instruction
+ * @len:	length to copy
+ * @handler:	address to jump to when the temporary breakpoint is hit
+ *
+ * Modify multi-byte instruction by using int3 breakpoint on SMP.
+ * In contrary to text_poke_smp(), we completely avoid stop_machine() here,
+ * and achieve the synchronization using int3 breakpoint.
+ *
+ * The way it is done:
+ *	- add a int3 trap to the address that will be patched
+ *	- sync cores
+ *	- update all but the first byte of the patched range
+ *	- sync cores
+ *	- replace the first byte (int3) by the first byte of
+ *	  replacing opcode
+ *	- sync cores
+ *
+ * Note: must be called under text_mutex.
+ */
+void *text_poke_bp(void *addr, const void *opcode, size_t len, void *handler)
+{
+	unsigned char int3 = 0xcc;
+
+	bp_int3_handler = handler;
+	bp_int3_addr = (u8 *)addr + sizeof(int3);
+	bp_patching_in_progress = true;
+	/*
+	 * Corresponding read barrier in int3 notifier for
+	 * making sure the in_progress flags is correctly ordered wrt.
+	 * patching
+	 */
+	smp_wmb();
+
+	text_poke(addr, &int3, sizeof(int3));
+
+	on_each_cpu(do_sync_core, NULL, 1);
+
+	if (len - sizeof(int3) > 0) {
+		/* patch all but the first byte */
+		text_poke((char *)addr + sizeof(int3),
+			  (const char *) opcode + sizeof(int3),
+			  len - sizeof(int3));
+		/*
+		 * According to Intel, this core syncing is very likely
+		 * not necessary and we'd be safe even without it. But
+		 * better safe than sorry (plus there's not only Intel).
+		 */
+		on_each_cpu(do_sync_core, NULL, 1);
+	}
+
+	/* patch the first byte */
+	text_poke(addr, opcode, sizeof(int3));
+
+	on_each_cpu(do_sync_core, NULL, 1);
+
+	bp_patching_in_progress = false;
+	smp_wmb();
+
+	return addr;
+}
+
+/* this one needs to run before anything else handles it as a
+ * regular exception */
+static struct notifier_block int3_nb = {
+	.priority = 0x7fffffff,
+	.notifier_call = int3_notify
+};
+
+static int __init int3_init(void)
+{
+	return register_die_notifier(&int3_nb);
+}
+
+arch_initcall(int3_init);
 /*
  * Cross-modifying kernel text with stop_machine().
  * This code originally comes from immediate value.
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 6e33498..b58b490 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1709,7 +1709,7 @@ EXPORT_SYMBOL_GPL(unregister_kprobes);
 
 static struct notifier_block kprobe_exceptions_nb = {
 	.notifier_call = kprobe_exceptions_notify,
-	.priority = 0x7fffffff /* we need to be notified first */
+	.priority = 0x7ffffff0 /* High priority, but not first.  */
 };
 
 unsigned long __weak arch_deref_entry_point(void *entry)

  reply	other threads:[~2013-07-17  1:18 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-10 20:25 [RFC] [PATCH 0/2] x86: make jump labels use int3-based breakpoint instead of stop_machine() Jiri Kosina
2013-07-10 20:25 ` [RFC] [PATCH 1/2] x86: introduce int3-based instruction patching Jiri Kosina
2013-07-10 21:31   ` [RFC] [PATCH 1/2 v2] " Jiri Kosina
2013-07-10 21:36     ` H. Peter Anvin
2013-07-10 21:48       ` Borislav Petkov
2013-07-10 21:56         ` H. Peter Anvin
2013-07-10 22:14           ` Borislav Petkov
2013-07-10 22:39       ` Jiri Kosina
2013-07-11  3:29       ` Masami Hiramatsu
2013-07-11 10:09       ` Jiri Kosina
2013-07-11 10:54         ` Jiri Kosina
2013-07-11 16:31         ` H. Peter Anvin
2013-07-11 16:46           ` Steven Rostedt
2013-07-11 19:21             ` Jiri Kosina
2013-07-12  1:00             ` Masami Hiramatsu
2013-07-11 14:35       ` Steven Rostedt
2013-07-11 14:47         ` Jason Baron
2013-07-10 21:46     ` Joe Perches
2013-07-11 10:23     ` Masami Hiramatsu
2013-07-11 10:51       ` Jiri Kosina
2013-07-12  0:50         ` Masami Hiramatsu
2013-07-11 16:10       ` H. Peter Anvin
2013-07-11 19:29         ` Jiri Kosina
2013-07-11 20:49           ` H. Peter Anvin
2013-07-11 20:51           ` H. Peter Anvin
2013-07-11 15:57     ` Steven Rostedt
2013-07-11 19:43       ` Jiri Kosina
2013-07-11 19:52         ` Steven Rostedt
2013-07-10 20:25 ` [RFC] [PATCH 2/2] x86: make jump_label use int3-based patching Jiri Kosina
2013-07-10 22:26 ` [RFC] [PATCH 0/2] x86: make jump labels use int3-based breakpoint instead of stop_machine() Jason Baron
2013-07-11  0:04   ` Jiri Kosina
2013-07-11 16:42     ` Steven Rostedt
2013-07-11 19:23       ` Jiri Kosina
2013-07-11 19:54         ` Steven Rostedt
2013-07-11  2:21 ` Masami Hiramatsu
2013-07-11 20:26 ` [PATCH 2/2 v3] x86: make jump_label use int3-based patching Jiri Kosina
2013-07-12  2:12   ` Steven Rostedt
2013-07-12  5:44   ` Masami Hiramatsu
2013-07-11 20:26 ` [PATCH 1/2 v3] x86: introduce int3-based instruction patching Jiri Kosina
2013-07-11 20:53   ` H. Peter Anvin
2013-07-11 21:04     ` Borislav Petkov
2013-07-11 21:36       ` H. Peter Anvin
2013-07-12  7:57         ` Borislav Petkov
2013-07-17  3:59           ` H. Peter Anvin
2013-07-11 22:31     ` Jiri Kosina
2013-07-12  2:09       ` Steven Rostedt
2013-07-11 23:01   ` Joe Perches
2013-07-12  2:07   ` Steven Rostedt
2013-07-12  2:40   ` Masami Hiramatsu
2013-07-12  9:21 ` [PATCH 1/2 v4] " Jiri Kosina
2013-07-17  1:18   ` tip-bot for Jiri Kosina [this message]
2013-07-12  9:22 ` [PATCH 2/2 v4] x86: make jump_label use int3-based patching Jiri Kosina
2013-07-17  1:18   ` [tip:x86/jumplabel] x86: Make " tip-bot for Jiri Kosina

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=tip-fd4363fff3d96795d3feb1b3fb48ce590f186bdd@git.kernel.org \
    --to=tipbot@zytor.com \
    --cc=hpa@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=jkosina@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=masami.hiramatsu.pt@hitachi.com \
    --cc=mingo@kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.