All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/2] kpatch: dynamic kernel patching
@ 2014-05-01 15:52 Josh Poimboeuf
  2014-05-01 15:52 ` [RFC PATCH 1/2] kpatch: add TAINT_KPATCH flag Josh Poimboeuf
                   ` (5 more replies)
  0 siblings, 6 replies; 60+ messages in thread
From: Josh Poimboeuf @ 2014-05-01 15:52 UTC (permalink / raw)
  To: Josh Poimboeuf, Seth Jennings, Masami Hiramatsu, Steven Rostedt,
	Frederic Weisbecker, Ingo Molnar, Jiri Slaby
  Cc: linux-kernel

Hi,

Since Jiri posted the kGraft patches [1], I wanted to share an
alternative live patching solution called kpatch, which is something
we've been working on at Red Hat for quite a while.

The kernel piece of it ("kpatch core module") is completely
self-contained in a GPL module.  It compiles and works without needing
to change any kernel code, and in fact we already have it working fine
with Fedora 20 [2] without any distro kernel patches needed.  We'd
definitely like to see it (or some combination of it and kGraft) merged
into Linux.

This patch set is for the core module, which provides the kernel
infrastructure for kpatch.  It has a kpatch_register() interface which
allows kernel modules ("patch modules") to replace old functions with
new functions which are loaded with the modules.

There are also some user space tools [2] which aren't included in this
patch set, which magically generate binary patch modules from source
diffs, and manage the loading and unloading of these modules.   I didn't
include them here because I think we should agree on what the kernel
parts should look like before trying to discuss the user space tools
(and whether they should be in-tree).


kpatch vs kGraft
----------------

I think the biggest difference between kpatch and kGraft is how they
ensure that the patch is applied atomically and safely.

kpatch checks the backtraces of all tasks in stop_machine() to ensure
that no instances of the old function are running when the new function
is applied.  I think the biggest downside of this approach is that
stop_machine() has to idle all other CPUs during the patching process,
so it inserts a small amount of latency (a few ms on an idle system).

Instead, kGraft uses per-task consistency: each task either sees the old
version or the new version of the function.  This gives a consistent
view with respect to functions, but _not_ data, because the old and new
functions are allowed to run simultaneously and share data.  This could
be dangerous if a patch changes how a function uses a data structure.
The new function could make a data change that the old function wasn't
expecting.

With kpatch, that's not an issue because all the functions are patched
at the same time.  So kpatch is safer with respect to data interactions.

Other advantages of the kpatch stop_machine() approach:

- IMO, the kpatch code is much simpler than kGraft.  The implementation
  is very straightforward and is completely self-contained.  It requires
  zero changes to the kernel.
 
  (However a new TAINT_KPATCH flag would be a good idea, and we do
  anticipate some minor changes to kprobes and ftrace for better
  compatibility.)

- The use of stop_machine() will enable an important not-yet-implemented
  feature to call a user-supplied callback function at loading time
  which can be used to atomically update data structures when applying a
  patch.  I don't see how such a feature would be possible with the
  kGraft approach.

- kpatch applies patches immediately without having to send signals to
  sleeping processes, and without having to hope that those processes
  handle the signal appropriately.

- kpatch's patching behavior is more deterministic because
  stop_machine() ensures that all tasks are sleeping and interrupts are
  disabled when the patching occurs.

- kpatch already supports other cool features like:
    - removing patches and rolling back to the original functions
    - atomically replacing existing patches
    - incremental patching
    - loading multiple patch modules


TODO
----

Here are the only outstanding issues:

- A new FTRACE_OPS_FL_PERMANENT flag is needed to tell ftrace to never
  disable the handler.  Otherwise a patch could be temporarily or
  permanently removed in certain situations.

- A few kprobes compatibility issues:
    - Patching of a kprobed function doesn't take effect until the
      kprobe is removed.
    - kretprobes removes the probed function's calling function's IP
      from the stack, which could lead to a false negative in the kpatch
      backtrace safety check.

[1] http://thread.gmane.org/gmane.linux.kernel/1694304
[2] https://github.com/dynup/kpatch


Josh Poimboeuf (2):
  kpatch: add TAINT_KPATCH flag
  kpatch: add kpatch core module

 Documentation/kpatch.txt        | 193 +++++++++++++
 Documentation/oops-tracing.txt  |   3 +
 Documentation/sysctl/kernel.txt |   1 +
 MAINTAINERS                     |   9 +
 arch/Kconfig                    |  14 +
 include/linux/kernel.h          |   1 +
 include/linux/kpatch.h          |  61 ++++
 kernel/Makefile                 |   1 +
 kernel/kpatch/Makefile          |   1 +
 kernel/kpatch/kpatch.c          | 615 ++++++++++++++++++++++++++++++++++++++++
 kernel/panic.c                  |   2 +
 11 files changed, 901 insertions(+)
 create mode 100644 Documentation/kpatch.txt
 create mode 100644 include/linux/kpatch.h
 create mode 100644 kernel/kpatch/Makefile
 create mode 100644 kernel/kpatch/kpatch.c

-- 
1.9.0


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [RFC PATCH 1/2] kpatch: add TAINT_KPATCH flag
  2014-05-01 15:52 [RFC PATCH 0/2] kpatch: dynamic kernel patching Josh Poimboeuf
@ 2014-05-01 15:52 ` Josh Poimboeuf
  2014-05-01 15:52 ` [RFC PATCH 2/2] kpatch: add kpatch core module Josh Poimboeuf
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 60+ messages in thread
From: Josh Poimboeuf @ 2014-05-01 15:52 UTC (permalink / raw)
  To: Josh Poimboeuf, Seth Jennings, Masami Hiramatsu, Steven Rostedt,
	Frederic Weisbecker, Ingo Molnar, Jiri Slaby
  Cc: linux-kernel

Add a TAINT_KPATCH flag to be set whenever a kpatch patch module
successfully replaces a function.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Seth Jennings <sjenning@redhat.com>
---
 Documentation/oops-tracing.txt  | 3 +++
 Documentation/sysctl/kernel.txt | 1 +
 include/linux/kernel.h          | 1 +
 kernel/panic.c                  | 2 ++
 4 files changed, 7 insertions(+)

diff --git a/Documentation/oops-tracing.txt b/Documentation/oops-tracing.txt
index e315599..2ed7ed1 100644
--- a/Documentation/oops-tracing.txt
+++ b/Documentation/oops-tracing.txt
@@ -268,6 +268,9 @@ characters, each representing a particular tainted value.
  14: 'E' if an unsigned module has been loaded in a kernel supporting
      module signature.
 
+ 15: 'H' if a kpatch hot patch module has replaced any functions in the
+     running kernel.
+
 The primary reason for the 'Tainted: ' string is to tell kernel
 debuggers if this is a clean kernel or if anything unusual has
 occurred.  Tainting is permanent: even if an offending module is
diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index 9886c3d..7d31b1e 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -788,6 +788,7 @@ can be ORed together:
 4096 - An out-of-tree module has been loaded.
 8192 - An unsigned module has been loaded in a kernel supporting module
        signature.
+16384 - A kpatch module has replaced a function in the running kernel.
 
 ==============================================================
 
diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 4c52907..cdfbe73 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -470,6 +470,7 @@ extern enum system_states {
 #define TAINT_FIRMWARE_WORKAROUND	11
 #define TAINT_OOT_MODULE		12
 #define TAINT_UNSIGNED_MODULE		13
+#define TAINT_KPATCH			14
 
 extern const char hex_asc[];
 #define hex_asc_lo(x)	hex_asc[((x) & 0x0f)]
diff --git a/kernel/panic.c b/kernel/panic.c
index d02fa9f..2c04a88 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -212,6 +212,7 @@ static const struct tnt tnts[] = {
 	{ TAINT_FIRMWARE_WORKAROUND,	'I', ' ' },
 	{ TAINT_OOT_MODULE,		'O', ' ' },
 	{ TAINT_UNSIGNED_MODULE,	'E', ' ' },
+	{ TAINT_KPATCH,			'H', ' ' },
 };
 
 /**
@@ -231,6 +232,7 @@ static const struct tnt tnts[] = {
  *  'I' - Working around severe firmware bug.
  *  'O' - Out-of-tree module has been loaded.
  *  'E' - Unsigned module has been loaded.
+ *  'H' - kpatch replaced a function.
  *
  *	The string is overwritten by the next call to print_tainted().
  */
-- 
1.9.0


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [RFC PATCH 2/2] kpatch: add kpatch core module
  2014-05-01 15:52 [RFC PATCH 0/2] kpatch: dynamic kernel patching Josh Poimboeuf
  2014-05-01 15:52 ` [RFC PATCH 1/2] kpatch: add TAINT_KPATCH flag Josh Poimboeuf
@ 2014-05-01 15:52 ` Josh Poimboeuf
  2014-05-01 20:45 ` [RFC PATCH 0/2] kpatch: dynamic kernel patching Andi Kleen
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 60+ messages in thread
From: Josh Poimboeuf @ 2014-05-01 15:52 UTC (permalink / raw)
  To: Josh Poimboeuf, Seth Jennings, Masami Hiramatsu, Steven Rostedt,
	Frederic Weisbecker, Ingo Molnar, Jiri Slaby
  Cc: linux-kernel

Add the kpatch core module.  It's a self-contained module with a kernel
patching infrastructure that enables patching a running kernel without
rebooting or restarting any processes.  Kernel modules ("patch modules")
can call kpatch_register() to replace new functions with old ones.

Before applying a patch, kpatch checks the stacks of all tasks in
stop_machine() to ensure that the patch is applied atomically, to
prevent any weird effects from function interface changes or data
semantic changes that might occur between old and new versions of the
functions running simultaneously.  If any of the to-be-patched functions
are on the stack, it fails with -EBUSY.

ftrace is used to do the code modification.  For each function to be
patched, kpatch registers an ftrace ops handler.  When called, the
handler modifies regs->ip on the stack and returns back to ftrace, which
restores the RIP and "returns" to the beginning of the new function.

Other features:

* Safe unpatching

* Atomic repatching (replacing one patch module with another)

* Support for multiple patch modules

* kpatch_[register|unregister] are properly synchronized with
  kpatch_ftrace_handler() when it runs in NMI context (thanks to Masami
  for helping with this)

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
---
 Documentation/kpatch.txt | 193 +++++++++++++++
 MAINTAINERS              |   9 +
 arch/Kconfig             |  14 ++
 include/linux/kpatch.h   |  61 +++++
 kernel/Makefile          |   1 +
 kernel/kpatch/Makefile   |   1 +
 kernel/kpatch/kpatch.c   | 615 +++++++++++++++++++++++++++++++++++++++++++++++
 7 files changed, 894 insertions(+)
 create mode 100644 Documentation/kpatch.txt
 create mode 100644 include/linux/kpatch.h
 create mode 100644 kernel/kpatch/Makefile
 create mode 100644 kernel/kpatch/kpatch.c

diff --git a/Documentation/kpatch.txt b/Documentation/kpatch.txt
new file mode 100644
index 0000000..697c054
--- /dev/null
+++ b/Documentation/kpatch.txt
@@ -0,0 +1,193 @@
+kpatch: dynamic kernel patching
+===============================
+
+kpatch is a Linux dynamic kernel patching infrastructure which allows you to
+patch a running kernel without rebooting or restarting any processes.  It
+enables sysadmins to apply critical security patches to the kernel immediately,
+without having to wait for long-running tasks to complete, users to log off, or
+for scheduled reboot windows.  It gives more control over uptime without
+sacrificing security or stability.
+
+How it works
+------------
+
+kpatch works at a function granularity: old functions are replaced with new
+ones.  It has four main components:
+
+- **kpatch-build**: a collection of tools which convert a source diff patch to
+  a patch module.  They work by compiling the kernel both with and without
+  the source patch, comparing the binaries, and generating a patch module
+  which includes new binary versions of the functions to be replaced.
+
+- **patch module**: a kernel module (.ko file) which includes the
+  replacement functions and metadata about the original functions.
+
+- **kpatch core module**: the kernel infrastructure which provides an interface
+  for the patch modules to register new functions for replacement.  It uses the
+  kernel ftrace subsystem to hook into the original function's mcount call
+  instruction, so that a call to the original function is redirected to the
+  replacement function.
+
+- **kpatch utility:** a command-line tool which allows a user to manage a
+  collection of patch modules.  One or more patch modules may be
+  configured to load at boot time, so that a system can remain patched
+  even after a reboot into the same version of the kernel.
+
+
+How to use it
+-------------
+
+Currently, only the core module is in the kernel tree.  The supporting
+kpatch-build and kpatch utility tools can be found at:
+
+  https://github.com/dynup/kpatch
+
+You can also find directions there for how to create binary patch modules and
+load them into your kernel.
+
+
+Limitations
+-----------
+
+- Patches which modify kernel modules are not supported (yet).  Only
+  functions in the vmlinux file (listed in System.map) can be patched.
+
+- Patches to functions which are always on the stack of at least one
+  process in the system are not supported.  Examples: schedule(),
+  sys_poll(), sys_select(), sys_read(), sys_nanosleep().  Attempting to
+  apply such a patch will cause the insmod of the patch module to return
+  an error.
+
+- Patches which modify init functions (annotated with `__init`) are not
+  supported.  kpatch-build will return an error if the patch attempts
+  to do so.
+
+- Patches which modify statically allocated data are not supported.
+  kpatch-build will detect that and return an error.  (In the future
+  we will add a facility to support it.  It will probably require the
+  user to write code which runs at patch module loading time which manually
+  updates the data.)
+
+- Patches which change the way a function interacts with dynamically
+  allocated data might be safe, or might not.  It isn't possible for
+  kpatch-build to verify the safety of this kind of patch.  It's up to
+  the user to understand what the patch does, whether the new functions
+  interact with dynamically allocated data in a different way than the
+  old functions did, and whether it would be safe to atomically apply
+  such a patch to a running kernel.
+
+
+Frequently Asked Questions
+--------------------------
+
+**Q. Isn't this just a virus/rootkit injection framework?**
+
+kpatch uses kernel modules to replace code.  It requires the `CAP_SYS_MODULE`
+capability.  If you already have that capability, then you already have the
+ability to arbitrarily modify the kernel, with or without kpatch.
+
+**Q. How can I detect if somebody has patched the kernel?**
+
+When a patch module is loaded, the TAINT_KPATCH flag is set.  To test for the
+taint flag, run:
+
+    [ `cat /proc/sys/kernel/tainted` -ge 16384 ] && echo tainted
+
+If TAINT_KPATCH is set, you'll see "tainted".
+
+**Q. Will it destabilize my system?**
+
+No, as long as the patch is chosen carefully.  See the Limitations section
+above.
+
+**Q. Why does kpatch use ftrace to jump to the replacement function instead of
+adding the jump directly?**
+
+ftrace owns the first "call mcount" instruction of every kernel function.  In
+order to keep compatibility with ftrace, we go through ftrace rather than
+updating the instruction directly.
+
+**Q Is kpatch compatible with \<insert kernel debugging subsystem here\>?**
+
+We aim to be good kernel citizens and maintain compatibility.  A hot patch
+replacement function is no different than a function loaded by any other kernel
+module.  Each replacement function has its own symbol name and kallsyms entry,
+so it looks like a normal function to the kernel.
+
+- **oops stack traces**: Yes.  If the replacement function is involved in an
+  oops, the stack trace will show the function and kernel module name of the
+  replacement function, just like any other kernel module function.  The oops
+  message will also show the taint flag (currently `TAINT_USER`).
+- **kdump/crash**: Yes.  Replacement functions are normal functions, so crash
+  will have no issues. [TODO: create patch module debuginfo symbols and crash
+  warning message]
+- **ftrace**: Yes, see previous question.
+- **systemtap/kprobes**: Some incompatibilities exist.
+  - If you setup a kprobe module at the beginning of a function before loading
+    a kpatch module, and they both affect the same function, kprobes "wins"
+    until the kprobe has been unregistered.  This is tracked in issue
+    [#47](https://github.com/dynup/kpatch/issues/47).
+  - Setting a kretprobe before loading a kpatch module could be unsafe.  See
+    issue [#67](https://github.com/dynup/kpatch/issues/67).
+- **perf**: TODO: try it out
+
+**Q. Why not use something like kexec instead?**
+
+If you want to avoid a hardware reboot, but are ok with restarting processes,
+kexec is a good alternative.
+
+**Q. If an application can't handle a reboot, it's designed wrong.**
+
+That's a good poi... [system reboots]
+
+**Q. What changes are needed in other upstream projects?**
+
+We hope to make the following changes to other projects:
+
+- kernel:
+	- ftrace improvements to close any windows that would allow a patch to
+	  be inadvertently disabled
+	- hot patch taint flag
+	- possibly the kpatch core module itself
+
+- crash:
+	- make it glaringly obvious that you're debugging a patched kernel
+	- point it to where the patch modules and corresponding debug symbols
+	  live on the file system
+
+**Q: Is it possible to register a function that gets called atomically with
+`stop_machine` when the patch module loads and unloads?**
+
+We do have plans to implement something like that.
+
+**Q. What kernels are supported?**
+
+kpatch needs gcc >= 4.6 and Linux >= 3.7 for use of the -mfentry flag.
+
+**Q. Is it possible to remove a patch?**
+
+Yes.  Just run `kpatch unload` which will disable and unload the patch module
+and restore the function to its original state.
+
+**Q. Can you apply multiple patches?**
+
+Yes.  Also, a single function can even be patched multiple times if needed.
+
+**Q. Why did kpatch-build detect a changed function that wasn't touched by the
+source patch?**
+
+There could be a variety of reasons for this, such as:
+
+- The patch changed an inline function.
+- The compiler decided to inline a changed function, resulting in the outer
+  function getting recompiled.  This is common in the case where the inner
+  function is static and is only called once.
+- The function uses a WARN() or WARN_ON() macro.  These macros embed the source
+  code line number (`__LINE__`) into an instruction.  If a function was changed
+  higher up in the file, it will affect the line numbers for all subsequent
+  WARN calls in the file, resulting in recompilation of their functions.  If
+  this happens to you, you can usually just ignore it, as patching a few extra
+  functions isn't typically a problem.  If it becomes a problem for whatever
+  reason, you can change the source patch to redefine the WARN macro for the
+  affected files, such that it hard codes the old line number instead of using
+  `__LINE__`, for example.
diff --git a/MAINTAINERS b/MAINTAINERS
index ea44a57..711dc3b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5200,6 +5200,15 @@ F:	include/linux/kmemleak.h
 F:	mm/kmemleak.c
 F:	mm/kmemleak-test.c
 
+KPATCH
+M:	Josh Poimboeuf <jpoimboe@redhat.com>
+M:	Seth Jennings <sjenning@redhat.com>
+W:	https://github.com/dynup/kpatch
+S:	Maintained
+F:	Documentation/kpatch.txt
+F:	include/linux/kpatch.h
+F:	kernel/kpatch/*
+
 KPROBES
 M:	Ananth N Mavinakayanahalli <ananth@in.ibm.com>
 M:	Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
diff --git a/arch/Kconfig b/arch/Kconfig
index 97ff872..8693aae 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -472,6 +472,20 @@ config HAVE_IRQ_EXIT_ON_IRQ_STACK
 	  This spares a stack switch and improves cache usage on softirq
 	  processing.
 
+config KPATCH
+	tristate "kpatch dynamic kernel updating support"
+	default n
+	depends on MODULES
+	depends on FTRACE
+	depends on HAVE_FENTRY
+	depends on SYSFS
+	help
+	  kpatch is a dynamic kernel patching infrastructure which allows you
+	  to dynamically update the code of a running kernel without rebooting
+	  or restarting any processes.
+
+	  If in doubt, say "N".
+
 #
 # ABI hall of shame
 #
diff --git a/include/linux/kpatch.h b/include/linux/kpatch.h
new file mode 100644
index 0000000..121e383
--- /dev/null
+++ b/include/linux/kpatch.h
@@ -0,0 +1,61 @@
+/*
+ * kpatch.h
+ *
+ * Copyright (C) 2014 Seth Jennings <sjenning@redhat.com>
+ * Copyright (C) 2013-2014 Josh Poimboeuf <jpoimboe@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <https://www.gnu.org/licenses/>.
+ *
+ * Contains the API for the core kpatch module used by the patch modules
+ */
+
+#ifndef _KPATCH_H_
+#define _KPATCH_H_
+
+#include <linux/types.h>
+#include <linux/module.h>
+
+enum kpatch_op {
+	KPATCH_OP_NONE,
+	KPATCH_OP_PATCH,
+	KPATCH_OP_UNPATCH,
+};
+
+struct kpatch_func {
+	/* public */
+	unsigned long new_addr;
+	unsigned long new_size;
+	unsigned long old_addr;
+	unsigned long old_size;
+
+	/* private */
+	struct hlist_node node;
+	struct kpatch_module *kpmod;
+	enum kpatch_op op;
+};
+
+struct kpatch_module {
+	struct module *mod;
+	struct kpatch_func *funcs;
+	int num_funcs;
+
+	bool enabled;
+};
+
+extern struct kobject *kpatch_patches_kobj;
+
+extern int kpatch_register(struct kpatch_module *kpmod, bool replace);
+extern int kpatch_unregister(struct kpatch_module *kpmod);
+
+#endif /* _KPATCH_H_ */
diff --git a/kernel/Makefile b/kernel/Makefile
index f2a8b62..2e56269 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -87,6 +87,7 @@ obj-$(CONFIG_RING_BUFFER) += trace/
 obj-$(CONFIG_TRACEPOINTS) += trace/
 obj-$(CONFIG_IRQ_WORK) += irq_work.o
 obj-$(CONFIG_CPU_PM) += cpu_pm.o
+obj-$(CONFIG_KPATCH) += kpatch/
 
 obj-$(CONFIG_PERF_EVENTS) += events/
 
diff --git a/kernel/kpatch/Makefile b/kernel/kpatch/Makefile
new file mode 100644
index 0000000..800241a
--- /dev/null
+++ b/kernel/kpatch/Makefile
@@ -0,0 +1 @@
+obj-$(CONFIG_KPATCH) := kpatch.o
diff --git a/kernel/kpatch/kpatch.c b/kernel/kpatch/kpatch.c
new file mode 100644
index 0000000..390e926
--- /dev/null
+++ b/kernel/kpatch/kpatch.c
@@ -0,0 +1,615 @@
+/*
+ * Copyright (C) 2014 Seth Jennings <sjenning@redhat.com>
+ * Copyright (C) 2013-2014 Josh Poimboeuf <jpoimboe@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/* Contains the code for the core kpatch module.  Each patch module registers
+ * with this module to redirect old functions to new functions.
+ *
+ * Each patch module can contain one or more new functions.  This information
+ * is contained in the .patches section of the patch module.  For each function
+ * patched by the module we must:
+ * - Call stop_machine
+ * - Ensure that no execution thread is currently in the old function (or has
+ *   it in the call stack)
+ * - Add the new function address to the kpatch_funcs table
+ *
+ * After that, each call to the old function calls into kpatch_ftrace_handler()
+ * which finds the new function in the kpatch_funcs table and updates the
+ * return instruction pointer so that ftrace will return to the new function.
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/kpatch.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/stop_machine.h>
+#include <linux/ftrace.h>
+#include <linux/hashtable.h>
+#include <linux/preempt_mask.h>
+#include <asm/stacktrace.h>
+#include <asm/cacheflush.h>
+
+#define KPATCH_HASH_BITS 8
+static DEFINE_HASHTABLE(kpatch_func_hash, KPATCH_HASH_BITS);
+
+static DEFINE_SEMAPHORE(kpatch_mutex);
+
+static int kpatch_num_registered;
+
+static struct kobject *kpatch_root_kobj;
+struct kobject *kpatch_patches_kobj;
+EXPORT_SYMBOL_GPL(kpatch_patches_kobj);
+
+struct kpatch_backtrace_args {
+	struct kpatch_module *kpmod;
+	int ret;
+};
+
+/*
+ * The kpatch core module has a state machine which allows for proper
+ * synchronization with kpatch_ftrace_handler() when it runs in NMI context.
+ *
+ *         +-----------------------------------------------------+
+ *         |                                                     |
+ *         |                                                     +
+ *         v                                     +---> KPATCH_STATE_SUCCESS
+ * KPATCH_STATE_IDLE +---> KPATCH_STATE_UPDATING |
+ *         ^                                     +---> KPATCH_STATE_FAILURE
+ *         |                                                     +
+ *         |                                                     |
+ *         +-----------------------------------------------------+
+ *
+ * KPATCH_STATE_IDLE: No updates are pending.  The func hash is valid, and the
+ * reader doesn't need to check func->op.
+ *
+ * KPATCH_STATE_UPDATING: An update is in progress.  The reader must call
+ * kpatch_state_finish(KPATCH_STATE_FAILURE) before accessing the func hash.
+ *
+ * KPATCH_STATE_FAILURE: An update failed, and the func hash might be
+ * inconsistent (pending patched funcs might not have been removed yet).  If
+ * func->op is KPATCH_OP_PATCH, then rollback to the previous version of the
+ * func.
+ *
+ * KPATCH_STATE_SUCCESS: An update succeeded, but the func hash might be
+ * inconsistent (pending unpatched funcs might not have been removed yet).  If
+ * func->op is KPATCH_OP_UNPATCH, then rollback to the previous version of the
+ * func.
+ */
+enum {
+	KPATCH_STATE_IDLE,
+	KPATCH_STATE_UPDATING,
+	KPATCH_STATE_SUCCESS,
+	KPATCH_STATE_FAILURE,
+};
+static atomic_t kpatch_state;
+
+
+static inline void kpatch_state_idle(void)
+{
+	int state = atomic_read(&kpatch_state);
+	WARN_ON(state != KPATCH_STATE_SUCCESS && state != KPATCH_STATE_FAILURE);
+	atomic_set(&kpatch_state, KPATCH_STATE_IDLE);
+}
+
+static inline void kpatch_state_updating(void)
+{
+	WARN_ON(atomic_read(&kpatch_state) != KPATCH_STATE_IDLE);
+	atomic_set(&kpatch_state, KPATCH_STATE_UPDATING);
+}
+
+/* If state is updating, change it to success or failure and return new state */
+static inline int kpatch_state_finish(int state)
+{
+	int result;
+	WARN_ON(state != KPATCH_STATE_SUCCESS && state != KPATCH_STATE_FAILURE);
+	result = atomic_cmpxchg(&kpatch_state, KPATCH_STATE_UPDATING, state);
+	return result == KPATCH_STATE_UPDATING ? state : result;
+}
+
+static struct kpatch_func *kpatch_get_func(unsigned long ip)
+{
+	struct kpatch_func *f;
+
+	/* Here, we have to use rcu safe hlist because of NMI concurrency */
+	hash_for_each_possible_rcu(kpatch_func_hash, f, node, ip)
+		if (f->old_addr == ip)
+			return f;
+	return NULL;
+}
+
+static struct kpatch_func *kpatch_get_prev_func(struct kpatch_func *f,
+						unsigned long ip)
+{
+	hlist_for_each_entry_continue_rcu(f, node)
+		if (f->old_addr == ip)
+			return f;
+	return NULL;
+}
+
+static inline int kpatch_compare_addresses(unsigned long stack_addr,
+					   unsigned long func_addr,
+					   unsigned long func_size)
+{
+	if (stack_addr >= func_addr && stack_addr < func_addr + func_size) {
+		/* TODO: use kallsyms to print symbol name */
+		pr_err("activeness safety check failed for function at address 0x%lx\n",
+		       stack_addr);
+		return -EBUSY;
+	}
+	return 0;
+}
+
+static void kpatch_backtrace_address_verify(void *data, unsigned long address,
+				     int reliable)
+{
+	struct kpatch_backtrace_args *args = data;
+	struct kpatch_module *kpmod = args->kpmod;
+	struct kpatch_func *func;
+	int i;
+
+	if (args->ret)
+		return;
+
+	/* check kpmod funcs */
+	for (i = 0; i < kpmod->num_funcs; i++) {
+		unsigned long func_addr, func_size;
+		struct kpatch_func *active_func;
+
+		func = &kpmod->funcs[i];
+		active_func = kpatch_get_func(func->old_addr);
+		if (!active_func) {
+			/* patching an unpatched func */
+			func_addr = func->old_addr;
+			func_size = func->old_size;
+		} else {
+			/* repatching or unpatching */
+			func_addr = active_func->new_addr;
+			func_size = active_func->new_size;
+		}
+
+		args->ret = kpatch_compare_addresses(address, func_addr,
+						     func_size);
+		if (args->ret)
+			return;
+	}
+
+	/* in the replace case, need to check the func hash as well */
+	hash_for_each_rcu(kpatch_func_hash, i, func, node) {
+		if (func->op == KPATCH_OP_UNPATCH) {
+			args->ret = kpatch_compare_addresses(address,
+							     func->new_addr,
+							     func->new_size);
+			if (args->ret)
+				return;
+		}
+	}
+}
+
+static int kpatch_backtrace_stack(void *data, char *name)
+{
+	return 0;
+}
+
+static const struct stacktrace_ops kpatch_backtrace_ops = {
+	.address	= kpatch_backtrace_address_verify,
+	.stack		= kpatch_backtrace_stack,
+	.walk_stack	= print_context_stack_bp,
+};
+
+/*
+ * Verify activeness safety, i.e. that none of the to-be-patched functions are
+ * on the stack of any task.
+ *
+ * This function is called from stop_machine() context.
+ */
+static int kpatch_verify_activeness_safety(struct kpatch_module *kpmod)
+{
+	struct task_struct *g, *t;
+	int ret = 0;
+
+	struct kpatch_backtrace_args args = {
+		.kpmod = kpmod,
+		.ret = 0
+	};
+
+	/* Check the stacks of all tasks. */
+	do_each_thread(g, t) {
+		dump_trace(t, NULL, NULL, 0, &kpatch_backtrace_ops, &args);
+		if (args.ret) {
+			ret = args.ret;
+			goto out;
+		}
+	} while_each_thread(g, t);
+
+out:
+	return ret;
+}
+
+/* Called from stop_machine */
+static int kpatch_apply_patch(void *data)
+{
+	struct kpatch_module *kpmod = data;
+	struct kpatch_func *funcs = kpmod->funcs;
+	int num_funcs = kpmod->num_funcs;
+	int i, ret;
+
+	ret = kpatch_verify_activeness_safety(kpmod);
+	if (ret) {
+		kpatch_state_finish(KPATCH_STATE_FAILURE);
+		return ret;
+	}
+
+	/* tentatively add the new funcs to the global func hash */
+	for (i = 0; i < num_funcs; i++)
+		hash_add_rcu(kpatch_func_hash, &funcs[i].node,
+			     funcs[i].old_addr);
+
+	/* memory barrier between func hash add and state change */
+	smp_wmb();
+
+	/*
+	 * Check if any inconsistent NMI has happened while updating.  If not,
+	 * move to success state.
+	 */
+	ret = kpatch_state_finish(KPATCH_STATE_SUCCESS);
+	if (ret == KPATCH_STATE_FAILURE) {
+		pr_err("NMI activeness safety check failed\n");
+
+		/* Failed, we have to rollback patching process */
+		for (i = 0; i < num_funcs; i++)
+			hash_del_rcu(&funcs[i].node);
+
+		return -EBUSY;
+	}
+
+	return 0;
+}
+
+/* Called from stop_machine */
+static int kpatch_remove_patch(void *data)
+{
+	struct kpatch_module *kpmod = data;
+	struct kpatch_func *funcs = kpmod->funcs;
+	int num_funcs = kpmod->num_funcs;
+	int ret, i;
+
+	ret = kpatch_verify_activeness_safety(kpmod);
+	if (ret) {
+		kpatch_state_finish(KPATCH_STATE_FAILURE);
+		return ret;
+	}
+
+	/* Check if any inconsistent NMI has happened while updating */
+	ret = kpatch_state_finish(KPATCH_STATE_SUCCESS);
+	if (ret == KPATCH_STATE_FAILURE)
+		return -EBUSY;
+
+	/* Succeeded, remove all updating funcs from hash table */
+	for (i = 0; i < num_funcs; i++)
+		hash_del_rcu(&funcs[i].node);
+
+	return 0;
+}
+
+/*
+ * This is where the magic happens.  Update regs->ip to tell ftrace to return
+ * to the new function.
+ *
+ * If there are multiple patch modules that have registered to patch the same
+ * function, the last one to register wins, as it'll be first in the hash
+ * bucket.
+ */
+static void notrace kpatch_ftrace_handler(unsigned long ip, unsigned long parent_ip,
+				   struct ftrace_ops *fops,
+				   struct pt_regs *regs)
+{
+	struct kpatch_func *func;
+	int state;
+
+	preempt_disable_notrace();
+
+	if (likely(!in_nmi()))
+		func = kpatch_get_func(ip);
+	else {
+		/* Checking for NMI inconsistency */
+		state = kpatch_state_finish(KPATCH_STATE_FAILURE);
+
+		/* no memory reordering between state and func hash read */
+		smp_rmb();
+
+		func = kpatch_get_func(ip);
+
+		if (likely(state == KPATCH_STATE_IDLE))
+			goto done;
+
+		if (state == KPATCH_STATE_SUCCESS) {
+			/*
+			 * Patching succeeded.  If the function was being
+			 * unpatched, roll back to the previous version.
+			 */
+			if (func && func->op == KPATCH_OP_UNPATCH)
+				func = kpatch_get_prev_func(func, ip);
+		} else {
+			/*
+			 * Patching failed.  If the function was being patched,
+			 * roll back to the previous version.
+			 */
+			if (func && func->op == KPATCH_OP_PATCH)
+				func = kpatch_get_prev_func(func, ip);
+		}
+	}
+done:
+	if (func)
+		regs->ip = func->new_addr;
+
+	preempt_enable_notrace();
+}
+
+static struct ftrace_ops kpatch_ftrace_ops __read_mostly = {
+	.func = kpatch_ftrace_handler,
+	.flags = FTRACE_OPS_FL_SAVE_REGS,
+};
+
+/* Remove kpatch_funcs from ftrace filter */
+static void kpatch_remove_funcs_from_filter(struct kpatch_func *funcs,
+					    int num_funcs)
+{
+	int i, ret = 0;
+
+	for (i = 0; i < num_funcs; i++) {
+		struct kpatch_func *func = &funcs[i];
+
+		/*
+		 * If any other modules have also patched this function, don't
+		 * remove its ftrace handler.
+		 */
+		if (kpatch_get_func(func->old_addr))
+			continue;
+
+		/* Remove the ftrace handler for this function. */
+		ret = ftrace_set_filter_ip(&kpatch_ftrace_ops, func->old_addr,
+					   1, 0);
+
+		WARN(ret, "can't remove ftrace filter at address 0x%lx (rc=%d)",
+		     func->old_addr, ret);
+	}
+}
+
+int kpatch_register(struct kpatch_module *kpmod, bool replace)
+{
+	int ret, i;
+	struct kpatch_func *funcs = kpmod->funcs;
+	struct kpatch_func *func;
+	int num_funcs = kpmod->num_funcs;
+
+	if (!kpmod->mod || !funcs || !num_funcs)
+		return -EINVAL;
+
+	kpmod->enabled = false;
+
+	down(&kpatch_mutex);
+
+	if (!try_module_get(kpmod->mod)) {
+		ret = -ENODEV;
+		goto err_up;
+	}
+
+	for (i = 0; i < num_funcs; i++) {
+		func = &funcs[i];
+
+		func->op = KPATCH_OP_PATCH;
+		func->kpmod = kpmod;
+
+		/*
+		 * If any other modules have also patched this function, it
+		 * already has an ftrace handler.
+		 */
+		if (kpatch_get_func(func->old_addr))
+			continue;
+
+		/* Add an ftrace handler for this function. */
+		ret = ftrace_set_filter_ip(&kpatch_ftrace_ops, func->old_addr,
+					   0, 0);
+		if (ret) {
+			pr_err("can't set ftrace filter at address 0x%lx\n",
+			       func->old_addr);
+			num_funcs = i;
+			goto err_rollback;
+		}
+	}
+
+	/* Register the ftrace trampoline if it hasn't been done already. */
+	if (!kpatch_num_registered) {
+		ret = register_ftrace_function(&kpatch_ftrace_ops);
+		if (ret) {
+			pr_err("can't register ftrace handler\n");
+			goto err_rollback;
+		}
+	}
+	kpatch_num_registered++;
+
+	if (replace)
+		hash_for_each_rcu(kpatch_func_hash, i, func, node)
+			func->op = KPATCH_OP_UNPATCH;
+
+	/* memory barrier between func hash and state write */
+	smp_wmb();
+
+	kpatch_state_updating();
+
+	/*
+	 * Idle the CPUs, verify activeness safety, and atomically make the new
+	 * functions visible to the trampoline.
+	 */
+	ret = stop_machine(kpatch_apply_patch, kpmod, NULL);
+
+	/*
+	 * For the replace case, remove any obsolete funcs from the hash and
+	 * the ftrace filter, and disable the owning patch module so that it
+	 * can be removed.
+	 */
+	if (!ret && replace)
+		hash_for_each_rcu(kpatch_func_hash, i, func, node) {
+			if (func->op != KPATCH_OP_UNPATCH)
+				continue;
+			hash_del_rcu(&func->node);
+			kpatch_remove_funcs_from_filter(func, 1);
+			if (func->kpmod->enabled) {
+				kpatch_num_registered--;
+				func->kpmod->enabled = false;
+				pr_notice("unloaded patch module \"%s\"\n",
+					  func->kpmod->mod->name);
+				module_put(func->kpmod->mod);
+			}
+		}
+
+	/* memory barrier between func hash and state write */
+	smp_wmb();
+
+	/* NMI handlers can return to normal now */
+	kpatch_state_idle();
+
+	/*
+	 * Wait for all existing NMI handlers to complete so that they don't
+	 * see any changes to funcs or funcs->op that might occur after this
+	 * point.
+	 *
+	 * Any NMI handlers starting after this point will see the IDLE state.
+	 */
+	synchronize_rcu();
+
+	if (ret)
+		goto err_unregister;
+
+	for (i = 0; i < num_funcs; i++)
+		funcs[i].op = KPATCH_OP_NONE;
+
+	pr_notice_once("tainting kernel with TAINT_KPATCH\n");
+	add_taint(TAINT_KPATCH, LOCKDEP_STILL_OK);
+
+	pr_notice("loaded patch module \"%s\"\n", kpmod->mod->name);
+
+	kpmod->enabled = true;
+
+	up(&kpatch_mutex);
+	return 0;
+
+err_unregister:
+	if (replace)
+		hash_for_each_rcu(kpatch_func_hash, i, func, node)
+			func->op = KPATCH_OP_NONE;
+	if (kpatch_num_registered == 1) {
+		int ret2 = unregister_ftrace_function(&kpatch_ftrace_ops);
+		if (ret2) {
+			pr_err("ftrace unregister failed (%d)\n", ret2);
+			goto err_rollback;
+		}
+	}
+	kpatch_num_registered--;
+err_rollback:
+	kpatch_remove_funcs_from_filter(funcs, num_funcs);
+	module_put(kpmod->mod);
+err_up:
+	up(&kpatch_mutex);
+	return ret;
+}
+EXPORT_SYMBOL(kpatch_register);
+
+int kpatch_unregister(struct kpatch_module *kpmod)
+{
+	struct kpatch_func *funcs = kpmod->funcs;
+	int num_funcs = kpmod->num_funcs;
+	int i, ret;
+
+	WARN_ON(!kpmod->enabled);
+
+	down(&kpatch_mutex);
+
+	for (i = 0; i < num_funcs; i++)
+		funcs[i].op = KPATCH_OP_UNPATCH;
+
+	/* memory barrier between func hash and state write */
+	smp_wmb();
+
+	kpatch_state_updating();
+
+	ret = stop_machine(kpatch_remove_patch, kpmod, NULL);
+
+	/* NMI handlers can return to normal now */
+	kpatch_state_idle();
+
+	/*
+	 * Wait for all existing NMI handlers to complete so that they don't
+	 * see any changes to funcs or funcs->op that might occur after this
+	 * point.
+	 *
+	 * Any NMI handlers starting after this point will see the IDLE state.
+	 */
+	synchronize_rcu();
+
+	if (ret) {
+		for (i = 0; i < num_funcs; i++)
+			funcs[i].op = KPATCH_OP_NONE;
+		goto out;
+	}
+
+	if (kpatch_num_registered == 1) {
+		ret = unregister_ftrace_function(&kpatch_ftrace_ops);
+		if (ret)
+			WARN(1, "can't unregister ftrace handler");
+		else
+			kpatch_num_registered--;
+	}
+
+	kpatch_remove_funcs_from_filter(funcs, num_funcs);
+
+	pr_notice("unloaded patch module \"%s\"\n", kpmod->mod->name);
+
+	kpmod->enabled = false;
+	module_put(kpmod->mod);
+
+out:
+	up(&kpatch_mutex);
+	return ret;
+}
+EXPORT_SYMBOL(kpatch_unregister);
+
+static int kpatch_init(void)
+{
+	kpatch_root_kobj = kobject_create_and_add("kpatch", kernel_kobj);
+	if (!kpatch_root_kobj)
+		return -ENOMEM;
+
+	kpatch_patches_kobj = kobject_create_and_add("patches",
+						     kpatch_root_kobj);
+	if (!kpatch_patches_kobj)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static void kpatch_exit(void)
+{
+	WARN_ON(kpatch_num_registered != 0);
+	kobject_put(kpatch_patches_kobj);
+	kobject_put(kpatch_root_kobj);
+}
+
+module_init(kpatch_init);
+module_exit(kpatch_exit);
+MODULE_LICENSE("GPL");
-- 
1.9.0


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-01 15:52 [RFC PATCH 0/2] kpatch: dynamic kernel patching Josh Poimboeuf
  2014-05-01 15:52 ` [RFC PATCH 1/2] kpatch: add TAINT_KPATCH flag Josh Poimboeuf
  2014-05-01 15:52 ` [RFC PATCH 2/2] kpatch: add kpatch core module Josh Poimboeuf
@ 2014-05-01 20:45 ` Andi Kleen
  2014-05-01 21:01   ` Josh Poimboeuf
  2014-05-02  8:37 ` Jiri Kosina
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 60+ messages in thread
From: Andi Kleen @ 2014-05-01 20:45 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Seth Jennings, Masami Hiramatsu, Steven Rostedt,
	Frederic Weisbecker, Ingo Molnar, Jiri Slaby, linux-kernel

Josh Poimboeuf <jpoimboe@redhat.com> writes:
>
> kpatch checks the backtraces of all tasks in stop_machine() to ensure
> that no instances of the old function are running when the new function
> is applied.

How does that work for tail calls? 

      call foo
foo:
      ...
      jmp bar

bar:
      ... code executing ...

When you backtrace you will see foo, but you are running in bar.

Note that tail calls can be indirect, so they cannot be resolved
statically.

CONFIG_DEBUG_INFO usually disables tail calls, but not supporting 
it would seem like a large limitation, as the cost can be high.

It wouldn't surprise me if there are some similar special cases that
can even happen with them disabled.

In theory you could read LBRs, but even that may miss some extreme
cases.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-01 20:45 ` [RFC PATCH 0/2] kpatch: dynamic kernel patching Andi Kleen
@ 2014-05-01 21:01   ` Josh Poimboeuf
  2014-05-01 21:06     ` Andi Kleen
  0 siblings, 1 reply; 60+ messages in thread
From: Josh Poimboeuf @ 2014-05-01 21:01 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Seth Jennings, Masami Hiramatsu, Steven Rostedt,
	Frederic Weisbecker, Ingo Molnar, Jiri Slaby, linux-kernel

On Thu, May 01, 2014 at 01:45:33PM -0700, Andi Kleen wrote:
> Josh Poimboeuf <jpoimboe@redhat.com> writes:
> >
> > kpatch checks the backtraces of all tasks in stop_machine() to ensure
> > that no instances of the old function are running when the new function
> > is applied.
> 
> How does that work for tail calls? 
> 
>       call foo
> foo:
>       ...
>       jmp bar
> 
> bar:
>       ... code executing ...
> 
> When you backtrace you will see foo, but you are running in bar.

> Note that tail calls can be indirect, so they cannot be resolved
> statically.
> 
> CONFIG_DEBUG_INFO usually disables tail calls, but not supporting 
> it would seem like a large limitation, as the cost can be high.
> 
> It wouldn't surprise me if there are some similar special cases that
> can even happen with them disabled.
> 
> In theory you could read LBRs, but even that may miss some extreme
> cases.

When bar returns, would it skip foo and go straight back to foo's
caller?  If so, then it should be safe to patch foo after it jumps to
bar.

-- 
Josh

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-01 21:01   ` Josh Poimboeuf
@ 2014-05-01 21:06     ` Andi Kleen
  2014-05-01 21:27       ` Josh Poimboeuf
  2014-05-02  1:30       ` Masami Hiramatsu
  0 siblings, 2 replies; 60+ messages in thread
From: Andi Kleen @ 2014-05-01 21:06 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Andi Kleen, Seth Jennings, Masami Hiramatsu, Steven Rostedt,
	Frederic Weisbecker, Ingo Molnar, Jiri Slaby, linux-kernel

> When bar returns, would it skip foo and go straight back to foo's
> caller?  If so, then it should be safe to patch foo after it jumps to
> bar.

foo is no problem, you see it in the backtrace.
But you don't see bar.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-01 21:06     ` Andi Kleen
@ 2014-05-01 21:27       ` Josh Poimboeuf
  2014-05-01 21:39         ` Josh Poimboeuf
  2014-05-02  1:30       ` Masami Hiramatsu
  1 sibling, 1 reply; 60+ messages in thread
From: Josh Poimboeuf @ 2014-05-01 21:27 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Seth Jennings, Masami Hiramatsu, Steven Rostedt,
	Frederic Weisbecker, Ingo Molnar, Jiri Slaby, linux-kernel

On Thu, May 01, 2014 at 11:06:01PM +0200, Andi Kleen wrote:
> > When bar returns, would it skip foo and go straight back to foo's
> > caller?  If so, then it should be safe to patch foo after it jumps to
> > bar.
> 
> foo is no problem, you see it in the backtrace.
> But you don't see bar.

Sorry, I missed your point the first time.  Good question.

stop_machine schedules a high priority thread on each CPU, which means
every other task will be waiting in a schedule() call (assuming a
non-preemptible kernel).  In my local kernel, a quick grep of the
disassembly doesn't show any jumps to schedule:

  $ egrep 'j.*<.*>' vmlinux.asm |grep -v '\+' |grep schedule
  ffffffff816b89b5:	e9 e2 fe ff ff       	jmpq   ffffffff816b889c <retint_with_reschedule>
  ffffffff816b8cec:	75 1e                	jne    ffffffff816b8d0c <paranoid_schedule>

But yes, that would be a problem if any tail call jumps to schedule()
ever showed up.  We may need to detect that case in our patch generation
tooling and fail to create the patch module binary if the patch affects
a function which does this.

-- 
Josh

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-01 21:27       ` Josh Poimboeuf
@ 2014-05-01 21:39         ` Josh Poimboeuf
  0 siblings, 0 replies; 60+ messages in thread
From: Josh Poimboeuf @ 2014-05-01 21:39 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Seth Jennings, Masami Hiramatsu, Steven Rostedt,
	Frederic Weisbecker, Ingo Molnar, Jiri Slaby, linux-kernel

On Thu, May 01, 2014 at 04:27:46PM -0500, Josh Poimboeuf wrote:
> On Thu, May 01, 2014 at 11:06:01PM +0200, Andi Kleen wrote:
> > > When bar returns, would it skip foo and go straight back to foo's
> > > caller?  If so, then it should be safe to patch foo after it jumps to
> > > bar.
> > 
> > foo is no problem, you see it in the backtrace.
> > But you don't see bar.
> 
> Sorry, I missed your point the first time.  Good question.
> 
> stop_machine schedules a high priority thread on each CPU, which means
> every other task will be waiting in a schedule() call (assuming a
> non-preemptible kernel).  In my local kernel, a quick grep of the
> disassembly doesn't show any jumps to schedule:
> 
>   $ egrep 'j.*<.*>' vmlinux.asm |grep -v '\+' |grep schedule
>   ffffffff816b89b5:	e9 e2 fe ff ff       	jmpq   ffffffff816b889c <retint_with_reschedule>
>   ffffffff816b8cec:	75 1e                	jne    ffffffff816b8d0c <paranoid_schedule>
> 
> But yes, that would be a problem if any tail call jumps to schedule()
> ever showed up.  We may need to detect that case in our patch generation
> tooling and fail to create the patch module binary if the patch affects
> a function which does this.

Thinking more about this... Even if it jumps to schedule(), I think
there's no problem, since the function is basically done, and we already
know not to patch schedule().

-- 
Josh

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-01 21:06     ` Andi Kleen
  2014-05-01 21:27       ` Josh Poimboeuf
@ 2014-05-02  1:30       ` Masami Hiramatsu
  1 sibling, 0 replies; 60+ messages in thread
From: Masami Hiramatsu @ 2014-05-02  1:30 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Josh Poimboeuf, Seth Jennings, Steven Rostedt,
	Frederic Weisbecker, Ingo Molnar, Jiri Slaby, linux-kernel

(2014/05/02 6:06), Andi Kleen wrote:
>> When bar returns, would it skip foo and go straight back to foo's
>> caller?  If so, then it should be safe to patch foo after it jumps to
>> bar.
> 
> foo is no problem, you see it in the backtrace.
> But you don't see bar.

No, there is no "foo" in backtrace.
As Josh said, stop_machine() schedules highest priority tasks on
each running thread, as below.

buz:
  ...
  call foo -> push buz on the stack

foo:
 ...
 jmp bar

bar:
 ...
 call schedule -> push bar on the stack

Thus, we'll not see foo on the stack. However, since the tail call
doesn't return to foo, we can treat it already finished. So, there
is no consistency problem on it. :)

Thank you,

-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-01 15:52 [RFC PATCH 0/2] kpatch: dynamic kernel patching Josh Poimboeuf
                   ` (2 preceding siblings ...)
  2014-05-01 20:45 ` [RFC PATCH 0/2] kpatch: dynamic kernel patching Andi Kleen
@ 2014-05-02  8:37 ` Jiri Kosina
  2014-05-02 13:29   ` Josh Poimboeuf
  2014-05-02 13:10 ` Jiri Kosina
  2014-05-05  8:55 ` Ingo Molnar
  5 siblings, 1 reply; 60+ messages in thread
From: Jiri Kosina @ 2014-05-02  8:37 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Seth Jennings, Masami Hiramatsu, Steven Rostedt,
	Frederic Weisbecker, Ingo Molnar, Jiri Slaby, linux-kernel

On Thu, 1 May 2014, Josh Poimboeuf wrote:

> Since Jiri posted the kGraft patches [1], I wanted to share an
> alternative live patching solution called kpatch, which is something
> we've been working on at Red Hat for quite a while.

Hi Josh, Seth,

thanks a lot for following up to our RFC with your submission, I am pretty 
sure this will help to energize the discussion and will provoke ideas for 
further improvements.

[ ... snip ... ]
> kpatch vs kGraft
> ----------------
> 
> I think the biggest difference between kpatch and kGraft is how they
> ensure that the patch is applied atomically and safely.
> 
> kpatch checks the backtraces of all tasks in stop_machine() to ensure
> that no instances of the old function are running when the new function
> is applied.  I think the biggest downside of this approach is that
> stop_machine() has to idle all other CPUs during the patching process,
> so it inserts a small amount of latency (a few ms on an idle system).

Completely agreed with your comparative analysis, thanks for a nice 
summary.

Additional thing that I believe is important to add here: with the 
"stop-machine / check all tasks" aproach, there might be situations where 
you'll always fail to patch the system; if there is a long-time sleeper in 
the patched callchain, such a single sleeper is enough to make the 
patching of the whole system impossible.

With the lazy/gradual aproach implemented in kGraft, the whole system is 
gradually moving towards "fully patched" state and once all the sleepers 
blocking the process wake up, it ultimately converges to the fully patched 
state.

Thanks,

-- 
Jiri Kosina
SUSE Labs


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-01 15:52 [RFC PATCH 0/2] kpatch: dynamic kernel patching Josh Poimboeuf
                   ` (3 preceding siblings ...)
  2014-05-02  8:37 ` Jiri Kosina
@ 2014-05-02 13:10 ` Jiri Kosina
  2014-05-02 13:37   ` Josh Poimboeuf
  2014-05-05 23:34   ` David Lang
  2014-05-05  8:55 ` Ingo Molnar
  5 siblings, 2 replies; 60+ messages in thread
From: Jiri Kosina @ 2014-05-02 13:10 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Seth Jennings, Masami Hiramatsu, Steven Rostedt,
	Frederic Weisbecker, Ingo Molnar, Jiri Slaby, linux-kernel

On Thu, 1 May 2014, Josh Poimboeuf wrote:

> kpatch vs kGraft
> ----------------
> 
> I think the biggest difference between kpatch and kGraft is how they
> ensure that the patch is applied atomically and safely.
> 
> kpatch checks the backtraces of all tasks in stop_machine() to ensure
> that no instances of the old function are running when the new function
> is applied.  I think the biggest downside of this approach is that
> stop_machine() has to idle all other CPUs during the patching process,
> so it inserts a small amount of latency (a few ms on an idle system).
> 
> Instead, kGraft uses per-task consistency: each task either sees the old
> version or the new version of the function.  This gives a consistent
> view with respect to functions, but _not_ data, because the old and new
> functions are allowed to run simultaneously and share data.  This could
> be dangerous if a patch changes how a function uses a data structure.
> The new function could make a data change that the old function wasn't
> expecting.

Please correct me if I am wrong, but with kPatch, you are also unable to 
do a "flip and forget" switch between functions that expect different 
format of in-memory data without performing a non-trivial all-memory 
lookup to find structures in question and perfoming corresponding 
transformations.

What we can do with kGraft si to perform the patching in two steps

(1) redirect to a temporary band-aid function that can handle both 
    semantics of the data (persumably in highly sub-optimal way)
(2) patching in (1) succeeds completely (kGraft claims victory), start a 
    new round of patching with redirect to the final function which 
    expects only the new semantics

This basically implies that both aproaches need "human inspection" in this 
respect anyway.

-- 
Jiri Kosina
SUSE Labs


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-02  8:37 ` Jiri Kosina
@ 2014-05-02 13:29   ` Josh Poimboeuf
  0 siblings, 0 replies; 60+ messages in thread
From: Josh Poimboeuf @ 2014-05-02 13:29 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Seth Jennings, Masami Hiramatsu, Steven Rostedt,
	Frederic Weisbecker, Ingo Molnar, Jiri Slaby, linux-kernel

On Fri, May 02, 2014 at 10:37:53AM +0200, Jiri Kosina wrote:
> On Thu, 1 May 2014, Josh Poimboeuf wrote:
> 
> > Since Jiri posted the kGraft patches [1], I wanted to share an
> > alternative live patching solution called kpatch, which is something
> > we've been working on at Red Hat for quite a while.
> 
> Hi Josh, Seth,
> 
> thanks a lot for following up to our RFC with your submission, I am pretty 
> sure this will help to energize the discussion and will provoke ideas for 
> further improvements.

Hi Jiri!

Thanks for your comments.  I'm pretty sure we have the same basic goals
and we definitely want to find something that works for everybody.

> 
> [ ... snip ... ]
> > kpatch vs kGraft
> > ----------------
> > 
> > I think the biggest difference between kpatch and kGraft is how they
> > ensure that the patch is applied atomically and safely.
> > 
> > kpatch checks the backtraces of all tasks in stop_machine() to ensure
> > that no instances of the old function are running when the new function
> > is applied.  I think the biggest downside of this approach is that
> > stop_machine() has to idle all other CPUs during the patching process,
> > so it inserts a small amount of latency (a few ms on an idle system).
> 
> Completely agreed with your comparative analysis, thanks for a nice 
> summary.
> 
> Additional thing that I believe is important to add here: with the 
> "stop-machine / check all tasks" aproach, there might be situations where 
> you'll always fail to patch the system; if there is a long-time sleeper in 
> the patched callchain, such a single sleeper is enough to make the 
> patching of the whole system impossible.

Yeah, this is true with our current implementation.  With
stop_machine(), functions which are always on the stack of at least one
process in the system can't be patched.  In practice this means
functions like schedule(), poll(), select() and nanosleep().

Here are all the top-level sleeping functions on my system:

  $ for i in /proc/*/wchan; do cat $i; echo; done | sort |uniq |egrep -v '^0$'
  devtmpfsd
  do_sigtimedwait
  do_wait
  ep_poll
  fsnotify_mark_destroy
  futex_wait_queue_me
  hrtimer_nanosleep
  hub_thread
  inotify_read
  irq_thread
  kauditd_thread
  khugepaged
  kjournald2
  ksm_scan_thread
  kswapd
  kthreadd
  n_tty_read
  pipe_wait
  poll_schedule_timeout
  rcu_gp_kthread
  rescuer_thread
  rfcomm_run
  scsi_error_handler
  smpboot_thread_fn
  unix_stream_recvmsg
  worker_thread

In addition to these functions, their call chain path down to schedule()
also can't be patched.  But it works out to be a tiny ratio of
unpatchable-to-patchable.

I think this is a very minor limitation, but if it turned out to be a
problem, we could do something like installing an ftrace handler for the
blocking function, waking the sleeping task, and looping in the ftrace
handler until the patching process is complete.

> 
> With the lazy/gradual aproach implemented in kGraft, the whole system is 
> gradually moving towards "fully patched" state and once all the sleepers 
> blocking the process wake up, it ultimately converges to the fully patched 
> state.
> 
> Thanks,
> 
> -- 
> Jiri Kosina
> SUSE Labs
> 

-- 
Josh

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-02 13:10 ` Jiri Kosina
@ 2014-05-02 13:37   ` Josh Poimboeuf
  2014-05-05 23:34   ` David Lang
  1 sibling, 0 replies; 60+ messages in thread
From: Josh Poimboeuf @ 2014-05-02 13:37 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Seth Jennings, Masami Hiramatsu, Steven Rostedt,
	Frederic Weisbecker, Ingo Molnar, Jiri Slaby, linux-kernel

On Fri, May 02, 2014 at 03:10:58PM +0200, Jiri Kosina wrote:
> On Thu, 1 May 2014, Josh Poimboeuf wrote:
> 
> > kpatch vs kGraft
> > ----------------
> > 
> > I think the biggest difference between kpatch and kGraft is how they
> > ensure that the patch is applied atomically and safely.
> > 
> > kpatch checks the backtraces of all tasks in stop_machine() to ensure
> > that no instances of the old function are running when the new function
> > is applied.  I think the biggest downside of this approach is that
> > stop_machine() has to idle all other CPUs during the patching process,
> > so it inserts a small amount of latency (a few ms on an idle system).
> > 
> > Instead, kGraft uses per-task consistency: each task either sees the old
> > version or the new version of the function.  This gives a consistent
> > view with respect to functions, but _not_ data, because the old and new
> > functions are allowed to run simultaneously and share data.  This could
> > be dangerous if a patch changes how a function uses a data structure.
> > The new function could make a data change that the old function wasn't
> > expecting.
> 
> Please correct me if I am wrong, but with kPatch, you are also unable to 
> do a "flip and forget" switch between functions that expect different 
> format of in-memory data without performing a non-trivial all-memory 
> lookup to find structures in question and perfoming corresponding 
> transformations.
> 
> What we can do with kGraft si to perform the patching in two steps
> 
> (1) redirect to a temporary band-aid function that can handle both 
>     semantics of the data (persumably in highly sub-optimal way)
> (2) patching in (1) succeeds completely (kGraft claims victory), start a 
>     new round of patching with redirect to the final function which 
>     expects only the new semantics
> 
> This basically implies that both aproaches need "human inspection" in this 
> respect anyway.

Ah, interesting.  The intermediate function which knows how to handle
both versions of the data could get pretty tricky though.

With kpatch we'd just need to allow the user to install a callback
function which runs in stop_machine() and manually updates all the data
structures.  I think we'll be implementing something like this soon.

-- 
Josh

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-01 15:52 [RFC PATCH 0/2] kpatch: dynamic kernel patching Josh Poimboeuf
                   ` (4 preceding siblings ...)
  2014-05-02 13:10 ` Jiri Kosina
@ 2014-05-05  8:55 ` Ingo Molnar
  2014-05-05 13:26   ` Josh Poimboeuf
  5 siblings, 1 reply; 60+ messages in thread
From: Ingo Molnar @ 2014-05-05  8:55 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Seth Jennings, Masami Hiramatsu, Steven Rostedt,
	Frederic Weisbecker, Ingo Molnar, Jiri Slaby, linux-kernel,
	Peter Zijlstra, Andrew Morton, Linus Torvalds, Thomas Gleixner


* Josh Poimboeuf <jpoimboe@redhat.com> wrote:

> [...]
> 
> kpatch checks the backtraces of all tasks in stop_machine() to 
> ensure that no instances of the old function are running when the 
> new function is applied.  I think the biggest downside of this 
> approach is that stop_machine() has to idle all other CPUs during 
> the patching process, so it inserts a small amount of latency (a few 
> ms on an idle system).

When live patching the kernel, how about achieving an even 'cleaner' 
state for all tasks in the system: to freeze all tasks, as the suspend 
and hibernation code (and kexec) does, via freeze_processes()?

That means no tasks in the system have any real kernel execution 
state, and there's also no problem with long-sleeping tasks, as 
freeze_processes() is supposed to be fast as well.

I.e. go for the most conservative live patching state first, and relax 
it only once the initial model is upstream and is working robustly.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-05  8:55 ` Ingo Molnar
@ 2014-05-05 13:26   ` Josh Poimboeuf
  2014-05-05 14:10     ` Frederic Weisbecker
  0 siblings, 1 reply; 60+ messages in thread
From: Josh Poimboeuf @ 2014-05-05 13:26 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Seth Jennings, Masami Hiramatsu, Steven Rostedt,
	Frederic Weisbecker, Ingo Molnar, Jiri Slaby, linux-kernel,
	Peter Zijlstra, Andrew Morton, Linus Torvalds, Thomas Gleixner

On Mon, May 05, 2014 at 10:55:37AM +0200, Ingo Molnar wrote:
> 
> * Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> 
> > [...]
> > 
> > kpatch checks the backtraces of all tasks in stop_machine() to 
> > ensure that no instances of the old function are running when the 
> > new function is applied.  I think the biggest downside of this 
> > approach is that stop_machine() has to idle all other CPUs during 
> > the patching process, so it inserts a small amount of latency (a few 
> > ms on an idle system).
> 
> When live patching the kernel, how about achieving an even 'cleaner' 
> state for all tasks in the system: to freeze all tasks, as the suspend 
> and hibernation code (and kexec) does, via freeze_processes()?
> 
> That means no tasks in the system have any real kernel execution 
> state, and there's also no problem with long-sleeping tasks, as 
> freeze_processes() is supposed to be fast as well.
> 
> I.e. go for the most conservative live patching state first, and relax 
> it only once the initial model is upstream and is working robustly.

I had considered doing this before, but the problem I found is that many
kernel threads are unfreezable.  So we wouldn't be able to check whether
its safe to replace any functions in use by those kernel threads.

-- 
Josh

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-05 13:26   ` Josh Poimboeuf
@ 2014-05-05 14:10     ` Frederic Weisbecker
  2014-05-05 18:43       ` Ingo Molnar
  0 siblings, 1 reply; 60+ messages in thread
From: Frederic Weisbecker @ 2014-05-05 14:10 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Ingo Molnar, Seth Jennings, Masami Hiramatsu, Steven Rostedt,
	Ingo Molnar, Jiri Slaby, linux-kernel, Peter Zijlstra,
	Andrew Morton, Linus Torvalds, Thomas Gleixner

On Mon, May 05, 2014 at 08:26:38AM -0500, Josh Poimboeuf wrote:
> On Mon, May 05, 2014 at 10:55:37AM +0200, Ingo Molnar wrote:
> > 
> > * Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> > 
> > > [...]
> > > 
> > > kpatch checks the backtraces of all tasks in stop_machine() to 
> > > ensure that no instances of the old function are running when the 
> > > new function is applied.  I think the biggest downside of this 
> > > approach is that stop_machine() has to idle all other CPUs during 
> > > the patching process, so it inserts a small amount of latency (a few 
> > > ms on an idle system).
> > 
> > When live patching the kernel, how about achieving an even 'cleaner' 
> > state for all tasks in the system: to freeze all tasks, as the suspend 
> > and hibernation code (and kexec) does, via freeze_processes()?
> > 
> > That means no tasks in the system have any real kernel execution 
> > state, and there's also no problem with long-sleeping tasks, as 
> > freeze_processes() is supposed to be fast as well.
> > 
> > I.e. go for the most conservative live patching state first, and relax 
> > it only once the initial model is upstream and is working robustly.
> 
> I had considered doing this before, but the problem I found is that many
> kernel threads are unfreezable.  So we wouldn't be able to check whether
> its safe to replace any functions in use by those kernel threads.

OTOH many kernel threads are parkable. Which achieves kind of similar desired
behaviour: the kernel threads then aren't running.

And in fact we could implement freezing on top of park for kthreads.

But unfortunately there are still quite some of them which don't support parking.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-05 14:10     ` Frederic Weisbecker
@ 2014-05-05 18:43       ` Ingo Molnar
  2014-05-05 21:49         ` Frederic Weisbecker
  2014-05-06 11:45         ` Masami Hiramatsu
  0 siblings, 2 replies; 60+ messages in thread
From: Ingo Molnar @ 2014-05-05 18:43 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Josh Poimboeuf, Seth Jennings, Masami Hiramatsu, Steven Rostedt,
	Ingo Molnar, Jiri Slaby, linux-kernel, Peter Zijlstra,
	Andrew Morton, Linus Torvalds, Thomas Gleixner


* Frederic Weisbecker <fweisbec@gmail.com> wrote:

> On Mon, May 05, 2014 at 08:26:38AM -0500, Josh Poimboeuf wrote:
> > On Mon, May 05, 2014 at 10:55:37AM +0200, Ingo Molnar wrote:
> > > 
> > > * Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> > > 
> > > > [...]
> > > > 
> > > > kpatch checks the backtraces of all tasks in stop_machine() to 
> > > > ensure that no instances of the old function are running when the 
> > > > new function is applied.  I think the biggest downside of this 
> > > > approach is that stop_machine() has to idle all other CPUs during 
> > > > the patching process, so it inserts a small amount of latency (a few 
> > > > ms on an idle system).
> > > 
> > > When live patching the kernel, how about achieving an even 'cleaner' 
> > > state for all tasks in the system: to freeze all tasks, as the suspend 
> > > and hibernation code (and kexec) does, via freeze_processes()?
> > > 
> > > That means no tasks in the system have any real kernel execution 
> > > state, and there's also no problem with long-sleeping tasks, as 
> > > freeze_processes() is supposed to be fast as well.
> > > 
> > > I.e. go for the most conservative live patching state first, and relax 
> > > it only once the initial model is upstream and is working robustly.
> > 
> > I had considered doing this before, but the problem I found is 
> > that many kernel threads are unfreezable.  So we wouldn't be able 
> > to check whether its safe to replace any functions in use by those 
> > kernel threads.
> 
> OTOH many kernel threads are parkable. Which achieves kind of 
> similar desired behaviour: the kernel threads then aren't running.
> 
> And in fact we could implement freezing on top of park for kthreads.
> 
> But unfortunately there are still quite some of them which don't 
> support parking.

Well, if distros are moving towards live patching (and they are!), 
then it looks rather necessary to me that something scary as flipping 
out live kernel instructions with substantially different code should 
be as safe as possible, and only then fast.

If a kernel refuses to patch with certain threads running, that will 
drive those kernel threads being fixed and such. It's a deterministic, 
recoverable, reportable bug situation, so fixing it should be fast.

We learned these robustness lessons the hard way with kprobes and 
ftrace dynamic code patching... which are utterly simple compared to 
live kernel patching!

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-05 18:43       ` Ingo Molnar
@ 2014-05-05 21:49         ` Frederic Weisbecker
  2014-05-06 12:12           ` Josh Poimboeuf
  2014-05-06 11:45         ` Masami Hiramatsu
  1 sibling, 1 reply; 60+ messages in thread
From: Frederic Weisbecker @ 2014-05-05 21:49 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Josh Poimboeuf, Seth Jennings, Masami Hiramatsu, Steven Rostedt,
	Ingo Molnar, Jiri Slaby, linux-kernel, Peter Zijlstra,
	Andrew Morton, Linus Torvalds, Thomas Gleixner

On Mon, May 05, 2014 at 08:43:04PM +0200, Ingo Molnar wrote:
> 
> * Frederic Weisbecker <fweisbec@gmail.com> wrote:
> 
> > On Mon, May 05, 2014 at 08:26:38AM -0500, Josh Poimboeuf wrote:
> > > On Mon, May 05, 2014 at 10:55:37AM +0200, Ingo Molnar wrote:
> > > > 
> > > > * Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> > > > 
> > > > > [...]
> > > > > 
> > > > > kpatch checks the backtraces of all tasks in stop_machine() to 
> > > > > ensure that no instances of the old function are running when the 
> > > > > new function is applied.  I think the biggest downside of this 
> > > > > approach is that stop_machine() has to idle all other CPUs during 
> > > > > the patching process, so it inserts a small amount of latency (a few 
> > > > > ms on an idle system).
> > > > 
> > > > When live patching the kernel, how about achieving an even 'cleaner' 
> > > > state for all tasks in the system: to freeze all tasks, as the suspend 
> > > > and hibernation code (and kexec) does, via freeze_processes()?
> > > > 
> > > > That means no tasks in the system have any real kernel execution 
> > > > state, and there's also no problem with long-sleeping tasks, as 
> > > > freeze_processes() is supposed to be fast as well.
> > > > 
> > > > I.e. go for the most conservative live patching state first, and relax 
> > > > it only once the initial model is upstream and is working robustly.
> > > 
> > > I had considered doing this before, but the problem I found is 
> > > that many kernel threads are unfreezable.  So we wouldn't be able 
> > > to check whether its safe to replace any functions in use by those 
> > > kernel threads.
> > 
> > OTOH many kernel threads are parkable. Which achieves kind of 
> > similar desired behaviour: the kernel threads then aren't running.
> > 
> > And in fact we could implement freezing on top of park for kthreads.
> > 
> > But unfortunately there are still quite some of them which don't 
> > support parking.
> 
> Well, if distros are moving towards live patching (and they are!), 
> then it looks rather necessary to me that something scary as flipping 
> out live kernel instructions with substantially different code should 
> be as safe as possible, and only then fast.

Sure I won't argue that.

> 
> If a kernel refuses to patch with certain threads running, that will 
> drive those kernel threads being fixed and such. It's a deterministic, 
> recoverable, reportable bug situation, so fixing it should be fast.
> 
> We learned these robustness lessons the hard way with kprobes and 
> ftrace dynamic code patching... which are utterly simple compared to 
> live kernel patching!

Yeah, agreed. More rationale behind: we want to put the kthreads into
semantic sleeps, not just random sleeping point. This way we lower the
chances to execute new code messing up living state that is expecting old
code after random preemption or sleeping points.

But by semantic sleeps I mean more than just explicit calls to schedule()
as opposed to preemption points.
It also implies shutting down as well the living states handled by the kthread
such that some sort of re-initialization of the state is also needed when
the kthread gets back to run.

And that's exactly what good implementations of kthread park provide.

Consider kernel/watchdog.c as an example: when we park the lockup
detector kthread, it disables the perf event and the hrtimer before it goes
to actually park and sleep. When the kthread is later unparked, the kthread
restarts the hrtimer and the perf event.

If we live patch code that has obscure relations with perf or hrtimer here,
we lower a lot the chances for a crash when the watchdog kthread is parked.

So I'm in favour of parking all possible kthreads before live patching. Freezing
alone doesn't provide the same state shutdown than parking.

Now since parking looks more widely implemented than kthread freezing, we could
even think about implementing kthread freezing using parking as backend.


> 
> Thanks,
> 
> 	Ingo

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-02 13:10 ` Jiri Kosina
  2014-05-02 13:37   ` Josh Poimboeuf
@ 2014-05-05 23:34   ` David Lang
  2014-05-05 23:52     ` Jiri Kosina
  1 sibling, 1 reply; 60+ messages in thread
From: David Lang @ 2014-05-05 23:34 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Josh Poimboeuf, Seth Jennings, Masami Hiramatsu, Steven Rostedt,
	Frederic Weisbecker, Ingo Molnar, Jiri Slaby, linux-kernel

On Fri, 2 May 2014, Jiri Kosina wrote:

> On Thu, 1 May 2014, Josh Poimboeuf wrote:
>
>> kpatch vs kGraft
>> ----------------
>>
>> I think the biggest difference between kpatch and kGraft is how they
>> ensure that the patch is applied atomically and safely.
>>
>> kpatch checks the backtraces of all tasks in stop_machine() to ensure
>> that no instances of the old function are running when the new function
>> is applied.  I think the biggest downside of this approach is that
>> stop_machine() has to idle all other CPUs during the patching process,
>> so it inserts a small amount of latency (a few ms on an idle system).
>>
>> Instead, kGraft uses per-task consistency: each task either sees the old
>> version or the new version of the function.  This gives a consistent
>> view with respect to functions, but _not_ data, because the old and new
>> functions are allowed to run simultaneously and share data.  This could
>> be dangerous if a patch changes how a function uses a data structure.
>> The new function could make a data change that the old function wasn't
>> expecting.
>
> Please correct me if I am wrong, but with kPatch, you are also unable to
> do a "flip and forget" switch between functions that expect different
> format of in-memory data without performing a non-trivial all-memory
> lookup to find structures in question and perfoming corresponding
> transformations.
>
> What we can do with kGraft si to perform the patching in two steps
>
> (1) redirect to a temporary band-aid function that can handle both
>    semantics of the data (persumably in highly sub-optimal way)
> (2) patching in (1) succeeds completely (kGraft claims victory), start a
>    new round of patching with redirect to the final function which
>    expects only the new semantics
>
> This basically implies that both aproaches need "human inspection" in this
> respect anyway.

how would you know that all instances of the datastructure in memory have been 
touched? just because all tasks have run and are outside the function in 
question doesn't tell you data structures have been converted. You have no way 
of knowing when (or if) the next call to the modified function will take place 
on any potential in-memory structure.

David Lang

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-05 23:34   ` David Lang
@ 2014-05-05 23:52     ` Jiri Kosina
  2014-05-06  1:59       ` David Lang
  2014-05-06  7:32       ` Ingo Molnar
  0 siblings, 2 replies; 60+ messages in thread
From: Jiri Kosina @ 2014-05-05 23:52 UTC (permalink / raw)
  To: David Lang
  Cc: Josh Poimboeuf, Seth Jennings, Masami Hiramatsu, Steven Rostedt,
	Frederic Weisbecker, Ingo Molnar, Jiri Slaby, linux-kernel

On Mon, 5 May 2014, David Lang wrote:

> how would you know that all instances of the datastructure in memory 
> have= been touched? just because all tasks have run and are outside the 
> function in question doesn't tell you data structures have been 
> converted. You have n= o way of knowing when (or if) the next call to 
> the modified function will take place on any potential in-memory 
> structure.

The problem you are trying to avoid here is functions expecting to read 
"v2" format of the data from memory, while there are still tasks that are 
unpredictably writing "v1" format of the data to the memory.

There are several ways to attack this problem:

- stop the whole system, convert all the existing data structures to new 
  format (which might potentially be non-trivial, mostly because you 
  have to *know* where all the data structures have been allocated), apply 
  patch, resume operation [ksplice, probably kpatch in future]
- restrict the data format to be backwards compatible [to be done 
  manually during patch creation, currently what kGraft needs to do in 
  such case]
- have a proxy code which can read both "v1" and "v2" formats, and writes 
  back in the same format it has seen the data structure on input
- once all the *code* has been converted, it still has to understand "v1" 
  and "v2", but it can now start writing out "v2" format only [possible 
  with kGraft, not implemented in automated fashion]

Ideas are of course more than welcome.

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-05 23:52     ` Jiri Kosina
@ 2014-05-06  1:59       ` David Lang
  2014-05-06 12:17         ` Josh Poimboeuf
  2014-05-06  7:32       ` Ingo Molnar
  1 sibling, 1 reply; 60+ messages in thread
From: David Lang @ 2014-05-06  1:59 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Josh Poimboeuf, Seth Jennings, Masami Hiramatsu, Steven Rostedt,
	Frederic Weisbecker, Ingo Molnar, Jiri Slaby, linux-kernel

On Tue, 6 May 2014, Jiri Kosina wrote:

> On Mon, 5 May 2014, David Lang wrote:
>
>> how would you know that all instances of the datastructure in memory
>> have= been touched? just because all tasks have run and are outside the
>> function in question doesn't tell you data structures have been
>> converted. You have n= o way of knowing when (or if) the next call to
>> the modified function will take place on any potential in-memory
>> structure.
>
> The problem you are trying to avoid here is functions expecting to read
> "v2" format of the data from memory, while there are still tasks that are
> unpredictably writing "v1" format of the data to the memory.
>
> There are several ways to attack this problem:
>
> - stop the whole system, convert all the existing data structures to new
>  format (which might potentially be non-trivial, mostly because you
>  have to *know* where all the data structures have been allocated), apply
>  patch, resume operation [ksplice, probably kpatch in future]
> - restrict the data format to be backwards compatible [to be done
>  manually during patch creation, currently what kGraft needs to do in
>  such case]
> - have a proxy code which can read both "v1" and "v2" formats, and writes
>  back in the same format it has seen the data structure on input

doesn't this have the same problem of finding all the data?

> - once all the *code* has been converted, it still has to understand "v1"
>  and "v2", but it can now start writing out "v2" format only [possible
>  with kGraft, not implemented in automated fashion]

which is a varient of the second one in that all the data needs to be tagged 
with a version so that it can be converted.

I don't see that 'stop the world' ends up being much better for this.

it does avoid any possibility of v1 code reading v2 data, but it doesn't help in 
avoiding v2 code reading v1 data some time (potentially much) later.

David Lang

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-05 23:52     ` Jiri Kosina
  2014-05-06  1:59       ` David Lang
@ 2014-05-06  7:32       ` Ingo Molnar
  2014-05-06  8:03         ` Jiri Kosina
  2014-05-06 12:23         ` Josh Poimboeuf
  1 sibling, 2 replies; 60+ messages in thread
From: Ingo Molnar @ 2014-05-06  7:32 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: David Lang, Josh Poimboeuf, Seth Jennings, Masami Hiramatsu,
	Steven Rostedt, Frederic Weisbecker, Ingo Molnar, Jiri Slaby,
	linux-kernel


* Jiri Kosina <jkosina@suse.cz> wrote:

> On Mon, 5 May 2014, David Lang wrote:
> 
> > how would you know that all instances of the datastructure in memory 
> > have= been touched? just because all tasks have run and are outside the 
> > function in question doesn't tell you data structures have been 
> > converted. You have n= o way of knowing when (or if) the next call to 
> > the modified function will take place on any potential in-memory 
> > structure.
> 
> The problem you are trying to avoid here is functions expecting to read 
> "v2" format of the data from memory, while there are still tasks that are 
> unpredictably writing "v1" format of the data to the memory.
> 
> There are several ways to attack this problem:
> 
> - stop the whole system, convert all the existing data structures to new 
>   format (which might potentially be non-trivial, mostly because you 
>   have to *know* where all the data structures have been allocated), apply 
>   patch, resume operation [ksplice, probably kpatch in future]
> - restrict the data format to be backwards compatible [to be done 
>   manually during patch creation, currently what kGraft needs to do in 
>   such case]
> - have a proxy code which can read both "v1" and "v2" formats, and writes 
>   back in the same format it has seen the data structure on input
> - once all the *code* has been converted, it still has to understand "v1" 
>   and "v2", but it can now start writing out "v2" format only [possible 
>   with kGraft, not implemented in automated fashion]
> 
> Ideas are of course more than welcome.

So what I'm curious about, what is the actual 'in the field' distro 
experience, about the type of live-patches that get pushed with 
urgency?

My guess would be that the overwhelming majority of live-patches don't 
change data structures - and hence the right initial model would be to 
ensure (via tooling, and via review) that 'v1' and 'v2' data is 
exactly the same.

The most abstract way to 'live patch' a kernel is to do a checkpoint 
save on all application state, to reboot the kernel, boot into the 
patched kernel and then restore all application state seemlessly.

i.e. the CR effort that has been going on for years, to save/restore 
all state that is exposed through a kernel ABI. That allows a complete 
kernel to be switched out, without affecting applications.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-06  7:32       ` Ingo Molnar
@ 2014-05-06  8:03         ` Jiri Kosina
  2014-05-06 12:23         ` Josh Poimboeuf
  1 sibling, 0 replies; 60+ messages in thread
From: Jiri Kosina @ 2014-05-06  8:03 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: David Lang, Josh Poimboeuf, Seth Jennings, Masami Hiramatsu,
	Steven Rostedt, Frederic Weisbecker, Ingo Molnar, Jiri Slaby,
	linux-kernel

On Tue, 6 May 2014, Ingo Molnar wrote:

> So what I'm curious about, what is the actual 'in the field' distro 
> experience, about the type of live-patches that get pushed with 
> urgency?

This is of course a very good question. We've done some very light 
preparatory analysis and went through patches which would make most sense 
to be shipped as hot/live patches without enough time for proper downtime 
scheduling (i.e. CVE severity high enough (local root), etc). Most of the 
time, these turn out to be a one-or-few liners, mostly adding extra check, 
fixing bounds, etc. There were just one or two in a few years history 
where some extra care would be needed.

Of course, I guess that most valuable input regarding this could be coming 
from kSplice guys, as they have the biggest in-field experience with this, 
and are shipping a lot of non-trivial patches this way, not just the 
"super-critical" ones. But I am not really sure whether we can expect any 
input from them, unfortunately.

> My guess would be that the overwhelming majority of live-patches don't 
> change data structures - and hence the right initial model would be to 
> ensure (via tooling, and via review) that 'v1' and 'v2' data is exactly 
> the same.

I fully agree ... that's why we are not pro-actively dealing with that in 
kGraft, for the sake of keeping it stupid at simple (at least in the early 
stages).

> The most abstract way to 'live patch' a kernel is to do a checkpoint 
> save on all application state, to reboot the kernel, boot into the 
> patched kernel and then restore all application state seemlessly.

In some sense yes, but it seems to still be rather costly operation 
(especially checkpointing on a HUGE machine might take some time, if I 
understand the technology correctly) compared to just flipping a few bytes 
(while maintaining overall correctness, of course) in the kernel memory.

Basically, CR might be the best solution for very complex and large kernel 
updates. What I am currently most concerned about though, are scenarios 
where datacenter owners need to urgently apply a local root one-liner fix 
with as little hassle as possible.

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-05 18:43       ` Ingo Molnar
  2014-05-05 21:49         ` Frederic Weisbecker
@ 2014-05-06 11:45         ` Masami Hiramatsu
  2014-05-06 12:26           ` Steven Rostedt
  1 sibling, 1 reply; 60+ messages in thread
From: Masami Hiramatsu @ 2014-05-06 11:45 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Frederic Weisbecker, Josh Poimboeuf, Seth Jennings,
	Steven Rostedt, Ingo Molnar, Jiri Slaby, linux-kernel,
	Peter Zijlstra, Andrew Morton, Linus Torvalds, Thomas Gleixner

(2014/05/06 3:43), Ingo Molnar wrote:
> 
> * Frederic Weisbecker <fweisbec@gmail.com> wrote:
> 
>> On Mon, May 05, 2014 at 08:26:38AM -0500, Josh Poimboeuf wrote:
>>> On Mon, May 05, 2014 at 10:55:37AM +0200, Ingo Molnar wrote:
>>>>
>>>> * Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>>>>
>>>>> [...]
>>>>>
>>>>> kpatch checks the backtraces of all tasks in stop_machine() to 
>>>>> ensure that no instances of the old function are running when the 
>>>>> new function is applied.  I think the biggest downside of this 
>>>>> approach is that stop_machine() has to idle all other CPUs during 
>>>>> the patching process, so it inserts a small amount of latency (a few 
>>>>> ms on an idle system).
>>>>
>>>> When live patching the kernel, how about achieving an even 'cleaner' 
>>>> state for all tasks in the system: to freeze all tasks, as the suspend 
>>>> and hibernation code (and kexec) does, via freeze_processes()?
>>>>
>>>> That means no tasks in the system have any real kernel execution 
>>>> state, and there's also no problem with long-sleeping tasks, as 
>>>> freeze_processes() is supposed to be fast as well.
>>>>
>>>> I.e. go for the most conservative live patching state first, and relax 
>>>> it only once the initial model is upstream and is working robustly.
>>>
>>> I had considered doing this before, but the problem I found is 
>>> that many kernel threads are unfreezable.  So we wouldn't be able 
>>> to check whether its safe to replace any functions in use by those 
>>> kernel threads.
>>
>> OTOH many kernel threads are parkable. Which achieves kind of 
>> similar desired behaviour: the kernel threads then aren't running.
>>
>> And in fact we could implement freezing on top of park for kthreads.
>>
>> But unfortunately there are still quite some of them which don't 
>> support parking.
> 
> Well, if distros are moving towards live patching (and they are!), 
> then it looks rather necessary to me that something scary as flipping 
> out live kernel instructions with substantially different code should 
> be as safe as possible, and only then fast.

Agreed.
At this point, I think we'd better take a safer way to live patch.

However, I also think if users can accept such freezing wait-time,
it means they can also accept kexec based "checkpoint-restart" patching.
So, I think the final goal of the kpatch will be live patching without
stopping the machine. I'm discussing the issue on github #138, but that is
off-topic. :)

> If a kernel refuses to patch with certain threads running, that will 
> drive those kernel threads being fixed and such. It's a deterministic, 
> recoverable, reportable bug situation, so fixing it should be fast.

That's nice to fix that. As Frederic said, we can make all kthreads
park-able.

> We learned these robustness lessons the hard way with kprobes and 
> ftrace dynamic code patching... which are utterly simple compared to 
> live kernel patching!

Yeah, thanks for your help :)

Thank you,

-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-05 21:49         ` Frederic Weisbecker
@ 2014-05-06 12:12           ` Josh Poimboeuf
  2014-05-06 12:33             ` Steven Rostedt
  2014-05-06 14:05             ` Frederic Weisbecker
  0 siblings, 2 replies; 60+ messages in thread
From: Josh Poimboeuf @ 2014-05-06 12:12 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ingo Molnar, Seth Jennings, Masami Hiramatsu, Steven Rostedt,
	Ingo Molnar, Jiri Slaby, linux-kernel, Peter Zijlstra,
	Andrew Morton, Linus Torvalds, Thomas Gleixner

On Mon, May 05, 2014 at 11:49:23PM +0200, Frederic Weisbecker wrote:
> On Mon, May 05, 2014 at 08:43:04PM +0200, Ingo Molnar wrote:
> > If a kernel refuses to patch with certain threads running, that will 
> > drive those kernel threads being fixed and such. It's a deterministic, 
> > recoverable, reportable bug situation, so fixing it should be fast.
> > 
> > We learned these robustness lessons the hard way with kprobes and 
> > ftrace dynamic code patching... which are utterly simple compared to 
> > live kernel patching!
> 
> Yeah, agreed. More rationale behind: we want to put the kthreads into
> semantic sleeps, not just random sleeping point. This way we lower the
> chances to execute new code messing up living state that is expecting old
> code after random preemption or sleeping points.
> 
> But by semantic sleeps I mean more than just explicit calls to schedule()
> as opposed to preemption points.
> It also implies shutting down as well the living states handled by the kthread
> such that some sort of re-initialization of the state is also needed when
> the kthread gets back to run.
> 
> And that's exactly what good implementations of kthread park provide.
> 
> Consider kernel/watchdog.c as an example: when we park the lockup
> detector kthread, it disables the perf event and the hrtimer before it goes
> to actually park and sleep. When the kthread is later unparked, the kthread
> restarts the hrtimer and the perf event.
> 
> If we live patch code that has obscure relations with perf or hrtimer here,
> we lower a lot the chances for a crash when the watchdog kthread is parked.
> 
> So I'm in favour of parking all possible kthreads before live patching. Freezing
> alone doesn't provide the same state shutdown than parking.
> 
> Now since parking looks more widely implemented than kthread freezing, we could
> even think about implementing kthread freezing using parking as backend.

The vast majority of kernel threads on my system don't seem to know
anything about parking or freezing.  I see one kthread function which
calls kthread_should_park(), which is smpboot_thread_fn(), used for
ksoftirqd/*, migration/* and watchdog/*.  But there are many other
kthread functions which seem to be parking ignorant, including:

  cpu_idle_loop
  kthreadd
  rcu_gp_kthread
  worker_thread
  rescuer_thread
  devtmpfsd
  hub_thread
  kswapd
  ksm_scan_thread
  khugepaged
  fsnotify_mark_destroy
  scsi_error_handler
  kauditd_thread
  kjournald2
  irq_thread
  rfcomm_run

Maybe we could modify all these thread functions (and probably more) to
be park and/or freezer capable.  But really it wouldn't make much of a
difference IMO.  It would only protect careless users from a tiny
percentage of all possible havoc that a careless user could create.

Live patching is a very sensitive and risky operation, and from a kernel
standpoint we should make it as safe as we reasonably can.  But we can't
do much about careless users.  Ultimately the risk is in the hands of
the user and their choice of patches.  They need to absolutely
understand all the implications of patching a particular function.  If
the patch changes the way a function interacts with some external data,
then they're starting to tempt fate and they need to be extra careful.
This care needs to be taken for *all* kernel functions, not just for the
few that are called from kernel threads.

Also, the top level kernel thread functions (like those listed above)
will never be patchable anyway, because we never patch an in-use
function (these functions are always in the threads' backtraces).  This
further diminishes the benefit of parking/freezing kernel threads.

-- 
Josh

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-06  1:59       ` David Lang
@ 2014-05-06 12:17         ` Josh Poimboeuf
  0 siblings, 0 replies; 60+ messages in thread
From: Josh Poimboeuf @ 2014-05-06 12:17 UTC (permalink / raw)
  To: David Lang
  Cc: Jiri Kosina, Seth Jennings, Masami Hiramatsu, Steven Rostedt,
	Frederic Weisbecker, Ingo Molnar, Jiri Slaby, linux-kernel

On Mon, May 05, 2014 at 06:59:29PM -0700, David Lang wrote:
> On Tue, 6 May 2014, Jiri Kosina wrote:
> 
> >On Mon, 5 May 2014, David Lang wrote:
> >
> >>how would you know that all instances of the datastructure in memory
> >>have= been touched? just because all tasks have run and are outside the
> >>function in question doesn't tell you data structures have been
> >>converted. You have n= o way of knowing when (or if) the next call to
> >>the modified function will take place on any potential in-memory
> >>structure.
> >
> >The problem you are trying to avoid here is functions expecting to read
> >"v2" format of the data from memory, while there are still tasks that are
> >unpredictably writing "v1" format of the data to the memory.
> >
> >There are several ways to attack this problem:
> >
> >- stop the whole system, convert all the existing data structures to new
> > format (which might potentially be non-trivial, mostly because you
> > have to *know* where all the data structures have been allocated), apply
> > patch, resume operation [ksplice, probably kpatch in future]
> >- restrict the data format to be backwards compatible [to be done
> > manually during patch creation, currently what kGraft needs to do in
> > such case]
> >- have a proxy code which can read both "v1" and "v2" formats, and writes
> > back in the same format it has seen the data structure on input
> 
> doesn't this have the same problem of finding all the data?
> 
> >- once all the *code* has been converted, it still has to understand "v1"
> > and "v2", but it can now start writing out "v2" format only [possible
> > with kGraft, not implemented in automated fashion]
> 
> which is a varient of the second one in that all the data needs to be tagged
> with a version so that it can be converted.
> 
> I don't see that 'stop the world' ends up being much better for this.
> 
> it does avoid any possibility of v1 code reading v2 data, but it doesn't
> help in avoiding v2 code reading v1 data some time (potentially much) later.

In general we would avoid this type of situation, where data semantics
change.  If it's absolutely necessary, then it's up to the user to write
a hook function that will find all the data structures and convert them
at patch loading time.  If they can't write a function which does that
sufficiently and safely, then live patching isn't a good option.

-- 
Josh

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-06  7:32       ` Ingo Molnar
  2014-05-06  8:03         ` Jiri Kosina
@ 2014-05-06 12:23         ` Josh Poimboeuf
  2014-05-07 12:19           ` Ingo Molnar
  1 sibling, 1 reply; 60+ messages in thread
From: Josh Poimboeuf @ 2014-05-06 12:23 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jiri Kosina, David Lang, Seth Jennings, Masami Hiramatsu,
	Steven Rostedt, Frederic Weisbecker, Ingo Molnar, Jiri Slaby,
	linux-kernel

On Tue, May 06, 2014 at 09:32:28AM +0200, Ingo Molnar wrote:
> 
> * Jiri Kosina <jkosina@suse.cz> wrote:
> 
> > On Mon, 5 May 2014, David Lang wrote:
> > 
> > > how would you know that all instances of the datastructure in memory 
> > > have= been touched? just because all tasks have run and are outside the 
> > > function in question doesn't tell you data structures have been 
> > > converted. You have n= o way of knowing when (or if) the next call to 
> > > the modified function will take place on any potential in-memory 
> > > structure.
> > 
> > The problem you are trying to avoid here is functions expecting to read 
> > "v2" format of the data from memory, while there are still tasks that are 
> > unpredictably writing "v1" format of the data to the memory.
> > 
> > There are several ways to attack this problem:
> > 
> > - stop the whole system, convert all the existing data structures to new 
> >   format (which might potentially be non-trivial, mostly because you 
> >   have to *know* where all the data structures have been allocated), apply 
> >   patch, resume operation [ksplice, probably kpatch in future]
> > - restrict the data format to be backwards compatible [to be done 
> >   manually during patch creation, currently what kGraft needs to do in 
> >   such case]
> > - have a proxy code which can read both "v1" and "v2" formats, and writes 
> >   back in the same format it has seen the data structure on input
> > - once all the *code* has been converted, it still has to understand "v1" 
> >   and "v2", but it can now start writing out "v2" format only [possible 
> >   with kGraft, not implemented in automated fashion]
> > 
> > Ideas are of course more than welcome.
> 
> So what I'm curious about, what is the actual 'in the field' distro 
> experience, about the type of live-patches that get pushed with 
> urgency?
> 
> My guess would be that the overwhelming majority of live-patches don't 
> change data structures - and hence the right initial model would be to 
> ensure (via tooling, and via review) that 'v1' and 'v2' data is 
> exactly the same.

Yes, in general we want to avoid data changes.  In practice, we expect
most patches to be small, localized security fixes, so it shouldn't be
an issue in most cases.

Currently the kpatch tooling detects any compile-time changes to static
data and refuses to build the patch module in that case.

But there's no way to programmatically detect changes to dynamic data.
Which is why the user always has to be very careful when selecting a
patch.

-- 
Josh

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-06 11:45         ` Masami Hiramatsu
@ 2014-05-06 12:26           ` Steven Rostedt
  2014-05-06 22:33             ` Masami Hiramatsu
  2014-05-16 16:27             ` Jiri Kosina
  0 siblings, 2 replies; 60+ messages in thread
From: Steven Rostedt @ 2014-05-06 12:26 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Ingo Molnar, Frederic Weisbecker, Josh Poimboeuf, Seth Jennings,
	Ingo Molnar, Jiri Slaby, linux-kernel, Peter Zijlstra,
	Andrew Morton, Linus Torvalds, Thomas Gleixner

On Tue, 06 May 2014 20:45:50 +0900
Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> wrote:


> However, I also think if users can accept such freezing wait-time,
> it means they can also accept kexec based "checkpoint-restart" patching.
> So, I think the final goal of the kpatch will be live patching without
> stopping the machine. I'm discussing the issue on github #138, but that is
> off-topic. :)
> 

I agree with Ingo too. Being conservative at first is the right
approach here. We should start out with a stop_machine making sure that
everything is sane before we continue. Sure, that's not much different
than a kexec, but lets take things one step at a time.

ftrace did the stop_machine (and still does for some archs), and slowly
moved to a more efficient method. kpatch/kgraft should follow suit.

-- Steve

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-06 12:12           ` Josh Poimboeuf
@ 2014-05-06 12:33             ` Steven Rostedt
  2014-05-06 22:49               ` Masami Hiramatsu
  2014-05-06 14:05             ` Frederic Weisbecker
  1 sibling, 1 reply; 60+ messages in thread
From: Steven Rostedt @ 2014-05-06 12:33 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Frederic Weisbecker, Ingo Molnar, Seth Jennings,
	Masami Hiramatsu, Ingo Molnar, Jiri Slaby, linux-kernel,
	Peter Zijlstra, Andrew Morton, Linus Torvalds, Thomas Gleixner

On Tue, 6 May 2014 07:12:11 -0500
Josh Poimboeuf <jpoimboe@redhat.com> wrote:

> Live patching is a very sensitive and risky operation, and from a kernel
> standpoint we should make it as safe as we reasonably can.  But we can't
> do much about careless users.  Ultimately the risk is in the hands of
> the user and their choice of patches.  They need to absolutely
> understand all the implications of patching a particular function.  If
> the patch changes the way a function interacts with some external data,
> then they're starting to tempt fate and they need to be extra careful.
> This care needs to be taken for *all* kernel functions, not just for the
> few that are called from kernel threads.

Ideally the kpatch tools should be able to somewhat prevent users from
doing damage. Or at least make them type a sentence that says:

  I know what I'm doing and will not blame anyone but myself if this
  kills the system along with all my puppies and kittens.

I'm guessing that kpatch needs to be marketed that a distro or "hired
help" will be creating the patch and the admin only needs to "trust"
the one that gave them the kpatch module to load. All the
testing/checking that the module works will be done by kernel
developers and not by any "users".


-- Steve

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-06 12:12           ` Josh Poimboeuf
  2014-05-06 12:33             ` Steven Rostedt
@ 2014-05-06 14:05             ` Frederic Weisbecker
  2014-05-06 14:50               ` Josh Poimboeuf
  1 sibling, 1 reply; 60+ messages in thread
From: Frederic Weisbecker @ 2014-05-06 14:05 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Ingo Molnar, Seth Jennings, Masami Hiramatsu, Steven Rostedt,
	Ingo Molnar, Jiri Slaby, linux-kernel, Peter Zijlstra,
	Andrew Morton, Linus Torvalds, Thomas Gleixner

On Tue, May 06, 2014 at 07:12:11AM -0500, Josh Poimboeuf wrote:
> On Mon, May 05, 2014 at 11:49:23PM +0200, Frederic Weisbecker wrote:
> > On Mon, May 05, 2014 at 08:43:04PM +0200, Ingo Molnar wrote:
> > > If a kernel refuses to patch with certain threads running, that will 
> > > drive those kernel threads being fixed and such. It's a deterministic, 
> > > recoverable, reportable bug situation, so fixing it should be fast.
> > > 
> > > We learned these robustness lessons the hard way with kprobes and 
> > > ftrace dynamic code patching... which are utterly simple compared to 
> > > live kernel patching!
> > 
> > Yeah, agreed. More rationale behind: we want to put the kthreads into
> > semantic sleeps, not just random sleeping point. This way we lower the
> > chances to execute new code messing up living state that is expecting old
> > code after random preemption or sleeping points.
> > 
> > But by semantic sleeps I mean more than just explicit calls to schedule()
> > as opposed to preemption points.
> > It also implies shutting down as well the living states handled by the kthread
> > such that some sort of re-initialization of the state is also needed when
> > the kthread gets back to run.
> > 
> > And that's exactly what good implementations of kthread park provide.
> > 
> > Consider kernel/watchdog.c as an example: when we park the lockup
> > detector kthread, it disables the perf event and the hrtimer before it goes
> > to actually park and sleep. When the kthread is later unparked, the kthread
> > restarts the hrtimer and the perf event.
> > 
> > If we live patch code that has obscure relations with perf or hrtimer here,
> > we lower a lot the chances for a crash when the watchdog kthread is parked.
> > 
> > So I'm in favour of parking all possible kthreads before live patching. Freezing
> > alone doesn't provide the same state shutdown than parking.
> > 
> > Now since parking looks more widely implemented than kthread freezing, we could
> > even think about implementing kthread freezing using parking as backend.
> 
> The vast majority of kernel threads on my system don't seem to know
> anything about parking or freezing.  I see one kthread function which
> calls kthread_should_park(), which is smpboot_thread_fn(), used for
> ksoftirqd/*, migration/* and watchdog/*.  But there are many other
> kthread functions which seem to be parking ignorant, including:
> 
>   cpu_idle_loop
>   kthreadd
>   rcu_gp_kthread
>   worker_thread
>   rescuer_thread
>   devtmpfsd
>   hub_thread
>   kswapd
>   ksm_scan_thread
>   khugepaged
>   fsnotify_mark_destroy
>   scsi_error_handler
>   kauditd_thread
>   kjournald2
>   irq_thread
>   rfcomm_run

Yeah I now realize that only very few of them can park. Only infiniband, rcu,
stop_machine and the watchdog...

But that's still a good direction to take if we want to make the kernel
step by step more robust against live patching.

> 
> Maybe we could modify all these thread functions (and probably more) to
> be park and/or freezer capable.  But really it wouldn't make much of a
> difference IMO.  It would only protect careless users from a tiny
> percentage of all possible havoc that a careless user could create.
> 
> Live patching is a very sensitive and risky operation, and from a kernel
> standpoint we should make it as safe as we reasonably can.  But we can't
> do much about careless users.  Ultimately the risk is in the hands of
> the user and their choice of patches.  They need to absolutely
> understand all the implications of patching a particular function.

Kernel developers will be a tiny minority of users of live patching.

Very few sysadmins know about kernel internals. You can't really ask them
to judge if a patch is reasonably live patchable or not.

It's possible to appreciate a patch wrt. its size, or the fact that it came
through a stable tree, so it's at worst mildly invasive.

The only thing that could make live patching safe is that a community
eventually builds around it and carefully inspect patches in a stable release
then provide a selection of safely live patchable pieces.

Other than that, expect people to do crazy things.

> If the patch changes the way a function interacts with some external data,
> then they're starting to tempt fate and they need to be extra careful.
> This care needs to be taken for *all* kernel functions, not just for the
> few that are called from kernel threads.

It's actually very hard to tell if a given function isn't called by any
kernel thread.

> 
> Also, the top level kernel thread functions (like those listed above)
> will never be patchable anyway, because we never patch an in-use
> function (these functions are always in the threads' backtraces).  This
> further diminishes the benefit of parking/freezing kernel threads.

You're right to do some basic prevention. It's the bare minimum.

But keep in mind it's a tiny protection if you consider the top level
function is only a small part of what is called by a kernel thread. And
there is still a risk that the patched part is incompatible with previously
executed code. For example when a patched function depends on a patched
initialization but we executed the old version of the initialization already
and we are going to execute the new patched function. It's actually a common
pattern. And careful implementation of kthreads parking can help. But this is
something that we can do step by step.

Ah this reminds me when we chased kprobes dangerous spots and we tried to
declare __kprobes the functions which were too dangerous to hot patch.

We eventually gave up because it was impossible to fix everything. And that
was only for kprobes!

So you can never tell if a given patch will impact a given kthread.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-06 14:05             ` Frederic Weisbecker
@ 2014-05-06 14:50               ` Josh Poimboeuf
  2014-05-07 12:24                 ` Ingo Molnar
  0 siblings, 1 reply; 60+ messages in thread
From: Josh Poimboeuf @ 2014-05-06 14:50 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ingo Molnar, Seth Jennings, Masami Hiramatsu, Steven Rostedt,
	Ingo Molnar, Jiri Slaby, linux-kernel, Peter Zijlstra,
	Andrew Morton, Linus Torvalds, Thomas Gleixner

On Tue, May 06, 2014 at 04:05:21PM +0200, Frederic Weisbecker wrote:
> On Tue, May 06, 2014 at 07:12:11AM -0500, Josh Poimboeuf wrote:
> > On Mon, May 05, 2014 at 11:49:23PM +0200, Frederic Weisbecker wrote:
> > > On Mon, May 05, 2014 at 08:43:04PM +0200, Ingo Molnar wrote:
> > > > If a kernel refuses to patch with certain threads running, that will 
> > > > drive those kernel threads being fixed and such. It's a deterministic, 
> > > > recoverable, reportable bug situation, so fixing it should be fast.
> > > > 
> > > > We learned these robustness lessons the hard way with kprobes and 
> > > > ftrace dynamic code patching... which are utterly simple compared to 
> > > > live kernel patching!
> > > 
> > > Yeah, agreed. More rationale behind: we want to put the kthreads into
> > > semantic sleeps, not just random sleeping point. This way we lower the
> > > chances to execute new code messing up living state that is expecting old
> > > code after random preemption or sleeping points.
> > > 
> > > But by semantic sleeps I mean more than just explicit calls to schedule()
> > > as opposed to preemption points.
> > > It also implies shutting down as well the living states handled by the kthread
> > > such that some sort of re-initialization of the state is also needed when
> > > the kthread gets back to run.
> > > 
> > > And that's exactly what good implementations of kthread park provide.
> > > 
> > > Consider kernel/watchdog.c as an example: when we park the lockup
> > > detector kthread, it disables the perf event and the hrtimer before it goes
> > > to actually park and sleep. When the kthread is later unparked, the kthread
> > > restarts the hrtimer and the perf event.
> > > 
> > > If we live patch code that has obscure relations with perf or hrtimer here,
> > > we lower a lot the chances for a crash when the watchdog kthread is parked.
> > > 
> > > So I'm in favour of parking all possible kthreads before live patching. Freezing
> > > alone doesn't provide the same state shutdown than parking.
> > > 
> > > Now since parking looks more widely implemented than kthread freezing, we could
> > > even think about implementing kthread freezing using parking as backend.
> > 
> > The vast majority of kernel threads on my system don't seem to know
> > anything about parking or freezing.  I see one kthread function which
> > calls kthread_should_park(), which is smpboot_thread_fn(), used for
> > ksoftirqd/*, migration/* and watchdog/*.  But there are many other
> > kthread functions which seem to be parking ignorant, including:
> > 
> >   cpu_idle_loop
> >   kthreadd
> >   rcu_gp_kthread
> >   worker_thread
> >   rescuer_thread
> >   devtmpfsd
> >   hub_thread
> >   kswapd
> >   ksm_scan_thread
> >   khugepaged
> >   fsnotify_mark_destroy
> >   scsi_error_handler
> >   kauditd_thread
> >   kjournald2
> >   irq_thread
> >   rfcomm_run
> 
> Yeah I now realize that only very few of them can park. Only infiniband, rcu,
> stop_machine and the watchdog...
> 
> But that's still a good direction to take if we want to make the kernel
> step by step more robust against live patching.
> 
> > 
> > Maybe we could modify all these thread functions (and probably more) to
> > be park and/or freezer capable.  But really it wouldn't make much of a
> > difference IMO.  It would only protect careless users from a tiny
> > percentage of all possible havoc that a careless user could create.
> > 
> > Live patching is a very sensitive and risky operation, and from a kernel
> > standpoint we should make it as safe as we reasonably can.  But we can't
> > do much about careless users.  Ultimately the risk is in the hands of
> > the user and their choice of patches.  They need to absolutely
> > understand all the implications of patching a particular function.
> 
> Kernel developers will be a tiny minority of users of live patching.
> 
> Very few sysadmins know about kernel internals. You can't really ask them
> to judge if a patch is reasonably live patchable or not.
> 
> It's possible to appreciate a patch wrt. its size, or the fact that it came
> through a stable tree, so it's at worst mildly invasive.
> 
> The only thing that could make live patching safe is that a community
> eventually builds around it and carefully inspect patches in a stable release
> then provide a selection of safely live patchable pieces.

I agree, and this is exactly the approach we're taking.  It's not
possible for the tools or the kernel to determine whether a patch is
safe.  We don't expect a sysadmin to make that determination either.  We
have to rely on kernel experts to analyze the patches, test them, and
then deliver them (or the binary patch modules) to the sysadmin.

> Other than that, expect people to do crazy things.

Yes.  This is why we taint the kernel with TAINT_KPATCH.

> > If the patch changes the way a function interacts with some external data,
> > then they're starting to tempt fate and they need to be extra careful.
> > This care needs to be taken for *all* kernel functions, not just for the
> > few that are called from kernel threads.
> 
> It's actually very hard to tell if a given function isn't called by any
> kernel thread.

I agree.  The point I was trying to make is that the user needs to be
very careful with analyzing the patch, regardless of whether the
function is called by a kernel thread.

> > Also, the top level kernel thread functions (like those listed above)
> > will never be patchable anyway, because we never patch an in-use
> > function (these functions are always in the threads' backtraces).  This
> > further diminishes the benefit of parking/freezing kernel threads.
> 
> You're right to do some basic prevention. It's the bare minimum.
> 
> But keep in mind it's a tiny protection if you consider the top level
> function is only a small part of what is called by a kernel thread. And
> there is still a risk that the patched part is incompatible with previously
> executed code. For example when a patched function depends on a patched
> initialization but we executed the old version of the initialization already
> and we are going to execute the new patched function. It's actually a common
> pattern. And careful implementation of kthreads parking can help. But this is
> something that we can do step by step.

We prevent many situations like this by disallowing the user from
patching __init functions when building the patch module binary.

> Ah this reminds me when we chased kprobes dangerous spots and we tried to
> declare __kprobes the functions which were too dangerous to hot patch.
> 
> We eventually gave up because it was impossible to fix everything. And that
> was only for kprobes!
> 
> So you can never tell if a given patch will impact a given kthread.

If the user (or the person creating the patch for them) doesn't
understand all impacts of the patch, they have no business patching
their kernel with it.

-- 
Josh

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-06 12:26           ` Steven Rostedt
@ 2014-05-06 22:33             ` Masami Hiramatsu
  2014-05-16 16:27             ` Jiri Kosina
  1 sibling, 0 replies; 60+ messages in thread
From: Masami Hiramatsu @ 2014-05-06 22:33 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Ingo Molnar, Frederic Weisbecker, Josh Poimboeuf, Seth Jennings,
	Ingo Molnar, Jiri Slaby, linux-kernel, Peter Zijlstra,
	Andrew Morton, Linus Torvalds, Thomas Gleixner

(2014/05/06 21:26), Steven Rostedt wrote:
> On Tue, 06 May 2014 20:45:50 +0900
> Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> wrote:
> 
> 
>> However, I also think if users can accept such freezing wait-time,
>> it means they can also accept kexec based "checkpoint-restart" patching.
>> So, I think the final goal of the kpatch will be live patching without
>> stopping the machine. I'm discussing the issue on github #138, but that is
>> off-topic. :)
>>
> 
> I agree with Ingo too. Being conservative at first is the right
> approach here. We should start out with a stop_machine making sure that
> everything is sane before we continue. Sure, that's not much different
> than a kexec, but lets take things one step at a time.

Agreed. that is a correct way to move things forward.
Anyway, my stop_machine-less approach still has many implementation
issues. It should be solved in upstream, not out-of-tree. So, this
topic is off-topic at this stage. :) We need to focus on how to
merge live-patch into upstream kernel.

> ftrace did the stop_machine (and still does for some archs), and slowly
> moved to a more efficient method. kpatch/kgraft should follow suit.

Sure, that's a best story of how things should be evolved on the kernel :)

Thank you,

-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-06 12:33             ` Steven Rostedt
@ 2014-05-06 22:49               ` Masami Hiramatsu
  0 siblings, 0 replies; 60+ messages in thread
From: Masami Hiramatsu @ 2014-05-06 22:49 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Josh Poimboeuf, Frederic Weisbecker, Ingo Molnar, Seth Jennings,
	Ingo Molnar, Jiri Slaby, linux-kernel, Peter Zijlstra,
	Andrew Morton, Linus Torvalds, Thomas Gleixner

(2014/05/06 21:33), Steven Rostedt wrote:
> On Tue, 6 May 2014 07:12:11 -0500
> Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> 
>> Live patching is a very sensitive and risky operation, and from a kernel
>> standpoint we should make it as safe as we reasonably can.  But we can't
>> do much about careless users.  Ultimately the risk is in the hands of
>> the user and their choice of patches.  They need to absolutely
>> understand all the implications of patching a particular function.  If
>> the patch changes the way a function interacts with some external data,
>> then they're starting to tempt fate and they need to be extra careful.
>> This care needs to be taken for *all* kernel functions, not just for the
>> few that are called from kernel threads.
> 
> Ideally the kpatch tools should be able to somewhat prevent users from
> doing damage. Or at least make them type a sentence that says:
> 
>   I know what I'm doing and will not blame anyone but myself if this
>   kills the system along with all my puppies and kittens.
> 
> I'm guessing that kpatch needs to be marketed that a distro or "hired
> help" will be creating the patch and the admin only needs to "trust"
> the one that gave them the kpatch module to load. All the
> testing/checking that the module works will be done by kernel
> developers and not by any "users".

I strongly agreed. Desktop/Cloud users may not care about it, most of
users just update kernel and reboot. Somewhat "enterprise" users may
also like to do "rolling update" their cluster systems. Only limited
mission critical non-stop systems require it. And since they usually
don't want to loose distro support, all binary patches will be provided
and tested by distro kernel developers, not by users itself. :)

Thank you,

-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-06 12:23         ` Josh Poimboeuf
@ 2014-05-07 12:19           ` Ingo Molnar
  2014-05-09  1:46             ` David Lang
  0 siblings, 1 reply; 60+ messages in thread
From: Ingo Molnar @ 2014-05-07 12:19 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Jiri Kosina, David Lang, Seth Jennings, Masami Hiramatsu,
	Steven Rostedt, Frederic Weisbecker, Ingo Molnar, Jiri Slaby,
	linux-kernel


* Josh Poimboeuf <jpoimboe@redhat.com> wrote:

> On Tue, May 06, 2014 at 09:32:28AM +0200, Ingo Molnar wrote:
> > 
> > * Jiri Kosina <jkosina@suse.cz> wrote:
> > 
> > > On Mon, 5 May 2014, David Lang wrote:
> > > 
> > > > how would you know that all instances of the datastructure in memory 
> > > > have= been touched? just because all tasks have run and are outside the 
> > > > function in question doesn't tell you data structures have been 
> > > > converted. You have n= o way of knowing when (or if) the next call to 
> > > > the modified function will take place on any potential in-memory 
> > > > structure.
> > > 
> > > The problem you are trying to avoid here is functions expecting to read 
> > > "v2" format of the data from memory, while there are still tasks that are 
> > > unpredictably writing "v1" format of the data to the memory.
> > > 
> > > There are several ways to attack this problem:
> > > 
> > > - stop the whole system, convert all the existing data structures to new 
> > >   format (which might potentially be non-trivial, mostly because you 
> > >   have to *know* where all the data structures have been allocated), apply 
> > >   patch, resume operation [ksplice, probably kpatch in future]
> > > - restrict the data format to be backwards compatible [to be done 
> > >   manually during patch creation, currently what kGraft needs to do in 
> > >   such case]
> > > - have a proxy code which can read both "v1" and "v2" formats, and writes 
> > >   back in the same format it has seen the data structure on input
> > > - once all the *code* has been converted, it still has to understand "v1" 
> > >   and "v2", but it can now start writing out "v2" format only [possible 
> > >   with kGraft, not implemented in automated fashion]
> > > 
> > > Ideas are of course more than welcome.
> > 
> > So what I'm curious about, what is the actual 'in the field' distro 
> > experience, about the type of live-patches that get pushed with 
> > urgency?
> > 
> > My guess would be that the overwhelming majority of live-patches don't 
> > change data structures - and hence the right initial model would be to 
> > ensure (via tooling, and via review) that 'v1' and 'v2' data is 
> > exactly the same.
> 
> Yes, in general we want to avoid data changes.  In practice, we expect
> most patches to be small, localized security fixes, so it shouldn't be
> an issue in most cases.
> 
> Currently the kpatch tooling detects any compile-time changes to 
> static data and refuses to build the patch module in that case.
> 
> But there's no way to programmatically detect changes to dynamic 
> data. Which is why the user always has to be very careful when 
> selecting a patch.

And since this is about the system kernel it's dead easy to mess up a 
new kernel function and make the system unbootable - so it's not like 
'be careful' isn't something implied already.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-06 14:50               ` Josh Poimboeuf
@ 2014-05-07 12:24                 ` Ingo Molnar
  2014-05-07 15:41                   ` Josh Poimboeuf
  0 siblings, 1 reply; 60+ messages in thread
From: Ingo Molnar @ 2014-05-07 12:24 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Frederic Weisbecker, Seth Jennings, Masami Hiramatsu,
	Steven Rostedt, Ingo Molnar, Jiri Slaby, linux-kernel,
	Peter Zijlstra, Andrew Morton, Linus Torvalds, Thomas Gleixner


* Josh Poimboeuf <jpoimboe@redhat.com> wrote:

> > Ah this reminds me when we chased kprobes dangerous spots and we 
> > tried to declare __kprobes the functions which were too dangerous 
> > to hot patch.
> > 
> > We eventually gave up because it was impossible to fix everything. 
> > And that was only for kprobes!
> > 
> > So you can never tell if a given patch will impact a given 
> > kthread.
> 
> If the user (or the person creating the patch for them) doesn't 
> understand all impacts of the patch, they have no business patching 
> their kernel with it.

I think what is being somewhat lost is this discussion is the 
distinction between:

 1) is the patch safe
 2) is the _live patching_ safe

It's really two different things. We should absolutely strive for live 
patching to be safe under all circumstances, as long as the patch 
being fed to it is safe in itself when building a new kernel the old 
fashioned way.

I.e. it's natural that a kernel can be messed up via a patch, but this 
subsystem should absolutely make sure that it will safely reject 
totally fine patches that are unsafe to live patch.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-07 12:24                 ` Ingo Molnar
@ 2014-05-07 15:41                   ` Josh Poimboeuf
  2014-05-07 15:57                     ` Ingo Molnar
  0 siblings, 1 reply; 60+ messages in thread
From: Josh Poimboeuf @ 2014-05-07 15:41 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Frederic Weisbecker, Seth Jennings, Masami Hiramatsu,
	Steven Rostedt, Ingo Molnar, Jiri Slaby, linux-kernel,
	Peter Zijlstra, Andrew Morton, Linus Torvalds, Thomas Gleixner

On Wed, May 07, 2014 at 02:24:44PM +0200, Ingo Molnar wrote:
> 
> * Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> 
> > > Ah this reminds me when we chased kprobes dangerous spots and we 
> > > tried to declare __kprobes the functions which were too dangerous 
> > > to hot patch.
> > > 
> > > We eventually gave up because it was impossible to fix everything. 
> > > And that was only for kprobes!
> > > 
> > > So you can never tell if a given patch will impact a given 
> > > kthread.
> > 
> > If the user (or the person creating the patch for them) doesn't 
> > understand all impacts of the patch, they have no business patching 
> > their kernel with it.
> 
> I think what is being somewhat lost is this discussion is the 
> distinction between:
> 
>  1) is the patch safe
>  2) is the _live patching_ safe
> 
> It's really two different things. We should absolutely strive for live 
> patching to be safe under all circumstances, as long as the patch 
> being fed to it is safe in itself when building a new kernel the old 
> fashioned way.
> 
> I.e. it's natural that a kernel can be messed up via a patch, but this 
> subsystem should absolutely make sure that it will safely reject 
> totally fine patches that are unsafe to live patch.

Thanks, that's a very succinct way to put it.  They are indeed two
different things, but at the same time they're interrelated: determining
whether a patch is safe requires making assumptions about how it will be
applied.

Here's how kpatch draws the lines:

1) Is the patch safe?

   Determined by the user (and partially enforced by the kpatch-build
   tools).

   The user can assume that the old function(s) will not be in use by
   any task at the time of replacement, so there's no risk of unexpected
   interactions between the old and the new.  There is no guarantee that
   all tasks will be frozen.

2) Is the live patching safe?

   Determined by the kernel.  The kernel only needs to ensure that the
   old function(s) are not in use.  We do this with stop_machine() and
   backtrace checks of all tasks.

It's a very simple contract between user and kernel.

I think the proposal was that we change this contract such that the user
can additionally assume that all tasks will be frozen.  I could be
missing something but I don't see a real benefit from it.

-- 
Josh

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-07 15:41                   ` Josh Poimboeuf
@ 2014-05-07 15:57                     ` Ingo Molnar
  2014-05-07 16:43                       ` Josh Poimboeuf
  2014-05-07 22:56                       ` David Lang
  0 siblings, 2 replies; 60+ messages in thread
From: Ingo Molnar @ 2014-05-07 15:57 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Frederic Weisbecker, Seth Jennings, Masami Hiramatsu,
	Steven Rostedt, Ingo Molnar, Jiri Slaby, linux-kernel,
	Peter Zijlstra, Andrew Morton, Linus Torvalds, Thomas Gleixner


* Josh Poimboeuf <jpoimboe@redhat.com> wrote:

> On Wed, May 07, 2014 at 02:24:44PM +0200, Ingo Molnar wrote:
> > 
> > * Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> > 
> > > > Ah this reminds me when we chased kprobes dangerous spots and we 
> > > > tried to declare __kprobes the functions which were too dangerous 
> > > > to hot patch.
> > > > 
> > > > We eventually gave up because it was impossible to fix everything. 
> > > > And that was only for kprobes!
> > > > 
> > > > So you can never tell if a given patch will impact a given 
> > > > kthread.
> > > 
> > > If the user (or the person creating the patch for them) doesn't 
> > > understand all impacts of the patch, they have no business patching 
> > > their kernel with it.
> > 
> > I think what is being somewhat lost is this discussion is the 
> > distinction between:
> > 
> >  1) is the patch safe
> >  2) is the _live patching_ safe
> > 
> > It's really two different things. We should absolutely strive for live 
> > patching to be safe under all circumstances, as long as the patch 
> > being fed to it is safe in itself when building a new kernel the old 
> > fashioned way.
> > 
> > I.e. it's natural that a kernel can be messed up via a patch, but this 
> > subsystem should absolutely make sure that it will safely reject 
> > totally fine patches that are unsafe to live patch.
> 
> Thanks, that's a very succinct way to put it.  They are indeed two
> different things, but at the same time they're interrelated: determining
> whether a patch is safe requires making assumptions about how it will be
> applied.

No!

A patch to the kernel source is 'safe' if it results in a correctly 
patched kernel source. Full stop!

Live patching does not enter into this question, ever. The correctness 
of a patch to the source does not depend on 'live patching' 
considerations in any way, shape or form.

Any mechanism that tries to blur these lines is broken by design.

My claim is that if a patch is correct/safe in the old fashioned way, 
then a fundamental principle is that a live patching subsystem must 
either safely apply, or safely reject the live patching attempt, 
independently from any user input.

It's similar to how kprobes (or ftrace) will safely reject or perform 
a live patching of the kernel.

So for example, there's this recent upstream kernel fix:

  3ca9e5d36afb agp: info leak in agpioc_info_wrap()

which fixes an information leak. The 'patch' is Git commit 
3ca9e5d36afb (i.e. it patches a very specific incoming kernel source 
tree that results in a specific outgoing source tree), and we know 
it's safe and correct.

Any live patching subsystem must make sure that if this patch is 
live-patched, that this attempt is either rejected safely or performed 
safely.

"We think/hope it won't blow up in most cases and we automated some 
checks halfways" or "the user must know what he is doing" is really 
not something that I think is a good concept for something as fragile 
as live patching.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-07 15:57                     ` Ingo Molnar
@ 2014-05-07 16:43                       ` Josh Poimboeuf
  2014-05-07 22:56                       ` David Lang
  1 sibling, 0 replies; 60+ messages in thread
From: Josh Poimboeuf @ 2014-05-07 16:43 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Frederic Weisbecker, Seth Jennings, Masami Hiramatsu,
	Steven Rostedt, Ingo Molnar, Jiri Slaby, linux-kernel,
	Peter Zijlstra, Andrew Morton, Linus Torvalds, Thomas Gleixner

On Wed, May 07, 2014 at 05:57:54PM +0200, Ingo Molnar wrote:
> Live patching does not enter into this question, ever. The correctness 
> of a patch to the source does not depend on 'live patching' 
> considerations in any way, shape or form.
> 
> Any mechanism that tries to blur these lines is broken by design.
> 
> My claim is that if a patch is correct/safe in the old fashioned way, 
> then a fundamental principle is that a live patching subsystem must 
> either safely apply, or safely reject the live patching attempt, 
> independently from any user input.

That's a valiant goal, but it's not going to happen unless you want to
rewrite Linux in Haskell.  It's just not possible for a program to prove
that a patch is safe to apply to a running kernel.  There are way too
many subtle interactions with dynamically allocated data between
functions.

I think the only way to achieve that is with CRIU, but it still requires
a kexec or a reboot, so you lose all kernel state and it's much more
disruptive.

> "We think/hope it won't blow up in most cases and we automated some 
> checks halfways" or "the user must know what he is doing" is really 
> not something that I think is a good concept for something as fragile 
> as live patching.

This is a distro tool, not a general purpose one.  If distros are
careful with their patch selection, it won't blow up.  It's a valuable
way for distros to help out sysadmins who need a hot security fix but
can't reboot immediately.

-- 
Josh

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-07 15:57                     ` Ingo Molnar
  2014-05-07 16:43                       ` Josh Poimboeuf
@ 2014-05-07 22:56                       ` David Lang
  2014-05-08  6:12                         ` Ingo Molnar
  1 sibling, 1 reply; 60+ messages in thread
From: David Lang @ 2014-05-07 22:56 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Josh Poimboeuf, Frederic Weisbecker, Seth Jennings,
	Masami Hiramatsu, Steven Rostedt, Ingo Molnar, Jiri Slaby,
	linux-kernel, Peter Zijlstra, Andrew Morton, Linus Torvalds,
	Thomas Gleixner

On Wed, 7 May 2014, Ingo Molnar wrote:

> * Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>
>> On Wed, May 07, 2014 at 02:24:44PM +0200, Ingo Molnar wrote:
>>>
>>> * Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>>>
>>>>> Ah this reminds me when we chased kprobes dangerous spots and we
>>>>> tried to declare __kprobes the functions which were too dangerous
>>>>> to hot patch.
>>>>>
>>>>> We eventually gave up because it was impossible to fix everything.
>>>>> And that was only for kprobes!
>>>>>
>>>>> So you can never tell if a given patch will impact a given
>>>>> kthread.
>>>>
>>>> If the user (or the person creating the patch for them) doesn't
>>>> understand all impacts of the patch, they have no business patching
>>>> their kernel with it.
>>>
>>> I think what is being somewhat lost is this discussion is the
>>> distinction between:
>>>
>>>  1) is the patch safe
>>>  2) is the _live patching_ safe
>>>
>>> It's really two different things. We should absolutely strive for live
>>> patching to be safe under all circumstances, as long as the patch
>>> being fed to it is safe in itself when building a new kernel the old
>>> fashioned way.
>>>
>>> I.e. it's natural that a kernel can be messed up via a patch, but this
>>> subsystem should absolutely make sure that it will safely reject
>>> totally fine patches that are unsafe to live patch.
>>
>> Thanks, that's a very succinct way to put it.  They are indeed two
>> different things, but at the same time they're interrelated: determining
>> whether a patch is safe requires making assumptions about how it will be
>> applied.
>
> No!
>
> A patch to the kernel source is 'safe' if it results in a correctly
> patched kernel source. Full stop!
>
> Live patching does not enter into this question, ever. The correctness
> of a patch to the source does not depend on 'live patching'
> considerations in any way, shape or form.
>
> Any mechanism that tries to blur these lines is broken by design.
>
> My claim is that if a patch is correct/safe in the old fashioned way,
> then a fundamental principle is that a live patching subsystem must
> either safely apply, or safely reject the live patching attempt,
> independently from any user input.
>
> It's similar to how kprobes (or ftrace) will safely reject or perform
> a live patching of the kernel.
>
> So for example, there's this recent upstream kernel fix:
>
>  3ca9e5d36afb agp: info leak in agpioc_info_wrap()
>
> which fixes an information leak. The 'patch' is Git commit
> 3ca9e5d36afb (i.e. it patches a very specific incoming kernel source
> tree that results in a specific outgoing source tree), and we know
> it's safe and correct.
>
> Any live patching subsystem must make sure that if this patch is
> live-patched, that this attempt is either rejected safely or performed
> safely.
>
> "We think/hope it won't blow up in most cases and we automated some
> checks halfways" or "the user must know what he is doing" is really
> not something that I think is a good concept for something as fragile
> as live patching.

In that case you will have to reject any kernel patch that changes any memory 
structure, because it's impossible as a general rule to say that changing memory 
structures is going to be safe (or even possible) to change.

that includes any access to memory that moves around a lock

David Lang

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-07 22:56                       ` David Lang
@ 2014-05-08  6:12                         ` Ingo Molnar
  2014-05-08  6:50                           ` David Lang
  0 siblings, 1 reply; 60+ messages in thread
From: Ingo Molnar @ 2014-05-08  6:12 UTC (permalink / raw)
  To: David Lang
  Cc: Josh Poimboeuf, Frederic Weisbecker, Seth Jennings,
	Masami Hiramatsu, Steven Rostedt, Ingo Molnar, Jiri Slaby,
	linux-kernel, Peter Zijlstra, Andrew Morton, Linus Torvalds,
	Thomas Gleixner


* David Lang <david@lang.hm> wrote:

> On Wed, 7 May 2014, Ingo Molnar wrote:
> 
> >* Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> >
> >>On Wed, May 07, 2014 at 02:24:44PM +0200, Ingo Molnar wrote:
> >>>
> >>>* Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> >>>
> >>>>>Ah this reminds me when we chased kprobes dangerous spots and we
> >>>>>tried to declare __kprobes the functions which were too dangerous
> >>>>>to hot patch.
> >>>>>
> >>>>>We eventually gave up because it was impossible to fix everything.
> >>>>>And that was only for kprobes!
> >>>>>
> >>>>>So you can never tell if a given patch will impact a given
> >>>>>kthread.
> >>>>
> >>>>If the user (or the person creating the patch for them) doesn't
> >>>>understand all impacts of the patch, they have no business patching
> >>>>their kernel with it.
> >>>
> >>>I think what is being somewhat lost is this discussion is the
> >>>distinction between:
> >>>
> >>> 1) is the patch safe
> >>> 2) is the _live patching_ safe
> >>>
> >>>It's really two different things. We should absolutely strive for live
> >>>patching to be safe under all circumstances, as long as the patch
> >>>being fed to it is safe in itself when building a new kernel the old
> >>>fashioned way.
> >>>
> >>>I.e. it's natural that a kernel can be messed up via a patch, but this
> >>>subsystem should absolutely make sure that it will safely reject
> >>>totally fine patches that are unsafe to live patch.
> >>
> >>Thanks, that's a very succinct way to put it.  They are indeed two
> >>different things, but at the same time they're interrelated: determining
> >>whether a patch is safe requires making assumptions about how it will be
> >>applied.
> >
> >No!
> >
> >A patch to the kernel source is 'safe' if it results in a correctly
> >patched kernel source. Full stop!
> >
> >Live patching does not enter into this question, ever. The correctness
> >of a patch to the source does not depend on 'live patching'
> >considerations in any way, shape or form.
> >
> >Any mechanism that tries to blur these lines is broken by design.
> >
> >My claim is that if a patch is correct/safe in the old fashioned way,
> >then a fundamental principle is that a live patching subsystem must
> >either safely apply, or safely reject the live patching attempt,
> >independently from any user input.
> >
> >It's similar to how kprobes (or ftrace) will safely reject or perform
> >a live patching of the kernel.
> >
> >So for example, there's this recent upstream kernel fix:
> >
> > 3ca9e5d36afb agp: info leak in agpioc_info_wrap()
> >
> >which fixes an information leak. The 'patch' is Git commit
> >3ca9e5d36afb (i.e. it patches a very specific incoming kernel source
> >tree that results in a specific outgoing source tree), and we know
> >it's safe and correct.
> >
> >Any live patching subsystem must make sure that if this patch is
> >live-patched, that this attempt is either rejected safely or performed
> >safely.
> >
> >"We think/hope it won't blow up in most cases and we automated some
> >checks halfways" or "the user must know what he is doing" is really
> >not something that I think is a good concept for something as fragile
> >as live patching.
> 
> In that case you will have to reject any kernel patch that changes 
> any memory structure, because it's impossible as a general rule to 
> say that changing memory structures is going to be safe (or even 
> possible) to change.
> 
> that includes any access to memory that moves around a lock

Initially restricting it to such patches would be a good beginning - 
most of the security fixes are just failed checks, i.e. they don't 
typically even change any external (not on stack) memory structure, 
right?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-08  6:12                         ` Ingo Molnar
@ 2014-05-08  6:50                           ` David Lang
  2014-05-08  7:08                             ` Ingo Molnar
  0 siblings, 1 reply; 60+ messages in thread
From: David Lang @ 2014-05-08  6:50 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Josh Poimboeuf, Frederic Weisbecker, Seth Jennings,
	Masami Hiramatsu, Steven Rostedt, Ingo Molnar, Jiri Slaby,
	linux-kernel, Peter Zijlstra, Andrew Morton, Linus Torvalds,
	Thomas Gleixner

On Thu, 8 May 2014, Ingo Molnar wrote:

>>>
>>> No!
>>>
>>> A patch to the kernel source is 'safe' if it results in a correctly
>>> patched kernel source. Full stop!
>>>
>>> Live patching does not enter into this question, ever. The correctness
>>> of a patch to the source does not depend on 'live patching'
>>> considerations in any way, shape or form.
>>>
>>> Any mechanism that tries to blur these lines is broken by design.
>>>
>>> My claim is that if a patch is correct/safe in the old fashioned way,
>>> then a fundamental principle is that a live patching subsystem must
>>> either safely apply, or safely reject the live patching attempt,
>>> independently from any user input.
>>>
>>> It's similar to how kprobes (or ftrace) will safely reject or perform
>>> a live patching of the kernel.
>>>
>>> So for example, there's this recent upstream kernel fix:
>>>
>>> 3ca9e5d36afb agp: info leak in agpioc_info_wrap()
>>>
>>> which fixes an information leak. The 'patch' is Git commit
>>> 3ca9e5d36afb (i.e. it patches a very specific incoming kernel source
>>> tree that results in a specific outgoing source tree), and we know
>>> it's safe and correct.
>>>
>>> Any live patching subsystem must make sure that if this patch is
>>> live-patched, that this attempt is either rejected safely or performed
>>> safely.
>>>
>>> "We think/hope it won't blow up in most cases and we automated some
>>> checks halfways" or "the user must know what he is doing" is really
>>> not something that I think is a good concept for something as fragile
>>> as live patching.
>>
>> In that case you will have to reject any kernel patch that changes
>> any memory structure, because it's impossible as a general rule to
>> say that changing memory structures is going to be safe (or even
>> possible) to change.
>>
>> that includes any access to memory that moves around a lock
>
> Initially restricting it to such patches would be a good beginning -
> most of the security fixes are just failed checks, i.e. they don't
> typically even change any external (not on stack) memory structure,
> right?

in terms of hit-patching kernels you are correct.

but that's a far cry from what it sounded like you were demanding (that it must 
handle any kernel patch)

David Lang

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-08  6:50                           ` David Lang
@ 2014-05-08  7:08                             ` Ingo Molnar
  2014-05-08  7:29                               ` Masami Hiramatsu
                                                 ` (2 more replies)
  0 siblings, 3 replies; 60+ messages in thread
From: Ingo Molnar @ 2014-05-08  7:08 UTC (permalink / raw)
  To: David Lang
  Cc: Josh Poimboeuf, Frederic Weisbecker, Seth Jennings,
	Masami Hiramatsu, Steven Rostedt, Ingo Molnar, Jiri Slaby,
	linux-kernel, Peter Zijlstra, Andrew Morton, Linus Torvalds,
	Thomas Gleixner


* David Lang <david@lang.hm> wrote:

> On Thu, 8 May 2014, Ingo Molnar wrote:
> 
> >>>
> >>>No!
> >>>
> >>>A patch to the kernel source is 'safe' if it results in a correctly
> >>>patched kernel source. Full stop!
> >>>
> >>>Live patching does not enter into this question, ever. The correctness
> >>>of a patch to the source does not depend on 'live patching'
> >>>considerations in any way, shape or form.
> >>>
> >>>Any mechanism that tries to blur these lines is broken by design.
> >>>
> >>>My claim is that if a patch is correct/safe in the old fashioned way,
> >>>then a fundamental principle is that a live patching subsystem must
> >>>either safely apply, or safely reject the live patching attempt,
> >>>independently from any user input.
> >>>
> >>>It's similar to how kprobes (or ftrace) will safely reject or perform
> >>>a live patching of the kernel.
> >>>
> >>>So for example, there's this recent upstream kernel fix:
> >>>
> >>>3ca9e5d36afb agp: info leak in agpioc_info_wrap()
> >>>
> >>>which fixes an information leak. The 'patch' is Git commit
> >>>3ca9e5d36afb (i.e. it patches a very specific incoming kernel source
> >>>tree that results in a specific outgoing source tree), and we know
> >>>it's safe and correct.
> >>>
> >>>Any live patching subsystem must make sure that if this patch is
> >>>live-patched, that this attempt is either rejected safely or performed
> >>>safely.
> >>>
> >>>"We think/hope it won't blow up in most cases and we automated some
> >>>checks halfways" or "the user must know what he is doing" is really
> >>>not something that I think is a good concept for something as fragile
> >>>as live patching.
> >>
> >>In that case you will have to reject any kernel patch that changes
> >>any memory structure, because it's impossible as a general rule to
> >>say that changing memory structures is going to be safe (or even
> >>possible) to change.
> >>
> >>that includes any access to memory that moves around a lock
> >
> >Initially restricting it to such patches would be a good beginning -
> >most of the security fixes are just failed checks, i.e. they don't
> >typically even change any external (not on stack) memory structure,
> >right?
> 
> in terms of hit-patching kernels you are correct.
> 
> but that's a far cry from what it sounded like you were demanding 
> (that it must handle any kernel patch)

No, I was not demanding that at all, my suggestion was:

   > My claim is that if a patch is correct/safe in the old fashioned 
   > way, then a fundamental principle is that a live patching 
   > subsystem must either safely apply, or safely reject the live 
   > patching attempt, independently from any user input.

Note the 'if'. It could start simple and stupid, and only allow cases 
where we know the patch must be trivially safe (because it does not do 
much in terms of disturbing globally visible state). That needs some 
tooling help, but apparently tooling help is in place already.

And then we can complicate it from there - but have a reasonably 
robust starting point that we _know_ works (as long as the 
implementation is correct).

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-08  7:08                             ` Ingo Molnar
@ 2014-05-08  7:29                               ` Masami Hiramatsu
  2014-05-08 12:48                               ` Josh Poimboeuf
  2014-06-14 20:31                               ` Pavel Machek
  2 siblings, 0 replies; 60+ messages in thread
From: Masami Hiramatsu @ 2014-05-08  7:29 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: David Lang, Josh Poimboeuf, Frederic Weisbecker, Seth Jennings,
	Steven Rostedt, Ingo Molnar, Jiri Slaby, linux-kernel,
	Peter Zijlstra, Andrew Morton, Linus Torvalds, Thomas Gleixner

(2014/05/08 16:08), Ingo Molnar wrote:
> 
> * David Lang <david@lang.hm> wrote:
> 
>> On Thu, 8 May 2014, Ingo Molnar wrote:
>>
>>>>>
>>>>> No!
>>>>>
>>>>> A patch to the kernel source is 'safe' if it results in a correctly
>>>>> patched kernel source. Full stop!
>>>>>
>>>>> Live patching does not enter into this question, ever. The correctness
>>>>> of a patch to the source does not depend on 'live patching'
>>>>> considerations in any way, shape or form.
>>>>>
>>>>> Any mechanism that tries to blur these lines is broken by design.
>>>>>
>>>>> My claim is that if a patch is correct/safe in the old fashioned way,
>>>>> then a fundamental principle is that a live patching subsystem must
>>>>> either safely apply, or safely reject the live patching attempt,
>>>>> independently from any user input.
>>>>>
>>>>> It's similar to how kprobes (or ftrace) will safely reject or perform
>>>>> a live patching of the kernel.
>>>>>
>>>>> So for example, there's this recent upstream kernel fix:
>>>>>
>>>>> 3ca9e5d36afb agp: info leak in agpioc_info_wrap()
>>>>>
>>>>> which fixes an information leak. The 'patch' is Git commit
>>>>> 3ca9e5d36afb (i.e. it patches a very specific incoming kernel source
>>>>> tree that results in a specific outgoing source tree), and we know
>>>>> it's safe and correct.
>>>>>
>>>>> Any live patching subsystem must make sure that if this patch is
>>>>> live-patched, that this attempt is either rejected safely or performed
>>>>> safely.
>>>>>
>>>>> "We think/hope it won't blow up in most cases and we automated some
>>>>> checks halfways" or "the user must know what he is doing" is really
>>>>> not something that I think is a good concept for something as fragile
>>>>> as live patching.
>>>>
>>>> In that case you will have to reject any kernel patch that changes
>>>> any memory structure, because it's impossible as a general rule to
>>>> say that changing memory structures is going to be safe (or even
>>>> possible) to change.
>>>>
>>>> that includes any access to memory that moves around a lock
>>>
>>> Initially restricting it to such patches would be a good beginning -
>>> most of the security fixes are just failed checks, i.e. they don't
>>> typically even change any external (not on stack) memory structure,
>>> right?
>>
>> in terms of hit-patching kernels you are correct.
>>
>> but that's a far cry from what it sounded like you were demanding 
>> (that it must handle any kernel patch)
> 
> No, I was not demanding that at all, my suggestion was:
> 
>    > My claim is that if a patch is correct/safe in the old fashioned 
>    > way, then a fundamental principle is that a live patching 
>    > subsystem must either safely apply, or safely reject the live 
>    > patching attempt, independently from any user input.

What would you mean "safely apply" and "safely reject"?
Current kpatch checks the target functions are in use and reject safely
if it is in use.

> Note the 'if'. It could start simple and stupid, and only allow cases 
> where we know the patch must be trivially safe (because it does not do 
> much in terms of disturbing globally visible state). That needs some 
> tooling help, but apparently tooling help is in place already.

Hm, it is possible to check the binary patch itself doesn't refer
the global memory, but it's hard to find it derefers some address
of global memory. Also, there can be a case that the function caller
passes the address of a global memory, and the binary patch just
calls another function with that address. This can change the global
state (but correct way).

> And then we can complicate it from there - but have a reasonably 
> robust starting point that we _know_ works (as long as the 
> implementation is correct).

I agreed if you mean the starting with a simple "tool" and
complicate it. I doubt kernel module can check the patch logic....

Thank you,

-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-08  7:08                             ` Ingo Molnar
  2014-05-08  7:29                               ` Masami Hiramatsu
@ 2014-05-08 12:48                               ` Josh Poimboeuf
  2014-05-09  6:21                                 ` Masami Hiramatsu
  2014-06-14 20:31                               ` Pavel Machek
  2 siblings, 1 reply; 60+ messages in thread
From: Josh Poimboeuf @ 2014-05-08 12:48 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: David Lang, Frederic Weisbecker, Seth Jennings, Masami Hiramatsu,
	Steven Rostedt, Ingo Molnar, Jiri Slaby, linux-kernel,
	Peter Zijlstra, Andrew Morton, Linus Torvalds, Thomas Gleixner

On Thu, May 08, 2014 at 09:08:15AM +0200, Ingo Molnar wrote:
> 
> * David Lang <david@lang.hm> wrote:
> 
> > On Thu, 8 May 2014, Ingo Molnar wrote:
> > 
> > >>>
> > >>>No!
> > >>>
> > >>>A patch to the kernel source is 'safe' if it results in a correctly
> > >>>patched kernel source. Full stop!
> > >>>
> > >>>Live patching does not enter into this question, ever. The correctness
> > >>>of a patch to the source does not depend on 'live patching'
> > >>>considerations in any way, shape or form.
> > >>>
> > >>>Any mechanism that tries to blur these lines is broken by design.
> > >>>
> > >>>My claim is that if a patch is correct/safe in the old fashioned way,
> > >>>then a fundamental principle is that a live patching subsystem must
> > >>>either safely apply, or safely reject the live patching attempt,
> > >>>independently from any user input.
> > >>>
> > >>>It's similar to how kprobes (or ftrace) will safely reject or perform
> > >>>a live patching of the kernel.
> > >>>
> > >>>So for example, there's this recent upstream kernel fix:
> > >>>
> > >>>3ca9e5d36afb agp: info leak in agpioc_info_wrap()
> > >>>
> > >>>which fixes an information leak. The 'patch' is Git commit
> > >>>3ca9e5d36afb (i.e. it patches a very specific incoming kernel source
> > >>>tree that results in a specific outgoing source tree), and we know
> > >>>it's safe and correct.
> > >>>
> > >>>Any live patching subsystem must make sure that if this patch is
> > >>>live-patched, that this attempt is either rejected safely or performed
> > >>>safely.
> > >>>
> > >>>"We think/hope it won't blow up in most cases and we automated some
> > >>>checks halfways" or "the user must know what he is doing" is really
> > >>>not something that I think is a good concept for something as fragile
> > >>>as live patching.
> > >>
> > >>In that case you will have to reject any kernel patch that changes
> > >>any memory structure, because it's impossible as a general rule to
> > >>say that changing memory structures is going to be safe (or even
> > >>possible) to change.
> > >>
> > >>that includes any access to memory that moves around a lock
> > >
> > >Initially restricting it to such patches would be a good beginning -
> > >most of the security fixes are just failed checks, i.e. they don't
> > >typically even change any external (not on stack) memory structure,
> > >right?
> > 
> > in terms of hit-patching kernels you are correct.
> > 
> > but that's a far cry from what it sounded like you were demanding 
> > (that it must handle any kernel patch)
> 
> No, I was not demanding that at all, my suggestion was:
> 
>    > My claim is that if a patch is correct/safe in the old fashioned 
>    > way, then a fundamental principle is that a live patching 
>    > subsystem must either safely apply, or safely reject the live 
>    > patching attempt, independently from any user input.
> 
> Note the 'if'. It could start simple and stupid, and only allow cases 
> where we know the patch must be trivially safe (because it does not do 
> much in terms of disturbing globally visible state). That needs some 
> tooling help, but apparently tooling help is in place already.
> 
> And then we can complicate it from there - but have a reasonably 
> robust starting point that we _know_ works (as long as the 
> implementation is correct).

I really wonder if detecting a "trivially safe" patch is even possible.

Where do you draw the line with the following patches?

- add a call to another function which modifies global data

- add an early return or a goto which changes the way the function
  modifies (or no longer modifies) global data

- touch a local stack variable which results in global data being
  modified later in the function

- return a different value which causes the function's caller to modify
  data

-- 
Josh

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-07 12:19           ` Ingo Molnar
@ 2014-05-09  1:46             ` David Lang
  2014-05-09  2:45               ` Steven Rostedt
  2014-05-09  4:07               ` Masami Hiramatsu
  0 siblings, 2 replies; 60+ messages in thread
From: David Lang @ 2014-05-09  1:46 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Josh Poimboeuf, Jiri Kosina, Seth Jennings, Masami Hiramatsu,
	Steven Rostedt, Frederic Weisbecker, Ingo Molnar, Jiri Slaby,
	linux-kernel

On Wed, 7 May 2014, Ingo Molnar wrote:

> * Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>
>> On Tue, May 06, 2014 at 09:32:28AM +0200, Ingo Molnar wrote:
>>>
>>> * Jiri Kosina <jkosina@suse.cz> wrote:
>>>
>>>> On Mon, 5 May 2014, David Lang wrote:
>>>>
>>>>> how would you know that all instances of the datastructure in memory
>>>>> have= been touched? just because all tasks have run and are outside the
>>>>> function in question doesn't tell you data structures have been
>>>>> converted. You have n= o way of knowing when (or if) the next call to
>>>>> the modified function will take place on any potential in-memory
>>>>> structure.
>>>>
>>>> The problem you are trying to avoid here is functions expecting to read
>>>> "v2" format of the data from memory, while there are still tasks that are
>>>> unpredictably writing "v1" format of the data to the memory.
>>>>
>>>> There are several ways to attack this problem:
>>>>
>>>> - stop the whole system, convert all the existing data structures to new
>>>>   format (which might potentially be non-trivial, mostly because you
>>>>   have to *know* where all the data structures have been allocated), apply
>>>>   patch, resume operation [ksplice, probably kpatch in future]
>>>> - restrict the data format to be backwards compatible [to be done
>>>>   manually during patch creation, currently what kGraft needs to do in
>>>>   such case]
>>>> - have a proxy code which can read both "v1" and "v2" formats, and writes
>>>>   back in the same format it has seen the data structure on input
>>>> - once all the *code* has been converted, it still has to understand "v1"
>>>>   and "v2", but it can now start writing out "v2" format only [possible
>>>>   with kGraft, not implemented in automated fashion]
>>>>
>>>> Ideas are of course more than welcome.
>>>
>>> So what I'm curious about, what is the actual 'in the field' distro
>>> experience, about the type of live-patches that get pushed with
>>> urgency?
>>>
>>> My guess would be that the overwhelming majority of live-patches don't
>>> change data structures - and hence the right initial model would be to
>>> ensure (via tooling, and via review) that 'v1' and 'v2' data is
>>> exactly the same.
>>
>> Yes, in general we want to avoid data changes.  In practice, we expect
>> most patches to be small, localized security fixes, so it shouldn't be
>> an issue in most cases.
>>
>> Currently the kpatch tooling detects any compile-time changes to
>> static data and refuses to build the patch module in that case.
>>
>> But there's no way to programmatically detect changes to dynamic
>> data. Which is why the user always has to be very careful when
>> selecting a patch.
>
> And since this is about the system kernel it's dead easy to mess up a
> new kernel function and make the system unbootable - so it's not like
> 'be careful' isn't something implied already.

It's possible to have two versions of code that each work independently, but 
that you can't switch between easily on the fly.

If the new code assumes a lock is held that the old code didn't take, then when 
you switch, you are eventually going to hit a case where the new code trys to 
release a lock it doesn't hold.

detecting all possible cases progromatically seems close to impossible.

but this means that there are two categories of patches

1. patches that are safe to put in a kernel that you are going to boot from

2. patches that are able to be applied on the fly

and the tool isn't going to be able to tell you which category the patch is in. 
It can identify some of the items that make it unlikely or impossible for the 
patch to belong to #2, but don't rely on the tool catching all of them

David Lang

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-09  1:46             ` David Lang
@ 2014-05-09  2:45               ` Steven Rostedt
  2014-05-09  4:07               ` Masami Hiramatsu
  1 sibling, 0 replies; 60+ messages in thread
From: Steven Rostedt @ 2014-05-09  2:45 UTC (permalink / raw)
  To: David Lang
  Cc: Ingo Molnar, Josh Poimboeuf, Jiri Kosina, Seth Jennings,
	Masami Hiramatsu, Frederic Weisbecker, Ingo Molnar, Jiri Slaby,
	linux-kernel

On Thu, 2014-05-08 at 18:46 -0700, David Lang wrote:
> On Wed, 7 May 2014, Ingo Molnar wrote:

> It's possible to have two versions of code that each work independently, but 
> that you can't switch between easily on the fly.
> 
> If the new code assumes a lock is held that the old code didn't take, then when 
> you switch, you are eventually going to hit a case where the new code trys to 
> release a lock it doesn't hold.

But we are replacing full functions here. Never half way through. Thus,
there should never be "old code didn't take", because we replace the
entire function. If the new code expects a lock to be held where the old
code did not, there's nothing changing anything that leads up to this
new code. The lock still won't be taken.

A patch that expects that wont work on a reboot either.

> 
> detecting all possible cases progromatically seems close to impossible.
> 
> but this means that there are two categories of patches
> 
> 1. patches that are safe to put in a kernel that you are going to boot from
> 
> 2. patches that are able to be applied on the fly
> 
> and the tool isn't going to be able to tell you which category the patch is in. 
> It can identify some of the items that make it unlikely or impossible for the 
> patch to belong to #2, but don't rely on the tool catching all of them

I agree that live patching is going to be a bit trickier than patches
that are applied for reboot. And I also agree that no tool could
determine if the patch can be applied live even if it could be applied
safely for reboot. To do so would probably be on the difficulty scale of
the halting problem.

-- Steve



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-09  1:46             ` David Lang
  2014-05-09  2:45               ` Steven Rostedt
@ 2014-05-09  4:07               ` Masami Hiramatsu
  1 sibling, 0 replies; 60+ messages in thread
From: Masami Hiramatsu @ 2014-05-09  4:07 UTC (permalink / raw)
  To: David Lang
  Cc: Ingo Molnar, Josh Poimboeuf, Jiri Kosina, Seth Jennings,
	Steven Rostedt, Frederic Weisbecker, Ingo Molnar, Jiri Slaby,
	linux-kernel

(2014/05/09 10:46), David Lang wrote:
> On Wed, 7 May 2014, Ingo Molnar wrote:
> 
>> * Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>>
>>> On Tue, May 06, 2014 at 09:32:28AM +0200, Ingo Molnar wrote:
>>>>
>>>> * Jiri Kosina <jkosina@suse.cz> wrote:
>>>>
>>>>> On Mon, 5 May 2014, David Lang wrote:
>>>>>
>>>>>> how would you know that all instances of the datastructure in memory
>>>>>> have= been touched? just because all tasks have run and are outside the
>>>>>> function in question doesn't tell you data structures have been
>>>>>> converted. You have n= o way of knowing when (or if) the next call to
>>>>>> the modified function will take place on any potential in-memory
>>>>>> structure.
>>>>>
>>>>> The problem you are trying to avoid here is functions expecting to read
>>>>> "v2" format of the data from memory, while there are still tasks that are
>>>>> unpredictably writing "v1" format of the data to the memory.
>>>>>
>>>>> There are several ways to attack this problem:
>>>>>
>>>>> - stop the whole system, convert all the existing data structures to new
>>>>>   format (which might potentially be non-trivial, mostly because you
>>>>>   have to *know* where all the data structures have been allocated), apply
>>>>>   patch, resume operation [ksplice, probably kpatch in future]
>>>>> - restrict the data format to be backwards compatible [to be done
>>>>>   manually during patch creation, currently what kGraft needs to do in
>>>>>   such case]
>>>>> - have a proxy code which can read both "v1" and "v2" formats, and writes
>>>>>   back in the same format it has seen the data structure on input
>>>>> - once all the *code* has been converted, it still has to understand "v1"
>>>>>   and "v2", but it can now start writing out "v2" format only [possible
>>>>>   with kGraft, not implemented in automated fashion]
>>>>>
>>>>> Ideas are of course more than welcome.
>>>>
>>>> So what I'm curious about, what is the actual 'in the field' distro
>>>> experience, about the type of live-patches that get pushed with
>>>> urgency?
>>>>
>>>> My guess would be that the overwhelming majority of live-patches don't
>>>> change data structures - and hence the right initial model would be to
>>>> ensure (via tooling, and via review) that 'v1' and 'v2' data is
>>>> exactly the same.
>>>
>>> Yes, in general we want to avoid data changes.  In practice, we expect
>>> most patches to be small, localized security fixes, so it shouldn't be
>>> an issue in most cases.
>>>
>>> Currently the kpatch tooling detects any compile-time changes to
>>> static data and refuses to build the patch module in that case.
>>>
>>> But there's no way to programmatically detect changes to dynamic
>>> data. Which is why the user always has to be very careful when
>>> selecting a patch.
>>
>> And since this is about the system kernel it's dead easy to mess up a
>> new kernel function and make the system unbootable - so it's not like
>> 'be careful' isn't something implied already.
> 
> It's possible to have two versions of code that each work independently, but 
> that you can't switch between easily on the fly.
> 
> If the new code assumes a lock is held that the old code didn't take, then when 
> you switch, you are eventually going to hit a case where the new code trys to 
> release a lock it doesn't hold.
> 
> detecting all possible cases progromatically seems close to impossible.

Agreed. Perhaps, spinlock or locks which have small critical section are
usually able to make safe, because a lock caller also does unlock.
But mutex etc. usually have different locker/unlocker function.
In that case, we'll need to check running all functions which is in the
critical region.

> but this means that there are two categories of patches
> 
> 1. patches that are safe to put in a kernel that you are going to boot from
> 
> 2. patches that are able to be applied on the fly
> 
> and the tool isn't going to be able to tell you which category the patch is in. 
> It can identify some of the items that make it unlikely or impossible for the 
> patch to belong to #2, but don't rely on the tool catching all of them

Yeah, I think we'd better start with heuristic decision. Most of the
cases, it could be applied.

Thank you,

-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-08 12:48                               ` Josh Poimboeuf
@ 2014-05-09  6:21                                 ` Masami Hiramatsu
  0 siblings, 0 replies; 60+ messages in thread
From: Masami Hiramatsu @ 2014-05-09  6:21 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Ingo Molnar, David Lang, Frederic Weisbecker, Seth Jennings,
	Steven Rostedt, Ingo Molnar, Jiri Slaby, linux-kernel,
	Peter Zijlstra, Andrew Morton, Linus Torvalds, Thomas Gleixner

(2014/05/08 21:48), Josh Poimboeuf wrote:
>> No, I was not demanding that at all, my suggestion was:
>>
>>    > My claim is that if a patch is correct/safe in the old fashioned 
>>    > way, then a fundamental principle is that a live patching 
>>    > subsystem must either safely apply, or safely reject the live 
>>    > patching attempt, independently from any user input.
>>
>> Note the 'if'. It could start simple and stupid, and only allow cases 
>> where we know the patch must be trivially safe (because it does not do 
>> much in terms of disturbing globally visible state). That needs some 
>> tooling help, but apparently tooling help is in place already.
>>
>> And then we can complicate it from there - but have a reasonably 
>> robust starting point that we _know_ works (as long as the 
>> implementation is correct).
> 
> I really wonder if detecting a "trivially safe" patch is even possible.
> 
> Where do you draw the line with the following patches?
> 
> - add a call to another function which modifies global data

This depends on what global data and how. For example, the global data
is used only from the replaced functions, it's a kind of local data.
And also, the global data modification is as designed (e.g. acquiring/
releasing a spinlock), that is also safe.

I think, the bad case is modifying shared global data to new state which
unexpected by other data holders.

> - add an early return or a goto which changes the way the function
>   modifies (or no longer modifies) global data

Ditto, if it is unexpected at other parts, that will be unacceptable.

> - touch a local stack variable which results in global data being
>   modified later in the function
> 
> - return a different value which causes the function's caller to modify
>   data

I think if the local variable or return value change is correctly handled
by the caller (as expected), that is good too.

Thank you,

-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-06 12:26           ` Steven Rostedt
  2014-05-06 22:33             ` Masami Hiramatsu
@ 2014-05-16 16:27             ` Jiri Kosina
  2014-05-16 17:14               ` Josh Poimboeuf
                                 ` (2 more replies)
  1 sibling, 3 replies; 60+ messages in thread
From: Jiri Kosina @ 2014-05-16 16:27 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Masami Hiramatsu, Ingo Molnar, Frederic Weisbecker,
	Josh Poimboeuf, Seth Jennings, Ingo Molnar, Jiri Slaby,
	linux-kernel, Peter Zijlstra, Andrew Morton, Linus Torvalds,
	Thomas Gleixner

On Tue, 6 May 2014, Steven Rostedt wrote:

> > However, I also think if users can accept such freezing wait-time,
> > it means they can also accept kexec based "checkpoint-restart" patching.
> > So, I think the final goal of the kpatch will be live patching without
> > stopping the machine. I'm discussing the issue on github #138, but that is
> > off-topic. :)
> 
> I agree with Ingo too. Being conservative at first is the right
> approach here. We should start out with a stop_machine making sure that
> everything is sane before we continue. Sure, that's not much different
> than a kexec, but lets take things one step at a time.
> 
> ftrace did the stop_machine (and still does for some archs), and slowly
> moved to a more efficient method. kpatch/kgraft should follow suit.

I don't really agree here.

I actually believe that "lazy" switching kgraft is doing provides a little 
bit more in the sense of consistency than stop_machine()-based aproach.

Consider this scenario:

	void foo()
	{
		for (i=0; i<10000; i++) {
			bar(i);
			something_else(i);
		}
	}

Let's say you want to live-patch bar(). With stop_machine()-based aproach, 
you can easily end-up with old bar() and new bar() being called in two 
consecutive iterations before the loop is even exited, right? (especially 
on preemptible kernel, or if something_else() goes to sleep).

With lazy-switching implemented in kgraft, this can never happen.

So I'd like to ask for a little bit more explanation why you think the 
stop_machine()-based patching provides more sanity/consistency assurance 
than the lazy switching we're doing.

Thanks a lot,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-16 16:27             ` Jiri Kosina
@ 2014-05-16 17:14               ` Josh Poimboeuf
  2014-05-20  9:37                 ` Jiri Kosina
  2014-05-16 18:09               ` Masami Hiramatsu
  2014-05-16 18:55               ` Steven Rostedt
  2 siblings, 1 reply; 60+ messages in thread
From: Josh Poimboeuf @ 2014-05-16 17:14 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Steven Rostedt, Masami Hiramatsu, Ingo Molnar,
	Frederic Weisbecker, Seth Jennings, Ingo Molnar, Jiri Slaby,
	linux-kernel, Peter Zijlstra, Andrew Morton, Linus Torvalds,
	Thomas Gleixner

On Fri, May 16, 2014 at 06:27:27PM +0200, Jiri Kosina wrote:
> On Tue, 6 May 2014, Steven Rostedt wrote:
> 
> > > However, I also think if users can accept such freezing wait-time,
> > > it means they can also accept kexec based "checkpoint-restart" patching.
> > > So, I think the final goal of the kpatch will be live patching without
> > > stopping the machine. I'm discussing the issue on github #138, but that is
> > > off-topic. :)
> > 
> > I agree with Ingo too. Being conservative at first is the right
> > approach here. We should start out with a stop_machine making sure that
> > everything is sane before we continue. Sure, that's not much different
> > than a kexec, but lets take things one step at a time.
> > 
> > ftrace did the stop_machine (and still does for some archs), and slowly
> > moved to a more efficient method. kpatch/kgraft should follow suit.
> 
> I don't really agree here.
> 
> I actually believe that "lazy" switching kgraft is doing provides a little 
> bit more in the sense of consistency than stop_machine()-based aproach.
> 
> Consider this scenario:
> 
> 	void foo()
> 	{
> 		for (i=0; i<10000; i++) {
> 			bar(i);
> 			something_else(i);
> 		}
> 	}
> 
> Let's say you want to live-patch bar(). With stop_machine()-based aproach, 
> you can easily end-up with old bar() and new bar() being called in two 
> consecutive iterations before the loop is even exited, right? (especially 
> on preemptible kernel, or if something_else() goes to sleep).

Can you clarify why this would be a problem?  Is it because the new
bar() changed some data semantics which confused foo() or
something_else()?

> 
> With lazy-switching implemented in kgraft, this can never happen.
> 
> So I'd like to ask for a little bit more explanation why you think the 
> stop_machine()-based patching provides more sanity/consistency assurance 
> than the lazy switching we're doing.
> 
> Thanks a lot,
> 
> -- 
> Jiri Kosina
> SUSE Labs

-- 
Josh

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-16 16:27             ` Jiri Kosina
  2014-05-16 17:14               ` Josh Poimboeuf
@ 2014-05-16 18:09               ` Masami Hiramatsu
  2014-05-17 22:46                 ` Vojtech Pavlik
  2014-05-16 18:55               ` Steven Rostedt
  2 siblings, 1 reply; 60+ messages in thread
From: Masami Hiramatsu @ 2014-05-16 18:09 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Steven Rostedt, Ingo Molnar, Frederic Weisbecker, Josh Poimboeuf,
	Seth Jennings, Ingo Molnar, Jiri Slaby, linux-kernel,
	Peter Zijlstra, Andrew Morton, Linus Torvalds, Thomas Gleixner

(2014/05/17 1:27), Jiri Kosina wrote:
> On Tue, 6 May 2014, Steven Rostedt wrote:
> 
>>> However, I also think if users can accept such freezing wait-time,
>>> it means they can also accept kexec based "checkpoint-restart" patching.
>>> So, I think the final goal of the kpatch will be live patching without
>>> stopping the machine. I'm discussing the issue on github #138, but that is
>>> off-topic. :)
>>
>> I agree with Ingo too. Being conservative at first is the right
>> approach here. We should start out with a stop_machine making sure that
>> everything is sane before we continue. Sure, that's not much different
>> than a kexec, but lets take things one step at a time.
>>
>> ftrace did the stop_machine (and still does for some archs), and slowly
>> moved to a more efficient method. kpatch/kgraft should follow suit.
> 
> I don't really agree here.
> 
> I actually believe that "lazy" switching kgraft is doing provides a little 
> bit more in the sense of consistency than stop_machine()-based aproach.
> 
> Consider this scenario:
> 
> 	void foo()
> 	{
> 		for (i=0; i<10000; i++) {
> 			bar(i);
> 			something_else(i);
> 		}
> 	}

In this case, I'd recommend you to add foo() to replacing target
as dummy. Then, kpatch can ensure foo() is actually not running. :)

> Let's say you want to live-patch bar(). With stop_machine()-based aproach, 
> you can easily end-up with old bar() and new bar() being called in two 
> consecutive iterations before the loop is even exited, right? (especially 
> on preemptible kernel, or if something_else() goes to sleep).
> 
> With lazy-switching implemented in kgraft, this can never happen.

And I guess similar thing may happen with kgraft. If old function and
new function share a non-auto variable and they modify it different way,
the result will be unexpected by the mutual interference.

Thank you,

> 
> So I'd like to ask for a little bit more explanation why you think the 
> stop_machine()-based patching provides more sanity/consistency assurance 
> than the lazy switching we're doing.
> 
> Thanks a lot,
> 


-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-16 16:27             ` Jiri Kosina
  2014-05-16 17:14               ` Josh Poimboeuf
  2014-05-16 18:09               ` Masami Hiramatsu
@ 2014-05-16 18:55               ` Steven Rostedt
  2014-05-16 22:32                 ` Jiri Kosina
  2 siblings, 1 reply; 60+ messages in thread
From: Steven Rostedt @ 2014-05-16 18:55 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Masami Hiramatsu, Ingo Molnar, Frederic Weisbecker,
	Josh Poimboeuf, Seth Jennings, Ingo Molnar, Jiri Slaby,
	linux-kernel, Peter Zijlstra, Andrew Morton, Linus Torvalds,
	Thomas Gleixner

On Fri, 16 May 2014 18:27:27 +0200 (CEST)
Jiri Kosina <jkosina@suse.cz> wrote:


> > ftrace did the stop_machine (and still does for some archs), and slowly
> > moved to a more efficient method. kpatch/kgraft should follow suit.
> 
> I don't really agree here.
> 
> I actually believe that "lazy" switching kgraft is doing provides a little 
> bit more in the sense of consistency than stop_machine()-based aproach.
> 
> Consider this scenario:
> 
> 	void foo()
> 	{
> 		for (i=0; i<10000; i++) {
> 			bar(i);
> 			something_else(i);
> 		}
> 	}
> 
> Let's say you want to live-patch bar(). With stop_machine()-based aproach, 
> you can easily end-up with old bar() and new bar() being called in two 
> consecutive iterations before the loop is even exited, right? (especially 
> on preemptible kernel, or if something_else() goes to sleep).

And bar() should still do the same result. Otherwise you would think
that foo should change too.

> 
> With lazy-switching implemented in kgraft, this can never happen.
> 
> So I'd like to ask for a little bit more explanation why you think the 
> stop_machine()-based patching provides more sanity/consistency assurance 
> than the lazy switching we're doing.

Here's what I'm more concerned with. With "lazy" switching you can have
two tasks running two different versions of bar(). What happens if the
locking of data within bar changes? Say the data was protected
incorrectly with mutex(X) and you now need to protect it with mutex(Y).

With stop machine, you can make sure everyone is out of bar() and all
tasks will use the same mutex to access the data. But with a lazy
approach, one task can be protecting the data with mutex(X) and the
other with mutex(Y) causing both tasks to be accessing the data at the
same time.

*That* is what I'm more concerned about.

I believe there are more issues with running the two different versions
of the same function at the same time than there are with iterations of
two different versions of the call. One would expect that the results
should stay the same and if not, then the callers would need to be
changed too.

-- Steve

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-16 18:55               ` Steven Rostedt
@ 2014-05-16 22:32                 ` Jiri Kosina
  2014-05-17  0:27                   ` Steven Rostedt
  0 siblings, 1 reply; 60+ messages in thread
From: Jiri Kosina @ 2014-05-16 22:32 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Masami Hiramatsu, Ingo Molnar, Frederic Weisbecker,
	Josh Poimboeuf, Seth Jennings, Ingo Molnar, Jiri Slaby,
	linux-kernel, Peter Zijlstra, Andrew Morton, Linus Torvalds,
	Thomas Gleixner

On Fri, 16 May 2014, Steven Rostedt wrote:

> > With lazy-switching implemented in kgraft, this can never happen.
> > 
> > So I'd like to ask for a little bit more explanation why you think the 
> > stop_machine()-based patching provides more sanity/consistency assurance 
> > than the lazy switching we're doing.
> 
> Here's what I'm more concerned with. With "lazy" switching you can have
> two tasks running two different versions of bar(). What happens if the
> locking of data within bar changes? Say the data was protected
> incorrectly with mutex(X) and you now need to protect it with mutex(Y).
> 
> With stop machine, you can make sure everyone is out of bar() and all
> tasks will use the same mutex to access the data. But with a lazy
> approach, one task can be protecting the data with mutex(X) and the
> other with mutex(Y) causing both tasks to be accessing the data at the
> same time.
> 
> *That* is what I'm more concerned about.

That's true, and we come back to what has been said at the very beginning 
for both aproaches -- you can't really get away without manual human 
inspection of the patches that are being applied.

The case you have outlined is indeed problematic for the "lazy switching" 
aproach, and can be worked around (interim function, which takes both 
mutexes in well defined order, for example).

You can construct a broken locking scenario for stop_machine() aproach as 
well -- consider a case when you are exchaing a function which changes the 
locking order of two locks/mutexes. How do you deal with the rest of the 
code where the locks are being acquired, but not through the functions 
you've exchanged?

So again -- there is no disagreement, I believe, about the fact that "live 
patches" can't be reliably auto-generated, and human inspection will 
always be necessary. Given the intended use-case (serious CVEs mostly, 
handled by distro vendors), this is fine.

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-16 22:32                 ` Jiri Kosina
@ 2014-05-17  0:27                   ` Steven Rostedt
  2014-05-17  7:10                     ` Jiri Kosina
  0 siblings, 1 reply; 60+ messages in thread
From: Steven Rostedt @ 2014-05-17  0:27 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Masami Hiramatsu, Ingo Molnar, Frederic Weisbecker,
	Josh Poimboeuf, Seth Jennings, Ingo Molnar, Jiri Slaby,
	linux-kernel, Peter Zijlstra, Andrew Morton, Linus Torvalds,
	Thomas Gleixner

On Sat, 17 May 2014 00:32:10 +0200 (CEST)
Jiri Kosina <jkosina@suse.cz> wrote:


> That's true, and we come back to what has been said at the very beginning 
> for both aproaches -- you can't really get away without manual human 
> inspection of the patches that are being applied.
> 
> The case you have outlined is indeed problematic for the "lazy switching" 
> aproach, and can be worked around (interim function, which takes both 
> mutexes in well defined order, for example).
> 
> You can construct a broken locking scenario for stop_machine() aproach as 
> well -- consider a case when you are exchaing a function which changes the 
> locking order of two locks/mutexes. How do you deal with the rest of the 
> code where the locks are being acquired, but not through the functions 
> you've exchanged?

I'm a bit confused by this. If you change locking order and there's
other functions that acquire it in reverse order that's not in the
patch, that sounds like you just introduced a new bug.

> 
> So again -- there is no disagreement, I believe, about the fact that "live 
> patches" can't be reliably auto-generated, and human inspection will 
> always be necessary. Given the intended use-case (serious CVEs mostly, 
> handled by distro vendors), this is fine.
> 

Right, I absolutely agree that the real use case is to fix off by one
errors and buffer overflows. Anything that is more complex really needs
a reboot (at a minimum, a kexec reboot). I know our customers would love
to see this upgrading entire kernels, but that's rather unrealistic.

Why I still favor the stop_machine approach, is because the method of
patching is a bit simpler that way. A "lazy" approach will be more
complex and more likely to be buggy. The thing I'm arguing here is not
the end result being a problem, but the implementation of the patching
itself causing bugs.

I rather have a "lazy" approach, but like ftrace and its breakpoint
method, the stop_machine approach is the simpler way to make sure the
patching works before we try to optimize it.

-- Steve

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-17  0:27                   ` Steven Rostedt
@ 2014-05-17  7:10                     ` Jiri Kosina
  0 siblings, 0 replies; 60+ messages in thread
From: Jiri Kosina @ 2014-05-17  7:10 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Masami Hiramatsu, Ingo Molnar, Frederic Weisbecker,
	Josh Poimboeuf, Seth Jennings, Ingo Molnar, Jiri Slaby,
	linux-kernel, Peter Zijlstra, Andrew Morton, Linus Torvalds,
	Thomas Gleixner

On Fri, 16 May 2014, Steven Rostedt wrote:

> Why I still favor the stop_machine approach, is because the method of
> patching is a bit simpler that way. A "lazy" approach will be more
> complex and more likely to be buggy. The thing I'm arguing here is not
> the end result being a problem, but the implementation of the patching
> itself causing bugs.

Well, what can I say to this.

	21 files changed, 594 insertions(+), 10 deletions(-)

that's a complete implementation, including comments and some 
documentation.

Yes, it still has TODOs (such as patching modules as they are modprobed, 
we're working on multi-arch support, etc), but it's more or less complete 
working x86 skeleton.

> I rather have a "lazy" approach, 

I'm glad to hear that, thanks :)

> but like ftrace and its breakpoint method, the stop_machine approach is 
> the simpler way to make sure the patching works before we try to 
> optimize it.

I am still not convinced that it's more complex. It's actually lazy both 
in the way it performs patching and in implementation -- we basically set 
a flag, flip the switch, and let the universe converge to a new state by 
itself.

It's basically hard to argue about level of bugginess when no actual bugs 
are being pointed out :) (well, yes, the kthreads stuff needs to be taken 
care of, but both kgraft and kpatch have similar issues there).

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-16 18:09               ` Masami Hiramatsu
@ 2014-05-17 22:46                 ` Vojtech Pavlik
  0 siblings, 0 replies; 60+ messages in thread
From: Vojtech Pavlik @ 2014-05-17 22:46 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Jiri Kosina, Steven Rostedt, Ingo Molnar, Frederic Weisbecker,
	Josh Poimboeuf, Seth Jennings, Ingo Molnar, Jiri Slaby,
	linux-kernel, Peter Zijlstra, Andrew Morton, Linus Torvalds,
	Thomas Gleixner

On Sat, May 17, 2014 at 03:09:57AM +0900, Masami Hiramatsu wrote:
> (2014/05/17 1:27), Jiri Kosina wrote:
> > On Tue, 6 May 2014, Steven Rostedt wrote:
> > 
> >>> However, I also think if users can accept such freezing wait-time,
> >>> it means they can also accept kexec based "checkpoint-restart" patching.
> >>> So, I think the final goal of the kpatch will be live patching without
> >>> stopping the machine. I'm discussing the issue on github #138, but that is
> >>> off-topic. :)
> >>
> >> I agree with Ingo too. Being conservative at first is the right
> >> approach here. We should start out with a stop_machine making sure that
> >> everything is sane before we continue. Sure, that's not much different
> >> than a kexec, but lets take things one step at a time.
> >>
> >> ftrace did the stop_machine (and still does for some archs), and slowly
> >> moved to a more efficient method. kpatch/kgraft should follow suit.
> > 
> > I don't really agree here.
> > 
> > I actually believe that "lazy" switching kgraft is doing provides a little 
> > bit more in the sense of consistency than stop_machine()-based aproach.
> > 
> > Consider this scenario:
> > 
> > 	void foo()
> > 	{
> > 		for (i=0; i<10000; i++) {
> > 			bar(i);
> > 			something_else(i);
> > 		}
> > 	}
> 
> In this case, I'd recommend you to add foo() to replacing target
> as dummy. Then, kpatch can ensure foo() is actually not running. :)

The problem is: Where do you stop?

Adding the whole list of functions that transitively may ever call bar()
can be pretty much the whole kernel. And given that you can be calling
via pointer, you can't often even tell who is calling bar().

So yes, a developer could manually include foo() in the to be patched
list, so that it is checked that foo() isn't being executed while
patching.

But that would have to be done entirely manually after thoroughly
understanding the implications of the patch.

> > Let's say you want to live-patch bar(). With stop_machine()-based aproach, 
> > you can easily end-up with old bar() and new bar() being called in two 
> > consecutive iterations before the loop is even exited, right? (especially 
> > on preemptible kernel, or if something_else() goes to sleep).
> > 
> > With lazy-switching implemented in kgraft, this can never happen.
> 
> And I guess similar thing may happen with kgraft. If old function and
> new function share a non-auto variable and they modify it different way,
> the result will be unexpected by the mutual interference.

By non-auto I assume you mean globally accessible variable or data
structures.

Yes, with kGraft the new function has to be backward compatible with
pre-patch global data layout and semantics.

The parameters, return value and their semantics are free to change,
though, and kGraft guarantees consistency there.

-- 
Vojtech Pavlik
SUSE Labs

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-16 17:14               ` Josh Poimboeuf
@ 2014-05-20  9:37                 ` Jiri Kosina
  2014-05-20 12:59                   ` Josh Poimboeuf
  0 siblings, 1 reply; 60+ messages in thread
From: Jiri Kosina @ 2014-05-20  9:37 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Steven Rostedt, Masami Hiramatsu, Ingo Molnar,
	Frederic Weisbecker, Seth Jennings, Ingo Molnar, Jiri Slaby,
	linux-kernel, Peter Zijlstra, Andrew Morton, Linus Torvalds,
	Thomas Gleixner

On Fri, 16 May 2014, Josh Poimboeuf wrote:

> > Consider this scenario:
> > 
> > 	void foo()
> > 	{
> > 		for (i=0; i<10000; i++) {
> > 			bar(i);
> > 			something_else(i);
> > 		}
> > 	}
> > 
> > Let's say you want to live-patch bar(). With stop_machine()-based aproach, 
> > you can easily end-up with old bar() and new bar() being called in two 
> > consecutive iterations before the loop is even exited, right? (especially 
> > on preemptible kernel, or if something_else() goes to sleep).
> 
> Can you clarify why this would be a problem?  Is it because the new
> bar() changed some data semantics which confused foo() or
> something_else()?

I guess the example I used wasn't really completely illustrative, sorry 
for that. But I guess this has been answered later in the thread already; 
the thing is that you don't have a complete callgraph available (at least 
I believe you don't ...?), so you don't really know where your patched 
function will be called from, and thus you can't change function arguments 
or return value semantics; with lazy aproach, you can do that.

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-20  9:37                 ` Jiri Kosina
@ 2014-05-20 12:59                   ` Josh Poimboeuf
  0 siblings, 0 replies; 60+ messages in thread
From: Josh Poimboeuf @ 2014-05-20 12:59 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Steven Rostedt, Masami Hiramatsu, Ingo Molnar,
	Frederic Weisbecker, Seth Jennings, Ingo Molnar, Jiri Slaby,
	linux-kernel, Peter Zijlstra, Andrew Morton, Linus Torvalds,
	Thomas Gleixner

On Tue, May 20, 2014 at 11:37:16AM +0200, Jiri Kosina wrote:
> On Fri, 16 May 2014, Josh Poimboeuf wrote:
> 
> > > Consider this scenario:
> > > 
> > > 	void foo()
> > > 	{
> > > 		for (i=0; i<10000; i++) {
> > > 			bar(i);
> > > 			something_else(i);
> > > 		}
> > > 	}
> > > 
> > > Let's say you want to live-patch bar(). With stop_machine()-based aproach, 
> > > you can easily end-up with old bar() and new bar() being called in two 
> > > consecutive iterations before the loop is even exited, right? (especially 
> > > on preemptible kernel, or if something_else() goes to sleep).
> > 
> > Can you clarify why this would be a problem?  Is it because the new
> > bar() changed some data semantics which confused foo() or
> > something_else()?
> 
> I guess the example I used wasn't really completely illustrative, sorry 
> for that. But I guess this has been answered later in the thread already; 
> the thing is that you don't have a complete callgraph available (at least 
> I believe you don't ...?), so you don't really know where your patched 
> function will be called from, and thus you can't change function arguments 
> or return value semantics; with lazy aproach, you can do that.

If the patch changes return semantics, it would also need to change all
the affected callers to handle the new semantics.  But I think that's a
general statement for all patches, not just for the live patching case.
Otherwise it's just a bad patch in general.

For changed function arguments, the compiler will catch that and
recompile all the callers, so the kpatch tooling would detect that the
callers changed and patch them too.

But to be fair, here's an example where kGraft does better:

  foo = bar();
  something_else();
  baz(foo);

Let's say that bar()'s return semantics change, and baz() is also
patched to handle the new semantics.  If the stop_machine occurs in
something_else(), then the new baz() will be called with the old
"version" of foo.

If the user were smart enough to be aware of this potential scenario, it
could be solved with a kGraft-esque approach for data semantic changes,
where the patched baz() would need to handle both versions of foo.

This is another example of why changing return or data semantics is
tricky and should be avoided.

BTW, this is also an example of why the stop_machine and lazy approaches
aren't interchangeable.  The patch creator has to know which method will
be used to apply the patch in order to be able to determine whether a
patch is "safe".

-- 
Josh

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-05-08  7:08                             ` Ingo Molnar
  2014-05-08  7:29                               ` Masami Hiramatsu
  2014-05-08 12:48                               ` Josh Poimboeuf
@ 2014-06-14 20:31                               ` Pavel Machek
  2014-06-15  6:57                                 ` Ingo Molnar
  2 siblings, 1 reply; 60+ messages in thread
From: Pavel Machek @ 2014-06-14 20:31 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: David Lang, Josh Poimboeuf, Frederic Weisbecker, Seth Jennings,
	Masami Hiramatsu, Steven Rostedt, Ingo Molnar, Jiri Slaby,
	linux-kernel, Peter Zijlstra, Andrew Morton, Linus Torvalds,
	Thomas Gleixner

Hi!

> > in terms of hit-patching kernels you are correct.
> > 
> > but that's a far cry from what it sounded like you were demanding 
> > (that it must handle any kernel patch)
> 
> No, I was not demanding that at all, my suggestion was:
> 
>    > My claim is that if a patch is correct/safe in the old fashioned 
>    > way, then a fundamental principle is that a live patching 
>    > subsystem must either safely apply, or safely reject the live 
>    > patching attempt, independently from any user input.
> 
> Note the 'if'. It could start simple and stupid, and only allow cases 
> where we know the patch must be trivially safe (because it does not do 
> much in terms of disturbing globally visible state). That needs some 
> tooling help, but apparently tooling help is in place already.

Actually, even if patch is very trivial, it will be difficult to determine
if it is safe. Consider adding error check:

int
do_something(void)
{
#if 0
  if (1)
    return -1;
#endif
  return 0;
}

void
main(void)
{
  if (do_something())
    printf("error happened\n");
}

Imagine changing that #if 0 to #if 1. But gcc at -O3 already optimized out the error
message. So... do we compile whole second kernel and compare the binaries? I think I seen
remark "don't try to do binary compares" somewhere...

										Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching
  2014-06-14 20:31                               ` Pavel Machek
@ 2014-06-15  6:57                                 ` Ingo Molnar
  0 siblings, 0 replies; 60+ messages in thread
From: Ingo Molnar @ 2014-06-15  6:57 UTC (permalink / raw)
  To: Pavel Machek
  Cc: David Lang, Josh Poimboeuf, Frederic Weisbecker, Seth Jennings,
	Masami Hiramatsu, Steven Rostedt, Ingo Molnar, Jiri Slaby,
	linux-kernel, Peter Zijlstra, Andrew Morton, Linus Torvalds,
	Thomas Gleixner


* Pavel Machek <pavel@ucw.cz> wrote:

> Hi!
> 
> > > in terms of hit-patching kernels you are correct.
> > > 
> > > but that's a far cry from what it sounded like you were demanding 
> > > (that it must handle any kernel patch)
> > 
> > No, I was not demanding that at all, my suggestion was:
> > 
> >    > My claim is that if a patch is correct/safe in the old fashioned 
> >    > way, then a fundamental principle is that a live patching 
> >    > subsystem must either safely apply, or safely reject the live 
> >    > patching attempt, independently from any user input.
> > 
> > Note the 'if'. It could start simple and stupid, and only allow 
> > cases where we know the patch must be trivially safe (because it 
> > does not do much in terms of disturbing globally visible state). 
> > That needs some tooling help, but apparently tooling help is in 
> > place already.
> 
> Actually, even if patch is very trivial, it will be difficult to 
> determine if it is safe. Consider adding error check:
> 
> int
> do_something(void)
> {
> #if 0
>   if (1)
>     return -1;
> #endif
>   return 0;
> }
> 
> void
> main(void)
> {
>   if (do_something())
>     printf("error happened\n");
> }
> 
> Imagine changing that #if 0 to #if 1. But gcc at -O3 already 
> optimized out the error message. So... do we compile whole second 
> kernel and compare the binaries? I think I seen remark "don't try to 
> do binary compares" somewhere...

Fair enough.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 60+ messages in thread

end of thread, other threads:[~2014-06-15  6:57 UTC | newest]

Thread overview: 60+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-01 15:52 [RFC PATCH 0/2] kpatch: dynamic kernel patching Josh Poimboeuf
2014-05-01 15:52 ` [RFC PATCH 1/2] kpatch: add TAINT_KPATCH flag Josh Poimboeuf
2014-05-01 15:52 ` [RFC PATCH 2/2] kpatch: add kpatch core module Josh Poimboeuf
2014-05-01 20:45 ` [RFC PATCH 0/2] kpatch: dynamic kernel patching Andi Kleen
2014-05-01 21:01   ` Josh Poimboeuf
2014-05-01 21:06     ` Andi Kleen
2014-05-01 21:27       ` Josh Poimboeuf
2014-05-01 21:39         ` Josh Poimboeuf
2014-05-02  1:30       ` Masami Hiramatsu
2014-05-02  8:37 ` Jiri Kosina
2014-05-02 13:29   ` Josh Poimboeuf
2014-05-02 13:10 ` Jiri Kosina
2014-05-02 13:37   ` Josh Poimboeuf
2014-05-05 23:34   ` David Lang
2014-05-05 23:52     ` Jiri Kosina
2014-05-06  1:59       ` David Lang
2014-05-06 12:17         ` Josh Poimboeuf
2014-05-06  7:32       ` Ingo Molnar
2014-05-06  8:03         ` Jiri Kosina
2014-05-06 12:23         ` Josh Poimboeuf
2014-05-07 12:19           ` Ingo Molnar
2014-05-09  1:46             ` David Lang
2014-05-09  2:45               ` Steven Rostedt
2014-05-09  4:07               ` Masami Hiramatsu
2014-05-05  8:55 ` Ingo Molnar
2014-05-05 13:26   ` Josh Poimboeuf
2014-05-05 14:10     ` Frederic Weisbecker
2014-05-05 18:43       ` Ingo Molnar
2014-05-05 21:49         ` Frederic Weisbecker
2014-05-06 12:12           ` Josh Poimboeuf
2014-05-06 12:33             ` Steven Rostedt
2014-05-06 22:49               ` Masami Hiramatsu
2014-05-06 14:05             ` Frederic Weisbecker
2014-05-06 14:50               ` Josh Poimboeuf
2014-05-07 12:24                 ` Ingo Molnar
2014-05-07 15:41                   ` Josh Poimboeuf
2014-05-07 15:57                     ` Ingo Molnar
2014-05-07 16:43                       ` Josh Poimboeuf
2014-05-07 22:56                       ` David Lang
2014-05-08  6:12                         ` Ingo Molnar
2014-05-08  6:50                           ` David Lang
2014-05-08  7:08                             ` Ingo Molnar
2014-05-08  7:29                               ` Masami Hiramatsu
2014-05-08 12:48                               ` Josh Poimboeuf
2014-05-09  6:21                                 ` Masami Hiramatsu
2014-06-14 20:31                               ` Pavel Machek
2014-06-15  6:57                                 ` Ingo Molnar
2014-05-06 11:45         ` Masami Hiramatsu
2014-05-06 12:26           ` Steven Rostedt
2014-05-06 22:33             ` Masami Hiramatsu
2014-05-16 16:27             ` Jiri Kosina
2014-05-16 17:14               ` Josh Poimboeuf
2014-05-20  9:37                 ` Jiri Kosina
2014-05-20 12:59                   ` Josh Poimboeuf
2014-05-16 18:09               ` Masami Hiramatsu
2014-05-17 22:46                 ` Vojtech Pavlik
2014-05-16 18:55               ` Steven Rostedt
2014-05-16 22:32                 ` Jiri Kosina
2014-05-17  0:27                   ` Steven Rostedt
2014-05-17  7:10                     ` Jiri Kosina

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.